您的位置：首页 > 其它

关于URLEncoder和URLDecoder

2015-11-04 15:17 323 查看

今天简单总结下这两个编码解码相关的类吧，由于用的还比较少，主要写一下基础知识

先看下API上对URLEncoder的解释：

UtilityclassforHTMLformencoding.ThisclasscontainsstaticmethodsforconvertingaStringtothe

application/x-www-form-urlencoded

MIME
format.FormoreinformationaboutHTMLformencoding,consulttheHTMLspecification

WhenencodingaString,thefollowingrulesapply:

Thealphanumericcharacters"

"through"

","

"through"

"
and"

"through"

"remainthesame.
Thespecialcharacters"

","

",and"

"remainthesame.
Thespacecharacter"

"isconvertedintoaplussign"

".
Allothercharactersareunsafeandarefirstconvertedintooneormorebytesusingsomeencodingscheme.Then
eachbyteisrepresentedbythe3-characterstring"

%xy

",wherexyisthetwo-digithexadecimalrepresentationofthebyte.TherecommendedencodingschemetouseisUTF-8.However,forcompatibilityreasons,ifanencodingis
notspecified,thenthedefaultencodingoftheplatformisused.

ForexampleusingUTF-8astheencodingschemethestring"Thestringü@foo-bar"wouldgetconvertedto"The+string+%C3%BC%40foo-bar"
becauseinUTF-8thecharacterüisencodedastwobytesC3(hex)andBC(hex),andthecharacter@isencodedasonebyte40(hex).

翻译下，就是说这个类用于完成字符串和application/x-www-form-urlencodedMIME字符串的转换，转换规则就说字母和数字编码后保持不变，空格编码后为“+”，一些不安全的字符会采用指定字符集被编译成“%xy”的形式。如“ü”采用UTF-8编码后成为两个字节c3和bc（十六进制），即为%C3%BC
而URLDecoder就是相对的解码类了。

下面给个实例来说就明白了

<spanstyle="font-family:MicrosoftYaHei;font-size:14px;">publicstaticvoidmain(String[]args)throwsUnsupportedEncodingException{
Stringa="你好";
Stringabyte=URLEncoder.encode(a,"utf-8");
System.out.println(abyte);
a=URLDecoder.decode(abyte,"utf-8");
System.out.println(a);
}</span>输出结果是：

%E4%BD%A0%E5%A5%BD

你好

用”utf-8“编码，中文都是特殊字符。“你”被编码成了三个字节E4BDA0,表示出来就是%E4%BD%A0%.”好“被编码成了E5A5BD,表示出来就是%E5%A5%BD。

解释完毕，这里再抛出一个疑问

<spanstyle="font-family:MicrosoftYaHei;font-size:14px;">publicstaticvoidmain(String[]args)throwsUnsupportedEncodingException{
Stringa="你好";

byte[]b=a.getBytes("utf-8");
for(inti=0;i<b.length;i++){
System.out.print(b[i]);
}
System.out.println();

Stringabyte=URLEncoder.encode(a,"utf-8");
System.out.println(abyte);
a=URLDecoder.decode(abyte,"utf-8");
System.out.println(a);
}</span>

输出结果是：

-28-67-96-27-91-67

%E4%BD%A0%E5%A5%BD

你好

可以发现，手动调用string.getBytes(”utf-8“)方法，获取字符串按照utf-8编码后的字节，输出后发现与URLEncode.encode(a,"utf-8")方法得到的字节不同？

通过csdn上提问后找到了一个解释:

string.getBytes(”utf-8“)返回的字节是经过处理了的，把得到的二进制取反后加一再转换成十进制正好为getBytes的输出结果

例如E4，二进制为11100100，取反加一后为00011100，转换为十进制为28。

其余的字节可以自己验证，都是这个规律。至于为什么经过这个操作又是另外一个问题了，目前弄清这个疑问我已经很满足了，等水平高上去了看能不能再解释吧

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航