« | August 2025 | » | 日 | 一 | 二 | 三 | 四 | 五 | 六 | | | | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | | | | | | | |
| 公告 |
戒除浮躁,读好书,交益友 |
Blog信息 |
blog名称:邢红瑞的blog 日志总数:523 评论数量:1142 留言数量:0 访问次数:9692171 建立时间:2004年12月20日 |

| |
[java语言]搞不懂的java编码 原创空间, 软件技术, 电脑与网络
邢红瑞 发表于 2007/5/15 18:15:56 |
最近做一个项目,需要使用php,c和java3种平台,注意不是语言。从php发过来的中文经过java平台UTF-8编码,然后写入到xml文件中,xml的编码是UTF-8,xml文件的字符串是н¨¾ø¶Ôʱ¼ä,我用Jaxp解析读入到java中是脨脗陆篓戮酶露脭脢卤录盲,但是н¨¾ø¶Ôʱ¼ä使用ISO-8859-1转GB2312得到是中文就是新建绝对时间。如果将н¨¾ø¶Ôʱ¼ä使用new String(node.getTextContent().getBytes("UTF-8"),"GBK"),得到就是脨脗陆篓戮酶露脭脢卤录盲,想了一下,恍然大悟,是JDK欺骗了我,我使用中文XP,JDK安装就使用GBK编码,我新建的只要是默认的String,都是GBK编码的,除非我强制制定,public String(byte[] bytes, String charsetName) throws UnsupportedEncodingException构造一个新的 String,方法是使用指定的字符集解码指定的字节数组。但是我不明白sun的文档说,String 表示一个 UTF-16 格式的字符串。Note: UTF-16 depends on the system's byte-ordering conventions. Although in most
systems, high-order bytes follow low-order bytes in a 16-bit or 32-bit "word,"
some systems use the reverse order. UTF-16 documents cannot be interchanged
between such systems without a conversion. |
|
回复:搞不懂的java编码 原创空间, 软件技术, 电脑与网络
kruce(游客)发表评论于2007/5/16 19:31:57 |
Constructs a new String by decoding the specified byte array using the platform's default charset. The length of the new String is a function of the charset, and hence may not be equal to the length of the byte array. The behavior of this constructor when the given bytes are not valid in the default charset is unspecified.
What exactly is a charset? Technically, it is "the combination of a coded character set and a character-encoding scheme". In other words, a charset is a bunch of characters, the numerical codes that represent them, and a way to translate back and forth between a sequence of character codes and a sequence of bytes. The translation scheme differs greatly among charsets. Some have a one-to-one mapping between characters and bytes; most do not. |
|
» 1 »
|