您的位置:首页 > 编程语言 > Python开发

Python学习笔记 --- utf-8与utf-8-sig 两种编码格式区别

2016-09-27 15:06 501 查看
As UTF-8 is an 8-bit encoding no BOM is required and anyU+FEFF character in the decoded Unicode string
(even if it’s the firstcharacter) is treated as a ZERO WIDTH NO-BREAK SPACE.


UTF-8以字节为编码单元,它的字节顺序在所有系统中都是一様的,没有字节序的问题,也因此它实际上并不需要BOM(“ByteOrder Mark”), 但是UTF-8 with BOM即utf-8-sig需要提供BOM("ByteOrder
Mark")。


具体解释:

Python 'utf-8-sig' Codec
This work similar to UTF-8 with the following changes:

* On encoding/writing a UTF-8 encoded BOM will be prepended/written as the
first three bytes.

* On decoding/reading if the first three bytes are a UTF-8 encoded BOM, these
bytes will be skipped.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: