您的位置:首页 > 数据库 > MySQL

utf8-to-utf8mb4 mysql支持全部unicode

2016-08-04 16:37 351 查看
Are you using MySQL’s 
utf8
 charset in your databases? In this write-up I’ll explain why you should switch to 
utf8mb4
 instead,
and how to do it.


UTF-8

The UTF-8 encoding can represent every symbol in the Unicode character set, which ranges from U+000000 to U+10FFFF. That’s 1,114,112 possible
symbols. (Not all of these Unicode code points have been assigned characters yet, but that doesn’t stop UTF-8 from being able to encode them.)

UTF-8 is a variable-width encoding; it encodes each symbol using one to four 8-bit bytes. Symbols with lower numerical code point values are encoded using fewer bytes. This way, UTF-8 is optimized for the common case where ASCII characters and other BMP
symbols (whose code points range from U+000000 to U+00FFFF) are used — while still allowing astral symbols (whose code points range from U+010000 to U+10FFFF) to be stored.


MySQL’s 
utf8

For a long time, I was using MySQL’s 
utf8
 charset for databases, tables, and columns, assuming it mapped to the UTF-8 encoding
described above. By using 
utf8
, I’d be able to store any symbol I want in my database — or so I thought.

While writing about JavaScript’s internal character encoding, I noticed that there was no way to insert the U+1D306 TETRAGRAM FOR
CENTRE (

                                            
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: