utf8-to-utf8mb4 mysql支持全部unicode
2016-08-04 16:37
351 查看
Are you using MySQL’s
and how to do it.
The UTF-8 encoding can represent every symbol in the Unicode character set, which ranges from U+000000 to U+10FFFF. That’s 1,114,112 possible
symbols. (Not all of these Unicode code points have been assigned characters yet, but that doesn’t stop UTF-8 from being able to encode them.)
UTF-8 is a variable-width encoding; it encodes each symbol using one to four 8-bit bytes. Symbols with lower numerical code point values are encoded using fewer bytes. This way, UTF-8 is optimized for the common case where ASCII characters and other BMP
symbols (whose code points range from U+000000 to U+00FFFF) are used — while still allowing astral symbols (whose code points range from U+010000 to U+10FFFF) to be stored.
For a long time, I was using MySQL’s
described above. By using
While writing about JavaScript’s internal character encoding, I noticed that there was no way to insert the U+1D306 TETRAGRAM FOR
CENTRE (
utf8charset in your databases? In this write-up I’ll explain why you should switch to
utf8mb4instead,
and how to do it.
UTF-8
The UTF-8 encoding can represent every symbol in the Unicode character set, which ranges from U+000000 to U+10FFFF. That’s 1,114,112 possiblesymbols. (Not all of these Unicode code points have been assigned characters yet, but that doesn’t stop UTF-8 from being able to encode them.)
UTF-8 is a variable-width encoding; it encodes each symbol using one to four 8-bit bytes. Symbols with lower numerical code point values are encoded using fewer bytes. This way, UTF-8 is optimized for the common case where ASCII characters and other BMP
symbols (whose code points range from U+000000 to U+00FFFF) are used — while still allowing astral symbols (whose code points range from U+010000 to U+10FFFF) to be stored.
MySQL’s utf8
For a long time, I was using MySQL’s utf8charset for databases, tables, and columns, assuming it mapped to the UTF-8 encoding
described above. By using
utf8, I’d be able to store any symbol I want in my database — or so I thought.
While writing about JavaScript’s internal character encoding, I noticed that there was no way to insert the U+1D306 TETRAGRAM FOR
CENTRE (
相关文章推荐
- MySQL字符编码的讨论:如何处理emoji等4字节的Unicode字符 - utf8mb4 vs. utf8 Collations
- mysql关于utf8_unicode_ci与utf8mb4_unicode_ci的区别
- mysql utf8_unicode_ci utf8_general_ci 区别
- msql 不支持插入中文,所看的资料集合(unicode utf8 )
- Mysql中校对集utf8_unicode_ci与utf8_general_ci的区别说明
- Mysql中校对集utf8_unicode_ci与utf8_general_ci的区别
- mysql设置utf8_unicode_ci字符集php页面输出??乱码的解决方法
- unicode_to_utf8 in PHP
- MySQL中Utf8_general_ci 和 utf8_unicode_ci的区别
- 慎用mysql的utf8-unicode
- 慎用mysql的utf8-unicode
- ubuntu 12.04LTS mysql 修改支持utf8 重启失败
- 【转】Mysql中校对集utf8_unicode_ci与utf8_general_ci的区别
- MYSQL莫名的 illegal mix of collations( utf8_unicode_ci, IMPLICIT) and (utf8_general_ci)
- Tiburon 支持 Unicode 的 LoadFromFile, SaveToFile
- 慎用mysql的utf8-unicode
- ALinq1.3(由LinqToAccessDB改名)源码发布--支持Access、SQLite、MySql、Oracle四种数库
- Mysql中校对集utf8_unicode_ci与utf8_general_ci的区别
- 在Windows下mysql++支持UTF8
- mysql设置utf8_unicode_ci字符集php页面输出??乱码的解决方法