您的位置:首页 > 数据库 > MySQL

Mysql 字符集研究

2015-08-07 13:50 573 查看
mysql> show databases ;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

mysql> use information_schema ;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> use information_schema ;
Database changed
mysql> select * from CHARACTER_SETS ;

----- 字符集 符(Character)是各种文字和符号的总称,包括各国家文字、标点符号、图形符号、数字等。
-----字符集(Character set)是多个字符的集合,字符集种类较多,每个字符集包含的字符个数不同,常见字符集名称:ASCII字符集、GB2312字符集、BIG5字符集、 GB18030字符集、Unicode字符集等。
-----计算机要准确的处理各种字符集文字,需要进行字符编码,以便计算机能够识别和存储各种文字。

-- CHARACTER_SET_NAME 字符集名称
-- 默认的校验规则
-- 字符集描述
-- 单个字符可占用的最多字节数

+--------------------+----------------------+-----------------------------+--------+
| CHARACTER_SET_NAME | DEFAULT_COLLATE_NAME | DESCRIPTION                 | MAXLEN |
+--------------------+----------------------+-----------------------------+--------+
| big5               | big5_chinese_ci      | Big5 Traditional Chinese    |      2 |
| dec8               | dec8_swedish_ci      | DEC West European           |      1 |
| cp850              | cp850_general_ci     | DOS West European           |      1 |
| hp8                | hp8_english_ci       | HP West European            |      1 |
| koi8r              | koi8r_general_ci     | KOI8-R Relcom Russian       |      1 |
| latin1             | latin1_swedish_ci    | cp1252 West European        |      1 |
| latin2             | latin2_general_ci    | ISO 8859-2 Central European |      1 |
| swe7               | swe7_swedish_ci      | 7bit Swedish                |      1 |
| ascii              | ascii_general_ci     | US ASCII                    |      1 |
| ujis               | ujis_japanese_ci     | EUC-JP Japanese             |      3 |
| sjis               | sjis_japanese_ci     | Shift-JIS Japanese          |      2 |
| hebrew             | hebrew_general_ci    | ISO 8859-8 Hebrew           |      1 |
| tis620             | tis620_thai_ci       | TIS620 Thai                 |      1 |
| euckr              | euckr_korean_ci      | EUC-KR Korean               |      2 |
| koi8u              | koi8u_general_ci     | KOI8-U Ukrainian            |      1 |
| gb2312             | gb2312_chinese_ci    | GB2312 Simplified Chinese   |      2 |
| greek              | greek_general_ci     | ISO 8859-7 Greek            |      1 |
| cp1250             | cp1250_general_ci    | Windows Central European    |      1 |
| gbk                | gbk_chinese_ci       | GBK Simplified Chinese      |      2 |
| latin5             | latin5_turkish_ci    | ISO 8859-9 Turkish          |      1 |
| armscii8           | armscii8_general_ci  | ARMSCII-8 Armenian          |      1 |
| utf8               | utf8_general_ci      | UTF-8 Unicode               |      3 |
| ucs2               | ucs2_general_ci      | UCS-2 Unicode               |      2 |
| cp866              | cp866_general_ci     | DOS Russian                 |      1 |
| keybcs2            | keybcs2_general_ci   | DOS Kamenicky Czech-Slovak  |      1 |
| macce              | macce_general_ci     | Mac Central European        |      1 |
| macroman           | macroman_general_ci  | Mac West European           |      1 |
| cp852              | cp852_general_ci     | DOS Central European        |      1 |
| latin7             | latin7_general_ci    | ISO 8859-13 Baltic          |      1 |
| utf8mb4            | utf8mb4_general_ci   | UTF-8 Unicode               |      4 |
| cp1251             | cp1251_general_ci    | Windows Cyrillic            |      1 |
| utf16              | utf16_general_ci     | UTF-16 Unicode              |      4 |
| utf16le            | utf16le_general_ci   | UTF-16LE Unicode            |      4 |
| cp1256             | cp1256_general_ci    | Windows Arabic              |      1 |
| cp1257             | cp1257_general_ci    | Windows Baltic              |      1 |
| utf32              | utf32_general_ci     | UTF-32 Unicode              |      4 |
| binary             | binary               | Binary pseudo charset       |      1 |
| geostd8            | geostd8_general_ci   | GEOSTD8 Georgian            |      1 |
| cp932              | cp932_japanese_ci    | SJIS for Windows Japanese   |      2 |
| eucjpms            | eucjpms_japanese_ci  | UJIS for Windows Japanese   |      3 |
+--------------------+----------------------+-----------------------------+--------+

mysql> show character set ;
+----------+-----------------------------+---------------------+--------+
| Charset  | Description                 | Default collation   | Maxlen |
+----------+-----------------------------+---------------------+--------+
| big5     | Big5 Traditional Chinese    | big5_chinese_ci     |      2 |
| dec8     | DEC West European           | dec8_swedish_ci     |      1 |
| cp850    | DOS West European           | cp850_general_ci    |      1 |
| hp8      | HP West European            | hp8_english_ci      |      1 |
| koi8r    | KOI8-R Relcom Russian       | koi8r_general_ci    |      1 |
| latin1   | cp1252 West European        | latin1_swedish_ci   |      1 |
| latin2   | ISO 8859-2 Central European | latin2_general_ci   |      1 |
| swe7     | 7bit Swedish                | swe7_swedish_ci     |      1 |
| ascii    | US ASCII                    | ascii_general_ci    |      1 |
| ujis     | EUC-JP Japanese             | ujis_japanese_ci    |      3 |
| sjis     | Shift-JIS Japanese          | sjis_japanese_ci    |      2 |
| hebrew   | ISO 8859-8 Hebrew           | hebrew_general_ci   |      1 |
| tis620   | TIS620 Thai                 | tis620_thai_ci      |      1 |
| euckr    | EUC-KR Korean               | euckr_korean_ci     |      2 |
| koi8u    | KOI8-U Ukrainian            | koi8u_general_ci    |      1 |
| gb2312   | GB2312 Simplified Chinese   | gb2312_chinese_ci   |      2 |
| greek    | ISO 8859-7 Greek            | greek_general_ci    |      1 |
| cp1250   | Windows Central European    | cp1250_general_ci   |      1 |
| gbk      | GBK Simplified Chinese      | gbk_chinese_ci      |      2 |
| latin5   | ISO 8859-9 Turkish          | latin5_turkish_ci   |      1 |
| armscii8 | ARMSCII-8 Armenian          | armscii8_general_ci |      1 |
| utf8     | UTF-8 Unicode               | utf8_general_ci     |      3 |
| ucs2     | UCS-2 Unicode               | ucs2_general_ci     |      2 |
| cp866    | DOS Russian                 | cp866_general_ci    |      1 |
| keybcs2  | DOS Kamenicky Czech-Slovak  | keybcs2_general_ci  |      1 |
| macce    | Mac Central European        | macce_general_ci    |      1 |
| macroman | Mac West European           | macroman_general_ci |      1 |
| cp852    | DOS Central European        | cp852_general_ci    |      1 |
| latin7   | ISO 8859-13 Baltic          | latin7_general_ci   |      1 |
| utf8mb4  | UTF-8 Unicode               | utf8mb4_general_ci  |      4 |
| cp1251   | Windows Cyrillic            | cp1251_general_ci   |      1 |
| utf16    | UTF-16 Unicode              | utf16_general_ci    |      4 |
| utf16le  | UTF-16LE Unicode            | utf16le_general_ci  |      4 |
| cp1256   | Windows Arabic              | cp1256_general_ci   |      1 |
| cp1257   | Windows Baltic              | cp1257_general_ci   |      1 |
| utf32    | UTF-32 Unicode              | utf32_general_ci    |      4 |
| binary   | Binary pseudo charset       | binary              |      1 |
| geostd8  | GEOSTD8 Georgian            | geostd8_general_ci  |      1 |
| cp932    | SJIS for Windows Japanese   | cp932_japanese_ci   |      2 |
| eucjpms  | UJIS for Windows Japanese   | eucjpms_japanese_ci |      3 |
+----------+-----------------------------+---------------------+--------+

字符集相关参数
·         character_set_client       来自客户端的语句的字符集。
·         character_set_connection   用于没有字符集导入符的文字和数字-字符串转换。
·         character_set_database     默认数据库使用的字符集。当默认数据库更改时,服务器则设置该变量。如果没有默认数据库,变量的值同character_set_server。
·         character_set_results      用于向客户端返回查询结果的字符集。
·         character_set_ server      服务器的默认字符集。
·         character_set_system       服务器用来保存识别符的字符集。该值一定是utf8。
·         character_sets_dir         字符集安装目录。

推荐所有参数设置的字符集相同以避免乱码

字符集参数的修改
1 修改mysqld启动选项
[mysql@server ~]$   mysqld  --defaults-file=/home/mysql/my.cnf   --character-set-server=latin1 >> 1.log &
mysql> show variables like 'character%' ;
+--------------------------+---------------------------------+
| Variable_name            | Value                           |
+--------------------------+---------------------------------+
| character_set_client     | utf8                            |
| character_set_connection | utf8                            |
| character_set_database   | latin1                          |
| character_set_filesystem | binary                          |
| character_set_results    | utf8                            |
| character_set_server     | latin1                          |
| character_set_system     | utf8                            |
| character_sets_dir       | /mysql/software/share/charsets/ |

2 修改参数文件

[mysql@server ~]$ vi my.cnf
[mysqld_mulit]
mysqld  = /mysql/software/bin/mysqld_safe
mysqladmin  =  /mysql/software/bin/mysqladmin

[mysqld]
socket     = /mysql/data/db_3306/mysql3306.sock
port       = 3306
pid-file   = /mysql/data/db_3306/mysq3306.pid
datadir    = /mysql/data/db_3306
character_set_server  =  latin1

[mysqld3306]
socket     = /mysql/data/db_3306/mysql3306.sock
port       = 3306
pid-file   = /mysql/data/db_3306/mysq3306.pid
datadir    = /mysql/data/db_3306

[mysqld3307]
socket     = /mysql/data/db_3307/mysql3307.sock
port       = 3307
pid-file   = /mysql/data/db_3307/mysq3307.pid
datadir    = /mysql/data/db_3307

[mysqld3308]
socket     = /mysql/data/db_3308/mysql3308.sock
port       = 3308
pid-file   = /mysql/data/db_3308/mysq3308.pid
datadir    = /mysql/data/db_3308

[mysql@server ~]$   mysqld  --defaults-file=/home/mysql/my.cnf  >> 1.log &
[1] 6827
[mysql@server ~]$ 2015-08-07 11:15:22 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-08-07 11:15:22 0 [Note] mysqld (mysqld 5.6.26) starting as process 6827 ...
2015-08-07 11:15:22 6827 [Note] Plugin 'FEDERATED' is disabled.
2015-08-07 11:15:22 6827 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-08-07 11:15:22 6827 [Note] InnoDB: The InnoDB memory heap is disabled
2015-08-07 11:15:22 6827 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-08-07 11:15:22 6827 [Note] InnoDB: Memory barrier is not used
2015-08-07 11:15:22 6827 [Note] InnoDB: Compressed tables use zlib 1.2.3
2015-08-07 11:15:22 6827 [Note] InnoDB: Using Linux native AIO
2015-08-07 11:15:22 6827 [Note] InnoDB: Using CPU crc32 instructions
2015-08-07 11:15:22 6827 [Note] InnoDB: Initializing buffer pool, size = 128.0M
2015-08-07 11:15:22 6827 [Note] InnoDB: Completed initialization of buffer pool
2015-08-07 11:15:22 6827 [Note] InnoDB: Highest supported file format is Barracuda.
2015-08-07 11:15:22 6827 [Note] InnoDB: 128 rollback segment(s) are active.
2015-08-07 11:15:22 6827 [Note] InnoDB: Waiting for purge to start
2015-08-07 11:15:22 6827 [Note] InnoDB: 5.6.26 started; log sequence number 1626107
2015-08-07 11:15:22 6827 [Note] Server hostname (bind-address): '*'; port: 3306
2015-08-07 11:15:22 6827 [Note] IPv6 is not available.
2015-08-07 11:15:22 6827 [Note]   - '0.0.0.0' resolves to '0.0.0.0';
2015-08-07 11:15:22 6827 [Note] Server socket created on IP: '0.0.0.0'.
2015-08-07 11:15:22 6827 [Note] Event Scheduler: Loaded 0 events
2015-08-07 11:15:22 6827 [Note] mysqld: ready for connections.
Version: '5.6.26'  socket: '/mysql/data/db_3306/mysql3306.sock'  port: 3306  Source distribution

[mysql@server ~]$
[mysql@server ~]$
[mysql@server ~]$
[mysql@server ~]$  mysql -uroot -S /mysql/data/db_3306/mysql3306.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.26 Source distribution

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>  show variables like 'character%' ;
+--------------------------+---------------------------------+
| Variable_name            | Value                           |
+--------------------------+---------------------------------+
| character_set_client     | utf8                            |
| character_set_connection | utf8                            |
| character_set_database   | latin1                          |
| character_set_filesystem | binary                          |
| character_set_results    | utf8                            |
| character_set_server     | latin1                          |
| character_set_system     | utf8                            |
| character_sets_dir       | /mysql/software/share/charsets/ |
+--------------------------+---------------------------------+
8 rows in set (0.00 sec)

mysql>

各个层级字符集设置
1 实例级
参见上文
2 数据库级别
mysql> CREATE DATABASE dao CHARACTER set utf32 ;
Query OK, 1 row affected (0.01 sec)

mysql> use dao ;
Database changed
mysql> show create database dao ;
+----------+---------------------------------------------------------------+
| Database | Create Database                                               |
+----------+---------------------------------------------------------------+
| dao      | CREATE DATABASE `dao` /*!40100 DEFAULT CHARACTER SET utf32 */ |
+----------+---------------------------------------------------------------+
1 row in set (0.00 sec)

3 表级别
mysql> create table dao_test1 (a varchar(10))  default charset=utf8;
Query OK, 0 rows affected (0.05 sec)

mysql>  show create table dao_test1 ;
+-----------+------------------------------------------------------------------------------------------------+
| Table     | Create Table                                                                                   |
+-----------+------------------------------------------------------------------------------------------------+
| dao_test1 | CREATE TABLE `dao_test1` (
`a` varchar(10) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-----------+------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

4 列级别

mysql> create table dao_test2  (a varchar(10)  CHARACTER SET utf16  ,b varchar(100))  ;
Query OK, 0 rows affected (0.01 sec)

mysql>  show create table dao_test2 ;
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table     | Create Table                                                                                                                                         |
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| dao_test2 | CREATE TABLE `dao_test2` (
`a` varchar(10) CHARACTER SET utf16 DEFAULT NULL,
`b` varchar(100) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf32 |
+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: