MySQL 全文索引基础操作
2015-11-15 10:44
701 查看
当前测试的 mysql 版本: 5.0.95
MYSQL MYISAM引擎支持全文索引;MYSQL 5.6 后,InnoDB 也完全支持全文索引。
全文索引支持的字段类型包括 char ,varchar , text 。
创建测试表:(36行数据)
5.0 中 InnoDB 引擎创建失败,改为 MYISAM 再创建:
使用全文索引搜索数据:
使用 IN BOOLEAN MODE 设置特定搜索:
上面的一个查询为例,匹配度越高,匹配的结果值(如下字段 ref )越大。
现在查询一个单词,却找不出结果。 词语 'DEC' 不属于过滤的单词,为什么查不出?
查看全文索引变量,就知道了。最小的长度为 4 字符,4 个字符以下的都忽略不当作单词了!
#变量说明
http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html
ft_boolean_syntax :IN BOOLEAN MODE 匹配的选项
ft_max_word_len :最长匹配的字符数
ft_min_word_len :最短匹配的字符数
ft_query_expansion_limit :使用WITH QUERY EXPANSION 搜索出的前N(20)个结果再次搜索
ft_stopword_file :停词表文件,用于设置不需要匹配的词表
#默认停止词表
http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html
设置变量,允许查询3个长度的单词:
再查询,结果出来了!
而词语“for” 却查询不出!被忽略掉了!
变量 ft_stopword_file 没有指定文件(built-in),使用系统默认设置(参考:Full-Text Stopwords)
现在手动设置:
再查询,结果出来了!
现在需要把 “for” 忽略掉,怎么设置呢?如果添加过滤词,编辑文件添加该字符:
再查询,找不到了!已经被忽略!~
参考: Full-Text Search Functions
MYSQL MYISAM引擎支持全文索引;MYSQL 5.6 后,InnoDB 也完全支持全文索引。
全文索引支持的字段类型包括 char ,varchar , text 。
创建测试表:(36行数据)
CREATE TABLE test.ft_tab ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, charname CHAR(10), collatename VARCHAR(200), description TEXT(500), maxlen int ) ENGINE=InnoDB; insert into test.ft_tab (charname,collatename,description,maxlen) select CHARACTER_SET_NAME,DEFAULT_COLLATE_NAME,DESCRIPTION,MAXLEN from information_schema.character_sets ;
5.0 中 InnoDB 引擎创建失败,改为 MYISAM 再创建:
mysql> CREATE FULLTEXT INDEX idx_ft_tab ON test.ft_tab(description); ERROR 1214 (HY000): The used table type doesn't support FULLTEXT indexes mysql> mysql> ALTER TABLE test.ft_tab ENGINE = MYISAM; Query OK, 36 rows affected (0.02 sec) Records: 36 Duplicates: 0 Warnings: 0 mysql> CREATE FULLTEXT INDEX idx_ft_tab ON test.ft_tab(description); Query OK, 36 rows affected (0.01 sec) Records: 36 Duplicates: 0 Warnings: 0
使用全文索引搜索数据:
mysql> explain select * from test.ft_tab where description like '%Windows%'; +----+-------------+--------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | ft_tab | ALL | NULL | NULL | NULL | NULL | 36 | Using where | +----+-------------+--------+------+---------------+------+---------+------+------+-------------+ 1 row in set (0.00 sec) mysql> explain select * from test.ft_tab where match(description) against('Windows'); +----+-------------+--------+----------+---------------+------------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------+----------+---------------+------------+---------+------+------+-------------+ | 1 | SIMPLE | ft_tab | fulltext | idx_ft_tab | idx_ft_tab | 0 | | 1 | Using where | +----+-------------+--------+----------+---------------+------------+---------+------+------+-------------+ 1 row in set (0.00 sec)
使用 IN BOOLEAN MODE 设置特定搜索:
#至少包含一个单词(或) select * from test.ft_tab where match(description) against('Windows Central' IN BOOLEAN MODE); #括号,至少包含一个单词(或) select * from test.ft_tab where match(description) against('(Windows Central)' IN BOOLEAN MODE); select * from test.ft_tab where match(description) against('+(Windows Central)' IN BOOLEAN MODE); #括号/大于小于好,至少包含一个单词(或),大于号的排序最高,小于号的最低,同时存在的处中间 select * from test.ft_tab where match(description) against('+(>Windows <Central)' IN BOOLEAN MODE); select * ,(match(description) against('+(>Windows <Central)' IN BOOLEAN MODE)) AS REF from test.ft_tab where match(description) against('+(>Windows <Central)' IN BOOLEAN MODE); #两个(多个)单词同时包含(与) select * from test.ft_tab where match(description) against('+Windows +Central' IN BOOLEAN MODE); #包含一个(多个)单词,且不包含另一个(多个)单词 select * from test.ft_tab where match(description) against('+Windows -Central' IN BOOLEAN MODE); #包含一个(多个)单词,如果包含一个单词(如 Central),排序更高 select * from test.ft_tab where match(description) against('+Windows Central' IN BOOLEAN MODE); select * ,(match(description) against('+Windows Central' IN BOOLEAN MODE)) AS REF from test.ft_tab where match(description) against('+Windows Central' IN BOOLEAN MODE); #双引号,内部作为一个词组查询 select * from test.ft_tab where match(description) against('"Windows Central"' IN BOOLEAN MODE); #星号,模糊匹配,只在单词尾部 select * from test.ft_tab where match(description) against('Windows*' IN BOOLEAN MODE); select * from test.ft_tab where match(description) against('Cen*' IN BOOLEAN MODE); #包含一个(多个)单词,如果包含一个单词(如 ~Central),排序更低 select * from test.ft_tab where match(description) against('+Windows ~Central' IN BOOLEAN MODE); select * ,(match(description) against('+Windows ~Central' IN BOOLEAN MODE)) AS REF from test.ft_tab where match(description) against('+Windows ~Central' IN BOOLEAN MODE);
上面的一个查询为例,匹配度越高,匹配的结果值(如下字段 ref )越大。
mysql> select * ,(match(description) against('+(>Windows <Central)' IN BOOLEAN MODE)) AS REF -> from test.ft_tab where match(description) against('+(>Windows <Central)' IN BOOLEAN MODE); +----+----------+---------------------+-----------------------------+--------+-------------------+ | id | charname | collatename | description | maxlen | REF | +----+----------+---------------------+-----------------------------+--------+-------------------+ | 7 | latin2 | latin2_general_ci | ISO 8859-2 Central European | 1 | 0.666666686534882 | | 18 | cp1250 | cp1250_general_ci | Windows Central European | 1 | 1.16666674613953 | | 26 | macce | macce_general_ci | Mac Central European | 1 | 0.666666686534882 | | 28 | cp852 | cp852_general_ci | DOS Central European | 1 | 0.666666686534882 | | 30 | cp1251 | cp1251_general_ci | Windows Cyrillic | 1 | 1.5 | | 31 | cp1256 | cp1256_general_ci | Windows Arabic | 1 | 1.5 | | 32 | cp1257 | cp1257_general_ci | Windows Baltic | 1 | 1.5 | | 35 | cp932 | cp932_japanese_ci | SJIS for Windows Japanese | 2 | 1.5 | | 36 | eucjpms | eucjpms_japanese_ci | UJIS for Windows Japanese | 3 | 1.5 | +----+----------+---------------------+-----------------------------+--------+-------------------+
现在查询一个单词,却找不出结果。 词语 'DEC' 不属于过滤的单词,为什么查不出?
mysql> select * from test.ft_tab where match(description) against('DEC'); Empty set (0.00 sec)
查看全文索引变量,就知道了。最小的长度为 4 字符,4 个字符以下的都忽略不当作单词了!
mysql> show variables like '%ft%'; +--------------------------+----------------+ | Variable_name | Value | +--------------------------+----------------+ | ft_boolean_syntax | + -><()~*:""&| | | ft_max_word_len | 84 | | ft_min_word_len | 4 | | ft_query_expansion_limit | 20 | | ft_stopword_file | (built-in) | +--------------------------+----------------+
#变量说明
http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html
ft_boolean_syntax :IN BOOLEAN MODE 匹配的选项
ft_max_word_len :最长匹配的字符数
ft_min_word_len :最短匹配的字符数
ft_query_expansion_limit :使用WITH QUERY EXPANSION 搜索出的前N(20)个结果再次搜索
ft_stopword_file :停词表文件,用于设置不需要匹配的词表
#默认停止词表
http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html
设置变量,允许查询3个长度的单词:
[root@localhost]# vi /etc/my.cnf [mysqld] ft_min_word_len=3 #重启服务,修复表 [root@localhost]# service mysqld restart mysql> repair table test.ft_tab;
再查询,结果出来了!
mysql> select * from test.ft_tab where match(description) against('DEC'); +----+----------+-----------------+-------------------+--------+ | id | charname | collatename | description | maxlen | +----+----------+-----------------+-------------------+--------+ | 2 | dec8 | dec8_swedish_ci | DEC West European | 1 | +----+----------+-----------------+-------------------+--------+
而词语“for” 却查询不出!被忽略掉了!
mysql> select * from test.ft_tab where match(description) against('for'); Empty set (0.00 sec)
变量 ft_stopword_file 没有指定文件(built-in),使用系统默认设置(参考:Full-Text Stopwords)
现在手动设置:
[root@localhost]# vi /etc/my.cnf [mysqld] ft_stopword_file=/usr/share/mysql/ft_stopword.txt [root@localhost]# service mysqld restart mysql> repair table test.ft_tab; mysql> show variables like '%ft%';
mysql> show variables like '%ft%'; +--------------------------+----------------------------------+ | Variable_name | Value | +--------------------------+----------------------------------+ | ft_boolean_syntax | + -><()~*:""&| | | ft_max_word_len | 84 | | ft_min_word_len | 3 | | ft_query_expansion_limit | 20 | | ft_stopword_file | /usr/share/mysql/ft_stopword.txt | +--------------------------+----------------------------------+
再查询,结果出来了!
mysql> select * from test.ft_tab where match(description) against('for'); +----+----------+---------------------+---------------------------+--------+ | id | charname | collatename | description | maxlen | +----+----------+---------------------+---------------------------+--------+ | 35 | cp932 | cp932_japanese_ci | SJIS for Windows Japanese | 2 | | 36 | eucjpms | eucjpms_japanese_ci | UJIS for Windows Japanese | 3 | +----+----------+---------------------+---------------------------+--------+
现在需要把 “for” 忽略掉,怎么设置呢?如果添加过滤词,编辑文件添加该字符:
[root@localhost]# vi /usr/share/mysql/ft_stopword.txt 'for','in','at' [root@localhost]# service mysqld restart
再查询,找不到了!已经被忽略!~
mysql> select * from test.ft_tab where match(description) against('for'); Empty set (0.00 sec)
参考: Full-Text Search Functions
相关文章推荐
- MySQL入门
- 使用mysql时遇到的各种问题
- mysql使用笔记:相关术语和定义
- mysql子查询
- mysqldump:Couldn't execute 'show create table `tablename`': Table tablename' doesn't exist (1146)
- ajax+mysql
- mac 使用mysql 5.7.9
- MySQL备份与恢复
- 不同场景下 MySQL 的迁移方案
- ubuntu下 pdo_mysql扩展项详细安装
- MySQL的备份方式
- MYSQL的COMMIT和ROLLBACK
- mysql绿色版安装
- mysql设置事务隔离级别
- mysql-5.6 升级 5.7
- mysql启用skip-name-resolve模式时出现Warning的处理办法
- windos安装mysql(zip版本msql包) 初始化密码、新增用户
- MySQL——数据类型、列属性
- MySQL——列属性、查询select、模糊查询、聚合函数
- Mysql5.7设定root密码