您的位置：首页 > 其它

索引列上计算引起的索引失效及优化措施以及注意事项

2012-08-21 14:20 134 查看

两个示例

例子一

表结构

DROP TABLE IF EXISTS `account`;
CREATE TABLE IF NOT EXISTS `account` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`account` int(10) unsigned NOT NULL,
`password` char(32) NOT NULL,
`ip` char(15) NOT NULL,
`time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `time` (`time`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

比如要统计2012年08月15日注册的会员数：

SELECT count(id) FROM account WHERE DATEDIFF("2012-08-15",time)=0

例子二

表结构

DROP TABLE IF EXISTS `active`;
CREATE TABLE IF NOT EXISTS `user` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`userid` int(10) unsigned NOT NULL,
`lastactive` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `lastactive` (`lastactive`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

统计最近3分钟的活跃用户

SELECT count(id) FROM user WHERE unix_timstamp()-lastactive < 180

以上两个例子中，虽然都建有索引，但是SQL执行中却不走索引，而采用全表扫描。

原因揭密

SQL语句where中如果有functionName(colname)或者某些运算，则MYSQL无法使用基于colName的索引。使用索引需要直接查询某个字段。

索引失效的原因是索引是针对原值建的二叉树，将列值计算后，原来的二叉树就用不上了；

为了解决索引列上计算引起的索引失效问题，将计算放到索引列外的表达式上。

解决办法

例子一：SELECT count(id) FROM account WHERE time between "2012-08-15 00:00:00" and "2012-08-15 23:59:59"

例子二：SELECT count(id) FROM user WHERE lastactive > unix_timstamp() - 180

不得不说

创建索引、优化查询以便达到更好的查询优化效果。但实际上，MySQL有时并不按我们设计的那样执行查询。MySQL是根据统计信息来生成执行计划的，这就涉及索引及索引的刷选率，表数据量，还有一些额外的因素。

Each table index is queried, and the best index is used unless the optimizer believes that it is more efficient to use a table scan. At one time, a scan was used based on whether the best index spanned more than 30% of the table, but a fixed percentage no
longer determines the choice between using an index or a scan. The optimizer now is more complex and bases its estimate on additional factors such as table size, number of rows, and I/O block size.

简而言之，当MYSQL认为符合条件的记录在30%以上，它就不会再使用索引，因为mysql认为走索引的代价比不用索引代价大，所以优化器选择了自己认为代价最小的方式。事实也的确如此

实例检测

表结构

DROP TABLE IF EXISTS `active`;
CREATE TABLE IF NOT EXISTS `active` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`userid` int(10) unsigned NOT NULL,
`lastactive` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `lastactive` (`lastactive`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

插入数据

insert into active values
(null,10000, unix_timestamp("2012-08-20 15:10:02")),
(null,10001, unix_timestamp("2012-08-20 15:10:02")),
(null,10002, unix_timestamp("2012-08-20 15:10:03")),
(null,10003, unix_timestamp("2012-08-20 15:10:03")),
(null,10004, unix_timestamp("2012-08-20 15:10:03")),
(null,10005, unix_timestamp("2012-08-20 15:10:04")),
(null,10006, unix_timestamp("2012-08-20 15:10:04")),
(null,10007, unix_timestamp("2012-08-20 15:10:05")),
(null,10008, unix_timestamp("2012-08-20 15:10:06"))

explain select * from active where lastactive > unix_timestamp()-3;

上面这句索引起作用。

但是我在测试中，因为插入的日期与我测试的当前日期相差不少时间。所以我改写为以下内容：

explain select * from active where lastactive > unix_timestamp("2012-08-20 15:10:06") - 3;

但是数据显示，TYPE为ALL，key为NULL。也就是说索引不起作用。

我在改写以下语句测试：

explain select * from active where lastactive > unix_timestamp("2012-08-20 15:10:06");

上面这个语句，索引又起作用了。

一个疑惑

正好手头上有一个12016条记录的数据，证实一下“当MYSQL认为符合条件的记录在30%以上，它就不会再使用索引”的结论。经过测试，在总记录12016条记录的表中，查询小于1854条记录时走索引，大于该记录时不走索引。符合条件的记录在15.4%。这....，30%的数据可能有待确认，正如上面说的那样，MySQL的优化器是考虑多方面因素，并选择自己认为代价最小的方式。

mysql自己判断是否使用索引，如果你自己确信使用索引可以提高效率，你也可以强行实用索引force index(index_name)

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航