您的位置：首页 > 数据库 > MySQL

从Mysql EXPLAIN探寻数据库查询优化2

2012-04-29 17:01 375 查看

mysql> explain select A.id,A.title,B.title from jos_content A,jos_categories B where A.catid=B.id;+----+-------------+-------+--------+---------------+---------+---------+---------------------+-------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+---------------+---------+---------+---------------------+-------+-------------+| 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 46585 | || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | Using where |+----+-------------+-------+--------+---------------+---------+---------+---------------------+-------+-------------+2 rows in set (0.00 sec)

这个是我们经常使用的一种查询方式，对B表的联接类型使用了eq_ref，索引使用了PRIMARY，但是对于A表，却没有使用任何索引，这可能不是我们想要的。查看以上SQL语句，我们可能会想到，有必要给A.catid加个索引了。mysql> alter table jos_content add index idx_catid(`catid`);Query OK, 46585 rows affected (0.75 sec)Records: 46585 Duplicates: 0 Warnings: 0 mysql> explain select A.id,A.title,B.title from jos_content A,jos_categories B where A.catid=B.id;+----+-------------+-------+--------+---------------+---------+---------+---------------------+-------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+---------------+---------+---------+---------------------+-------+-------------+| 1 | SIMPLE | A | ALL | idx_catid | NULL | NULL | NULL | 46585 | || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | Using where |+----+-------------+-------+--------+---------------+---------+---------+---------------------+-------+-------------+2 rows in set (0.00 sec)

这样表A便使用了idx_catid索引。
下面我们做一次三个表的联合查询mysql> explain select A.id,A.title,B.title from jos_content A,jos_categories B,jos_sections C where A.catid=B.id and A.sectionid=C.id;+----+-------------+-------+--------+---------------+---------+---------+---------------------+-------+--------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+---------------+---------+---------+---------------------+-------+--------------------------------+| 1 | SIMPLE | C | index | PRIMARY | PRIMARY | 4 | NULL | 2 | Using index || 1 | SIMPLE | A | ALL | idx_catid | NULL | NULL | NULL | 46585 | Using where; Using join buffer || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | Using where |+----+-------------+-------+--------+---------------+---------+---------+---------------------+-------+--------------------------------+3 rows in set (0.00 sec)

这里显示了Mysql先将C表读入查询，并使用PRIMARY索引，然后联合A表进行查询，这时候type显示的是ALL，可以用的索引有idx_catid，但是实际没有用。原因非常明显，因为使用的连接条件是A.sectionid=C.id，所以我们给A.sectionid加个索引先。mysql> alter table jos_content add index idx_section(`sectionid`);Query OK, 46585 rows affected (0.89 sec)Records: 46585 Duplicates: 0 Warnings: 0 mysql> explain select A.id,A.title,B.title from jos_content A,jos_categories B,jos_sections C where A.catid=B.id and A.sectionid=C.id;+----+-------------+-------+--------+-----------------------+-------------+---------+---------------------+-------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+-----------------------+-------------+---------+---------------------+-------+-------------+| 1 | SIMPLE | C | index | PRIMARY | PRIMARY | 4 | NULL | 2 | Using index || 1 | SIMPLE | A | ref | idx_catid,idx_section | idx_section | 4 | joomla_test.C.id | 23293 | Using where || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | Using where |+----+-------------+-------+--------+-----------------------+-------------+---------+---------------------+-------+-------------+3 rows in set (0.00 sec)

这时候显示结果告诉我们，效果很明显，在连接A表时type变成了ref，索引使用了idx_section，如果我们注意看后两列，对A表的查询结果后一次明显少了一半左右，而且没有用到join buffer。这个表读入的顺序是Mysql优化器帮我们做的，可以得知，用记录数少的表做为基础表进行联合，将会得到更高的效率。
对于上面的语句，我们换一种写法mysql> explain select A.id,A.title,B.title from jos_content A left join jos_categories B on A.catid=B.id left join jos_sections C on A.sectionid=C.id;+----+-------------+-------+--------+---------------+---------+---------+-------------------------+-------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+---------------+---------+---------+-------------------------+-------+-------------+| 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 46585 | || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | || 1 | SIMPLE | C | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.sectionid | 1 | Using index |+----+-------------+-------+--------+---------------+---------+---------+-------------------------+-------+-------------+3 rows in set (0.00 sec)

Mysql 读入表的顺序被改变了，这意味着，如果我们用left join来做连接查询，Mysql会按SQL语句中表出现的顺序读入，还有一个有变化的地方是联接B和C的type都变成了eq_ref，前边我们说过，这样说明Mysql可以找到唯一的行，这个效率是比ref要高的。
再来看一个排序的例子：mysql> explain select A.id,A.title,B.title from jos_content A left join jos_categories B on A.catid=B.id left join jos_sections C on A.sectionid=C.id order by B.id;+----+-------------+-------+--------+---------------+---------+---------+-------------------------+-------+---------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+---------------+---------+---------+-------------------------+-------+---------------------------------+| 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 46585 | Using temporary; Using filesort || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | || 1 | SIMPLE | C | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.sectionid | 1 | Using index |+----+-------------+-------+--------+---------------+---------+---------+-------------------------+-------+---------------------------------+3 rows in set (0.00 sec) mysql> explain select A.id,A.title,B.title from jos_content A left join jos_categories B on A.catid=B.id left join jos_sections C on A.sectionid=C.id order by A.id;+----+-------------+-------+--------+---------------+---------+---------+-------------------------+-------+----------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+---------------+---------+---------+-------------------------+-------+----------------+| 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 46585 | Using filesort || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | || 1 | SIMPLE | C | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.sectionid | 1 | Using index |+----+-------------+-------+--------+---------------+---------+---------+-------------------------+-------+----------------+

对于上面两条语句，只是修改了一下排序字段，而第一个使用了Using temporary，而第二个却没有。在日常的网站维护中，如果有Using temporary出现，说明需要做一些优化措施了。而为什么第一个用了临时表，而第二个没有用呢？因为如果有ORDER BY子句和一个不同的GROUP BY子句，或者如果ORDER BY或GROUP BY中的字段都来自其他的表而非连接顺序中的第一个表的话，就会创建一个临时表了。那么，对于上面例子中的第一条语句，我们需要对jos_categories的id进行排序，可以将SQL做如下改动：mysql> explain select B.id,B.title,A.title from jos_categories A left join jos_content B on A.id=B.catid left join jos_sections C on B.sectionid=C.id order by A.id;+----+-------------+-------+--------+---------------+-----------+---------+-------------------------+------+----------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+---------------+-----------+---------+-------------------------+------+----------------+| 1 | SIMPLE | A | ALL | NULL | NULL | NULL | NULL | 18 | Using filesort || 1 | SIMPLE | B | ref | idx_catid | idx_catid | 4 | joomla_test.A.id | 3328 | || 1 | SIMPLE | C | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.B.sectionid | 1 | Using index |+----+-------------+-------+--------+---------------+-----------+---------+-------------------------+------+----------------+3 rows in set (0.00 sec)

这样我们发现，不会再有Using temporary了，而且在查询jos_content时，查询的记录明显有了数量级的降低，这是因为jos_content的idx_catid起了作用。所以结论是：尽量对第一个表的索引键进行排序，这样效率是高的。
我们还会发现，在排序的语句中都出现了Using filesort，字面意思可能会被理解为：使用文件进行排序或中文件中进行排序。实际上这是不正确的，这是一个让人产生误解的词语。当我们试图对一个没有索引的字段进行排序时，就是filesoft。它跟文件没有任何关系，实际上是内部的一个快速排序。
然而，当我们回过头来再看上面运行过的一个SQL的时候会有以下发现：mysql> explain select A.id,A.title,B.title from jos_content A,jos_categories B,jos_sections C where A.catid=B.id and A.sectionid=C.id order by C.id;+----+-------------+-------+--------+-----------------------+-------------+---------+---------------------+-------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+-----------------------+-------------+---------+---------------------+-------+-------------+| 1 | SIMPLE | C | index | PRIMARY | PRIMARY | 4 | NULL | 1 | Using index || 1 | SIMPLE | A | ref | idx_catid,idx_section | idx_section | 4 | joomla_test.C.id | 23293 | Using where || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | Using where |+----+-------------+-------+--------+-----------------------+-------------+---------+---------------------+-------+-------------+3 rows in set (0.00 sec)

这是我们刚才运行过的一条语句，只是加了一个排序，而这条语句中C表的主键对排序起了作用，我们会发现Using filesort没有了。而尽管在上面的语句中也是对第一个表的主键进行排序，却没有得到想要的效果（第一个表的主键没有用到），这是为什么呢？实际上以上运行过的所有left join的语句中，第一个表的索引都没有用到，尽管对第一个表的主键进行了排序也无济于事。不免有些奇怪！
于是我们继续测试了下一条SQL：mysql> explain select A.id,A.title,B.title from jos_content A left join jos_categories B on A.catid=B.id left join jos_sections C on A.sectionid=C.id where A.id < 100;+----+-------------+-------+--------+----------------+---------+---------+-------------------------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+----------------+---------+---------+-------------------------+------+-------------+| 1 | SIMPLE | A | range | PRIMARY | PRIMARY | 4 | NULL | 90 | Using where || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | || 1 | SIMPLE | C | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.sectionid | 1 | Using index |+----+-------------+-------+--------+----------------+---------+---------+-------------------------+------+-------------+3 rows in set (0.05 sec)

然后，当再次进行排序操作的时候，Using filesoft也没有再出现mysql> explain select A.id,A.title,B.title from jos_content A left join jos_categories B on A.catid=B.id left join jos_sections C on A.sectionid=C.id where A.id < 100 order by A.id;+----+-------------+-------+--------+---------------+---------+---------+-------------------------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+-------+--------+---------------+---------+---------+-------------------------+------+-------------+| 1 | SIMPLE | A | range | PRIMARY | PRIMARY | 4 | NULL | 105 | Using where || 1 | SIMPLE | B | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.catid | 1 | || 1 | SIMPLE | C | eq_ref | PRIMARY | PRIMARY | 4 | joomla_test.A.sectionid | 1 | Using index |+----+-------------+-------+--------+---------------+---------+---------+-------------------------+------+-------------+3 rows in set (0.00 sec)

这个结果表明：对where条件里涉及到的字段，Mysql会使用索引进行搜索，而这个索引的使用也对排序的效率有很好的提升。
写了段程序测试了一下，分别让以下两个SQL语句执行200次：1. select A.id,A.title,B.title from jos_content A left join jos_categories B on A.catid=B.id left join jos_sections C on A.sectionid=C.id2. select A.id,A.title,B.title from jos_content A,jos_categories B,jos_sections C where A.catid=B.id and A.sectionid=C.id3. select A.id,A.title,B.title from jos_content A left join jos_categories B on A.catid=B.id left join jos_sections C on A.sectionid=C.id　order by rand() limit 104. select A.id from jos_content A left join jos_categories B on B.id=A.catid left join jos_sections C on A.sectionid=C.id order by A.id

结果是第(1)条平均用时20s，第(2)条平均用时44s，第(3)条平均用时70s，第(4)条平均用时2s。而且假如我们用explain观察第(3)条语句的执行情况，会发现它创建了temporary表来进行排序。
综上所述，可以得出如下结论：1. 对需要查询和排序的字段要加索引。2. 在一定环境下，left join还是比普通连接查询效率要高，但是要尽量少地连接表，并且在做连接查询时注意观察索引是否起了作用。3. 排序尽量对第一个表的索引字段进行，可以避免mysql创建临时表，这是非常耗资源的。4. 对where条件里涉及到的字段，应适当地添加索引，这样会对排序操作有优化的作用。5. 在做随机抽取数据的需求时，避免使用order by rand()，从上面的例子可以看出，这种是很浪费数据库资源的，在执行过程中用show processlist查看，会发现第(3)条有Copying to tmp table on disk。而对(3)和(4)的对比得知，如果要实现这个功能，最好另辟奚径，来减轻Mysql的压力。6. 从第4点可以看出，如果说在分页时我们能先得到主键，再根据主键查询相关内容，也能得到查询的优化效果。通过国外《High Performance MySQL》专家组的测试可以看出，根据主键进行查询的类似“SELECT ... FROM... WHERE id = ...”的SQL语句（其中id为PRIMARYKEY），每秒钟能够处理10000次以上的查询，而普通的SELECT查询每秒只能处理几十次到几百次。涉及到分页的查询效率问题，网上的可用资源越来越多，查询功能也体现出了它的重要性。也便是sphinx、lucene这些第三方搜索引擎的用武之地了。7. 在平时的作业中，可以打开Mysql的Slow queries功能，经常检查一下是哪些语句降低的Mysql的执行效率，并进行定期优化。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航