PostgreSQL 查询优化--CTE使用
2016-12-02 16:47
302 查看
背景1:
当我们需要查询很多客户的,离当前时间最近订单时
参考网址: http://bonesmoses.org/2014/05/08/trumping-the-postgresql-query-planner/
创建测试表
不走索引
EXPLAIN ANALYZE SELECT DISTINCT ON (client_id) client_id, order_date FROM test_order ORDER BY client_id, order_date DESC;
背景2:
参考网址: https://yq.aliyun.com/articles/65202?spm=5176.8091938.0.0.tZZBTS
当然,你还可以让车辆签到的方式来解决这个问题,但是总有未签到的,或者没有这种设计的时候,那么怎么解决呢?
情景3:
生成树形结构
参考网址: http://blog.databasepatterns.com/2014/02/trees-paths-recursive-cte-postgresql.html
当我们需要查询很多客户的,离当前时间最近订单时
参考网址: http://bonesmoses.org/2014/05/08/trumping-the-postgresql-query-planner/
创建测试表
CREATE TABLE test_order ( client_id INT NOT NULL, order_date TIMESTAMP NOT NULL, filler TEXT NOT NULL );插入测试数据
INSERT INTO test_order SELECT s1.id, (CURRENT_DATE - INTERVAL '1000 days')::DATE + generate_series(1, s1.id%1000), repeat(' ', 20) FROM generate_series(1, 10000) s1 (id);
CREATE INDEX idx_test_order_client_id_order_date ON test_order (client_id, order_date DESC);执行普通SQL
不走索引
EXPLAIN ANALYZE SELECT client_id, max(order_date) FROM test_order GROUP BY client_id;
"Execution time: 5741.682 ms"使用索引
EXPLAIN ANALYZE SELECT DISTINCT ON (client_id) client_id, order_date FROM test_order ORDER BY client_id, order_date DESC;
"Execution time: 4628.510 ms"优化后SQL
EXPLAIN ANALYZE WITH RECURSIVE skip AS ( (SELECT client_id, order_date FROM test_order ORDER BY client_id, order_date DESC LIMIT 1) UNION ALL (SELECT (SELECT min(client_id) FROM test_order WHERE client_id > skip.client_id ) AS client_id, (SELECT max(order_date) FROM test_order WHERE client_id = ( SELECT min(client_id) FROM test_order WHERE client_id > skip.client_id ) ) AS order_date FROM skip WHERE skip.client_id IS NOT NULL) ) SELECT * FROM skip;
"Execution time: 865.889 ms"查询结果
client_id; order_date 1;"2014-03-09 00:00:00" 2;"2014-03-10 00:00:00" 3;"2014-03-11 00:00:00" 4;"2014-03-12 00:00:00" 5;"2014-03-13 00:00:00" 6;"2014-03-14 00:00:00" 7;"2014-03-15 00:00:00" 8;"2014-03-16 00:00:00" 9;"2014-03-17 00:00:00" 10;"2014-03-18 00:00:00" 11;"2014-03-19 00:00:00" 12;"2014-03-20 00:00:00" 13;"2014-03-21 00:00:00" 14;"2014-03-22 00:00:00" 15;"2014-03-23 00:00:00" 16;"2014-03-24 00:00:00" 17;"2014-03-25 00:00:00" 18;"2014-03-26 00:00:00" 19;"2014-03-27 00:00:00" 20;"2014-03-28 00:00:00" 21;"2014-03-29 00:00:00" 22;"2014-03-30 00:00:00" 23;"2014-03-31 00:00:00" 24;"2014-04-01 00:00:00"
背景2:
参考网址: https://yq.aliyun.com/articles/65202?spm=5176.8091938.0.0.tZZBTS
有一个这样的场景,一张小表A,里面存储了一些ID,大约几百个。
(比如说巡逻车辆ID,环卫车辆的ID,公交车,微公交的ID)。 另外有一张日志表B,每条记录中的ID是来自前面那张小表的,但不是每个ID都出现在这张日志表中,比如说一天可能只有几十个ID会出现在这个日志表的当天的数据中。(比如车辆的行车轨迹数据,每秒上报轨迹,数据量就非常庞大)。 那么我怎么快速的找出今天没有出现的ID呢。(哪些巡逻车辆没有出现在这个片区,是不是偷懒了?哪些环卫车辆没有出行,哪些公交或微公交没有出行)? select id from A where id not in (select id from B where time between ? and ?);这个QUERY会很慢,有什么优化方法呢。当然,你还可以让车辆签到的方式来解决这个问题,但是总有未签到的,或者没有这种设计的时候,那么怎么解决呢?
-- A create table a(id int primary key, info text); -- B create table b(id int primary key, aid int, crt_time timestamp); create index b_aid on b(aid); -- a表插入1000条 insert into a select generate_series(1,1000), md5(random()::text); -- b表插入500万条,只包含aid的500个id。 insert into b select generate_series(1,5000000), generate_series(1,500), clock_timestamp(); 优化前: select * from a where id not in (select aid from b); 执行时间:大于1min 优化后: select * from a where id not in (with recursive skip as ( ( select min(aid) aid from b where aid is not null ) union all ( select (select min(aid) aid from b where b.aid > s.aid and b.aid is not null) from skip s where s.aid is not null ) -- 这里的where s.aid is not null 一定要加,否则就死循环了. ) select aid from skip where aid is not null); 执行时间:46 msec
情景3:
生成树形结构
参考网址: http://blog.databasepatterns.com/2014/02/trees-paths-recursive-cte-postgresql.html
create table subregions ( id smallint primary key, name text not null, parent_id smallint null references subregions(id) ); insert into subregions values (1,'World',null), (2,'Africa',1), (5,'South America',419), (9,'Oceania',1), (11,'Western Africa',2), (13,'Central America',419), (14,'Eastern Africa',2), (15,'Northern Africa',2), (17,'Middle Africa',2), (18,'Southern Africa',2), (19,'Americas',1), (21,'Northern America',19), (29,'Caribbean',419), (30,'Eastern Asia',142), (34,'Southern Asia',142), (35,'South-Eastern Asia',142), (39,'Southern Europe',150), (53,'Australia and New Zealand',9), (54,'Melanesia',9), (57,'Micronesia',9), (61,'Polynesia',9), (142,'Asia',1), (143,'Central Asia',142), (145,'Western Asia',142), (150,'Europe',1), (151,'Eastern Europe',150), (154,'Northern Europe',150), (155,'Western Europe',150), (419,'Latin America and the Caribbean',19); And you wanted to make a pretty tree like this: World Africa Eastern Africa Middle Africa Northern Africa Southern Africa Western Africa Americas Latin America and the Caribbean Caribbean Central America South America Northern America Asia Central Asia Eastern Asia South-Eastern Asia Southern Asia Western Asia Europe Eastern Europe Northern Europe Southern Europe Western Europe Oceania Australia and New Zealand Melanesia Micronesia Polynesia Here's how you'd do it: with recursive my_expression as ( --start with the "anchor", i.e. all of the nodes whose parent_id is null: select id, name as path, name as tree, 0 as level from subregions where parent_id is null union all --then the recursive part: select current.id as id, previous.path || ' > ' || current.name as path, repeat(' ', previous.level + 1) || current.name as tree, previous.level + 1 as level from subregions current join my_expression as previous on current.parent_id = previous.id ) select tree from my_expression order by path 路径间加入父节点和分割 select path from my_expression order by path 输出结果: World World > Africa World > Africa > Eastern Africa World > Africa > Middle Africa World > Africa > Northern Africa World > Africa > Southern Africa World > Africa > Western Africa World > Americas World > Americas > Latin America and the Caribbean World > Americas > Latin America and the Caribbean > Caribbean World > Americas > Latin America and the Caribbean > Central America World > Americas > Latin America and the Caribbean > South America World > Americas > Northern America World > Asia World > Asia > Central Asia World > Asia > Eastern Asia World > Asia > South-Eastern Asia World > Asia > Southern Asia World > Asia > Western Asia World > Europe World > Europe > Eastern Europe World > Europe > Northern Europe World > Europe > Southern Europe World > Europe > Western Europe World > Oceania World > Oceania > Australia and New Zealand World > Oceania > Melanesia World > Oceania > Micronesia World > Oceania > Polynesia
相关文章推荐
- PostgreSQL查询优化一例---使用CTE优化,兼谈松散扫描
- 使用Postgresql基因查询优化
- MySQL查询优化技术系列讲座之使用索引
- ASP.NET优化连载(二)尽量使用存储过程,并优化查询语句
- 使用Limit参数优化MySQL查询 潇湘博客
- 数据库查询性能优化(合理使用索引|避免或简化排序|避免对大型表进行全表顺序扫描|避免使用相关的子查询|避免使用通配符匹配 )
- 在SQLite中使用索引优化查询速度
- 数据库性能优化(强制使用索引查询)
- 使用Explain进行查询及应用优化
- MySQL查询优化讲座之使用索引
- MySQL查询优化技术系列讲座之使用索引【转】
- 在SQLite中使用索引优化查询速度
- 使用连接(JOIN)来代替子查询(Sub-Queries) mysql优化系列记录
- MySQL查询优化技术系列讲座之使用索引
- MySQL查询优化技术系列讲座之使用索引
- 使用临时表优化大表查询
- 数据库查询的优化——索引使用的注意点
- PostgreSQL源码修改 ——查询优化(四)
- SQL Server2005数据库查询中使用CTE
- 使用Limit参数优化MySQL查询的方法