PostgreSQL 秒杀场景优化
2015-10-23 10:56
267 查看
秒杀场景的典型瓶颈在于对同一条记录的多次更新请求,然后只有一个或者少量请求是成功的,其他请求是以失败或更新不到告终。
例如,Iphone的1元秒杀,如果我只放出1台Iphone,我们把它看成一条记录,秒杀开始后,谁先抢到(更新这条记录的锁),谁就算秒杀成功。
例如:
使用一个标记位来表示这条记录是否已经被更新,或者记录更新的次数(几台Iphone)。
update tbl set xxx=xxx,upd_cnt=upd_cnt+1 where id=pk and upd_cnt+1<=5; -- 假设可以秒杀5台
这种方法的弊端:
获得锁的用户在处理这条记录时,可能成功,也可能失败,或者可能需要很长时间,(例如数据库响应慢)在它结束事务前,其他会话只能等着。
等待是非常不科学的,因为对于没有获得锁的用户,等待是在浪费时间。
所以一般的优化处理方法是先使用for update nowait的方式来避免等待,即如果无法即可获得锁,那么就不等待。
例如:
begin;select 1 from tbl where id=pk for update nowait; -- 如果用户无法即刻获得锁,则返回错误。从而这个事务回滚。update tbl set xxx=xxx,upd_cnt=upd_cnt+1 where id=pk and upd_cnt+1<=5;end;
这种方法可以减少用户的等待时间,因为无法即刻获得锁后就直接返回了。
但是这种方法也存在一定的弊端,对于一个商品,如果可以秒杀多台的话,我们用1条记录来存储多台,降低了秒杀的并发性。
因为我们用的是行锁。
解决这个问题办法很多,最终就是要提高并发性,例如:
1. 分段秒杀,把商品数量打散,拆成多个段,从而提高并发处理能力。
总体来说,优化的思路是减少锁等待时间,避免串行,尽量并行。
优化到这里就结束了吗?显然没有,以上方法任意数据库都可以做到,如果就这样结束怎么体现PostgreSQL的特性呢?
PostgreSQL还提供了一个锁类型,advisory锁,这种锁比行锁更加轻量,支持会话级别和事务级别。(但是需要注意ID是全局的,否则会相互干扰,也就是说,所有参与秒杀或者需要用到advisory lock的ID需要在单个库内保持全局唯一)
例子:
update tbl set xxx=xxx,upd_cnt=upd_cnt+1 where id=pk and upd_cnt+1<=5 and pg_try_advisory_xact_lock(:id);
最后必须要对比一下for update nowait和advisory lock的性能。
下面是在一台本地虚拟机上的测试。
新建一张秒杀表
postgres=# \d t1 Table "public.t1" Column | Type | Modifiers --------+---------+----------- id | integer | not null info | text | Indexes: "t1_pkey" PRIMARY KEY, btree (id)
只有一条记录,不断的被更新
postgres=# select * from t1; id | info ----+------------------------------- 1 | 2015-09-14 09:47:04.703904+08(1 row)
压测for update nowait的方式:
CREATE OR REPLACE FUNCTION public.f1(i_id integer) RETURNS void LANGUAGE plpgsql AS $function$ declare begin perform 1 from t1 where id=i_id for update nowait; update t1 set info=now()::text where id=i_id; exception when others then return; end; $function$;
postgres@digoal-> cat test1.sql\setrandom id 1 1select f1(:id);
压测advisory lock的方式:
postgres@digoal-> cat test.sql\setrandom id 1 1update t1 set info=now()::text where id=:id and pg_try_advisory_xact_lock(:id);
清除压测统计数据:
postgres=# select pg_stat_reset(); pg_stat_reset --------------- (1 row)postgres=# select * from pg_stat_all_tables where relname='t1';-[ RECORD 1 ]-------+-------relid | 184731schemaname | publicrelname | t1seq_scan | 0seq_tup_read | 0idx_scan | 0idx_tup_fetch | 0n_tup_ins | 0n_tup_upd | 0n_tup_del | 0n_tup_hot_upd | 0n_live_tup | 0n_dead_tup | 0n_mod_since_analyze | 0last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0autovacuum_count | 0analyze_count | 0autoanalyze_count | 0
压测结果:
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test1.sql -c 20 -j 20 -T 60......transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 20number of threads: 20duration: 60 snumber of transactions actually processed: 792029latency average: 1.505 mslatency stddev: 4.275 mstps = 13196.542846 (including connections establishing)tps = 13257.270709 (excluding connections establishing)statement latencies in milliseconds: 0.002625 \setrandom id 1 1 1.502420 select f1(:id);
postgres=# select * from pg_stat_all_tables where relname='t1';-[ RECORD 1 ]-------+-------relid | 184731schemaname | publicrelname | t1seq_scan | 0seq_tup_read | 0idx_scan | 896963 // 大多数是无用功idx_tup_fetch | 896963 // 大多数是无用功n_tup_ins | 0n_tup_upd | 41775n_tup_del | 0n_tup_hot_upd | 41400n_live_tup | 0n_dead_tup | 928n_mod_since_analyze | 41774last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0autovacuum_count | 0analyze_count | 0autoanalyze_count | 0
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 20 -j 20 -T 60......transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 20number of threads: 20duration: 60 snumber of transactions actually processed: 1392372latency average: 0.851 mslatency stddev: 2.475 mstps = 23194.831054 (including connections establishing)tps = 23400.411501 (excluding connections establishing)statement latencies in milliseconds: 0.002594 \setrandom id 1 1 0.848536 update t1 set info=now()::text where id=:id and pg_try_advisory_xact_lock(:id);
postgres=# select * from pg_stat_all_tables where relname='t1';-[ RECORD 1 ]-------+--------relid | 184731schemaname | publicrelname | t1seq_scan | 0seq_tup_read | 0idx_scan | 1368933 // 大多数是无用功idx_tup_fetch | 1368933 // 大多数是无用功n_tup_ins | 0n_tup_upd | 54957n_tup_del | 0n_tup_hot_upd | 54489n_live_tup | 0n_dead_tup | 1048n_mod_since_analyze | 54957last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0autovacuum_count | 0analyze_count | 0autoanalyze_count | 0
我们注意到,不管用哪种方法,都会浪费掉很多次的无用功扫描。
为了解决无用扫描的问题,可以使用以下函数。(当然,还有更好的方法是对用户透明。)
CREATE OR REPLACE FUNCTION public.f(i_id integer) RETURNS void LANGUAGE plpgsql AS $function$ declare a_lock boolean := false;begin select pg_try_advisory_xact_lock(i_id) into a_lock; if a_lock then update t1 set info=now()::text where id=i_id; end if; exception when others then return; end; $function$;
transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 20number of threads: 20duration: 60 snumber of transactions actually processed: 1217195latency average: 0.973 mslatency stddev: 3.563 mstps = 20283.314001 (including connections establishing)tps = 20490.143363 (excluding connections establishing)statement latencies in milliseconds: 0.002703 \setrandom id 1 1 0.970209 select f(:id);
postgres=# select * from pg_stat_all_tables where relname='t1';-[ RECORD 1 ]-------+-------relid | 184731schemaname | publicrelname | t1seq_scan | 0seq_tup_read | 0idx_scan | 75927idx_tup_fetch | 75927n_tup_ins | 0n_tup_upd | 75927n_tup_del | 0n_tup_hot_upd | 75902n_live_tup | 0n_dead_tup | 962n_mod_since_analyze | 75927last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0autovacuum_count | 0analyze_count | 0autoanalyze_count | 0
除了吞吐率的提升,我们其实还看到真实的处理数(更新次数)也有提升,所以不仅仅是降低了等待延迟,实际上也提升了处理能力。
最后提供一个物理机上的数据参考,使用128个并发连接,同时对一条记录进行更新:
不做任何优化的并发处理能力:
transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 128number of threads: 128duration: 100 snumber of transactions actually processed: 285673latency average: 44.806 mslatency stddev: 45.751 mstps = 2855.547375 (including connections establishing)tps = 2855.856976 (excluding connections establishing)statement latencies in milliseconds: 0.002509 \setrandom id 1 1 44.803299 update t1 set info=now()::text where id=:id;
使用for update nowait的并发处理能力:
transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 128number of threads: 128duration: 100 snumber of transactions actually processed: 6663253latency average: 1.919 mslatency stddev: 2.804 mstps = 66623.169445 (including connections establishing)tps = 66630.307999 (excluding connections establishing)statement latencies in milliseconds: 0.001934 \setrandom id 1 1 1.917297 select f1(:id);
使用advisory lock后的并发处理能力:
transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 128number of threads: 128duration: 100 snumber of transactions actually processed: 19154754latency average: 0.667 mslatency stddev: 1.054 mstps = 191520.550924 (including connections establishing)tps = 191546.208051 (excluding connections establishing)statement latencies in milliseconds: 0.002085 \setrandom id 1 1 0.664420 select f(:id);
使用advisory lock,性能相比不做任何优化性能提升了约66倍,相比for update nowait性能提升了约1.8倍。
这种优化可以快速告诉用户是否能秒杀到此类商品,而不需要等待其他用户更新结束后才知道。所以大大降低了RT,提高了吞吐率。
例如,Iphone的1元秒杀,如果我只放出1台Iphone,我们把它看成一条记录,秒杀开始后,谁先抢到(更新这条记录的锁),谁就算秒杀成功。
例如:
使用一个标记位来表示这条记录是否已经被更新,或者记录更新的次数(几台Iphone)。
update tbl set xxx=xxx,upd_cnt=upd_cnt+1 where id=pk and upd_cnt+1<=5; -- 假设可以秒杀5台
这种方法的弊端:
获得锁的用户在处理这条记录时,可能成功,也可能失败,或者可能需要很长时间,(例如数据库响应慢)在它结束事务前,其他会话只能等着。
等待是非常不科学的,因为对于没有获得锁的用户,等待是在浪费时间。
所以一般的优化处理方法是先使用for update nowait的方式来避免等待,即如果无法即可获得锁,那么就不等待。
例如:
begin;select 1 from tbl where id=pk for update nowait; -- 如果用户无法即刻获得锁,则返回错误。从而这个事务回滚。update tbl set xxx=xxx,upd_cnt=upd_cnt+1 where id=pk and upd_cnt+1<=5;end;
这种方法可以减少用户的等待时间,因为无法即刻获得锁后就直接返回了。
但是这种方法也存在一定的弊端,对于一个商品,如果可以秒杀多台的话,我们用1条记录来存储多台,降低了秒杀的并发性。
因为我们用的是行锁。
解决这个问题办法很多,最终就是要提高并发性,例如:
1. 分段秒杀,把商品数量打散,拆成多个段,从而提高并发处理能力。
总体来说,优化的思路是减少锁等待时间,避免串行,尽量并行。
优化到这里就结束了吗?显然没有,以上方法任意数据库都可以做到,如果就这样结束怎么体现PostgreSQL的特性呢?
PostgreSQL还提供了一个锁类型,advisory锁,这种锁比行锁更加轻量,支持会话级别和事务级别。(但是需要注意ID是全局的,否则会相互干扰,也就是说,所有参与秒杀或者需要用到advisory lock的ID需要在单个库内保持全局唯一)
例子:
update tbl set xxx=xxx,upd_cnt=upd_cnt+1 where id=pk and upd_cnt+1<=5 and pg_try_advisory_xact_lock(:id);
最后必须要对比一下for update nowait和advisory lock的性能。
下面是在一台本地虚拟机上的测试。
新建一张秒杀表
postgres=# \d t1 Table "public.t1" Column | Type | Modifiers --------+---------+----------- id | integer | not null info | text | Indexes: "t1_pkey" PRIMARY KEY, btree (id)
只有一条记录,不断的被更新
postgres=# select * from t1; id | info ----+------------------------------- 1 | 2015-09-14 09:47:04.703904+08(1 row)
压测for update nowait的方式:
CREATE OR REPLACE FUNCTION public.f1(i_id integer) RETURNS void LANGUAGE plpgsql AS $function$ declare begin perform 1 from t1 where id=i_id for update nowait; update t1 set info=now()::text where id=i_id; exception when others then return; end; $function$;
postgres@digoal-> cat test1.sql\setrandom id 1 1select f1(:id);
压测advisory lock的方式:
postgres@digoal-> cat test.sql\setrandom id 1 1update t1 set info=now()::text where id=:id and pg_try_advisory_xact_lock(:id);
清除压测统计数据:
postgres=# select pg_stat_reset(); pg_stat_reset --------------- (1 row)postgres=# select * from pg_stat_all_tables where relname='t1';-[ RECORD 1 ]-------+-------relid | 184731schemaname | publicrelname | t1seq_scan | 0seq_tup_read | 0idx_scan | 0idx_tup_fetch | 0n_tup_ins | 0n_tup_upd | 0n_tup_del | 0n_tup_hot_upd | 0n_live_tup | 0n_dead_tup | 0n_mod_since_analyze | 0last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0autovacuum_count | 0analyze_count | 0autoanalyze_count | 0
压测结果:
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test1.sql -c 20 -j 20 -T 60......transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 20number of threads: 20duration: 60 snumber of transactions actually processed: 792029latency average: 1.505 mslatency stddev: 4.275 mstps = 13196.542846 (including connections establishing)tps = 13257.270709 (excluding connections establishing)statement latencies in milliseconds: 0.002625 \setrandom id 1 1 1.502420 select f1(:id);
postgres=# select * from pg_stat_all_tables where relname='t1';-[ RECORD 1 ]-------+-------relid | 184731schemaname | publicrelname | t1seq_scan | 0seq_tup_read | 0idx_scan | 896963 // 大多数是无用功idx_tup_fetch | 896963 // 大多数是无用功n_tup_ins | 0n_tup_upd | 41775n_tup_del | 0n_tup_hot_upd | 41400n_live_tup | 0n_dead_tup | 928n_mod_since_analyze | 41774last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0autovacuum_count | 0analyze_count | 0autoanalyze_count | 0
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 20 -j 20 -T 60......transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 20number of threads: 20duration: 60 snumber of transactions actually processed: 1392372latency average: 0.851 mslatency stddev: 2.475 mstps = 23194.831054 (including connections establishing)tps = 23400.411501 (excluding connections establishing)statement latencies in milliseconds: 0.002594 \setrandom id 1 1 0.848536 update t1 set info=now()::text where id=:id and pg_try_advisory_xact_lock(:id);
postgres=# select * from pg_stat_all_tables where relname='t1';-[ RECORD 1 ]-------+--------relid | 184731schemaname | publicrelname | t1seq_scan | 0seq_tup_read | 0idx_scan | 1368933 // 大多数是无用功idx_tup_fetch | 1368933 // 大多数是无用功n_tup_ins | 0n_tup_upd | 54957n_tup_del | 0n_tup_hot_upd | 54489n_live_tup | 0n_dead_tup | 1048n_mod_since_analyze | 54957last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0autovacuum_count | 0analyze_count | 0autoanalyze_count | 0
我们注意到,不管用哪种方法,都会浪费掉很多次的无用功扫描。
为了解决无用扫描的问题,可以使用以下函数。(当然,还有更好的方法是对用户透明。)
CREATE OR REPLACE FUNCTION public.f(i_id integer) RETURNS void LANGUAGE plpgsql AS $function$ declare a_lock boolean := false;begin select pg_try_advisory_xact_lock(i_id) into a_lock; if a_lock then update t1 set info=now()::text where id=i_id; end if; exception when others then return; end; $function$;
transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 20number of threads: 20duration: 60 snumber of transactions actually processed: 1217195latency average: 0.973 mslatency stddev: 3.563 mstps = 20283.314001 (including connections establishing)tps = 20490.143363 (excluding connections establishing)statement latencies in milliseconds: 0.002703 \setrandom id 1 1 0.970209 select f(:id);
postgres=# select * from pg_stat_all_tables where relname='t1';-[ RECORD 1 ]-------+-------relid | 184731schemaname | publicrelname | t1seq_scan | 0seq_tup_read | 0idx_scan | 75927idx_tup_fetch | 75927n_tup_ins | 0n_tup_upd | 75927n_tup_del | 0n_tup_hot_upd | 75902n_live_tup | 0n_dead_tup | 962n_mod_since_analyze | 75927last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0autovacuum_count | 0analyze_count | 0autoanalyze_count | 0
除了吞吐率的提升,我们其实还看到真实的处理数(更新次数)也有提升,所以不仅仅是降低了等待延迟,实际上也提升了处理能力。
最后提供一个物理机上的数据参考,使用128个并发连接,同时对一条记录进行更新:
不做任何优化的并发处理能力:
transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 128number of threads: 128duration: 100 snumber of transactions actually processed: 285673latency average: 44.806 mslatency stddev: 45.751 mstps = 2855.547375 (including connections establishing)tps = 2855.856976 (excluding connections establishing)statement latencies in milliseconds: 0.002509 \setrandom id 1 1 44.803299 update t1 set info=now()::text where id=:id;
使用for update nowait的并发处理能力:
transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 128number of threads: 128duration: 100 snumber of transactions actually processed: 6663253latency average: 1.919 mslatency stddev: 2.804 mstps = 66623.169445 (including connections establishing)tps = 66630.307999 (excluding connections establishing)statement latencies in milliseconds: 0.001934 \setrandom id 1 1 1.917297 select f1(:id);
使用advisory lock后的并发处理能力:
transaction type: Custom queryscaling factor: 1query mode: preparednumber of clients: 128number of threads: 128duration: 100 snumber of transactions actually processed: 19154754latency average: 0.667 mslatency stddev: 1.054 mstps = 191520.550924 (including connections establishing)tps = 191546.208051 (excluding connections establishing)statement latencies in milliseconds: 0.002085 \setrandom id 1 1 0.664420 select f(:id);
使用advisory lock,性能相比不做任何优化性能提升了约66倍,相比for update nowait性能提升了约1.8倍。
这种优化可以快速告诉用户是否能秒杀到此类商品,而不需要等待其他用户更新结束后才知道。所以大大降低了RT,提高了吞吐率。
相关文章推荐
- java对redis的基本操作
- oracle物理读和逻辑读
- SQLServer性能优化之 nolock,大幅提升数据库查询性能
- 打开MySQL数据库远程访问的权限
- MySql 5.6.20,安装后无法登陆的解决办法
- mysql 导入导出整个库
- ORACLE的任务DBMS_JOB.SUBMIT的使用------JOB
- 淮安之rac行
- MongoDb 命令查询所有数据库列表
- oracle 表连接三种方式
- oracle表连接----->哈希连接(Hash Join)
- Mysql MERGE引擎简介
- redis常用命令
- 积分入学数据库设计问题汇总
- Redis学习笔记(三)类型之散列
- InnoDB还是MyISAM 再谈MySQL存储引擎的选择
- SQL Server系统表讲解
- 一招辨认sql中的varchar和char
- 经典Microsoft SQL Server语句大全
- SQL Server 和CLR集成