您的位置:首页 > 数据库 > Oracle

ORACLE 11g 通过ASH结合AWR实战解决cpu高负载的详细过程

2016-09-07 22:11 549 查看
  ASH结合AWR实战解决oracle高负载             线上oracle数据库负载比较高,所以需要优化,一般进去看负载高的进程,如果是oracle进程,那么就是oracle运行不健康出异常了,我们通常可以用ash来分析问题,如果ash还无法解决,我们就需要扩大分析范围,需要采用awr分析日志来进行分析。
 1、ASH分析日志分析EventsTop SQL with Top Events
SQL IDPlanhashSampled # of Executions% ActivityEvent% EventTop Row Source% RwSrcSQL Text
2xg4n5sj1sx5h2047223008506611.18latch: cache buffers chains5.88TABLE ACCESS - FULL2.34select * from (select a.bis_pr...
    CPU + Wait for CPU5.24TABLE ACCESS - FULL2.35
19gkjf4r3zkpw293954107310132.23latch: cache buffers chains1.27TABLE ACCESS - FULL1.00select round(sum(t.current_fac...
6qhrc4ax6zr1a29395410739602.12latch: cache buffers chains1.21TABLE ACCESS - FULL0.96select round(sum(t.current_fac...
6nmdjjxfv7aqp29395410738471.87latch: cache buffers chains1.03TABLE ACCESS - FULL0.78select round(sum(t.current_fac...
1qq2h1qkxkaha29395410737761.71latch: cache buffers chains1.02TABLE ACCESS - FULL0.83select round(sum(t.current_fac...
 Top SQL with Top Row Sources
SQL IDPlanHashSampled # of Executions% ActivityRow Source% RwSrcTop Event% EventSQL Text
2xg4n5sj1sx5h2047223008506611.18TABLE ACCESS - FULL4.70CPU + Wait for CPU2.35select * from (select a.bis_pr...
    TABLE ACCESS - FULL3.90latch: cache buffers chains2.30
    TABLE ACCESS - FULL1.13latch: cache buffers chains0.66
19gkjf4r3zkpw293954107310132.23TABLE ACCESS - FULL1.73latch: cache buffers chains1.00select round(sum(t.current_fac...
6qhrc4ax6zr1a29395410739602.12TABLE ACCESS - FULL1.59latch: cache buffers chains0.96select round(sum(t.current_fac...
6nmdjjxfv7aqp29395410738471.87TABLE ACCESS - FULL1.39latch: cache buffers chains0.78select round(sum(t.current_fac...
1qq2h1qkxkaha29395410737761.71TABLE ACCESS - FULL1.34latch: cache buffers chains0.83select round(sum(t.current_fac...
 从这里分析,可以看到2xg4n5sj1sx5h的SQL引发了性能瓶颈,因为短短的20分钟内执行了5066次,而且诱发了latch: cache buffers chains以及TABLE ACCESS – FULL全表扫描,所以针对这个sql,分析后,建立优化索引:createindex IDX_PROJECT_ID_BRRQon BIS_REPORT_RENT_QUICK(BIS_PROJECT_ID, SEQUNCE_NO, YEAR, MONTH); 之后继续观察cpu负载,已经降低了一半,但是还不够,还没有达到正常值,而看ash报告已经不能提供帮助了。接下来我们需要更加全面的awr报告。  2、Awr看以下个主要方向(1)SQL ordered by Sharable Memory
(2)SQL ordered by CPU Tim
 
 
 3、先看SQL ordered by Sharable Memory
Only Statements with Sharable Memory greater than 1048576 are displayed
Sharable Mem (b)Executions% TotalSQL IdSQL ModuleSQL Text
5,890,119190.048jfndnfr9hct8 select * from ( select * from ...
4,980,36730.043gnf4zamgyb7x select * from ( select * from ...
3,816,03680.03277rt4gknr76h update POWERDESK.cont_ledger s...
2,966,255210.02f2b3a59mdr7wb select * from ( select * from ...
2,694,091120.02a8y6k4gt09xxn select * from ( select * from ...
2,316,487210.02amyymj7a06gum select * from ( select * from ...
2,003,767180.017z4hxbs26uz8d select * from ( select * from ...
2,003,74360.0196264xfwmxcm7 select * from ( select * from ...
1,979,443120.01a8y6k4gt09xxn select * from ( select * from ...
1,626,167180.017z4hxbs26uz8d select * from ( select * from ...
1,354,01190.01g371ncnjwg9zx select * from ( select * from ...
1,325,8571,8270.01fm2hxa2nb9gzx select count(*) from ( select ...
1,303,1251920.01dq8xug6mzj7dj select count(*) from (select r...
1,277,136110.011jubuhx6cdxw8 select sum(rent_square) from( ...
1,147,40040.016gj7zqb64m29q select count(*) from ( select ...
1,135,23240.016gj7zqb64m29q select count(*) from ( select ...
1,126,67210.01242g0k9qdh0q0 select sum(rent_square) from( ...
1,064,6073,7230.017sum7dh4hcmbh select * from ( select jbpmtas...
 
 
看到有fm2hxa2nb9gzx这条sql记录,执行次数比较多,有优化的空间:
SELECT   COUNT(*) FROM   (SELECT     res.*,     1 rec_status   FROM     Res_Approve_Info res   WHEREEXISTS     (SELECT       1     FROM       Res_Approve_User u     WHERE u.res_Approve_Info_Id = res.res_Approve_Info_Id       AND u.user_Cd = :1)            UNION     SELECT       res.*,       9 rec_status     FROM       Res_Approve_Info res     WHEREEXISTS       (SELECT         1       FROM         Res_Approve_User u,         Res_Accredit_Info a       WHERE u.res_Approve_Info_Id = res.res_Approve_Info_Id         AND U.USER_CD = A.USER_CD         AND A.ACC_USER_CD = :2)) res   WHERE1 = 1     AND res.status_cd = '1'
 
 
解决办法,添加索引:
createindex IXU_RES_USER_CD_2 on RES_APPROVE_USER(USER_CD);
 
 
 4、再看SQL ordered by CPU Time
Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
%Total - CPU Time as a percentage of Total DB CPU
%CPU - CPU Time as a percentage of Elapsed Time
%IO - User I/O Time as a percentage of Elapsed Time
Captured SQL account for 23.1% of Total CPU Time (s): 11,858
Captured PL/SQL account for 0.0% of Total CPU Time (s): 11,858
CPU Time (s)ExecutionsCPU per Exec (s)%TotalElapsed Time (s)%CPU%IOSQL IdSQL ModuleSQL Text
146.6711,0550.011.24192.1676.330.004198t22zum3dr select t1.module_cd, count(t1....
140.39652.161.18300.4546.730.00a8s9kjk7mk6yc select nvl(sum(nvl(sales_money...
133.99632.131.13365.3936.670.0015uv0hbq9dmm7 select nvl(sum(nvl(sales_money...
130.97652.011.10286.2345.750.00bhnjxfa4av3kw select nvl(sum(nvl(sales_money...
127.934,2150.031.08157.8881.030.00aajc7r9y7z1qs select * from ( select this_.d...
124.72631.981.05317.8139.240.001na6hypkn1q2f select nvl(sum(nvl(sales_money...
112.563470.320.95444.8725.300.00cpw6nx6gdv937 select sum(rent_square) from( ...
110.8119,0130.010.93156.7170.710.00ckkvz3r38xa4k select * from ( select kmpush0...
103.0019,0130.010.87136.8275.280.008n23w37kwdn3p select count(*) as col_0_0_ fr...
97.717,4000.010.82130.3274.980.00f1z0k5gx90tqz select (case when t1.module_cd...
 
 
 
依据以上awr统计分析cpu time(s)得出的结论,点击进去,找出sql记录,并给出优化方案如下:
 
属于sql写法不规范的优化:
 
4198t22zum3drselect t1.module_cd, count(t1.jbpm_task_id) num from jbpm_task t1, jbpm_task_candidate t2 where t1.jbpm_task_id=t2.jbpm_task_id and (t2.user_cd=:1 or t2.user_cd like :2 ) group by t1.module_cd order by decode(t1.module_cd, 'mesMeetingInfo', '0', 'resApprove', '1', 'planTarget', '2') asc优化建议: 将order by后面的decode(t1.module_cd, 'mesMeetingInfo', '0', 'resApprove', '1', 'planTarget', '2') asc移到from前面,在外层做order by操作
 
bhnjxfa4av3kwselect nvl(sum(nvl(sales_money, 0)), 0) from bis_sales_day where to_char(sales_date, 'yyyy-MM')='2016-06' and bis_cont_id in (select bis_cont_id from bis_cont bc where cont_type_cd in ('1', '2') and ((bc.status_cd = '2' and '2016-06' >= to_char(bc.cont_start_date, 'yyyy-MM') and to_char(bc.cont_to_fail_date, 'yyyy-MM')>='2016-06') or (bc.status_cd in ('1', '3') and '2016-06' >= to_char(bc.cont_start_date, 'yyyy-MM') and to_char(bc.cont_end_date, 'yyyy-MM')>='2016-06')) and bc.bis_project_id in (select bis_project_id from bis_project where is_business_project ='1' ) and bc.store_type='1' )优化建议: 将where后面 in的子查询判断字句变成 exists判断。
15uv0hbq9dmm7select nvl(sum(nvl(sales_money, 0)), 0) from bis_sales_day where to_char(sales_date, 'yyyy-MM')='2016-07' and bis_cont_id in (select bis_cont_id from bis_cont bc where cont_type_cd in ('1', '2') and ((bc.status_cd = '2' and '2016-07' >= to_char(bc.cont_start_date, 'yyyy-MM') and to_char(bc.cont_to_fail_date, 'yyyy-MM')>='2016-07') or (bc.status_cd in ('1', '3') and '2016-07' >= to_char(bc.cont_start_date, 'yyyy-MM') and to_char(bc.cont_end_date, 'yyyy-MM')>='2016-07')) and bc.bis_project_id in (select bis_project_id from bis_project where is_business_project ='1' ) and bc.store_type in ('1', '2') )优化建议: 将where后面 in的子查询判断字句变成 exists判断。
bhnjxfa4av3kwselect nvl(sum(nvl(sales_money, 0)), 0) from bis_sales_day where to_char(sales_date, 'yyyy-MM')='2016-06' and bis_cont_id in (select bis_cont_id from bis_cont bc where cont_type_cd in ('1', '2') and ((bc.status_cd = '2' and '2016-06' >= to_char(bc.cont_start_date, 'yyyy-MM') and to_char(bc.cont_to_fail_date, 'yyyy-MM')>='2016-06') or (bc.status_cd in ('1', '3') and '2016-06' >= to_char(bc.cont_start_date, 'yyyy-MM') and to_char(bc.cont_end_date, 'yyyy-MM')>='2016-06')) and bc.bis_project_id in (select bis_project_id from bis_project where is_business_project ='1' ) and bc.store_type='1' )优化建议: 将where后面 in的子查询判断字句变成 exists判断。
 
 
 
 
属于sql中添加索引的优化:
aajc7r9y7z1qsselect * from ( select this_.dly_note_id as dly1_396_0_, this_.created_center_cd as created2_396_0_, this_.created_date as created3_396_0_, this_.created_dept_cd as created4_396_0_, this_.created_position_cd as created5_396_0_, this_.creator as creator396_0_, this_.note_biz_type_cd as note7_396_0_, this_.note_content as note8_396_0_, this_.note_title as note9_396_0_, this_.record_version as record10_396_0_, this_.remark as remark396_0_, this_.send_mail_flg as send12_396_0_, this_.seq as seq396_0_, this_.tip_end_date as tip14_396_0_, this_.tip_flg as tip15_396_0_, this_.tip_period_cd as tip16_396_0_, this_.tip_start_date as tip17_396_0_, this_.tip_type_cd as tip18_396_0_, this_.updated_center_cd as updated19_396_0_, this_.updated_date as updated20_396_0_, this_.updated_dept_cd as updated21_396_0_, this_.updated_position_cd as updated22_396_0_, this_.updator as updator396_0_ from POWERDESK.dly_note this_ where this_.creator=:1 order by this_.seq asc ) where rownum <= :2优化方案,添加索引:createindex IDX_CREATOR on DLY_NOTE (creator);
 
 
 
 
在添加好IDX_CREATOR、IXU_RES_USER_CD_2 这2个索引后,cpu负载已经再次降低了一半,虽然没有到正常值,但是已经接近最低级别的告警线,暂时数据库脱离危险了,就等其它的几个需要修改sql的优化继续完成后,cpu负载基本就可以恢复到正常值了。
 
所以,一般oracle服务器负载飙升,90%都是由不规范的sql、性能差的sql造成的,而ash和awr分析报告是我们解决的最有效的途径。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: