您的位置：首页 > 数据库 > Oracle

ETL系列：一种针对oracle分区表的统计信息收集策略

2016-03-22 14:40 671 查看

*************************************************

** 转发请注明原文，尊重原创

** 原文来自： blog.csdn.net/clark_xu 徐长亮的专栏
*************************************************
oracle分区表有一维分区表和组合分区表；

组合分区呢在10g 中有两种：range-hash，range-list

11g 在组合分区功能这块有所增强，又推出了range-range,list-range,list-list,list-hash

统计信息对于oracle 11g的执行计划非常重要；

当单表数据量变化超过10%的时候，表的统计信息就会失效；

对于分区表来说：每个分区，子分区都有自己的统计信息；当单个分区（或者子分区）的数据量变化超过10%的时候，分区的统计信息也会失效；就会影响表的执行计划；查看分区表统计信息失效的语句为：

------------------------------------------------------------------------------
--子分区统计信息失效
select distinct owner,table_name,partition_name,subpartition_name
from dba_tab_statistics
where (last_analyzed is null or stale_stats = 'YES')
and owner = 'DW'
and partition_name is not null
and subpartition_name is not null;
------------------------------------------------------------------------------
--表的统计信息失效
select distinct owner,table_name,partition_name
from dba_tab_statistics
where (last_analyzed is null or stale_stats = 'YES')
and owner = 'DW'
and partition_name is null
and subpartition_name is null;
------------------------------------------------------------------------------
--分区统计信息失效
select distinct owner,table_name,partition_name
from dba_tab_statistics
where (last_analyzed is null or stale_stats = 'YES')
and owner = 'DW'
and partition_name is not null
AND subpartition_name is  null;

针对分区表，希望有这样一种收集统计信息的策略：只针对统计失效的某几个分区（或者子分区）进行统计信息收集；

替代一般的策略：只要有一个分区统计信息失效，就针对所有的分区表，子分区做统计信息收集；

后一种策略，简单，但是浪费了IO资源；

针对第一种策略，我写了如下的存储过程；

create or replace procedure P_gather_table_stats(avc_schema in varchar2 --输入需要收集统计信息的用户
) is
--01. stale_table_global处理不是分区表的统计信息失效
cursor stale_table_global is
select /*+ unnest */
distinct owner, table_name
from dba_tab_statistics
where (last_analyzed is null or stale_stats = 'YES')
and owner = avc_schema
--排除partition_name,subpartition_name
and partition_name is null
and subpartition_name is null;

--02. stale_table_part处理一维分区表的统计信息失效
cursor stale_table_part is
select /*+ unnest */
distinct owner, table_name, partition_name
from dba_tab_statistics
where (last_analyzed is null or stale_stats = 'YES')
and owner = avc_schema
--确定partition_name，排除subpartition_name
and partition_name is not null
and subpartition_name is null;

--03.stale_table_subpart处理二维分区表的统计信息失效
cursor stale_table_subpart is
select distinct owner, table_name, partition_name
from dba_tab_statistics
where (last_analyzed is null or stale_stats = 'YES')
and owner = avc_schema
and partition_name is not null
and subpartition_name is not null;

begin

--01.处理不是分区表的统计信息失效
--SELECT * FROM USER_TAB_MODIFICATIONS来展示统计结果，因为信息不是实时刷新到数据字典，所以
dbms_stats.flush_database_monitoring_info;
for stale in stale_table_global loop
dbms_stats.gather_table_stats(ownname          => stale.owner,
tabname          => stale.table_name,
estimate_percent => dbms_stats.auto_sample_size,
--for all columns size repeat替换for all indexed columns
method_opt  => 'for all indexed columns',
degree      => 8,
granularity => 'GLOBAL',
cascade     => true,
--force - gather statistics of table even if it is locked.
force => true);
end loop;

--02.处理一维分区表的统计信息失效
dbms_stats.flush_database_monitoring_info;
for stale_part in stale_table_part loop
dbms_stats.gather_table_stats(ownname          => stale_part.owner,
tabname          => stale_part.table_name,
partname         => stale_part.partition_name,
estimate_percent => dbms_stats.auto_sample_size,
--for all columns size repeat替换for all indexed columns
method_opt  => 'for all indexed columns',
degree      => 8,
granularity => 'PARTITION',
cascade     => true,
--force - gather statistics of table even if it is locked.
force => true);
end loop;

--03.处理二维分区表的统计信息失效
dbms_stats.flush_database_monitoring_info;
for stale_subpart in stale_table_subpart loop
dbms_stats.gather_table_stats(ownname          => stale_subpart.owner,
tabname          => stale_subpart.table_name,
partname         => stale_subpart.partition_name,
estimate_percent => dbms_stats.auto_sample_size,
--for all columns size repeat替换for all indexed columns
method_opt  => 'for all indexed columns',
degree      => 8,
granularity => 'SUBPARTITION',
cascade     => true,
--force - gather statistics of table even if it is locked.
force => true);
end loop;
--异常处理,简化版
exception
when others then
dbms_output.put_line(substr(sqlerrm, 1, 80);

end;

参数说明：(avc_schema in varchar2 --输入需要收集统计信息的用户

我写的这个脚本，在实际的生产数据库，每天都执行，所需时间，一个小时以内；

注：在oracle 11g中，从oracle 表的级别看，两个因素影响工作效率：统计信息，高水位线问题；

下次分析高水位线表的处理；

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航