您的位置:首页 > 数据库 > Oracle

详解oracle bitmap位图索引

2013-11-24 21:14 387 查看
位图索引是oracle中非常重要的一种索引形式。本文通过总结有关位图索引的资料,尝试回答如下几个问题:1:什么是位图索引?2:位图索引适合什么场景,不适合什么场景?3:位图索引的性能如何?

什么是位图索引?

位图索引,顾名思义,与“位”有关。大家都知道,计算机中的所有信息最终都是通过“位bit”来运算的, 二进制位运算在计算机中是非常高效的。每一个二进制位都可以取值0或者1,而取值的确切含义是由具体的上下文环境决定的。在oracle位图索引中,每一个二进制位代表了某一行中索引列的取值情况。例如,学生表中性别列的位图索引结构如下:男:0101001101女:1010110010在上面的位图结构中,存储了10条学生记录的性别分布情况,以“男”性别为例,从左到右的第n个二进制位代表了第n条记录是否性别为男,如果二进制位为1,代表true即性别为男,0代表false即性别不为男。以此类推,从图中可以看出,第一条记录的性别为女,第二条记录的性别为男,...第九条记录的性别为女,第十条记录的性别为男。大家都知道,在oracle中是根据rowid来定位记录的,因此,我们需要引入start rowid和end rowid,通过start rowid ,end rowid 和二进制位的偏移,我们就可以非常快速的计算出二进制位所代表的表记录rowid。位图索引的最终逻辑结构如下图:

位图索引适合什么场景,不适合什么场景?

现在我们已经了解了位图索引的逻辑结构,我们称每一单元的<key ,startrowid,end rowid,bitmap>为一个位图片段。当我们修改某一行数据的时候,我们需要锁定该行列值所对应的位图片段,如果我们进行的是更新操作,同时还会锁定更新后新值所在的位图片段。例如我们将列值从01修改为03,就需要同时锁定01和03位图片段,此时如果有其他用户需要修改与01或者03关联的表记录上的索引字段,就会被阻塞,因此位图索引不适合并发环境,在并发环境下可能会造成大量事务的阻塞。从位图索引的逻辑结构也可以看出,当索引列的distinct cardinality较大时,索引所占用的存储空间也会成比例扩大。下面我们测试一下位图索引占用空间与distinct cardinality的关系:数据库环境:oracle 11g  数据块大小:8k
CREATE or replace FUNCTION ind_spc_test(rn NUMBER) RETURN NUMBER
AS
v_j     NUMBER;
v_dis   NUMBER;
v_bm_sp NUMBER;
v_bt_sp NUMBER;
BEGIN
FOR i IN 1 .. 10LOOP
EXECUTE immediate 'truncate table t_easy1';
EXECUTE immediate 'truncate table t_easy2';
SELECT floor(rn/(11-i)) INTO v_j FROM dual;
FOR j IN 1 .. rn LOOP
INSERT INTO t_easy1 VALUES (mod(j,v_j));
INSERT INTO t_easy2 VALUES (mod(j,v_j));
END LOOP;
commit;
select count(distinct id) into v_j from t_easy1;
EXECUTE immediate 'analyze index i_easy1 COMPUTE STATISTICS';
SELECT lEAF_BLOCKS INTO v_bt_sp FROM user_indexes where index_name='I_EASY1';
EXECUTE immediate 'analyze index i_easy2 COMPUTE STATISTICS';
SELECT LEAF_BLOCKS INTO v_bm_sp FROM user_indexes where index_name='I_EASY2';
INSERT INTO bitmap_ind_space VALUES (v_j,v_bm_sp,v_bt_sp,rn );
COMMIT;
END LOOP;
RETURN 0;
END;
SQL> select * from bitmap_ind_space order by 1;

DISTINCT_VAL BITMAP_IND_BLKS BTREE_IND_BLKS    ROW_NUM
------------ --------------- -------------- ----------
10000		 139		300	100000
11111		  79		335	100000
12500		  89		285	100000
14285		 103		220	100000
16666		 120		257	100000
20000		 146		310	100000
25000		 183		293	100000
33333		 246		262	100000
50000		 371		296	100000
100000		 408		200	100000
这里的测试比较简单,下面看看大师的实验结果:从这里可以看出,随着distinct columns值的增加,位图索引占用空间逐步增大,但即便在最坏的情况下,位图索引占用的空间也仅仅是普通索引的2~3倍,在存储日益普遍的今天,这恐怕并不是很大的问题。

位图索引的查询性能如何?

下面我们看一下位图索引的查询性能如何。在很多资料中,都可以看到这样的论述:位图索引适合于 low distict cardinality的列。实际上,对于high distinct cardinality 的列,位图索引的查询性能也是非常不错的。下面我们来验证这个结论。首先我们创建两张表:emp_normal和emp_random.
SQL> create table emp_normal(empno number(10), ename varchar2(30), sal number(10));

表已创建。

Begin
For i in 1..1000000
Loop
Insert into emp_normal
values(i, dbms_random.string('U',30), dbms_random.value(1000,7000));
If mod(i, 10000) = 0 then
Commit;
End if;
End loop;
10  End;
11  /

PL/SQL 过程已成功完成。

SQL> create table emp_random as select /* +append */ * from emp_normal order by dbms_random.random;
emp_random由于其记录是随机分布的,因此该表上索引的CLUSTERING_FACTOR要高一些。我们首先看一下emp_normal表等值查询情况下,索引的效率如何:
SQL> create bitmap index bm_normal on emp_normal(empno);

索引已创建。

SQL> analyze table emp_normal compute statistics for table for all indexes for all indexed columns;

表已分析。

SQL> select index_name,clustering_factor from user_indexes;

INDEX_NAME		       CLUSTERING_FACTOR
------------------------------ -----------------
BM_NORMAL				 1000000

SQL> set autot traceonly
SQL> select * from emp_normal where empno=&empno;
输入 empno 的值:  1000
原值    1: select * from emp_normal where empno=&empno
新值    1: select * from emp_normal where empno=1000

执行计划
----------------------------------------------------------
Plan hash value: 1526426521

-------------------------------------------------------------------------------------------
| Id  | Operation		     | Name	  | Rows  | Bytes | Cost (%CPU)| Time	  |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |		  |	1 |    34 |	3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | EMP_NORMAL |	1 |    34 |	3   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|		  |	  |	  |	       |	  |
|*  3 |    BITMAP INDEX SINGLE VALUE | BM_NORMAL  |	  |	  |	       |	  |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("EMPNO"=1000)

统计信息
----------------------------------------------------------
1  recursive calls
0  db block gets
5  consistent gets
0  physical reads
0  redo size
702  bytes sent via SQL*Net to client
520  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

SQL> select * from emp_normal where empno=&empno;
输入 empno 的值:  2398
原值    1: select * from emp_normal where empno=&empno
新值    1: select * from emp_normal where empno=2398

执行计划
----------------------------------------------------------
Plan hash value: 1526426521

-------------------------------------------------------------------------------------------
| Id  | Operation		     | Name	  | Rows  | Bytes | Cost (%CPU)| Time	  |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |		  |	1 |    34 |	3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | EMP_NORMAL |	1 |    34 |	3   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|		  |	  |	  |	       |	  |
|*  3 |    BITMAP INDEX SINGLE VALUE | BM_NORMAL  |	  |	  |	       |	  |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("EMPNO"=2398)

统计信息
----------------------------------------------------------
1  recursive calls
0  db block gets
5  consistent gets
0  physical reads
0  redo size
703  bytes sent via SQL*Net to client
520  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

SQL> select * from emp_normal where empno=&empno;
输入 empno 的值:  8545
原值    1: select * from emp_normal where empno=&empno
新值    1: select * from emp_normal where empno=8545

执行计划
----------------------------------------------------------
Plan hash value: 1526426521

-------------------------------------------------------------------------------------------
| Id  | Operation		     | Name	  | Rows  | Bytes | Cost (%CPU)| Time	  |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |		  |	1 |    34 |	3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | EMP_NORMAL |	1 |    34 |	3   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|		  |	  |	  |	       |	  |
|*  3 |    BITMAP INDEX SINGLE VALUE | BM_NORMAL  |	  |	  |	       |	  |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("EMPNO"=8545)

统计信息
----------------------------------------------------------
1  recursive calls
0  db block gets
5  consistent gets
0  physical reads
0  redo size
703  bytes sent via SQL*Net to client
520  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

SQL> select * from emp_normal where empno=&empno;
输入 empno 的值:  128444
原值    1: select * from emp_normal where empno=&empno
新值    1: select * from emp_normal where empno=128444

执行计划
----------------------------------------------------------
Plan hash value: 1526426521

-------------------------------------------------------------------------------------------
| Id  | Operation		     | Name	  | Rows  | Bytes | Cost (%CPU)| Time	  |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |		  |	1 |    34 |	3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | EMP_NORMAL |	1 |    34 |	3   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|		  |	  |	  |	       |	  |
|*  3 |    BITMAP INDEX SINGLE VALUE | BM_NORMAL  |	  |	  |	       |	  |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("EMPNO"=128444)

统计信息
----------------------------------------------------------
1  recursive calls
0  db block gets
5  consistent gets
0  physical reads
0  redo size
704  bytes sent via SQL*Net to client
520  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

SQL> drop index bm_normal;

索引已删除。

SQL> create index bt_normal on emp_normal(empno);

索引已创建。

SQL> analyze table emp_normal compute statistics for table for all indexes for all indexed columns;

表已分析。

SQL> select index_name,clustering_factor from user_indexes;

INDEX_NAME		       CLUSTERING_FACTOR
------------------------------ -----------------
BT_NORMAL				    6210
SYS_IL0000076897C00002$$
PK_EMP					       1
PK_DEPT 				       1

SQL> set autot traceonly
SQL> select * from emp_normal where empno=&empno;
输入 empno 的值:  1000
原值    1: select * from emp_normal where empno=&empno
新值    1: select * from emp_normal where empno=1000

执行计划
----------------------------------------------------------
Plan hash value: 733975378

------------------------------------------------------------------------------------------
| Id  | Operation		    | Name	 | Rows  | Bytes | Cost (%CPU)| Time	 |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	    |		 |     1 |    34 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP_NORMAL |     1 |    34 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN	    | BT_NORMAL  |     1 |	 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("EMPNO"=1000)

统计信息
----------------------------------------------------------
1  recursive calls
0  db block gets
5  consistent gets
0  physical reads
0  redo size
702  bytes sent via SQL*Net to client
520  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

SQL> select * from emp_normal where empno=&empno;
输入 empno 的值:  128444
原值    1: select * from emp_normal where empno=&empno
新值    1: select * from emp_normal where empno=128444

执行计划
----------------------------------------------------------
Plan hash value: 733975378

------------------------------------------------------------------------------------------
| Id  | Operation		    | Name	 | Rows  | Bytes | Cost (%CPU)| Time	 |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	    |		 |     1 |    34 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP_NORMAL |     1 |    34 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN	    | BT_NORMAL  |     1 |	 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("EMPNO"=128444)

统计信息
----------------------------------------------------------
1  recursive calls
0  db block gets
5  consistent gets
0  physical reads
0  redo size
704  bytes sent via SQL*Net to client
520  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

SQL> select * from emp_normal where empno=&empno;
输入 empno 的值:  2398
原值    1: select * from emp_normal where empno=&empno
新值    1: select * from emp_normal where empno=2398

执行计划
----------------------------------------------------------
Plan hash value: 733975378

------------------------------------------------------------------------------------------
| Id  | Operation		    | Name	 | Rows  | Bytes | Cost (%CPU)| Time	 |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	    |		 |     1 |    34 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP_NORMAL |     1 |    34 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN	    | BT_NORMAL  |     1 |	 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("EMPNO"=2398)

统计信息
----------------------------------------------------------
1  recursive calls
0  db block gets
5  consistent gets
0  physical reads
0  redo size
703  bytes sent via SQL*Net to client
520  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed
总结如下:
BITMAPEMPNOB-TREE
Consistent ReadsPhysical ReadsConsistent ReadsPhysical Reads
50100050
50239850
50854550
509800850
508534250
5012844450
5085850
对emp_random表进行实验,得出的结果与之类似,这里不再獒述。从这里可以看出,在唯一列上的等值查询,位图索引与btree索引的效率相当。下面,我们在针对范围查询来进行测试。SQL> create bitmap index bm_random on emp_random(empno);索引已创建。SQL> analyze table emp_random compute statistics for table for all indexes for all columns;表已分析。SQL> select index_name,clustering_factor from user_indexes;INDEX_NAME CLUSTERING_FACTOR------------------------------ -----------------BM_RANDOM 1000000SQL> set autot traceonlySQL> select * from emp_random where empno between &range1 and &range2;输入 range1 的值: 1输入 range2 的值: 2300原值 1: select * from emp_random where empno between &range1 and &range2新值 1: select * from emp_random where empno between 1 and 2300已选择2300行。执行计划----------------------------------------------------------Plan hash value: 811843605-------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 2299 | 85063 | 418 (1)| 00:00:06 || 1 | TABLE ACCESS BY INDEX ROWID | EMP_RANDOM | 2299 | 85063 | 418 (1)| 00:00:06 || 2 | BITMAP CONVERSION TO ROWIDS| | | | | ||* 3 | BITMAP INDEX RANGE SCAN | BM_RANDOM | | | | |-------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------3 - access("EMPNO">=1 AND "EMPNO"<=2300)统计信息----------------------------------------------------------1 recursive calls0 db block gets2463 consistent gets0 physical reads0 redo size130225 bytes sent via SQL*Net to client2203 bytes received via SQL*Net from client155 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)2300 rows processedSQL> select * from emp_random where empno between &range1 and &range2;输入 range1 的值: 8输入 range2 的值: 1980原值 1: select * from emp_random where empno between &range1 and &range2新值 1: select * from emp_random where empno between 8 and 1980已选择1973行。执行计划----------------------------------------------------------Plan hash value: 811843605-------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1972 | 72964 | 366 (0)| 00:00:05 || 1 | TABLE ACCESS BY INDEX ROWID | EMP_RANDOM | 1972 | 72964 | 366 (0)| 00:00:05 || 2 | BITMAP CONVERSION TO ROWIDS| | | | | ||* 3 | BITMAP INDEX RANGE SCAN | BM_RANDOM | | | | |-------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------3 - access("EMPNO">=8 AND "EMPNO"<=1980)统计信息----------------------------------------------------------1 recursive calls0 db block gets2114 consistent gets0 physical reads0 redo size111758 bytes sent via SQL*Net to client1961 bytes received via SQL*Net from client133 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)1973 rows processedSQL> select * from emp_random where empno between &range1 and &range2;输入 range1 的值: 28888输入 range2 的值: 31850原值 1: select * from emp_random where empno between &range1 and &range2新值 1: select * from emp_random where empno between 28888 and 31850已选择2963行。执行计划----------------------------------------------------------Plan hash value: 811843605-------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 2962 | 107K| 513 (0)| 00:00:07 || 1 | TABLE ACCESS BY INDEX ROWID | EMP_RANDOM | 2962 | 107K| 513 (0)| 00:00:07 || 2 | BITMAP CONVERSION TO ROWIDS| | | | | ||* 3 | BITMAP INDEX RANGE SCAN | BM_RANDOM | | | | |-------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------3 - access("EMPNO">=28888 AND "EMPNO"<=31850)统计信息----------------------------------------------------------1 recursive calls0 db block gets3172 consistent gets0 physical reads0 redo size170625 bytes sent via SQL*Net to client2687 bytes received via SQL*Net from client199 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)2963 rows processedSQL> drop index bm_random;索引已删除。SQL> create index bt_random on emp_random(empno);索引已创建。SQL> analyze table emp_random compute statistics for table for all indexes for all columns;表已分析。SQL> set autot offSQL> select index_name,clustering_factor from user_indexes;INDEX_NAME CLUSTERING_FACTOR------------------------------ -----------------BT_RANDOM 999834SQL> set autot traceonlySQL> select * from emp_random where empno between &range1 and &range2;输入 range1 的值: 1输入 range2 的值: 2300原值 1: select * from emp_random where empno between &range1 and &range2新值 1: select * from emp_random where empno between 1 and 2300已选择2300行。执行计划----------------------------------------------------------Plan hash value: 731629521--------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 2299 | 85063 | 1735 (1)| 00:00:21 ||* 1 | TABLE ACCESS FULL| EMP_RANDOM | 2299 | 85063 | 1735 (1)| 00:00:21 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------1 - filter("EMPNO"<=2300 AND "EMPNO">=1)统计信息----------------------------------------------------------1 recursive calls0 db block gets6410 consistent gets0 physical reads0 redo size121081 bytes sent via SQL*Net to client2203 bytes received via SQL*Net from client155 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)2300 rows processedSQL> select * from emp_random where empno between &range1 and &range2;输入 range1 的值: 8输入 range2 的值: 1980原值 1: select * from emp_random where empno between &range1 and &range2新值 1: select * from emp_random where empno between 8 and 1980已选择1973行。执行计划----------------------------------------------------------Plan hash value: 731629521--------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1972 | 72964 | 1735 (1)| 00:00:21 ||* 1 | TABLE ACCESS FULL| EMP_RANDOM | 1972 | 72964 | 1735 (1)| 00:00:21 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------1 - filter("EMPNO"<=1980 AND "EMPNO">=8)统计信息----------------------------------------------------------1 recursive calls0 db block gets6388 consistent gets0 physical reads0 redo size103922 bytes sent via SQL*Net to client1961 bytes received via SQL*Net from client133 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)1973 rows processed归纳如下,
BITMAPEMPNO (Range)B-TREE
Consistent ReadsPhysical ReadsConsistent ReadsPhysical Reads
246301-230064100
211408-198063880
257201850-425064180
3172028888-3185064560
2762082900-8547864310
72540984888-100000072540
从这里可以看出,位图索引要优于btree索引,这是因为btree索引的cluster  factor 较大,从而优化器选择了全表扫描。即便在emp_normal 表下,即clustering factor较小时,位图索引btree索引相当的。因此在distinct cardinality 较大的情况下,范围扫描的效率位图索引也是不逊色与btree索引。总结如下:位图索引的查询性能经常是优于btree索引的,即便在distinct cardinality较大的情况下位图索引不适合与dml频繁的环境位图索引适用于DSS系统位图索引可以进行逻辑运算,多个索引和同时在查询语句中发挥作用,这是一个非常重要的地方

                                            
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  oracle index bitmap index