您的位置:首页 > 其它

初识全文索引

2014-04-03 11:33 127 查看
    通常来说,全文索引大多用在OLAP环境当中,全文索引擅长于词汇的快速搜索。
一、全文索引和普通b_tree索引对比
SQL> create tablet1 (id int,name varchar(10));
 
Table created.
 
SQL> create indext1_ind on t1(name);
 
Index created.
 
SQL> create tablet2 as select * from t1;
 
Table created.
 
SQL> create indext2_ind on t2(name) indextype is ctxsys.context;
 
Index created.
 
 
SQL> select *from t1 where name like '%tom%';
 
        ID NAME
--------------------
         1 tom
         2 tom tom
         2 tom tom
 
 
Execution Plan
----------------------------------------------------------
Plan hash value:3589342044
 
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |    1 |    10 |     2  (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| T1    |     1 |    10 |    2   (0)| 00:00:01 |
|*  2 |  INDEX FULL SCAN           | T1_IND|     1 |       |    1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
PredicateInformation (identified by operation id):
---------------------------------------------------
 
   2 - filter("NAME" LIKE '%tom%' AND"NAME" IS NOT NULL)
 
 
Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          4 consistent gets
          0 physical reads
          0 redo size
        676 bytes sent via SQL*Net to client
        519 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          3 rows processed
         
SQL> select *from t2 where contains(name,'tom')>0;
 
        ID NAME
--------------------
         1 tom
         2 tom tom
         2 tom tom
 
 
Execution Plan
----------------------------------------------------------
Plan hash value:785228215
 
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |    3 |    30 |     7  (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| T2    |     3 |    30 |    7   (0)| 00:00:01 |
|*  2 |  DOMAIN INDEX              | T2_IND|       |       |    1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
PredicateInformation (identified by operation id):
---------------------------------------------------
 
   2 -access("CTXSYS"."CONTAINS"("NAME",'tom')>0)
 
 
Statistics
----------------------------------------------------------
         10 recursive calls
          0 db block gets
         14 consistent gets
          0 physical reads
          0 redo size
        676 bytes sent via SQL*Net to client
        519 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          3 rows processed
 
SQL> selectobject_name,object_type from user_objects order by 1;
 
OBJECT_NAME                    OBJECT_TYPE
-------------------------------------------------
                --DR开头的四张表为全文索引的基表
DR$T2_IND$X                    INDEX
DRC$T2_IND$R                   INDEX
SYS_IL0000236119C00006$$       INDEX
SYS_IL0000236124C00002$$       INDEX
SYS_IOT_TOP_236122             INDEX
SYS_IOT_TOP_236128             INDEX
SYS_LOB0000236119C00006$$      LOB
 
OBJECT_NAME                    OBJECT_TYPE
-------------------------------------------------
SYS_LOB0000236124C00002$$      LOB
T1                             TABLE
T1_IND                         INDEX
T2                             TABLE
T2_IND                         INDEX
 
 二、DML操作对全文索引的影响
以context方式创建的全文索引并不是基于事务的,默认情况下,即使一个dml操作提交,信息也不会更新到全文索引中。
1、insert 操作
SQL> create tablet(name varchar2(30));
 
Table created.
 
SQL> create indext_ind on t(name) indextype is ctxsys.context;
 
Index created.
 
SQL> insert intot values('i am an oracle dba');
 
1 row created.
 
SQL> commit;
 
insert数据已提交,我们看看全文索引是否已更新
 
SQL> setautotrace on
SQL> select *from t where name like '%dba%';
 
NAME
------------------------------
i am an oracle dba
 
 
Execution Plan
----------------------------------------------------------
Plan hash value:1601196873
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |     |     1 |    17 |    2   (0)| 00:00:01 |
|*  1 | TABLE ACCESS FULL| T    |     1 |   17 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------
 
PredicateInformation (identified by operation id):
---------------------------------------------------
 
   1 - filter("NAME" IS NOT NULL AND"NAME" LIKE '%dba%')
 
Note
-----
   - dynamic sampling used for this statement(level=2)
 
 
Statistics
----------------------------------------------------------
          5 recursive calls
          0 db block gets
         15 consistent gets
          0 physical reads
          0 redo size
        538 bytes sent via SQL*Net to client
        520 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed
 
SQL> set line 200
SQL> select *from t  where contains(name,'dba') >0;
 
no rows selected
 
 
Execution Plan
----------------------------------------------------------
Plan hash value:315187259
 
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    1 |    29 |     4  (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| T    |     1 |    29 |    4   (0)| 00:00:01 |
|*  2 |  DOMAIN INDEX              | T_IND|       |       |    4   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
 
PredicateInformation (identified by operation id):
---------------------------------------------------
 
   2 -access("CTXSYS"."CONTAINS"("NAME",'dba')>0)
 
Note
-----
   - dynamic sampling used for this statement(level=2)
 
 
Statistics
----------------------------------------------------------
       1829 recursive calls
          0 db block gets
       2696 consistent gets
         30 physical reads
          0 redo size
        332 bytes sent via SQL*Net to client
        509 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
        164 sorts (memory)
          0 sorts (disk)
          0 rows processed
以上发现,全文索引并没有自动更新,而是把记录存放在线面的ctxsys.dr$pending表中。
 
SQL> setautotrace off
SQL> select *from ctxsys.dr$pending;
 
   PND_CID   PND_PID PND_ROWID         PND_TIMESTAMP      P
-------------------- ------------------ ------------------ -
      1084          0 AABGmVAAEAAAADmAAA 03-APR-14          N
 
SQL> insert intot values('he is an oracle dba');
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
SQL> select *from ctxsys.dr$pending;
 
   PND_CID   PND_PID PND_ROWID         PND_TIMESTAMP      P
-------------------- ------------------ ------------------ -
      1084          0 AABGmVAAEAAAADmAAA 03-APR-14          N
      1084          0 AABGmVAAEAAAADmAAB 03-APR-14          N
 
SQL> select *from t  where contains(name,'dba') >0;
 
no rows selected
 
为了把信息同步到全文索引中,我们需要手工同步:
SQL> alter indext_ind rebuild parameters ('sync');
 
Index altered.
 
SQL> select *from t  where contains(name,'dba') >0;
 
NAME
------------------------------
i am an oracle dba
he is an oracle dba
 
SQL> select *from ctxsys.dr$pending;
 
no rows selected
 
2、delete操作
SQL> select *from t;
 
NAME
------------------------------
i am an oracle dba
he is an oracle dba
 
SQL> delete fromt where name='he is an oracle dba';
 
1 row deleted.
 
SQL> select *from t  where contains(name,'dba') >0;
 
NAME
------------------------------
i am an oracle dba
 
SQL> select *from ctxsys.dr$pending;
 
no rows selected
 
SQL> select *from ctxsys.dr$delete; 

 
DEL_IDX_IDDEL_IXP_ID  DEL_DOCID
-------------------- ----------
      1084          0          2
 
这里我们看到全文索引立即生效了,至于ctxsys.dr$delete里面的数据是delete操作的中间状态,用来维护一个事物,无论事物提交或者回滚。
 
SQL> rollback;
 
Rollback complete.
 
SQL> select *from t  where contains(name,'dba') >0;
 
NAME
------------------------------
i am an oracle dba
he is an oracle dba
 
SQL> select *from ctxsys.dr$delete;
 
no rows selected
 
3、update操作
update操作相当于delete+insert操作,所以默认情况下需要手动刷新全文索引。
SQL> update t setname='oracle dba' where name='i am an oracle dba';
 
1 row updated.
 
SQL> select *from ctxsys.dr$delete;
 
DEL_IDX_IDDEL_IXP_ID  DEL_DOCID
-------------------- ----------
      1084          0          1
 
SQL> select *from ctxsys.dr$pending;
 
   PND_CID   PND_PID PND_ROWID         PND_TIMESTAMP      P
-------------------- ------------------ ------------------ -
      1084          0 AABGmVAAEAAAADmAAA 03-APR-14          N
 
SQL> select *from t where contains(name,'dba') > 0;
 
NAME
------------------------------
he is an oracle dba
 
SQL> alter indext_ind rebuild parameters ('sync');
 
Index altered.
 
SQL> select *from t where contains(name,'dba') > 0;
 
NAME
------------------------------
he is an oracle dba
oracle dba
 
由于全文索引创建对象大多是海量数据的表,dml操作如果实时更新会影响到系统性能。创建全文索引同步的三个选项:
manual:默认选项
every:在一个时间段后更新索引
on commitdml:在事务提交后更新索引
语法如下:
create index t_indon t(name) indextype is ctxsys.context parameters('sync (on commit)');
 
查看全文索引信息和性能的工具包ctx_report
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息