Oracle Index Clustering Factor(集群因子)

今天在做测试的时候发现字段上有索引,但是执行计划就是不走索引,经过在网上查找才发现原来是索引的集群因子过高导致的。

二、官网说明

The index clustering factor measures row order in relation to an indexed value suches employee last name.The more order that exists in rowstorage for this value,the lower the clustering factor.

----row存储的越有序,clustering factor的值越低。

The clustering factor is useful as a rough measure of the number of I/Os required to read an entire table by means of an index:

(1)、If the clustering factor is high,then Oracle Database performs a relatively high number of I/Os during a large index range scan.The index entriespoint to random table blocks,so the database may have to read and reread the same blocks over and over again to retrieve the data pointed to by the index.

----当clustering factor很高时,说明index entry (rowid) 是随机指向一些block的,在一个大的index range scan时,这样为了读取这些rowid指向的block,就需要一次又一次重复的去读这些block。

(2)、If the clustering factor is low,then Oracle Database performs a relatively low number of I/Os during a large index range scan.The index keys in arange tend to point to the same data blcok,so the database does not have to read and reread the same blocks over and over.

----当clustering factor值低时,说明index keys (rowid) 是指向的记录是存储在相同的block里,这样去读row时,只需要在同一个block里读取就可以了,这样减少重复读取blocks的次数。

The clustering factor is relevant for index scans because it can show:

(1)、Whether the database will use an index for large range scans;

(2)、The degree of table organization in relation to the index key;

(3)、Whether you should consider using an index-organized table,partitioning,or table cluster if rows must be ordered by the index key.

三、Index Clustering Factor说明

简单的说,Index Clustering Factor是通过一个索引扫描一张表,需要访问的表的数据块的数量,即对I/O的影响,也代表索引键存储位置是否有序。

(1)、如果越有序,即相邻的键值存储在相同的block,那么这时候Clustering Factor的值就越低;

(2)、如果不是很有序,即键值是随机的存储在block上,这样在读取键值时,可能就需要一次又一次的去访问相同的block,从而增加了I/O。

Clustering Factor的计算方式如下:

(1)、扫描一个索引(large index range scan);

(2)、比较某行的rowid和前一行的rowid,如果这两个rowid不属于同一个数据块,那么cluster factor增加1;

(3)、整个索引扫描完毕后,就得到了该索引的clustering factor。

如果clustering factor接近于表存储的块数,说明这张表是按照索引字段的顺序存储的。

如果clustering factor接近于行的数量,那说明这张表不是按索引字段顺序存储的。

在计算索引访问成本的时候,这个值十分有用。Clustering Factor乘以选择性参数(selectivity)就是访问索引的开销。

如果这个统计数据不能真实反映出索引的真实情况,那么可能会造成优化器错误的选择执行计划。另外如果某张表上的大多数访问是按照某个索引做索引扫描,那么将该表的数据按照索引字段的顺序重新组织,可以提高该表的访问性能。

四、测试

  4.1、产生问题:

----查看一下数据库的版本----
SQL> select * from v$version where rownum=1;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production

----创建一张测试表jack----
SQL> create table jack as select * from dba_objects where 1=2;

Table created.

----将数据无序的插入jack表中----
SQL> begin
  2      for i in 1..10 loop
  3        insert /*+ append */ into jack select * from dba_objects order by i;
  4      commit;
  5    end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> select count(*) from jack;

COUNT(*)
----------
    725460

----查看一下表的大小-----
SQL> set wrap off
SQL> col owner for a10;
SQL> col segment_name for a15;
SQL> select segment_name,blocks,extents,bytes/1024/1024||'M' "size" from user_segments where segment_name='JACK';

SEGMENT_NAME    BLOCKS    EXTENTS  size
------------- ---------- ---------- --------
JACK            11264      82      88M

----在object_id上创建索引----
SQL> create index jack_ind on jack(object_id);

Index created.

----查看一下索引的大小----
SQL> select segment_name,segment_type,blocks,extents,bytes/1024/1024||'M' "size" from user_segments where segment_name='JACK_IND';

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/645dd8712db027c5f87d83ca17e465b8.html