30926 及MERGE 临时表空占用问题

今天进行MERGE操作的时候是根据两个2个大表大约每张表10G左右,进行更新,服务器使用了PCI-E闪存盘,速度还是相当快的,大概语句如下:

merge into TEST_EVENT a
using
TEST_EVENT_1 b
on(a.MAIN_TARGET=b.MAIN_TARGET and a.EVENT_TYPE=b.EVENT_TYPE)
when matched then
update
set
a.CARD_NO=b.CARD_NO,a.EVENT_ACCOUNT=b.EVENT_ACCOUNT,a.EXTEND_PROPERTIES=b.EXTEND_PROPERTIES
where
b.EVENT_NAME= 'WITHDRAW' AND b.EVENT_TYPE IN ('SYNCHRONIZE','ASYNCHRONIZE')
AND
b.raw_add_time>=to_date('2014-01-01','yyyy-mm-dd')
AND
b.raw_add_time

出现了2个问题如下进行分析和说明:

1、merge 链接不稳定
报错
ORA-30926: 无法在源表中获得一组稳定的行
MERGE is a deterministic statement. You therefore need to evaluate the data returned by
the USING clause to ensure that there are no duplicate values in the join. Modify
the merge statement to include a deterministic where clause.
进行举例
SQL> select * from testmerg;

        ID NAME
----------- --------------------
          1 gaopeng
          2 gaopeng
          3 gaopeng

SQL> select * from testmerg2;

        ID NAME
----------- --------------------
          1 yanlei
          2 yanlei
          2 yanlei1

SQL>
SQL> merge  into  testmerg a
  2  using testmerg2 b
  3  on (a.id=b.id)
  4  when matched then update
  5  set a.name=b.name;

merge  into  testmerg a
using testmerg2 b
on (a.id=b.id)
when matched then update
set a.name=b.name

ORA-30926: 无法在源表中获得一组稳定的行

可以看到我们根据ID进行连接(a.id=b.id),但是源表ID有2个ID为2的行,所以MERGE不知道如何修改,到底把ID=2的 2 gaopeng 这一行更改为 yanlei,还是 yanlei1呢?
所以这是MERGE需要注意地方,如果不能满足这个条件MERGE是做不了的。

2、merge 临时表空间过大,并且伴随了大量的 direct path read temp等待事件

Explanation:
============
We might consume more temp space on disk only when the build side(the smaller
side in hash join) can't be fit in memory entirely. If it fits in memory, we
don't need any buffering for the probe side(the rightside).

If the staging table (smaller table participating in the hash join)
is too big to fit into the build side (in memory),
the right hand side's hash partitions need to be buffered, and thus the
huge temp segments requirement.

If the hash area size is large enough to accomodate the build side in memory, you
should not see this type of consumption.

如上解释,
SQL>  select SQL_ID,BLOCKS*8/1024/1024,SEGTYPE from v$sort_usage;

SQL_ID        BLOCKS*8/1024/1024 SEGTYPE
------------- ------------------ ---------
08mh0r0xsj3h6      0.0009765625 DATA
08mh0r0xsj3h6      0.0009765625 LOB_DATA
08mh0r0xsj3h6      0.0009765625 INDEX
dyk4dprp70d74      7.0634765625 HASH
如下看到类型为HASH大小约为7G,查看执行计划
PLAN_TABLE_OUTPUT
------------------------------------------------------------
Plan hash value: 3625956359
------------------------------------------------------------
| Id  | Operation            | Name        | Rows  | Bytes |
------------------------------------------------------------
|  0 | MERGE STATEMENT      |            |  2680K|  7953M|
|  1 |  MERGE              | SYN_EVENT  |      |      |
|  2 |  VIEW              |            |      |      |
|*  3 |    HASH JOIN        |            |  2680K|  4605M|
|  4 |    TABLE ACCESS FULL| SYN_EVENT_1 |  2680K|  2259M|
|  5 |    TABLE ACCESS FULL| SYN_EVENT  |    21M|    18G|
------------------------------------------------------------

可以看到临时表空间的增加来自于HASH JION,其实HASH JION 并不是MERGE的唯一的
链接方式.因为HASH JION是使用不到索引的,所以想使用到索引需要使用USE_NL的提示如下
merge /*+ leading( b ) use_nl( a b ) */ into SYN_EVENT a
using SYN_EVENT_1 b
on (a.id = b.id )
when matched then
  update
    set a.CARD_NO          = b.CARD_NO,
        a.EVENT_ACCOUNT    = b.EVENT_ACCOUNT,
        a.EXTEND_PROPERTIES = b.EXTEND_PROPERTIES
  where b.EVENT_NAME = 'WITHDRAW'
    AND b.EVENT_TYPE IN ('SYNCHRONIZE', 'ASYNCHRONIZE')
    AND b.raw_add_time >= to_date('2014-01-01', 'yyyy-mm-dd')
    AND b.raw_add_time < to_date('2015-01-01', 'yyyy-mm-dd');
         
但是这样仅仅是使用了NEST LOOP 代替了HASH JION,NEST LOOP的被驱动表是可以使用索引的,可以测试出确实TEMPSPACE来自于HASH JION。
在我的机器上测试syn_event数据量500W大约6G数据,SYN_EVENT_1数据量大约250W大约3G数据,的数据包含CLOB字段进行更新,更改执行计划后耗时大约为750 S

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/18923d8c039957c5371452218afdcdaa.html