RAC环境的死锁案例一
环境: linxu 64 ,Oracle 10g ,RAC
现象: 用户报告现象:
费用审批时与通讯服务器出现拥堵的现象描述:
条件:
(1)分公司财务正在进行费用申请单的审核;
(2)通讯服务器正在进行门店费用申请单的导入;
出现的状况:
(1)。分公司费用申请单的审核一直出现漏斗现象;
(2)。通讯服务导入进程出现死循环现象,必须手工删除进程,删除以后,分公司费用审批才能正常;
(3)问题产生时,DBA那里发现有一个SQL语句一直在运转
现象: DBA报告现象:
经查数据库日志发现,由于应用程序问题,使数据库死锁发生,从而导致数据库响应时间很长。数据库做了相关解锁处理后,数据库恢复正常,数据库alert.log报如下信息:
Fri Nov 14 14:17:07 2008
Global Enqueue Services Deadlock detected. More info in file
/u03/prd/prddb/10.2.0/admin/PRD_erpdb1/bdump/prd1_lmd0_26307.trc.
Fri Nov 14 14:25:54 2008
Global Enqueue Services Deadlock detected. More info in file
/u03/prd/prddb/10.2.0/admin/PRD_erpdb1/bdump/prd1_lmd0_26307.trc.
Fri Nov 14 14:44:02 2008
Fri Nov 14 14:47:42 2008
Global Enqueue Services Deadlock detected. More info in file
/u03/prd/prddb/10.2.0/admin/PRD_erpdb1/bdump/prd1_lmd0_26307.trc.
Fri Nov 14 14:47:45 2008
Global Enqueue Services Deadlock detected. More info in file
/u03/prd/prddb/10.2.0/admin/PRD_erpdb1/bdump/prd1_lmd0_26307.trc.
Fri Nov 14 14:47:50 2008
Global Enqueue Services Deadlock detected. More info in file
/u03/prd/prddb/10.2.0/admin/PRD_erpdb1/bdump/prd1_lmd0_26307.trc.
Global Enqueue Services Deadlock detected. More info in file
/u03/prd/prddb/10.2.0/admin/PRD_erpdb1/bdump/prd1_lmd0_26307.trc.
Fri Nov 14 14:56:42 2008
Global Enqueue Services Deadlock detected. More info in file
/u03/prd/prddb/10.2.0/admin/PRD_erpdb1/bdump/prd1_lmd0_26307.trc.
分析:
我们在prd1_lmd0_26307.trc文件中主要关注下面这段信息:
user session for deadlock lock 0x3f7764e88 pid=154 serial=659 audsid=73846000 user: 173/APPS O/S info: user: applprd, term: , ospid: 21397, machine: erpapp2 program: client info: 2022 0 application name: DPOSCOSTAPPROVE, hash value=50960439 action name: FRM::DPHEB鍝堝皵婊―POS鏅€氱敤, hash value=3707827426 Current SQL Statement: UPDATE EXPENDITUREFORM SET SERIES=NVL(:b1,0),EXPENDITUREFORMID=:b2,BARGAINORCOMPANYID=NVL(:b3,' '),CHANGEOUTCOMPANYID=:b4,CHANGEOUTSHOPPEID=:b4,FORMDATE=:b6,CURRENTDATE=:b7,CHANGETYPEID=NVL(:b8,' '),CHANGEOUTPERSON=NVL(:b9,' '),AUDITINGPERSON=NVL(:b10,' '),CHANGESTATEID=NVL(:b11,'01'),ZBSTATE=NVL(:b12,'02'),REMARK=NVL(:b13,' '),FILENO=NVL(:b14,' '),CANCELSIGN=NVL(:b15,'N'),INSERTDATA=NVL(:b16,'N'),UPDATEDATA=NVL(:b17,'N'),DELETEDATA=NVL(:b18,'N'),SENDDATA=NVL(:b19,'N'),ALREADYSEND=NVL(:b20,'N'),POSINPUTSIGN=:b21,HEAD_ID=NVL(:b22,0),EXPORTSIGN=NVL(:b23,'N'),CLOSEDATE=TRUNC(:b24),PAY_DATE=TRUNC(:b25) WHERE ROWID = :b26 ENQUEUE DUMP REQUEST: from 3.20635 on [0xbbda2][0x0],[TM] for reason 3 mtype 0 DUMP LOCAL BLOCKER/HOLDER: block level 4 res [0xbbda2][0x0],[TM] ----------resource 0x0x3c602b658---------------------- resname : [0xbbda2][0x0],[TM] Local node : 0 dir_node : 1 master_node : 1 hv idx : 118 hv last r.inc : 4 current inc : 8 hv status : 0 hv master : 1 open options : dd cached Held mode : KJUSERNL Cvt mode : KJUSERPR Next Cvt mode : KJUSERCW msg_seq : 0xad00ab res_seq : 15 grant_bits : KJUSERNL grant mode : KJUSERNL KJUSERCR KJUSERCW KJUSERPR KJUSERPW KJUSEREX count : 2 0 0 0 0 0 val_state : KJUSERVS_NOVALUE valblk : 0x00000000000000000000000000000000 . access_node : 1 vbreq_state : 0 state : x8 resp : 0x3c602b658 On Scan_q? : N cache level : 0 Total accesses: 4456 Imm. accesses: 3978 Granted_locks : 0 Cvting_locks : 2 value_block: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 GRANTED_Q : CONVERT_Q: lp 0x3f7764e88 gl KJUSERNL rl KJUSERPR rp 0x3c602b658 [0xbbda2][0x0],[TM] master 1 gl owner 0x3fe650d98 possible pid 31759 xid 9A000-0001-000004E0 bast 0 rseq 15 mseq 0 history 0x4951449a convert opt KJUSERGETVALUE lp 0x3f7bdb2b8 gl KJUSERNL rl KJUSERPR rp 0x3c602b658 [0xbbda2][0x0],[TM] master 1 gl owner 0x3fe5e05c8 possible pid 9917 xid 132000-0001-0000012F bast 0 rseq 15 mseq 0 history 0x4951449a convert opt KJUSERGETVALUE ----------enqueue 0x0x3f7764e88------------------------ lock version : 112601 Owner node : 0 grant_level : KJUSERNL req_level : KJUSERPR bast_level : KJUSERNL notify_func : (nil) resp : 0x3c602b658 procp : 0x3f86dc8e8 pid : 31759 proc version : 335 oprocp : (nil) opid : 0 group lock owner : 0x3fe650d98 possible pid : 31759 xid : 9A000-0001-000004E0 dd_time : 100.0 secs dd_count : 6 timeout : 0.0 secs On_timer_q? : N On_dd_q? : Y lock_state : OPENING CONVERTING Open Options : KJUSERDEADLOCK Convert options : KJUSERGETVALUE History : 0x4951449a Msg_Seq : 0x0 res_seq : 15 valblk : 0x00000000000000000000000000000000 . DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK possible owner[154.31759] Submitting asynchronized dump request [28] Global blockers dump start:--------------------------------- DUMP LOCAL BLOCKER/HOLDER: block level 3 res [0xbbda2][0x0],[TM] ----------resource 0x0x3c602b658---------------------- resname : [0xbbda2][0x0],[TM] Local node : 0 dir_node : 1 master_node : 1 hv idx : 118 hv last r.inc : 4 current inc : 8 hv status : 0 hv master : 1 open options : dd cached Held mode : KJUSERNL Cvt mode : KJUSERPR Next Cvt mode : KJUSERCW msg_seq : 0xad00ab res_seq : 15 grant_bits : KJUSERNL grant mode : KJUSERNL KJUSERCR KJUSERCW KJUSERPR KJUSERPW KJUSEREX count : 2 0 0 0 0 0 val_state : KJUSERVS_NOVALUE valblk : 0x00000000000000000000000000000000 . access_node : 1 vbreq_state : 0 state : x8 resp : 0x3c602b658 On Scan_q? : N cache level : 0 Total accesses: 4458 Imm. accesses: 3979 Granted_locks : 0 Cvting_locks : 2 value_block: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 GRANTED_Q : CONVERT_Q: lp 0x3f7764e88 gl KJUSERNL rl KJUSERPR rp 0x3c602b658 [0xbbda2][0x0],[TM] master 1 gl owner 0x3fe650d98 possible pid 31759 xid 9A000-0001-000004E0 bast 0 rseq 15 mseq 0 history 0x4951449a convert opt KJUSERGETVALUE lp 0x3f7bdb2b8 gl KJUSERNL rl KJUSERPR rp 0x3c602b658 [0xbbda2][0x0],[TM] master 1 gl owner 0x3fe5e05c8 possible pid 9917 xid 132000-0001-0000012F bast 0 rseq 15 mseq 0 history 0x4951449a convert opt KJUSERGETVALUE DUMP LOCAL BLOCKER/HOLDER: block level 4 res [0xbbda2][0x0],[TM] ----------resource 0x0x3c602b658---------------------- resname : [0xbbda2][0x0],[TM] Local node : 0 dir_node : 1 master_node : 1 hv idx : 118 hv last r.inc : 4 current inc : 8 hv status : 0 hv master : 1 open options : dd cached Held mode : KJUSERNL Cvt mode : KJUSERPR Next Cvt mode : KJUSERCW msg_seq : 0xad00ab res_seq : 15 grant_bits : KJUSERNL grant mode : KJUSERNL KJUSERCR KJUSERCW KJUSERPR KJUSERPW KJUSEREX count : 2 0 0 0 0 0 val_state : KJUSERVS_NOVALUE valblk : 0x00000000000000000000000000000000 . access_node : 1 vbreq_state : 0 state : x8 resp : 0x3c602b658 On Scan_q? : N cache level : 0 Total accesses: 4459 Imm. accesses: 3979 Granted_locks : 0 Cvting_locks : 2 value_block: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 GRANTED_Q : CONVERT_Q: lp 0x3f7764e88 gl KJUSERNL rl KJUSERPR rp 0x3c602b658 [0xbbda2][0x0],[TM] master 1 gl owner 0x3fe650d98 possible pid 31759 xid 9A000-0001-000004E0 bast 0 rseq 15 mseq 0 history 0x4951449a convert opt KJUSERGETVALUE lp 0x3f7bdb2b8 gl KJUSERNL rl KJUSERPR rp 0x3c602b658 [0xbbda2][0x0],[TM] master 1 gl owner 0x3fe5e05c8 possible pid 9917 xid 132000-0001-0000012F bast 0 rseq 15 mseq 0 history 0x4951449a convert opt KJUSERGETVALUE ----------enqueue 0x0x3f7764e88------------------------ lock version : 112601 Owner node : 0 grant_level : KJUSERNL req_level : KJUSERPR bast_level : KJUSERNL notify_func : (nil) resp : 0x3c602b658 procp : 0x3f86dc8e8 pid : 31759 proc version : 335 oprocp : (nil) opid : 0 group lock owner : 0x3fe650d98 possible pid : 31759 xid : 9A000-0001-000004E0 dd_time : 100.0 secs dd_count : 6 timeout : 0.0 secs On_timer_q? : N On_dd_q? : Y lock_state : OPENING CONVERTING Open Options : KJUSERDEADLOCK Convert options : KJUSERGETVALUE History : 0x4951449a Msg_Seq : 0x0 res_seq : 15 valblk : 0x00000000000000000000000000000000 . DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK possible owner[154.31759] Submitting asynchronized dump request [28] Global blockers dump end:----------------------------------- Global Wait-For-Graph(WFG) at ddTS[0.c58] : BLOCKED 0x3f7764e88 3 [0xbbda2][0x0],[TM] [9A000-0001-000004E0] 0 BLOCKER 0x3fedb9208 3 [0xbbda2][0x0],[TM] [71000-0004-0000058E] 3 BLOCKED 0x3fedb9208 4 [0xbbda2][0x0],[TM] [71000-0004-0000058E] 3 BLOCKER 0x3f7764e88 4 [0xbbda2][0x0],[TM] [9A000-0001-000004E0] 0 *** 2008-10-30 10:21:13.429 |
找到77502行有信息: user session for deadlock lock 0x3f7764e88
而在77511行有SQL语句: UPDATE EXPENDITUREFORM SET SERIES=NVL……
这两行信息提示我们: 用户在执行费用单审批的时候遇到了死锁. 那么究竟遇到了怎样的死锁情况呢, 我们搜索 0x3f7764e88 (备注: 0x3f7764e88是lock point) ,不停的按F3搜索后,我们搜索到在 77704行开始有如下信息:
Global Wait-For-Graph(WFG) at ddTS[0.c58] : BLOCKED 0x3f7764e88 3 [0xbbda2][0x0],[TM] [9A000-0001-000004E0] 0 BLOCKER 0x3fedb9208 3 [0xbbda2][0x0],[TM] [71000-0004-0000058E] 3 BLOCKED 0x3fedb9208 4 [0xbbda2][0x0],[TM] [71000-0004-0000058E] 3 BLOCKER 0x3f7764e88 4 [0xbbda2][0x0],[TM] [9A000-0001-000004E0] 0 |
这个信息提示我们,当时的死锁情况是这样的: 节点0上的费用审批程序尝试获得一个Mode=3 的TM锁 0x3f7764e88, 但这个锁被节点3上的 TM 锁 0x3fedb9208阻塞. 而0x3fedb9208 又反过来被节点0上的0x3f7764e88阻塞,这就形成了死锁.
那么怎么解决呢? 首先我们要搞清楚 UPDATE EXPENDITUREFORM SET SERIES=NVL…… 为什么会尝试获得一个排它型的TM锁? 可能的原因是 它更新的表存在子表,而子表上的外键上没有建立索引. 这种情况往往是最常见的原因.
我们打开EXPENDITUREFORM的子表 : EXPENDITUREFORMST查看,发现确实在字段EXPENDITUREFORMID上存在外键,而且在这个外键上不存在索引.
解决方法:
尝试在EXPENDITUREFORMST.EXPENDITUREFORMID上建立索引.
相关知识可参考:http://blog.retailsolution.cn/index.php/oracle-db-deadlock
关于作者:
昵称:商云方 档案信息:顾问, HAND张江技术中心 联系方式:你可以通过yunfang.shang@hand-china.com联系作者 点击查看商云方发表过的所有文章... 本文永久链接: http://blog.retailsolution.cn/archives/2080 |
对本文的评价: