DFS Lock Handle During RMAN Backup on Standby Database

Hi all!
Some time ago I faced some session on wait event DFS Lock Handle during a RMAN Backup on Standby Database. Btw, running backup on a Standby is a very interesting approach to avoid running backups on primary, so all nodes can be fully dedicated to application ends.

Turn that in my situation I noticed there was a lock with my apply process. The fix? Quite simple: Cancel apply process, run backup, and restart apply again. In my case, using DG Broker:

dgmgrl
connect / 
show configuration
show database 
edit database  set state = 'apply-off';

— Run Backup

edit database set state = 'apply-on';

 

Hope it helps!
Cheers!

Oracle Database Lock Performance / False Locks

Hello all!
This is a very interesing one… what about database taking too long or even showing locks that don’t really exists?
I faced some weird situations related to the size of DBMS_LOCK_ALLOCATED.

After some research, seems root cause is descripted in MOS DBMS_LOCK_ALLOCATED Table Keeps Growing Eventually Depleting Tablespace (Doc ID 1054343.6).

The issue is:
– DBMS_LOCK_ALLOCATED is the table that keep all locks on database.
– This table keeps growing even though many of the locks are probably no longer used.
– Upon checking the EXPIRED column, it reveals that the locks’ expiration date is a year or more in the future.

Besides the mentioned “storage space issue” in MOS Doc, there is also the situation that we have actually 1073741823 “lockhandlers” available. And what if this ‘not releasing’ cause we run out of handlers? Unlike, but possible.
Having a big table can also cause bad performance on DBMS_LOCK.REQUEST, that is the basic mechanism for locks (passing the handler, lockmode, timeout and release instructions).

Continue reading

Oracle Database 9i: Where is my SQL_ID? Which SQL is in lock?

Struggling with that, right?
As you know, in Oracle Database 9i the view V$SESSION doesn’t have SQL_ID column…
So how to map SQLs in my database? And, for example, how to get the SQLs causing a lock?

In the end of the day, the SQL_ID is only a representation of the hash_value of an SQL. You can even make the translation from SQL_ID to Hash Value as you can check on this post by Tanel Poder.

Ok, but I have to map which sql is causing the lock in my 9i database, how can I do that?

Here it goes:

If session status is ACTIVE:

SELECT s1.sql_text from v$sqlarea s1,v$session s2 where s2.SID=&sid and s2.SQL_ADDRESS = s1.ADDRESS

If session status is INACTIVE:

SELECT s1.sql_text from v$sqlarea s1,v$session s2 where s2.SID=&sid and s2.prev_sql_addr = s1.ADDRESS

You’re welcome! 😉

See you next week!

Solving Simple Locks Through @lock2s and @killlocker

Hi guys!
This post is to show the most simple and most common kind of locks for objects and the simpliest way to solve it (killing the locker). 🙂
It’s so common that I scripted it. Take a look:

greporadb> @lock2s
 Inst        SID SERIAL# UserName  STATUS   LOGON_TIME          LMODE  REQUEST LC_ET TY       ID1        ID2       CTIME LOCKWAIT         EVENT                                                                                           
----- ---------- ------- --------- -------- ------------------- ------ ------- ----- -- ---------- ---------- ---------- ---------------- -----------------------------------                                                             
    1        354   18145 MATHEUS   ACTIVE   17/06/2016 14:25:19 X      NONE    4032  TX     393238     424490        715 00000000DB0DF900 enq: TX - row lock contention                                                                   
    1        169   25571 GREPORA   ACTIVE   17/06/2016 14:22:48 NONE   X        714  TX     393238     424490        714 00000000DB0D5ED8 enq: TX - row lock contention                                                                   
    1        252   63517 MATHEUS   INACTIVE 17/06/2016 14:17:49 X      NONE     714  TX     655363    1550347       4195                  SQL*Net message from client                                                                     
    1        846   65011 GREPORA   ACTIVE   17/06/2016 14:20:18 NONE   X       4075  TX     655363    1550347        715 00000000DB0ECB88 enq: TX - row lock contention                                                                   
    1        354   18145 GREPORA   ACTIVE   17/06/2016 14:25:19 NONE   S       4032  TX     655363    1550347        715 00000000DB0DF900 enq: TX - row lock contention                                                                   
5 rows selected.

You can identify the Locker by LMODE column. And all his Waiters by REQUEST column marked by not ‘NONE’, below each Locker…

So, let’s kill the lockers:

greporadb> @killlocker
'ALTERSYSTEMKILLSESSION'''||SID||','||SERIAL#||'''IMMEDIATE;'                                                                                                                                                                                                                                                                                                                                          
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                                                                                                                       
alter system kill session '252,63517' immediate;                                                                                                                                                                                                                                                                                                                                                       
alter system kill session '354,18145' immediate;                                                                                                                                                                                                                                                                                                                                                       
2 rows selected.
greporadb> alter system kill session '252,63517' immediate;      
System altered.
greporadb> alter system kill session '354,18145' immediate;      
System altered.
greporadb> @lock2s
no rows selected

Solved!
My magic scripts? Here it goes:

get lock2s.sql:

set lines 10000
set trimspool on
col serial# for 999999
col lc_et for 999999
col l1name for a50
col lmode for a6
col username for a25
select /*+ rule */
distinct b.inst_id,a.sid,b.serial#,b.username,b.status,
--b.audsid,
--b.module,
--b.machine,b.osuser,
b.logon_time,
decode(lmode,1,'null',2,'RS',3,'RX',4,'S',5,'SRX',6,'X',0,'NONE',lmode) lmode,
decode(request,1,'null',2,'RS',3,'RX',4,'S',5,'SRX',6,'X',0,'NONE',request) request,
b.last_call_et LC_ET,a.type TY,a.id1,a.id2,
d.name||'.'||c.name l1name,a.ctime,b.lockwait,b.event
--distinct b.inst_id,a.sid,b.username,a.type,d.name||'.'||c.name l1name,a.id1,a.id2,
--decode(lmode,1,'null',2,'RS',3,'RX',4,'S',5,'SRX',6,'X',0,'NONE',lmode) lmode,
--decode(request,1,'null',2,'RS',3,'RX',4,'S',5,'SRX',6,'X',0,'NONE',request) request,a.ctime,b.lockwait,b.last_call_et
from gv$lock a, gv$session b,sys.obj$ c,sys.user$ d,(select a.id1 from gv$lock a where a.request  0) lock1
where a.id1    = c.OBJ# (+)
and a.sid      = b.sid
and c.owner#   = d.user# (+)
and a.inst_id=b.inst_id
and b.username is not null
and a.id1 = lock1.id1
order by id1,id2, lmode desc
/

get killlocker.sql:

select 'alter system kill session '''||sid||','||serial#||''' immediate;' 
from v$session where sid in (select BLOCKING_SESSION from v$session where BLOCKING_SESSION is not null);

Now you can put in your Linkedin you are a JR DBA… 😛
haha

Matheus.

Lock by DBLink – How to locate the remote session?

And if you identify a lock or other unwanted operation by a DBLink session, how to identify the original session in remote database (origin dabatase)?
The one million answer is simple: by process of v$session. By the way, looks like is easier than find the local process (spid)… Take a look in my example (scripts in the end of post):

dest> @sid
Sid:10035
Inst:1
SEQ# EVENT MODULE STATUS SID SERIAL# INST_ID
----- --------- ---------- ---------- ---------- ---------- ----------
29912 SQL*Net message from client oracle@origin2(TNS V1-V3) INACTIVE 10035 35 1
dest> @spid
SPID SID PID PROCESS_FOR_DB_LINK MACHINE LOGON_TIME
------ ---------- ---------- ----------- ----------- -----------
16188960 10035 882 17302472 origin2 24/08/2015 07:43:40

Now I know the sid 10035 refers to local process 16188960 and the process on origin database is 17302472. What I do what I want if this process:

root@origin2:/oracle/diag/rdbms/origin/origin2/trace>ps -ef |grep 17302472
grid 17302472 1 97 07:42:42 - 5:58 oracleorigin2 (LOCAL=NO)
root 24445782 36700580 0 08:05:45 pts/3 0:00 grep 17302472

What include to locae the session in the database by spid, see the sql, and etecetera:

origin> @spid2
Enter value for process: 17302472
SID SERIAL# USERNAME OSUSER PROGRAM STATUS
------- ---------- ----------- ----------- --------------- ----------
7951 41323 USER_XPTO scheduler_user sqlplus@scheduler_app.domain.net (TNS V1-V3) ACTIVE
database2> @sid
Sid:7951
Inst: 2
SQL_ID SEQ# EVENT MODULE STATUS SID SERIAL# INST_ID
---------- ----- --------- ------- --------- ----- ------ ----------
1w1wz2mdunya1 56778 db file sequential read REMOTE_LOAD ACTIVE 7951 41323 2

That’s OK?
Simple isn’t?

The used Scripts (except the “sid”, that is a simple SQL on gv$session):

Get SPID and PROCESS FOR DBLINK from a SID:

# spid:
col machine format a30
col process format 999999
select p.spid,b.sid, p.pid, b.process as process_for_db_link, machine, logon_time
from v$session b, v$process p
where b.paddr=p.addr
and sid=&sid
/

Get SID from SPID:

#spid2:
SELECT s.sid, s.serial#, s.username,
s.osuser, s.program, s.status,
FROM v$session s, v$process p
WHERE s.paddr = p.addr
AND p.spid IN (&process);
/

See ya!
Matheus.

Application Looping Until Lock a Row with NOWAIT Clause

Yesterday I treated an interesting situation:
A BATCH stayed on “SQL*Net message from client” event but the last_call_et was always on 1 or 0. Seems OK, with some client contention to send the commands to the DBMS, right? Nope.

It was caused by a loop in the application code “waiting” for a row lock but without “DBMS waiting events” (something like “select * from table for update nowait”). Take a look in how it was identified below.

First the session with no SQL_ID, no wait events and last_Call_et=0 of a “BATH_PROCESS” user:

proddb2> @sid
Sid:9796
Inst:
LAST_CALL_ET SQL_ID   EVENT STATUS SID SERIAL# INST_ID USERNAME
------------ ------- ------------- ---------- ------------------------
0 SQL*Net message from client INACTIVE 9796 45117 2 BATCH_PROCESS
proddb2> @trace
Enter value for sid: 9796
Enter value for serial: 45117
PL/SQL procedure successfully completed.

As you see, with no idea about what is happening, I started a trace. The trace was stuck with this:

*** 2015-06-15 14:03:25.755
WAIT #4574470448: nam='SQL*Net message from client' ela=993072 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833326636999
CLOSE #4574470448:c=10,e=15,dep=0,type=3,tim=12833326637228
PARSE #4574470448:c=25,e=41,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1139820409,tim=12833326637286
BINDS #4574470448:
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=01 fl2=1000000 frm=00 csi=00 siz=24 off=0
kxsbbbfp=110a8d0d8 bln=22 avl=05 flg=05
value=5022011
WAIT #4574470448: nam='gc cr block 2-way' ela= 709 p1=442 p2=5944 p3=8483 obj#=0 tim=12833326638533
WAIT #4574470448: nam='gc cr block 2-way' ela= 541 p1=3 p2=2088264 p3=4367 obj#=0 tim=12833326639352
WAIT #4574470448: nam='gc cr block 2-way' ela= 651 p1=442 p2=5944 p3=8483 obj#=0 tim=12833326641673
WAIT #4574470448: nam='enq: TX - row lock contention' ela= 1093 name|mode=1415053318 usn<obj#=23141074 tim=12833326643029
EXEC #4574470448:c=1776,e=5836,p=0,cr=117,cu=1,mis=0,r=0,dep=0,og=1,plh=1139820409,tim=12833326643150
ERROR #4574470448:err=54 tim=12833326643172
WAIT #4574470448: nam='SQL*Net break/reset to client' ela= 9 driver id=1413697536 break?=1 p3=0 obj#=23141074 tim=12833326643373
WAIT #4574470448: nam='SQL*Net break/reset to client' ela= 503 driver id=1413697536 break?=0 p3=0 obj#=23141074 tim=12833326643891
WAIT #4574470448: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833326643915

AHÁ!
Did you see the “err=54” there? Yes. You know this error:

ORA-00054: Resource busy and acquire with NOWAIT specified

It’s caused by a SELECT FOR UPDATE NOWAIT in the code.
But, this select is in a loop, so the session don’t go ahead until have it.
(Obviously it could be coded with some treatment/better logic for this loop and errors, buuuut…)

What can we do now?
The easy way is to discover the holding session and kill it.
And sometimes the easy way is the best way. 😉

For that, we use the “obj#” and “value”, also bolded in the trace.
As I know the application, I know that the used field in all “where clauses” is the “RECNO” column. But if you don’t, it’s needed to discover. With this information in mind:

proddb2>select * from dba_objects where object_id='23141074';
OWNER OBJECT_NAME
------------------------------ ----------------
OWNER_EXAMPLE TABLE_XPTO
proddb2> select * from OWNER_EXAMPLE.TABLE_XPTO WHERE recno=5022011;
COL_KEY FSAMED0 FSAMED1 FSMNEG1 FSMNEG2 FSMNEG3 COL_DATE RECNO
------- ---------- ---------- ---------- ---------- ---------- -----
1002974 0 0 -516.8 0 0 15/06/2015 00:00:00 5022011

Ok, I know the row that is holded by the other session.
Let’s discover which session is causing a lock by myself (but in my case, without “NOWAIT” clause, to have time to find the holder):

proddb5>select * from OWNER_EXAMPLE.TABLE_XPTO WHERE recno=5022011 for update;

In another sqlplus session:

proddb2> @me
INST_ID SID SERIAL# USERNAME EVENT BLOCKING_SE BLOCKING_SESSION BLOCKING_INSTANCE
------- ---------- ---------- --------------- ----------------------
5 14174 479 MATHEUS_BOESING enq:TX - row lock contention VALID 11006 1
2 4233 12879 MATHEUS_BOESING PX Deq: Execution Msg NOT IN WAIT
1 15410 7697 MATHEUS_BOESING PX Deq: Execution Msg NOT IN WAIT

AHÁ again!
The SID 11006. Let’s see who is there:

proddb2> @sid
Sid:11006
Inst:
SQL_ID SEQ# EVENT STATUS SID SERIAL# INST_ID USERNAME
-------------------- ---------- --------------------------------------
9jzm6vn5j06js 24919 enq: TX - row lock contention ACTIVE 11006 44627 1 DBLINK_OTHER_BATCH_SCHEMA

Ok, it’s another session of a different batch process in a remote database holding this row. As it’s less relevant, lets kill! Muahaha!
Then, you’ll see, my session get the lock and is in the middle of a transaction:

proddb1> @kill
***
sid : 11006
serial : 44627
***
System altered.
***
proddb1> @me
INST_ID SID SERIAL# USERNAME EVENT BLOCKING_SE BLOCKING_SESSION BLOCKING_INSTANCE
------- ---------- ---------- --------------- --------------------
5 14174 479 MATHEUS_BOESING transaction UNKNOWN
2 4332 56037 MATHEUS_BOESING PX Deq: Execution Msg NOT IN WAIT
1 12058 9 MATHEUS_BOESING class slave wait NO HOLDER

To release the “row locked” to my principal process, lets suicide (kill my own session, this case, that is holding the row lock right now).

proddb5> @kill
***
sid : 14174
serial : 479
***
System altered.
***

After kill all the holding sessions, my BATCH_PROCESS just gone! 😀
Take a look on the trace (running ok):

WAIT #4576933904: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833981531019
FETCH #4576933904:c=45,e=71,p=0,cr=3,cu=0,mis=0,r=5,dep=0,og=1,plh=419358542,tim=12833981531062
WAIT #4576933904: nam='SQL*Net message from client' ela= 562 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833981531654
WAIT #4576933904: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833981531788
FETCH #4576933904:c=55,e=86,p=0,cr=2,cu=0,mis=0,r=5,dep=0,og=1,plh=419358542,tim=12833981531826
WAIT #4576933904: nam='SQL*Net message from client' ela= 715 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833981532576
WAIT #4576933904: nam='SQL*Net message to client' ela= 4 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833981532721
FETCH #4576933904:c=61,e=96,p=0,cr=2,cu=0,mis=0,r=5,dep=0,og=1,plh=419358542,tim=12833981532758
WAIT #4576933904: nam='SQL*Net message from client' ela= 600 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833981533617
WAIT #4576933904: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833981534163
FETCH #4576933904:c=52,e=82,p=0,cr=2,cu=0,mis=0,r=5,dep=0,og=1,plh=419358542,tim=12833981534203
WAIT #4576933904: nam='SQL*Net message from client' ela= 517 driver id=1413697536 #bytes=1 p3=0 obj#=23141074 tim=12833981534752

Now, with the problem solved, lets disable the trace and continue the other daily tasks… 🙂

proddb2> @untrace
Enter value for sid: 9796
Enter value for serial: 45117
PL/SQL procedure successfully completed.

I hope it was useful!
If helped you, make a comment! 😀

See ya!
Matheus.