I already posted about Hang Analyze (part1, part2) and ASHDUMP. Now, in the same ‘package’, let me show you about SYSTEMSTATE DUMP.
Systemstate is basically made by the process state for all process in instance (or instances) at the time the systemstate is called.
Through a systemstate it’s possible to identify enqeues, rowcache locks, mutexes, library cache pins and locks, latch free situations, and other kind of chains.
It’s a good tool to add in a SR, but it’s quite hard to habituate on reading/interpreting the file. To undertand exactly how to read a systemstate I’d recommend you the best: Read the manual!
The doc Reading and Understanding Systemstate Dumps (Doc ID 423153.1) has a very good explanation with examples, I’m not able to to it better. 🙂
What I can do is share about the SYSTEMSTATE levels. I had some difficult to find it…
But before I show how you can do it, it’s important to mention that Oracle does not recommend you to use ‘numeric events’ without a SR (MOS), according to Note: 75713.1.
So, systemstate dump has several levels:
|2||dump (not including the lock element)|
|11||dump + global cache of RAC|
|256||short stack (function stack)|
|258||256 + 2 -> short stack + dump (not including the lock element)|
|266||256 + 10 -> short stack + dump|
|267||256 + 11 -> short stack + dump + global cache of RAC|
Levels 11 and 267 will dump global cache, will generate a large trace file, under normal circumstances is not recommended.
Under normal circumstances, if the process is not too much, it is recommended to use 266 because it can dump out the process of the function stack, it can be used to analyze the process in what to do.
But the more time-consuming to generate short stack, if the process is very large, such as the 2000 process, it may take more than 30 minutes. In this case, you can generate a level 10 or level 258, level 258 will collect more than level 10 short short stack, but less than level 10 to collect some lock element data.
In addition to the RAC system, please pay attention Bug 11800959 – A SYSTEMSTATE dump with level> = 10 in RAC dumps huge BUSY GLOBAL CACHE ELEMENTS – can hang / crash instances (Doc ID 11800959.8). The Bug is fixed in the 22.214.171.124. For versions <= 126.96.36.199 of the RAC, when the system lock an element for a bunch of time, when executing level 10, 266 or 267 of systemstate dump, it can cause database hang or crash. It may be solved by using level 258.
To generate it:
oradebug setmypid; oradebug unlimit; oradebug dump systemstate 266 oradebug tracefile_name
OR, for all cluster:
oradebug setmypid; oradebug unlimit; oradebug -g all dump systemstate 266 oradebug tracefile_name
An example of execution:
SQL> oradebug setmypid; Statement processed. SQL> oradebug unlimit; Statement processed. SQL> oradebug dump systemstate 266 Statement processed. SQL> oradebug tracefile_name /db/oracle/diag/rdbms/grepora/GREPORA/trace/GREPORA_ora_18256.trc
If you cannot create a connection in database with SQLPlus (even as SYSDBA), regarding it’s a hang situation, you can use an preliminar connection, like shown in the post “HANGANALYZE Part 2”.
I’d recommend you to read:
How to Collect Systemstate Dumps When you Cannot Connect to Oracle (Doc ID 121779.1)
Important Customer information about using Numeric Events (Doc ID 75713.1)
An example of a SYSTEMSTATE level 266 dumpfile:
*** 2016-06-15 16:59:00.180 Processing Oradebug command 'dump systemstate 266' =================================================== SYSTEM STATE (level=10, with short stacks) ------------ System global information: processes: base 0xbb5b6850, size 1000, cleanup 0xbb63b780 allocation: free sessions 0xbba6fb48, free calls (nil) control alloc errors: 0 (process), 0 (session), 0 (call) PMON latch cleanup depth: 0 seconds since PMON's last scan for dead processes: 39 system statistics: 0 OS CPU Qt wait time 727657760 Requests to/from client 7106305 logons cumulative 274 logons current 231675649 opened cursors cumulative 1737 opened cursors current 43388218 user commits 1682106 user rollbacks 835078962 user calls 752889391 recursive calls 12720247 recursive cpu usage 150 pinned cursors current 15100586388 session logical reads 0 session logical reads in local numa group 0 session logical reads in remote numa group [...] PROCESS 1: ---------------------------------------- SO: 0xbb63a6d0, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0xbb63a6d0, name=process, file=ksu.h LINE:12616, pg=0 (process) Oracle pid:1, ser:0, calls cur/top: (nil)/(nil) flags : (0x20) PSEUDO flags2: (0x0), flags3: (0x10) intr error: 0, call error: 0, sess error: 0, txn error 0 intr queue: empty ksudlp FALSE at location: 0 (post info) last post received: 0 0 0 last post received-location: No post last process to post me: none last post sent: 0 0 0 last post sent-location: No post last process posted by me: none (latch info) wait_event=0 bits=0 O/S info: user: , term: , ospid: (DEAD) OSD pid info: Unix process pid: 0, image: PSEUDO ---------------------------------------- SO: 0x6000c838, type: 5, owner: 0xbb63a6d0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=(nil), name=kss parent, file=kss2.h LINE:138, pg=0 PSO child state object changes : Dump of memory from 0x00000000BB5BA8E8 to 0x00000000BB5BAAF0 0BB5BA8E0 00000000 00000000 [........] 0BB5BA8F0 00000000 00000000 00000000 00000000 [................] Repeat 31 times PROCESS 2: PMON ---------------------------------------- SO: 0xbb63b780, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0xbb63b780, name=process, file=ksu.h LINE:12616, pg=0 (process) Oracle pid:2, ser:1, calls cur/top: 0xbaea88a8/0xbaea88a8 flags : (0xe) SYSTEM flags2: (0x0), flags3: (0x10) intr error: 0, call error: 0, sess error: 0, txn error 0 intr queue: empty ksudlp FALSE at location: 0 (post info) last post received: 0 0 16 last post received-location: ksu.h LINE:13945 ID:ksupsc last process to post me: bb75a490 25 0 last post sent: 0 0 19 last post sent-location: ksu.h LINE:13957 ID:ksuxfd last process posted by me: bb63b780 1 14 (latch info) wait_event=0 bits=0 Process Group: DEFAULT, pseudo proc: 0xbba5e848 O/S info: user: oracle, term: UNKNOWN, ospid: 14438 OSD pid info: Unix process pid: 14438, image: oracle@devdb09 (PMON) Short stack dump: [...]