SYSTEMSTATE DUMP

Hi guys!
I already posted about Hang Analyze (part1, part2) and ASHDUMP. Now, in the same ‘package’, let me show you about SYSTEMSTATE DUMP.

Systemstate is basically made by the process state for all process in instance (or instances) at the time the systemstate is called.
Through a systemstate it’s possible to identify enqeues, rowcache locks, mutexes, library cache pins and locks, latch free situations, and other kind of chains.

It’s a good tool to add in a SR, but it’s quite hard to habituate on reading/interpreting the file. To undertand exactly how to read a systemstate I’d recommend you the best: Read the manual!
The doc Reading and Understanding Systemstate Dumps (Doc ID 423153.1) has a very good explanation with examples, I’m not able to to it better. ūüôā

What I can do is share about the SYSTEMSTATE levels. I had some difficult to find it…
But before I show how you can do it, it‚Äôs important to mention that Oracle does not recommend you to use ‚Äėnumeric events‚Äô without a SR (MOS), according to Note: 75713.1.

So, systemstate dump has several levels:

Level Content
2 dump (not including the lock element)
10 dump
11 dump + global cache of RAC
256 short stack (function stack)
258 256 + 2 -> short stack + dump (not including the lock element)
266 256 + 10 -> short stack + dump
267 256 + 11 -> short stack + dump + global cache of RAC

Levels 11 and 267 will dump global cache, will generate a large trace file, under normal circumstances is not recommended.

Under normal circumstances, if the process is not too much, it is recommended to use 266 because it can dump out the process of the function stack, it can be used to analyze the process in what to do.
But the more time-consuming to generate short stack, if the process is very large, such as the 2000 process, it may take more than 30 minutes. In this case, you can generate a level 10 or level 258, level 258 will collect more than level 10 short short stack, but less than level 10 to collect some lock element data.

In addition to the RAC system, please pay attention Bug 11800959 – A SYSTEMSTATE dump with level> = 10 in RAC dumps huge BUSY GLOBAL CACHE ELEMENTS – can hang / crash instances (Doc ID 11800959.8). The Bug is fixed in the 11.2.0.3. For versions <= 11.2.0.2 of the RAC, when the system lock an element for a bunch of time, when executing level 10, 266 or 267 of systemstate dump, it can cause database hang or crash. It may be solved by using level 258.

To generate it:

oradebug setmypid;
oradebug unlimit;
oradebug dump systemstate 266
oradebug tracefile_name

OR, for all cluster:

oradebug setmypid;
oradebug unlimit;
oradebug -g all dump systemstate 266
oradebug tracefile_name

An example of execution:

SQL> oradebug setmypid;
Statement processed.
SQL> oradebug unlimit;
Statement processed.
SQL> oradebug dump systemstate 266
Statement processed.
SQL> oradebug tracefile_name
/db/oracle/diag/rdbms/grepora/GREPORA/trace/GREPORA_ora_18256.trc

If you cannot create a connection in database with SQLPlus (even as SYSDBA), regarding it‚Äôs a hang situation, you can use an preliminar connection, like shown in the post ‚ÄúHANGANALYZE Part 2‚ÄĚ.

I’d recommend you to read:
How to Collect Systemstate Dumps When you Cannot Connect to Oracle (Doc ID 121779.1)
Important Customer information about using Numeric Events (Doc ID 75713.1)

An example of a SYSTEMSTATE level 266 dumpfile:

*** 2016-06-15 16:59:00.180
Processing Oradebug command 'dump systemstate 266'
===================================================
SYSTEM STATE (level=10, with short stacks)
------------
System global information:
     processes: base 0xbb5b6850, size 1000, cleanup 0xbb63b780
     allocation: free sessions 0xbba6fb48, free calls (nil)
     control alloc errors: 0 (process), 0 (session), 0 (call)
     PMON latch cleanup depth: 0
     seconds since PMON's last scan for dead processes: 39
     system statistics:
0 OS CPU Qt wait time
727657760 Requests to/from client
7106305 logons cumulative
274 logons current
231675649 opened cursors cumulative
1737 opened cursors current
43388218 user commits
1682106 user rollbacks
835078962 user calls
752889391 recursive calls
12720247 recursive cpu usage
150 pinned cursors current
15100586388 session logical reads
0 session logical reads in local numa group
0 session logical reads in remote numa group
[...]
PROCESS 1:
  ----------------------------------------
  SO: 0xbb63a6d0, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3
   proc=0xbb63a6d0, name=process, file=ksu.h LINE:12616, pg=0
  (process) Oracle pid:1, ser:0, calls cur/top: (nil)/(nil)
            flags : (0x20) PSEUDO
            flags2: (0x0),  flags3: (0x10)
            intr error: 0, call error: 0, sess error: 0, txn error 0
            intr queue: empty
    ksudlp FALSE at location: 0
  (post info) last post received: 0 0 0
              last post received-location: No post
              last process to post me: none
              last post sent: 0 0 0
              last post sent-location: No post
              last process posted by me: none
    (latch info) wait_event=0 bits=0
    O/S info: user: , term: , ospid:  (DEAD)
    OSD pid info: Unix process pid: 0, image: PSEUDO
    ----------------------------------------
    SO: 0x6000c838, type: 5, owner: 0xbb63a6d0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
     proc=(nil), name=kss parent, file=kss2.h LINE:138, pg=0
  PSO child state object changes :
Dump of memory from 0x00000000BB5BA8E8 to 0x00000000BB5BAAF0
0BB5BA8E0                   00000000 00000000          [........]
0BB5BA8F0 00000000 00000000 00000000 00000000  [................]
  Repeat 31 times
PROCESS 2: PMON
  ----------------------------------------
  SO: 0xbb63b780, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3
   proc=0xbb63b780, name=process, file=ksu.h LINE:12616, pg=0
  (process) Oracle pid:2, ser:1, calls cur/top: 0xbaea88a8/0xbaea88a8
            flags : (0xe) SYSTEM
            flags2: (0x0),  flags3: (0x10)
            intr error: 0, call error: 0, sess error: 0, txn error 0
            intr queue: empty
    ksudlp FALSE at location: 0
  (post info) last post received: 0 0 16
              last post received-location: ksu.h LINE:13945 ID:ksupsc
              last process to post me: bb75a490 25 0
              last post sent: 0 0 19
              last post sent-location: ksu.h LINE:13957 ID:ksuxfd
              last process posted by me: bb63b780 1 14
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 0xbba5e848
    O/S info: user: oracle, term: UNKNOWN, ospid: 14438
    OSD pid info: Unix process pid: 14438, image: oracle@devdb09 (PMON)
    Short stack dump:
[...]

Matheus.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s