Exadata: Cell Server Crashing on ORA-00600: [LinuxBlockIO::reap]

Hi all,

So I started facing this in a client environment. Here is the alert message:

Target name=db01cel08.xxx.com
Message=ORA-00600: internal error code, arguments: [LinuxBlockIO::reap], [0x60000D502388], [], [], [], [], [], [], [], [], [], []
Event reported time=Dec 19, 2019 2:14:16 AM EDT

When checking on the cellserver I see this message:

[root@db01 ~]# ssh db01cel08
Last login: Thu Dec 19 04:45:13 2019 from db01.xxx.com
[root@db01cel08 ~]# cellcli
CellCLI: Release - Production on Fri Dec 19 17:13:31 EDT 2019

Copyright (c) 2007, 2016, Oracle. All rights reserved.



name: 10
alertDescription: "ORA-07445: exception encountered: core dump [__intel_new_memset()+62] [11] [0x000000000] [] [] []"
alertMessage: "ORA-07445: exception encountered: core dump [__intel_new_memset()+62] [11] [0x000000000] [] [] []"
alertSequenceID: 10
alertShortName: ADR
alertType: Stateless
beginTime: 2019-12-19T02:00:04-04:00
notificationState: 1
sequenceBeginTime: 2019-12-19T02:00:04-04:00
severity: critical
alertAction: "Errors in file /opt/oracle/cell/log/diag/asm/cell/SYS_112331_170406/trace/cellofltrc_19796_53.trc (incident=25). Diagnostic package is attached. It is also accessible at https://db01cel08.xxx.com/diagpack/download?name=db01cel08_2019_12_19T02_00_04_10.tar.bz2 It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at https://db01cel08.xxx.com/diagpack"

name: 11
alertDescription: "ORA-00600: internal error code, arguments: [LinuxBlockIO::reap], [0x60000D502388], [], [], [], [], [], [], [], [], [], []"
alertMessage: "ORA-00600: internal error code, arguments: [LinuxBlockIO::reap], [0x60000D502388], [], [], [], [], [], [], [], [], [], []"
alertSequenceID: 11
alertShortName: ADR
alertType: Stateless
beginTime: 2019-12-19T02:00:04-04:00
notificationState: 1
sequenceBeginTime: 2019-12-19T02:00:04-04:00
severity: critical
alertAction: "Errors in file /opt/oracle/cell/log/diag/asm/cell/db01cel08/trace/svtrc_9174_12.trc (incident=25). Diagnostic package is attached. It is also accessible at https://db01cel08.xxxx.com/diagpack/download?name=jdb01cel08_2019_12_19T02_00_04_11.tar.bz2 It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at https://db01cel08.xxx.com/diagpack"

name: 12_1
alertDescription: "A SQL PLAN quarantine has been added"
alertMessage: "A SQL PLAN quarantine has been added. As a result, Smart Scan is disabled for SQL statements with the quarantined SQL plan. Quarantine id : 21 Quarantine type : SQL PLAN Quarantine reason : Crash Quarantine Plan : SYSTEM Quarantine Mode : FULL_Quarantine DB Unique Name : XPTODB Incident id : 25 SQLID : 8j0az9sgxs5yh SQL Plan details : {SQL_PLAN_HASH_VALUE=281152830, PLAN_LINE_ID=9} In addition, the following disk region has been quarantined, and Smart Scan will be disabled for this region: Disk Region : {Grid Disk Name=Unknown, offset=186750337024, size=1M} "
alertSequenceID: 12
alertShortName: Software
alertType: Stateful
beginTime: 2019-12-19T02:00:12-04:00
metricObjectName: QUARANTINE/21
notificationState: 1
sequenceBeginTime: 2019-12-19T02:00:12-04:00
severity: critical
alertAction: "A SQL statement caused the Cell Server (CELLSRV) service on the cell to crash. A SQL PLAN quarantine has been created to prevent the same SQL statement from causing the same cell to crash. When possible, disable offload for the SQL statement or apply the RDBMS patch that fixes the crash, then remove the quarantine with the following CellCLI command: CellCLI> drop quarantine 21 All quarantines are automatically removed when a cell is patched or upgraded. For information about how to disable offload for the SQL statement, refer to the section about 'SQL Processing Offload' in Oracle Exadata Storage Server User's Guide. Diagnostic package is attached. It is also accessible at https://db01cel08.xxx.com/diagpack/download?name=db01cel08_2019_12_19T02_00_12_12_1.tar.bz2 It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at https://db01cel08.xxx.com/diagpack"


After some research, we could match the situation to Bug 13245134 – Query may fail with errors ORA-27618, ORA-27603, ORA-27626 or ORA-00600[linuxblockio::reap_1] or ora-600 [cacheput::process_1]

It’s also described as per: Exadata/SuperCluster: 11.2 databases missing fix for the bug 13245134 may lead to cell service crash with ora-600 [linuxblockio::reap_1]/ora-600 [cacheput::process_1] or ORA-27626: Exadata error: 242/Smart scan issues on the RDBMS side

In order to resolve the crashes quickly, I applied the patch online with:

After applying, all got solved:

[oracle@db01 ~]$ /oracle/xptodb/product/ lsinventory -OH $ORACLE_HOME | grep 13245134
Patch (online) 13245134: applied on Thu Dec 19 23:34:50 EST 2019
[oracle@db01 ~]$

Hope it helps!

Automatic Killing Inactive Sessions with Resource Manager

Hello All!

So, your are manually (or via script) killing idle sessions on your database?

As consequence, your users are getting error “ORA-00028: your session has been killed.” and getting angry on you?

What about doing it automatically in a much more graceful way and be seen as a nicer DBA? You can do it using Resource Manager.

Are you already using Resource Manager on your database? Yes – Great!

No – Shame on you. Read this and put RM in place ASAP. This is a great tool for the database to manage the database resources, plus no additional licensing is needed. So go for it!

Ok, but what is the trick?
It’s the limit max_idle_time. You can use it either on existing groups or subgroups of your plan (or subplan) or switch to “KILL” groups with this you can even use the same criteria you’d use for any script to perform this action. For this you might user the parameter new_switch_group and have a created a different group only for those kills.

Have this option it’s nicer, if you want to avoid killing sessions on database, by the way, as you can always switch a session for the killing group manually, not demanding it to fill the requirements to automatic switching.

How? Quick example:

Creating new Plan Groups with MAX_IDLE_TIME:

dbms_resource_manager.create_plan_directive( plan => '&RM_PLAN', group_or_subplan =>
'LONG_RUN', comment => 'Limit idle time to 5 minutes', max_idle_time => 300);
dbms_resource_manager.create_plan_directive( plan => '&RM_PLAN', group_or_subplan =>
'SHORT_RUN', comment => 'Limit idle time to 1 minute', max_idle_time => 60);


And what would be the error for the user that get’s the session killed?

ORA-3113 End of file on communication channel

Much nicer: Now you are a nice DBA and don’t kill sessions anymore.
You automated it!

Some additional recommendations:

  • Use this solution for Databases above or, due some known bugs:
    • Bug 8891495 – NON-IDLE SESSIONS ARE KILLED ( only)
  • Some MOS references on it:
    • How To Automatic Kill Inactive Sessions using Resource Manager (Doc ID 1935739.1)
    • Using Resource Manager to Detect and Kill Idle Sessions (Doc ID 1557657.1)

Hope it helps!

Advise: Setting underScore/Hidden Parameters

Please be aware no hidden parameters (parameters starting with “_”) should not be set unless recommended by Oracle explicitly or in case of a perfect match to the case.

This is explained on MOS Best Practices and Guidelines for Setting underScore/Hidden Parameter to Workaround and/or Resolve Customers Problems (Doc ID 2524119.1)

So, in case of any diagnostics you have to lead to setting it, SUGGESTED:

  • Get the explicit approval/agreement from Oracle.
  • Include evidence on the root cause for this setting. A comment should work, something like:
alter system set "_replace_virtual_columns"=false Comment= 'Workaround Doc ID 1512347.1' SID= '*' scope= both;


ORA-20001: Statistics Advisor: Invalid task name for the current user

Hi all,
So, I start seeing on my alert log:

Errors in file /u01/app/oracle/diag/rdbms/cdb1/CDB1/trace/CDB1_j000_89793.trc:
ORA-12012: error on auto execute of job "SYS"."ORA$AT_OS_OPT_SY_36807"
ORA-20001: Statistics Advisor: Invalid task name for the current user
ORA-06512: at "SYS.DBMS_STATS", line 47207
ORA-06512: at "SYS.DBMS_STATS_ADVISOR", line 882
ORA-06512: at "SYS.DBMS_STATS_INTERNAL", line 20059
ORA-06512: at "SYS.DBMS_STATS_INTERNAL", line 22201
ORA-06512: at "SYS.DBMS_STATS", line 47197

After some research I found Bug 22879263 – BETA 12.2 – ORA-12012 , ORA-20001 AND ORA-06512 FROM ORA$AT_OS_OPT_SY_88 JOB.

This was suppose to be fixed on, however I’m still getting. Anyway, I executed the below and it got fixed:

EXEC dbms_stats.init_package();

Ref: Recurring ORA-12012, ORA-20001, ORA-06512 In Container Database (Doc ID 2420581.1)

Hope it helps!

Diagsnap Causing Node Eviction on 12c

Hi all,

So, you know how it is when we are having node evictions over a RAC. Lot’s of sessions getting killed, some effects for the clients and applications, also boss bossing in our shoulders to have a quick resolution over this. Plus it’s never an easy thing to drill down and understand.

Troubleshooting node reboots/evictions within Grid Infrastructure (GI) often is difficult due to the lack of Network and OS level resource information. To help circumvent this situation the diagsnap feature has been developed and integrated with Grid Infrastructure. Diagsnap is triggered to collect Network and OS level resource information when a given node is about to get evicted or when Grid Infrastructure is about to crash.

The diagsnap feature is enabled automatically starting from Oct2017 PSU and Oct2017 RU.
For more information about the diagsnap feature, refer to the Document 2345654.1 “What is diagsnap resource in 12c GI and above?”

However, after a lot of research and SR logs sending and interaction with Oracle to investigate a node eviction, we ended up finding a match to Bug 25785073 – OCSSD hangs while DIAGSNAP takes pstack causing a node reboot(Doc ID 25785073.8).

So Diagsnap is not helping, it is the cause of the issues.

After some research, seems this is not the only bug related to it. See some more:

  • Bug 27182006 – Auxiliary commands generated by DIAGSNAP spin CPU(Doc ID 27182006.8)
  • Bug 24692439 – Auxiliary commands generated by DIAGSNAP consumes high CPU(Doc ID 24692439.8)
  • Bug 23101338 – Disable diagsnap after GI PSU patch was installed(Doc ID 23101338.8)
  • Bug 28462215 – The Process diagsnap.pl is Restarted Every 2 Minutes(Doc ID 28462215.8)


At least for the 12Cs, I’m disabling it in all my environments:

– Check if diagsnap is disabled or not (DIAGSNAP=Disable if disabled, nothing if enabled)

$ egrep '^DIAGSNAP|^PSTACK' /u01/app/$(hostname -s).ora

– Disable diagsnap (this will disable diagsnap on all nodes)

$ /u01/app/ manage -disable diagsnap

– Check that diagsnap is disabled (DIAGSNAP=Disable if disabled, nothing if enabled) — to be done on each node

$ egrep '^DIAGSNAP|^PSTACK' /u01/app/$(hostname -s).ora

Hope it helps!

Automatic SQL Tuning Advisor Raising ORA-00600: internal error code, arguments: [qksvcReplaceVC0]

Hi all,

So I got to receive frequently this error, always on same hour, from a database:

ORA-00600: internal error code, arguments: [qksvcReplaceVC0], [], [], [], [], [], [], [], [], [], [], []

Not much was required for matching it to the Automatic SQL Tuning Advisor.

This only seems to happen during execution of Automatic SQL Tuning Advisor. Several bugs have been logged for the issue but have not been resolved as the error is not reproducible at will. For example:

Bug 13959984: ORA-00600 [QKSVCREPLACEVC0]

How to fix it? Apply the patches!

To workaround it?

A few options:

1. Setting “_replace_virtual_columns” to false.

You can set this parameter at both session (where automatic SQL Tuning Advisor starts)
and system level with the following commands-

SQL> alter session set "_replace_virtual_columns"=false;

SQL> alter system set "_replace_virtual_columns"=false

2. Since it is only failing in the SQL Tuning Advisor auto task and has no effect on the database the error can be ignored.
You can disable that auto task and just run it manually when required:

–check auto job status

SQL> select client_name,status from dba_autotask_task;

SQL> select client_name,status from dba_autotask_client;

SQL> select client_name, operation_name, status from dba_autotask_operation;

–disable SQL Tuning Advisor job

SQL> exec dbms_auto_task_admin.disable ('sql tuning advisor', null, null);


SQL> exec dbms_auto_task_admin.disable (client_name => 'sql tuning advisor', operation => null, window_name => null);

–enable SQL Tuning Advisor job

SQL> exec dbms_auto_task_admin.enable ('sql tuning advisor', null, null);


SQL> exec dbms_auto_task_admin.enable (client_name => 'sql tuning advisor', operation => null, window_name => null);


Hope it helps!

ORA-02056: 2PC: k2lcom: bad two-phase command number rdonly from coord

Hi all,

So, you found this error?

Well, I received this in a client environment. After checking MOS ORA-02056: 2PC: K2gCheckGlobalCommit: Bad Two-phase Command Number 1 From Coord (Doc ID 2498134.1), which points to a known issue when dealing with committing transactions as per Bug 22016049 I noticed my problem could be different…

My real root cause is my transaction being done on a remote object and when checking on remote database I found pending distributed transactions.

For more informations about distributed transactions, please check: https://docs.oracle.com/database/121/ADMIN/ds_txns.htm#ADMIN12211


So, if you getting this situation, before planning and applying the patch I suggest you:

  • Check if your process is not working on remote databases.
  • Check for any possible distributed remote transactions.

How? Easy check (on remote db):

select * from dba_2pc_pending;

I hope it helps!

PS NOTE: I just found this very good and complete post by Rodrigo Mufalani on the same topic: ORA-01591: lock held by in-doubt distributed transaction

Have a look for an Action Plan!

ORA-04030: out of process memory when trying to allocate % bytes (kkoutlCreatePh,ub1 : kkoabr)

Hi all,

So, I started seen this on a client environment. Researching on the  case after no crear reference on MOS, I noticed some high PGA allocation as per below.

SYS@proddb AS SYSDBA PROD> select pid, serial#,category, allocated/1024/1024 MB, used/1024/1024 MB_used, max_allocated/1024/1024 MB_MAX_ALLOCATED_ON_PGA
2 from v$process_memory where pid=852;

---------- ---------- --------------- ---------- ---------- -----------------------
852 91 SQL .086807251 .00806427 .672416687
852 91 PL/SQL .087730408 .078926086 .126182556
852 91 Freeable .5625 0
852 91 Other 2.25187302 2.25187302

Seems a match MOS ORA-04030 Error With High “kkoutlCreatePh” (Doc ID 1618444.1).

The solution? Simply disabled the following parameter:



Trace for additional info:

Oracle Database 12c Enterprise Edition Release - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/12.1.0/db_1
System name: Linux
Node name: proddb.local
Release: 4.1.12-94.8.3.el7uek.x86_64
Version: #2 SMP Wed Apr 25 19:57:32 PDT 2018
Machine: x86_64
Instance name: proddb
Redo thread mounted by this instance: 1
Oracle process number: 854
Unix process pid: 28899, image: oracle@proddb.local

*** 2019-01-09 11:20:09.130
*** SESSION ID:(5429.55092) 2019-01-09 11:20:09.130
*** CLIENT ID:() 2019-01-09 11:20:09.130
*** SERVICE NAME:(XXXXX) 2019-01-09 11:20:09.130
*** MODULE NAME:(JDBC Thin Client) 2019-01-09 11:20:09.130
*** CLIENT DRIVER:(jdbcthin) 2019-01-09 11:20:09.130
*** ACTION NAME:() 2019-01-09 11:20:09.130

Jump to table of contents
Dump continued from file: /u01/app/oracle/diag/rdbms/proddb/proddb/trace/proddb_ora_28899.trc
ORA-04030: out of process memory when trying to allocate 34392040 bytes (kkoutlCreatePh,ub1 : kkoabr)

========= Dump for incident 2710965 (ORA 4030) ========
----- Beginning of Customized Incident Dump(s) -----
72% 18 GB, 1356 chunks: "free memory " SQL
kxs-heap-c ds=0x7fa8df330220 dsprt=0x7fa8df49dbe0
27% 6809 MB, 3366 chunks: "permanent memory " SQL
kxs-heap-c ds=0x7fa8df330220 dsprt=0x7fa8df49dbe0
0% 100 MB, 894 chunks: "permanent memory " SQL
kkoutlCreatePh ds=0x7fa8cce16708 dsprt=0x7fa8df330220
0% 23 MB, 589515 chunks: "chedef : qcuatc "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0
0% 18 MB, 150124 chunks: "opndef: qcopCreateOpnViaM "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0
0% 16 MB, 214829 chunks: "logdef: qcopCreateLog "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0
0% 11 MB, 241 chunks: "free memory "
top call heap ds=0x7fa8df49dbe0 dsprt=(nil)
0% 9643 KB, 4623 chunks: "qkkele " SQL
kxs-heap-c ds=0x7fa8df330220 dsprt=0x7fa8df49dbe0
0% 6534 KB, 58578 chunks: "optdef: qcopCreateOptInte "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0
0% 4134 KB, 15399 chunks: "kccdef : qcsvwsci "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0

25 GB total:
25 GB commented, 794 KB permanent
12 MB free (0 KB in empty extents),
24 GB, 1 heap: "kxs-heap-c " 67 KB free held
Summary of subheaps at depth 1
25 GB total:
203 MB commented, 6809 MB permanent
18 GB free (0 KB in empty extents),
6398 MB, 9243 chunks: "allocator state " 6398 MB free held
3758 MB, 4623 chunks: "qkkele " 3749 MB free held
3650 MB, 4623 chunks: "qkkkey " 3650 MB free held


Dump of Real-Free Memory Allocator Heap [0x7fa8df317000]
mag=0xfefe0001 flg=0x5000003 fds=0x0 blksz=65536
blkdstbl=0x7fa8df317010, iniblk=524288 maxblk=524288 numsegs=318
In-use num=2965 siz=641597440, Freeable num=0 siz=0, Free num=254 siz=3586195456


Dumping Work Area Table (level=1)

Global SGA Info

global target: 102400 MB
auto target: 62376 MB
max pga: 2048 MB
pga limit: 4096 MB
pga limit known: 0
pga limit errors: 0

pga inuse: 33104 MB
pga alloc: 35238 MB
pga freeable: 1225 MB
pga freed: 433664026 MB
pga to free: 0 %
broker request: 0

pga auto: 12 MB
pga manual: 0 MB

pga alloc (max): 35238 MB
pga auto (max): 12284 MB
pga manual (max): 2 MB

# workareas : 0
# workareas(max): 551

ORA-02019 While SELECT From A View Owned By Another User Using Dblink

Quick case today.

This week I had a client experiencing ORA-02019 while SELECT from a View with dblink and CONNECT BY PRIOR … START WITH into SELECT.
The situation involved Views on a DB which need to be accessed by a groups of users from another DB using proxy user and a DB link but encounter this ORA error.

The root cause?

ORA-02019 while performing select on a view or while selecting a view owned by another user, with dblink, is a match to Bug 26558437 – DATABASE LINK FAILS WITH ORA-2019 WHEN SELECT ANOTHER USER VIEW.

But MOS doesn’t have workaround besides applying patch, as usual. What we did and solved on our case?

We created a materialized view refreshed every 15 mins (solution supported this delay0 using the DBLink if the view owner.
In this case, the other users instead of executing the query on the view, will be actually querying the table created (and refreshed) by the mview code, which would be only be executed by the mview owner.

By the way, 2 good side effects:
1) Once view was executed more often then the period of refresh, this solution is also saving some efforts database wise, once executing select from mview is way better then the view code, besides not using the network on dblink
2) If the remote database get slow, or down, the data would be still available from last mview refresh.

2) MOS not always give you all the steps. Sometimes you can easily solve your problem by thinking a little bit more on the root cause problem.


ORA-00600: internal error code, arguments: [kghfrempty:ds]

Hi all,
Annoying thing, just because that sounded veeery unsual to me. I would expect a different error.

I have seen this “[kghfrempty:ds]” in past a couple times, and if you did see it too you may have noticed, this is almost as much generic as the ORA-600 itself. It happens because this is usually in the top of KGH errors in general. Some examples:


kghnerror <- kghfrempty <- kghfrempty_ex <- qerhjFreeSpace
<- qerhjFetch <- qersoProcessULS <- qersoFetch <- opifch2 <- kpoal8
<- opiodr <- kpoodrc <- rpiswu2 <- kpoodr <- upirtrc
<- kpurcsc <- kpufch0 <- kpufch <- OCIStmtFetch2 <- qksanExecSql <- qksanAnalyzeSql


kghfrempty <- kghfrempty_partial <- kghgex <- kghfnd <- kghalo <- kghgex <- kghalf <- ktmrProcessCRClone 
<- ktmbRead <- ktmrget_int <- ktmrget <- kdmsTransGet <- kdst_fetch_imc  <- kdsttgr  <- qertbFetch

And as being it can be related either to analyze bugs (as per stack 1), In-Memory Column Store – IMC (as per stack 2), create views, rollback over cluster, trigger creation, use function based indexes, or several other generic situations (there is no pattern, as you can see).

So what do to if you get this message? Well, generically, read all the MOS notes available and try to match you case. IF you match, pray to have a bug for your current version, or open an SR to either have a backport or start a new investigation.

A few examples of things to check:

  • ALERT: Bug 13605839 ORA-600 [ktbsdp1] ORA-600 [kghfrempty:ds] ORA-600 [kdBlkCheckError]. Corruption in Rollback with Clusterwide Global Transactions in RAC (Doc ID 1527740.1)
  • SQL Analyze Fails with ORA-600[kghfrempty:ds] (Doc ID 2247180.1)
    Bug 21556276 – dataguide createviewonpath hit ora 600 [kghfrempty:ds] (Doc ID 21556276.8)
  • Bug 20878625 – ORA-600[kghfrempty:ds] with IMC enabled (Doc ID 20878625.8)
  • Bug 19212166 – ORA-600 [KGHFREMPTY:DS] With Cache Line Alignment Enabled (Doc ID 19212166.8)
  • Bug 6797925 – ORA-600 [kghfrempty:ds] During Creation Of Trigger In XML Schema (Doc ID 6797925.8)
  • 11.1: ORA-600 [17147] AND [kghfrempty:ds] On Select Using Functional Indexes (Doc ID 884882.1)
  • ORA-600 [kghfrempty:ds] And ORA-600 [kghrcdepth:ds] While Gathering Statistics (Doc ID 1480132.1)

BUUT I just noticed some undocumented scenario causing this and it might help you on fixing this as well.
I noticed I was getting this error for different queries but always from same application. So I addressed a complete review on application layer and noticed the JDBC 7 on this application. Happens we had just upgraded from 12.1 to 19c.

SO, after upgrading jdbc, it started to work again properly and no MOS SR was required. So lesson learned: When you receive this kghfrempty:ds], speacially after migration, also check for your application JDBC version.

For compatibility Matrix: Starting With Oracle JDBC Drivers – Installation, Certification, and More! ( Doc ID 401934.1 )

Hope it helps, cheers!