DBCA “Recovery manager failed to restore datafiles”

Hi all,

If that’s the 5th Blog Post with the same title one open, don’t close, READ THIS ONE!

This one is different.

It’s actually another case about Oracle throwing generic errors for DBCA where 99% of times it’s the same error, so all blogs are different but the same in essence, and none resolve your problem. So, let’s go by parts:

The Error from Client:

DBCA_Error

 

Generic Case (if this is the first blog you open about the subject):

  • Make sure the file $ORACLE_HOME/bin/oracle has privilege set to chmod 6751 in both (ASM and DB) homes. it should look like this:
[oracle@PROD01 bin]$ cd /u01/app/oracle/product/19c/db/bin
[oracle@PROD01 bin]$
[oracle@PROD01 bin]$ ls -ltr oracle*
-rwsr-s--x. 1 oracle asmadmin 441253104 Aug 27 22:29 oracle
  • If you are not sure, set it accordingly:
cd $ORACLE_HOME/bin && chmod 6751 oracle
  • Not yet? Check the disks assigned to ASM privileges and groups:
kfod status=TRUE asm_diskstring='/dev/asm*' disk=ALL
  • Ater all this, still not working? Go for the atypical case below:

 

Atypical Case (Exception)

After some struggle and no success, I started validating everything I could. Some piece of words before the silver bullet here:

Noticed I knew you do have ASM? How come that? Well, most likely the error below the ASM happens at the point the DBCA runs a RMAN restore to create the database base files and metadata. This most likely happens at the time the write is happening on the ASM, once this is the most unstable process involved.

By looking deeper on the installation logs I could see:

[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: restoring datafile 00001 to +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: reading from backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: ORA-19870: error while restoring backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-19504: failed to create file "+DATA"
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-17502: ksfdcre:4 Failed to create file +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-15001: diskgroup "DATA" does not exist or is not mounted
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-01017: invalid username/password; logon denied

Bingo, so it’s a password issue?

Well, I’m creating the database and this actually matches with all the chmod 6751 thing…

What then?

Well, after a while going crazy validating passwd files and so one, realized something about the oracle user:

[oracle@PROD01 bin]$ id -a
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
[oracle@PROD01 bin]$ grep oracle /etc/group
wheel:x:10:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
madhoc:x:525:oracle

Can you see the oracle user is part of the oinstall group but not really appearing on /etc/group? Also not in osusergroup dba.

Well, let’s force it?

[oracle@PROD01 bin]$ sudo su -
Last login: Fri Aug 28 14:13:21 PDT 2020 on pts/3
[root@DMSDB1PA ~]# usermod -g oinstall -G oinstall,dba,asmadmin,asmoper,madhoc oracle
[root@PROD01 ~]# id oracle
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc)
[root@PROD01 ~]# grep oracle /etc/group
wheel:x:10:oracle
dba:x:203:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
oinstall:x:501:oracle
madhoc:x:525:oracle
[root@PROD01 ~]#

Well done!

Now try running DBCA again. That was a very tricky issue to find.

Know something else? At the point I was writing this I decided to have another look and ended up finding this MOS note: “ORA-17502 /ORA-01017: invalid username/password; logon denied ” While Creating 19c Database (Doc ID 2545858.1). We have a bug for it: BUG:29821687 – ORA-17502 /ORA-01017: INVALID USERNAME/PASSWORD; LOGON DENIED ” WHILE CREATING 19C DATABASE

You have the workaround already though. Go champs!

Hope it helps you, cheers!

19c: Could not execute DBMS_LOGMNR.START_LOGMNR: ORA-44609: CONTINOUS_MINE is desupported for use with DBMS_LOGMNR.START_LOGMNR.

Hi all,
This is to show you we should never trust 100% on documentation and how running on new versions yearly can put additional pressure on the documentation and cause errors…

So, I started supporting a new tool for data mining. There were no version restrictions as per their documentation, so I was more than happy about creating a PDB on my brand new 19c CDB, proudly using 19c for this new tool!

What happened?

Could not execute DBMS_LOGMNR.START_LOGMNR: ORA-44609: CONTINOUS_MINE is desupported for use with DBMS_LOGMNR.START_LOGMNR.

Doing my validation against the database:

SQL> execute dbms_logmnr.start_logmnr( options => dbms_logmnr.dict_from_online_catalog + SYS.DBMS_LOGMNR.CONTINUOUS_MINE);
BEGIN dbms_logmnr.start_logmnr( options => dbms_logmnr.dict_from_online_catalog + SYS.DBMS_LOGMNR.CONTINUOUS_MINE); END;

*
ERROR at line 1:
ORA-44609: CONTINOUS_MINE is desupported for use with DBMS_LOGMNR.START_LOGMNR.

Why is this happening?

This happens because the continuous_mine feature is deprecated since 12.2 and desupported/unavailable 19c on.

It seems this LogMining tool devs don’t check what changes on the new DBs versions before confirm compatibility, right?

And they don’t do it or a while, as this as announced as deprecated on 12.2…

Which makes me ask myself how are their confidence on the tool running on other clients they have… LOL

Anyway, are you facing the same? Here is some reference documentation to answer your boss about:

From: 19.1 Announcement – ORA-44609: CONTINOUS_MINE is Desupported For Use With DBMS_LOGMNR.START_LOGMNR (Doc ID 2456684.1):

  • CONTINUOUS_MINE was deprecated in Oracle Database 12c release 2 (12.2) and starting with 19.1 is desuppported. There is no replacement functionality.
  • The continuous_mine option for the dbms_logmnr.start_logmnr package is desupported in Oracle Database 19c (19.1), and is no longer available.

The real reasoning behind is: Nothing seems to be populating the V$LOGMNR_LOGS, so the ORA-1291 occurs.

Here is a quick test case or you, from the 19c version for Oracle Utilities Guide version 19c “Mining without specifying the list of redo log files“. (19c – 22. Using LogMiner to Analyze Redo Log Files)

I just put it all together to show you how it does [not] work:

  • Setting up it all:
ALTER DATABASE add SUPPLEMENTAL LOG DATA;
SELECT SUPPLEMENTAL_LOG_DATA_MIN FROM V$DATABASE;
EXECUTE DBMS_LOGMNR_D.BUILD (OPTIONS=>DBMS_LOGMNR_D.STORE_IN_REDO_LOGS);
@?/rdbms/admin/dbmslm.sql
alter system switch logfile;
alter system switch logfile;
alter system switch logfile;
create user BLABLA identified by BLABLA default tablespace users quota unlimited on users profile default;
grant connect, resource to ;
  • Doing some stuff to generate logs:
connect BLABLA/BLABLA
alter session set nls_date_format='dd-mon-yyyy hh24:mi:ss';
set time on
CREATE TABLE TEST_NULLS (COLUMNA1 NUMBER(3,0));
ALTER TABLE TEST_NULLS ADD (COLUMNA2 NUMBER(3) DEFAULT 0 NOT NULL);
insert into TEST_NULLS(columna1) values (4);
commit;
select * from TEST_NULLS;
update TEST_NULLS set columna2=221 where columna1=4;
commit;
select * from TEST_NULLS;
  • Gathering info or mining:
connect / as sysdba;
set echo on
set serveroutput on
alter session set nls_date_format='dd-mon-yyyy hh24:mi:ss';
set linesize 254
set pagesize 3000
column name format a40;
SELECT FILENAME name FROM V$LOGMNR_LOGS;
SELECT NAME, FIRST_TIME FROM V$ARCHIVED_LOG;
SELECT NAME, FIRST_TIME FROM V$ARCHIVED_LOG WHERE SEQUENCE# = (SELECT MAX(SEQUENCE#) FROM V$ARCHIVED_LOG WHERE DICTIONARY_BEGIN = 'YES');
  • Introduce the first_time to recover for the previous query:
EXEC DBMS_LOGMNR.START_LOGMNR(STARTTIME =>'&1',ENDTIME => SYSDATE,OPTIONS => DBMS_LOGMNR.DICT_FROM_REDO_LOGS + DBMS_LOGMNR.COMMITTED_DATA_ONLY + DBMS_LOGMNR.PRINT_PRETTY_SQL);
EXEC BEGIN DBMS_LOGMNR.START_LOGMNR(STARTTIME =>'&1',ENDTIME => SYSDATE,OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + DBMS_LOGMNR.COMMITTED_DATA_ONLY + DBMS_LOGMNR.PRINT_PRETTY_SQL);
select FIRST_CHANGE#,NEXT_CHANGE# from V$archived_log;
SELECT CHECKPOINT_CHANGE#, CURRENT_SCN FROM V$DATABASE;
  • Use the checkpoint at startscn and the current_scn at endscn:
EXEC DBMS_LOGMNR.START_LOGMNR(STARTSCN =>&1,ENDSCN => &2,OPTIONS => DBMS_LOGMNR.DICT_FROM_REDO_LOGS + DBMS_LOGMNR.COMMITTED_DATA_ONLY + DBMS_LOGMNR.PRINT_PRETTY_SQL);
EXEC DBMS_LOGMNR.START_LOGMNR(STARTSCN =>&1,ENDSCN => &2,OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + DBMS_LOGMNR.COMMITTED_DATA_ONLY + DBMS_LOGMNR.PRINT_PRETTY_SQL);
EXEC DBMS_LOGMNR.START_LOGMNR(STARTTIME =>SYSDATE, ENDTIME => SYSDATE +5/24,OPTIONS => DBMS_LOGMNR.DICT_FROM_REDO_LOGS + DBMS_LOGMNR.COMMITTED_DATA_ONLY + DBMS_LOGMNR.PRINT_PRETTY_SQL);
EXEC DBMS_LOGMNR.START_LOGMNR(STARTTIME =>SYSDATE ,ENDTIME => SYSDATE +5/24 ,OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + DBMS_LOGMNR.COMMITTED_DATA_ONLY + DBMS_LOGMNR.PRINT_PRETTY_SQL);

Ref: 19c – 22. Using LogMiner to Analyze Redo Log Files

  • You can expect or this:
ERROR at line 1:
ORA-01291: missing log file
ORA-06512: at "SYS.DBMS_LOGMNR", line 72
ORA-06512: at line 1

Hmmmm… So, the 19c Documentation is not working? Precisely.

As per (Doc ID 2613490.1), this will be fixed in 20.1 documentation.

  • Sections 22.13.2 Examples of Mining without specifying the list of redo log files explicitly and child example topics will be removed.
  • Section 22.4.2. Automatic Redo log Files options will be changed to Specifying Redo Log Files for Data Mining.
  • Section 22.7.4 The next sentence will be removed too.

In summary, whatever reference related to automatic mining in the documentation will be removed as this feature it’s not supported in 19.1 version and higher.

Ok, but this doesn’t solve my problem. What should I do with the client tool?

So, The continous_mine was the only method of starting LogMiner without first adding logfiles using dbms_logmnr.add_logfile.

What can I do, to workaround it:

  1. Add a logfile manually with dbms_logmnr.add_logfile for each logfile.
  2. Remove SYS.DBMS_LOGMNR.CONTINUOUS_MINE from the code.
    1. For this, you’ll need to specify the logfile explicitly. So I guess, some additional codding will be needed from your side…

I hope it helps!
Matheus.

Exadata: Cell Server Crashing on ORA-00600: [LinuxBlockIO::reap]

Hi all,

So I started facing this in a client environment. Here is the alert message:

Target name=db01cel08.xxx.com
Message=ORA-00600: internal error code, arguments: [LinuxBlockIO::reap], [0x60000D502388], [], [], [], [], [], [], [], [], [], []
Event reported time=Dec 19, 2019 2:14:16 AM EDT

When checking on the cellserver I see this message:

[root@db01 ~]# ssh db01cel08
Last login: Thu Dec 19 04:45:13 2019 from db01.xxx.com
[root@db01cel08 ~]# cellcli
CellCLI: Release 12.1.2.3.5 - Production on Fri Dec 19 17:13:31 EDT 2019

Copyright (c) 2007, 2016, Oracle. All rights reserved.

CellCLI> LIST ALERTHISTORY detail

[...]

name: 10
alertDescription: "ORA-07445: exception encountered: core dump [__intel_new_memset()+62] [11] [0x000000000] [] [] []"
alertMessage: "ORA-07445: exception encountered: core dump [__intel_new_memset()+62] [11] [0x000000000] [] [] []"
alertSequenceID: 10
alertShortName: ADR
alertType: Stateless
beginTime: 2019-12-19T02:00:04-04:00
endTime:
examinedBy:
notificationState: 1
sequenceBeginTime: 2019-12-19T02:00:04-04:00
severity: critical
alertAction: "Errors in file /opt/oracle/cell/log/diag/asm/cell/SYS_112331_170406/trace/cellofltrc_19796_53.trc (incident=25). Diagnostic package is attached. It is also accessible at https://db01cel08.xxx.com/diagpack/download?name=db01cel08_2019_12_19T02_00_04_10.tar.bz2 It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at https://db01cel08.xxx.com/diagpack"

name: 11
alertDescription: "ORA-00600: internal error code, arguments: [LinuxBlockIO::reap], [0x60000D502388], [], [], [], [], [], [], [], [], [], []"
alertMessage: "ORA-00600: internal error code, arguments: [LinuxBlockIO::reap], [0x60000D502388], [], [], [], [], [], [], [], [], [], []"
alertSequenceID: 11
alertShortName: ADR
alertType: Stateless
beginTime: 2019-12-19T02:00:04-04:00
endTime:
examinedBy:
notificationState: 1
sequenceBeginTime: 2019-12-19T02:00:04-04:00
severity: critical
alertAction: "Errors in file /opt/oracle/cell/log/diag/asm/cell/db01cel08/trace/svtrc_9174_12.trc (incident=25). Diagnostic package is attached. It is also accessible at https://db01cel08.xxxx.com/diagpack/download?name=jdb01cel08_2019_12_19T02_00_04_11.tar.bz2 It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at https://db01cel08.xxx.com/diagpack"

name: 12_1
alertDescription: "A SQL PLAN quarantine has been added"
alertMessage: "A SQL PLAN quarantine has been added. As a result, Smart Scan is disabled for SQL statements with the quarantined SQL plan. Quarantine id : 21 Quarantine type : SQL PLAN Quarantine reason : Crash Quarantine Plan : SYSTEM Quarantine Mode : FULL_Quarantine DB Unique Name : XPTODB Incident id : 25 SQLID : 8j0az9sgxs5yh SQL Plan details : {SQL_PLAN_HASH_VALUE=281152830, PLAN_LINE_ID=9} In addition, the following disk region has been quarantined, and Smart Scan will be disabled for this region: Disk Region : {Grid Disk Name=Unknown, offset=186750337024, size=1M} "
alertSequenceID: 12
alertShortName: Software
alertType: Stateful
beginTime: 2019-12-19T02:00:12-04:00
examinedBy:
metricObjectName: QUARANTINE/21
notificationState: 1
sequenceBeginTime: 2019-12-19T02:00:12-04:00
severity: critical
alertAction: "A SQL statement caused the Cell Server (CELLSRV) service on the cell to crash. A SQL PLAN quarantine has been created to prevent the same SQL statement from causing the same cell to crash. When possible, disable offload for the SQL statement or apply the RDBMS patch that fixes the crash, then remove the quarantine with the following CellCLI command: CellCLI> drop quarantine 21 All quarantines are automatically removed when a cell is patched or upgraded. For information about how to disable offload for the SQL statement, refer to the section about 'SQL Processing Offload' in Oracle Exadata Storage Server User's Guide. Diagnostic package is attached. It is also accessible at https://db01cel08.xxx.com/diagpack/download?name=db01cel08_2019_12_19T02_00_12_12_1.tar.bz2 It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at https://db01cel08.xxx.com/diagpack"

CellCLI>

After some research, we could match the situation to Bug 13245134 – Query may fail with errors ORA-27618, ORA-27603, ORA-27626 or ORA-00600[linuxblockio::reap_1] or ora-600 [cacheput::process_1]

It’s also described as per: Exadata/SuperCluster: 11.2 databases missing fix for the bug 13245134 may lead to cell service crash with ora-600 [linuxblockio::reap_1]/ora-600 [cacheput::process_1] or ORA-27626: Exadata error: 242/Smart scan issues on the RDBMS side

In order to resolve the crashes quickly, I applied the patch online with:

After applying, all got solved:

[oracle@db01 ~]$ /oracle/xptodb/product/11.2.0.4/OPatch/opatch lsinventory -OH $ORACLE_HOME | grep 13245134
Patch (online) 13245134: applied on Thu Dec 19 23:34:50 EST 2019
13245134
[oracle@db01 ~]$

Hope it helps!

Automatic Killing Inactive Sessions with Resource Manager

Hello All!

So, your are manually (or via script) killing idle sessions on your database?

As consequence, your users are getting error “ORA-00028: your session has been killed.” and getting angry on you?

What about doing it automatically in a much more graceful way and be seen as a nicer DBA? You can do it using Resource Manager.

Are you already using Resource Manager on your database? Yes – Great!

No – Shame on you. Read this and put RM in place ASAP. This is a great tool for the database to manage the database resources, plus no additional licensing is needed. So go for it!

Ok, but what is the trick?
It’s the limit max_idle_time. You can use it either on existing groups or subgroups of your plan (or subplan) or switch to “KILL” groups with this you can even use the same criteria you’d use for any script to perform this action. For this you might user the parameter new_switch_group and have a created a different group only for those kills.

Have this option it’s nicer, if you want to avoid killing sessions on database, by the way, as you can always switch a session for the killing group manually, not demanding it to fill the requirements to automatic switching.

How? Quick example:

Creating new Plan Groups with MAX_IDLE_TIME:

begin
dbms_resource_manager.create_plan_directive( plan => '&RM_PLAN', group_or_subplan =>
'LONG_RUN', comment => 'Limit idle time to 5 minutes', max_idle_time => 300);
dbms_resource_manager.create_plan_directive( plan => '&RM_PLAN', group_or_subplan =>
'SHORT_RUN', comment => 'Limit idle time to 1 minute', max_idle_time => 60);
end;
/

Cool!

And what would be the error for the user that get’s the session killed?

ORA-3113 End of file on communication channel

Much nicer: Now you are a nice DBA and don’t kill sessions anymore.
You automated it!

Some additional recommendations:

  • Use this solution for Databases above 11.2.0.4 or 12.1.0.2, due some known bugs:
    • Bug 9523768 – IDLE SESSIONS AREN’T ACTUALLY KILLED IMMEDIATELY (affecting 11.2.0.1 to 11.2.0.3)
    • Bug 13837378 – ALTER SYSTEM KILL SESSION IMMEDIATE DOES NOT KILL QC SESSIONS (11.2.0.1 to 12.1.0.1)
    • Bug 8891495 – NON-IDLE SESSIONS ARE KILLED (11.2.0.1 only)
  • Some MOS references on it:
    • How To Automatic Kill Inactive Sessions using Resource Manager (Doc ID 1935739.1)
    • Using Resource Manager to Detect and Kill Idle Sessions (Doc ID 1557657.1)

Hope it helps!

Advise: Setting underScore/Hidden Parameters

Please be aware no hidden parameters (parameters starting with “_”) should not be set unless recommended by Oracle explicitly or in case of a perfect match to the case.

This is explained on MOS Best Practices and Guidelines for Setting underScore/Hidden Parameter to Workaround and/or Resolve Customers Problems (Doc ID 2524119.1)

So, in case of any diagnostics you have to lead to setting it, SUGGESTED:

  • Get the explicit approval/agreement from Oracle.
  • Include evidence on the root cause for this setting. A comment should work, something like:
alter system set "_replace_virtual_columns"=false Comment= 'Workaround Doc ID 1512347.1' SID= '*' scope= both;

Cheers!

ORA-20001: Statistics Advisor: Invalid task name for the current user

Hi all,
So, I start seeing on my alert log:

2019-12-26T00:42:03.543268-05:00
Errors in file /u01/app/oracle/diag/rdbms/cdb1/CDB1/trace/CDB1_j000_89793.trc:
ORA-12012: error on auto execute of job "SYS"."ORA$AT_OS_OPT_SY_36807"
ORA-20001: Statistics Advisor: Invalid task name for the current user
ORA-06512: at "SYS.DBMS_STATS", line 47207
ORA-06512: at "SYS.DBMS_STATS_ADVISOR", line 882
ORA-06512: at "SYS.DBMS_STATS_INTERNAL", line 20059
ORA-06512: at "SYS.DBMS_STATS_INTERNAL", line 22201
ORA-06512: at "SYS.DBMS_STATS", line 47197

After some research I found Bug 22879263 – BETA 12.2 – ORA-12012 , ORA-20001 AND ORA-06512 FROM ORA$AT_OS_OPT_SY_88 JOB.

This was suppose to be fixed on 12.2.0.1, however I’m still getting. Anyway, I executed the below and it got fixed:

EXEC dbms_stats.init_package();

Ref: Recurring ORA-12012, ORA-20001, ORA-06512 In Container Database (Doc ID 2420581.1)

Hope it helps!

Diagsnap Causing Node Eviction on 12c

Hi all,

So, you know how it is when we are having node evictions over a RAC. Lot’s of sessions getting killed, some effects for the clients and applications, also boss bossing in our shoulders to have a quick resolution over this. Plus it’s never an easy thing to drill down and understand.

Troubleshooting node reboots/evictions within Grid Infrastructure (GI) often is difficult due to the lack of Network and OS level resource information. To help circumvent this situation the diagsnap feature has been developed and integrated with Grid Infrastructure. Diagsnap is triggered to collect Network and OS level resource information when a given node is about to get evicted or when Grid Infrastructure is about to crash.

The diagsnap feature is enabled automatically starting from 12.1.0.2 Oct2017 PSU and 12.2.0.1 Oct2017 RU.
For more information about the diagsnap feature, refer to the Document 2345654.1 “What is diagsnap resource in 12c GI and above?”

However, after a lot of research and SR logs sending and interaction with Oracle to investigate a node eviction, we ended up finding a match to Bug 25785073 – OCSSD hangs while DIAGSNAP takes pstack causing a node reboot(Doc ID 25785073.8).

So Diagsnap is not helping, it is the cause of the issues.

After some research, seems this is not the only bug related to it. See some more:

  • Bug 27182006 – Auxiliary commands generated by DIAGSNAP spin CPU(Doc ID 27182006.8)
  • Bug 24692439 – Auxiliary commands generated by DIAGSNAP consumes high CPU(Doc ID 24692439.8)
  • Bug 23101338 – Disable diagsnap after 12.1.0.2.160419 GI PSU patch was installed(Doc ID 23101338.8)
  • Bug 28462215 – The Process diagsnap.pl is Restarted Every 2 Minutes(Doc ID 28462215.8)

Well,

At least for the 12Cs, I’m disabling it in all my environments:

– Check if diagsnap is disabled or not (DIAGSNAP=Disable if disabled, nothing if enabled)

$ egrep '^DIAGSNAP|^PSTACK' /u01/app/12.2.0.1/grid/crf/admin/crf$(hostname -s).ora

– Disable diagsnap (this will disable diagsnap on all nodes)

$ /u01/app/12.2.0.1/grid/bin/oclumon manage -disable diagsnap

– Check that diagsnap is disabled (DIAGSNAP=Disable if disabled, nothing if enabled) — to be done on each node

$ egrep '^DIAGSNAP|^PSTACK' /u01/app/12.2.0.1/grid/crf/admin/crf$(hostname -s).ora

Hope it helps!
Cheers!

Automatic SQL Tuning Advisor Raising ORA-00600: internal error code, arguments: [qksvcReplaceVC0]

Hi all,

So I got to receive frequently this error, always on same hour, from a database:

ORA-00600: internal error code, arguments: [qksvcReplaceVC0], [], [], [], [], [], [], [], [], [], [], []

Not much was required for matching it to the Automatic SQL Tuning Advisor.

This only seems to happen during execution of Automatic SQL Tuning Advisor. Several bugs have been logged for the issue but have not been resolved as the error is not reproducible at will. For example:

Bug 17401718: ORA-600 [QKSVCREPLACEVC0] USING SQL TUNING ADVISOR
Bug 16491690: ORA-600 [QKSVCREPLACEVC0] WHEN AUTOMATIC SQL TUNING ADVISOR EXECUTED
Bug 13959984: ORA-00600 [QKSVCREPLACEVC0]

How to fix it? Apply the patches!

To workaround it?

A few options:

1. Setting “_replace_virtual_columns” to false.

You can set this parameter at both session (where automatic SQL Tuning Advisor starts)
and system level with the following commands-

SQL> alter session set "_replace_virtual_columns"=false;

SQL> alter system set "_replace_virtual_columns"=false

2. Since it is only failing in the SQL Tuning Advisor auto task and has no effect on the database the error can be ignored.
You can disable that auto task and just run it manually when required:

–check auto job status

SQL> select client_name,status from dba_autotask_task;

SQL> select client_name,status from dba_autotask_client;

SQL> select client_name, operation_name, status from dba_autotask_operation;

–disable SQL Tuning Advisor job

SQL> exec dbms_auto_task_admin.disable ('sql tuning advisor', null, null);

-OR-

SQL> exec dbms_auto_task_admin.disable (client_name => 'sql tuning advisor', operation => null, window_name => null);

–enable SQL Tuning Advisor job

SQL> exec dbms_auto_task_admin.enable ('sql tuning advisor', null, null);

-OR-

SQL> exec dbms_auto_task_admin.enable (client_name => 'sql tuning advisor', operation => null, window_name => null);

 

Hope it helps!

ORA-02056: 2PC: k2lcom: bad two-phase command number rdonly from coord

Hi all,

So, you found this error?

Well, I received this in a client environment. After checking MOS ORA-02056: 2PC: K2gCheckGlobalCommit: Bad Two-phase Command Number 1 From Coord (Doc ID 2498134.1), which points to a known issue when dealing with committing transactions as per Bug 22016049 I noticed my problem could be different…

My real root cause is my transaction being done on a remote object and when checking on remote database I found pending distributed transactions.

For more informations about distributed transactions, please check: https://docs.oracle.com/database/121/ADMIN/ds_txns.htm#ADMIN12211

GUID-D521B4E9-E916-4B02-8B0E-4FAF4DC61423-default

So, if you getting this situation, before planning and applying the patch I suggest you:

  • Check if your process is not working on remote databases.
  • Check for any possible distributed remote transactions.

How? Easy check (on remote db):

select * from dba_2pc_pending;

I hope it helps!

PS NOTE: I just found this very good and complete post by Rodrigo Mufalani on the same topic: ORA-01591: lock held by in-doubt distributed transaction

Have a look for an Action Plan!
Cheers!

ORA-04030: out of process memory when trying to allocate % bytes (kkoutlCreatePh,ub1 : kkoabr)

Hi all,

So, I started seen this on a client environment. Researching on the  case after no crear reference on MOS, I noticed some high PGA allocation as per below.

SYS@proddb AS SYSDBA PROD> select pid, serial#,category, allocated/1024/1024 MB, used/1024/1024 MB_used, max_allocated/1024/1024 MB_MAX_ALLOCATED_ON_PGA
2 from v$process_memory where pid=852;

PID SERIAL# CATEGORY MB MB_USED MB_MAX_ALLOCATED_ON_PGA
---------- ---------- --------------- ---------- ---------- -----------------------
852 91 SQL .086807251 .00806427 .672416687
852 91 PL/SQL .087730408 .078926086 .126182556
852 91 Freeable .5625 0
852 91 Other 2.25187302 2.25187302

Seems a match MOS ORA-04030 Error With High “kkoutlCreatePh” (Doc ID 1618444.1).

The solution? Simply disabled the following parameter:

"_b_tree_bitmap_plans"=false

 

Trace for additional info:

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/12.1.0/db_1
System name: Linux
Node name: proddb.local
Release: 4.1.12-94.8.3.el7uek.x86_64
Version: #2 SMP Wed Apr 25 19:57:32 PDT 2018
Machine: x86_64
Instance name: proddb
Redo thread mounted by this instance: 1
Oracle process number: 854
Unix process pid: 28899, image: oracle@proddb.local

*** 2019-01-09 11:20:09.130
*** SESSION ID:(5429.55092) 2019-01-09 11:20:09.130
*** CLIENT ID:() 2019-01-09 11:20:09.130
*** SERVICE NAME:(XXXXX) 2019-01-09 11:20:09.130
*** MODULE NAME:(JDBC Thin Client) 2019-01-09 11:20:09.130
*** CLIENT DRIVER:(jdbcthin) 2019-01-09 11:20:09.130
*** ACTION NAME:() 2019-01-09 11:20:09.130

[TOC00000]
Jump to table of contents
Dump continued from file: /u01/app/oracle/diag/rdbms/proddb/proddb/trace/proddb_ora_28899.trc
[TOC00001]
ORA-04030: out of process memory when trying to allocate 34392040 bytes (kkoutlCreatePh,ub1 : kkoabr)

[TOC00001-END]
[TOC00002]
========= Dump for incident 2710965 (ORA 4030) ========
[TOC00003]
----- Beginning of Customized Incident Dump(s) -----
=======================================
TOP 10 MEMORY USES FOR THIS PROCESS
---------------------------------------
72% 18 GB, 1356 chunks: "free memory " SQL
kxs-heap-c ds=0x7fa8df330220 dsprt=0x7fa8df49dbe0
27% 6809 MB, 3366 chunks: "permanent memory " SQL
kxs-heap-c ds=0x7fa8df330220 dsprt=0x7fa8df49dbe0
0% 100 MB, 894 chunks: "permanent memory " SQL
kkoutlCreatePh ds=0x7fa8cce16708 dsprt=0x7fa8df330220
0% 23 MB, 589515 chunks: "chedef : qcuatc "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0
0% 18 MB, 150124 chunks: "opndef: qcopCreateOpnViaM "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0
0% 16 MB, 214829 chunks: "logdef: qcopCreateLog "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0
0% 11 MB, 241 chunks: "free memory "
top call heap ds=0x7fa8df49dbe0 dsprt=(nil)
0% 9643 KB, 4623 chunks: "qkkele " SQL
kxs-heap-c ds=0x7fa8df330220 dsprt=0x7fa8df49dbe0
0% 6534 KB, 58578 chunks: "optdef: qcopCreateOptInte "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0
0% 4134 KB, 15399 chunks: "kccdef : qcsvwsci "
TCHK^2a9688d9 ds=0x7fa8df33feb8 dsprt=0x7fa8df49c9e0

=======================================
PRIVATE MEMORY SUMMARY FOR THIS PROCESS
---------------------------------------
******************************************************
PRIVATE HEAP SUMMARY DUMP
25 GB total:
25 GB commented, 794 KB permanent
12 MB free (0 KB in empty extents),
24 GB, 1 heap: "kxs-heap-c " 67 KB free held
------------------------------------------------------
Summary of subheaps at depth 1
25 GB total:
203 MB commented, 6809 MB permanent
18 GB free (0 KB in empty extents),
6398 MB, 9243 chunks: "allocator state " 6398 MB free held
3758 MB, 4623 chunks: "qkkele " 3749 MB free held
3650 MB, 4623 chunks: "qkkkey " 3650 MB free held

=========================================
REAL-FREE ALLOCATOR DUMP FOR THIS PROCESS
-----------------------------------------

Dump of Real-Free Memory Allocator Heap [0x7fa8df317000]
mag=0xfefe0001 flg=0x5000003 fds=0x0 blksz=65536
blkdstbl=0x7fa8df317010, iniblk=524288 maxblk=524288 numsegs=318
In-use num=2965 siz=641597440, Freeable num=0 siz=0, Free num=254 siz=3586195456

==========================================
INSTANCE-WIDE PRIVATE MEMORY USAGE SUMMARY
------------------------------------------

Dumping Work Area Table (level=1)
=====================================

Global SGA Info
---------------

global target: 102400 MB
auto target: 62376 MB
max pga: 2048 MB
pga limit: 4096 MB
pga limit known: 0
pga limit errors: 0

pga inuse: 33104 MB
pga alloc: 35238 MB
pga freeable: 1225 MB
pga freed: 433664026 MB
pga to free: 0 %
broker request: 0

pga auto: 12 MB
pga manual: 0 MB

pga alloc (max): 35238 MB
pga auto (max): 12284 MB
pga manual (max): 2 MB

# workareas : 0
# workareas(max): 551