Microsoft Ignite – Certification Voucher

Hey folks,

Do you enjoy learning new things? Between September 22 and September 24, Microsoft will be hosting their yearly Microsoft Ignite event. Of course, it’ll be online this time and free of charge :).

This year, you can earn a certification voucher and that’s awesome. You’ll be able to choose one certification from a specific list. Check the list of available certifications and the Terms and Conditions accessing the following link: https://docs.microsoft.com/pt-br/learn/certifications/microsoft-ignite-cloud-skills-challenge-2020-free-certification-exam.

Plan Stability: SQLs switching to Bad Plans (HPV) and Load Very High (12c+)

Hi all,

That’s not the first post about this topic. Please read this one first.

Thing is: I access this post soooo often, as I face this sort of issue sooo frequently I decided to make a new one with some updated info. Even because, 3 years ago I wrote this article for 12.1 databases, now we have some relevant changes to 12.2 onwards.

Ok, so, for a little bit of context from the previous post:

  • After some critical issues with Plan Stability (new HSP being created out of nowhere very often) I mapped this to the usage of Adaptive Query Features.
  • At that point, I had a match to the latchfree events and mapped it specifically to latch misses on “Result Cache: RC Latch”.
  • As per MOS at the time, when Automatic Dynamic Sampling is used for the SQL statements, it can decide, based upon these statistics, that a better response time could be achieved by using the result cache for those queries. This can cause heavy usage of the result cache leading to the contention on latch free for “Result Cache: RC Latch”
  • So the villain was the Automatic Dynamic Sampling, as part to the SQL Plan Directives.
  • On the following months and migrations I started seeing more and more issues related to the SQL Plan Directives as part of the Adaptive Query Optimization features. So, I have been following the referred steps more and more.
  • Happens this means setting a few underscore parameters, which we should avoid unless approved/recommended by Oracle, so I’ve been trying to be less intrusive and simply disabling the whole Adaptive Features.
  • How to do it? That’s the tricky part and the thing to keep in mind, as we had changes from 12.1 to 12.2

Some words about the parameters settings:

  • The parameter “optimizer_adaptive_features” is a valid parameter for 12.1 and it’s enabled by default. I recommend leaving it as FALSE if you are facing this sort of issues.
  • This parameter is obsolete con 12.2 onwards though. It was replaced by both optimizer_adaptive_statistics and optimizer_adaptive_plans.
    • 12.1:

adap1

    • 12.2+:

adap2

(images from https://blogs.oracle.com/optimizer/optimizer-adaptive-features-in-oracle-database-12c-release-2)

  • Also, keep one eye on all %optim% parameters, as specific mechanisms can be enabled (for some reason) as I recently faced in a client ( _optimizer_adaptive_plan = TRUE).

When Adaptive Queries are good?

Ok Matheus, but if we should disable it, why did Oracle created in the first place?

Well, if something would be always bad, it would be disabled by default… There are some good usage cases: Dynamic Sampling is typically recommended when you know you are getting a bad execution plan due to complex predicates.

oh, noice, so I’ll change on my database I have a lot of complex queries” . Don’t, I suggest you to try and use an alter session statement to set the value for optimizer_dynamic_sampling parameter as it can be extremely difficult to come up with a system-wide setting. And you can mess up with other things.

When Adaptive Queries are bad?

If the queries compile times need to be as fast as possible, for example, unrepeated OLTP queries where you can’t amortize the additional cost of compilation over many executions. Serial statements are typically short running and any Dynamic Sampling overhead at compile time can have a large impact on overall system performance (if statements are frequently hard parsed).

For systems that match this profile, setting OAF=FALSE is recommended. For Oracle Database 12c Release 2 onwards, using the default OAS=FALSE is recommended.

I hope this helps you!

DBCA “Recovery manager failed to restore datafiles”

Hi all,

If that’s the 5th Blog Post with the same title one open, don’t close, READ THIS ONE!

This one is different.

It’s actually another case about Oracle throwing generic errors for DBCA where 99% of times it’s the same error, so all blogs are different but the same in essence, and none resolve your problem. So, let’s go by parts:

The Error from Client:

DBCA_Error

 

Generic Case (if this is the first blog you open about the subject):

  • Make sure the file $ORACLE_HOME/bin/oracle has privilege set to chmod 6751 in both (ASM and DB) homes. it should look like this:
[oracle@PROD01 bin]$ cd /u01/app/oracle/product/19c/db/bin
[oracle@PROD01 bin]$
[oracle@PROD01 bin]$ ls -ltr oracle*
-rwsr-s--x. 1 oracle asmadmin 441253104 Aug 27 22:29 oracle
  • If you are not sure, set it accordingly:
cd $ORACLE_HOME/bin && chmod 6751 oracle
  • Not yet? Check the disks assigned to ASM privileges and groups:
kfod status=TRUE asm_diskstring='/dev/asm*' disk=ALL
  • Ater all this, still not working? Go for the atypical case below:

 

Atypical Case (Exception)

After some struggle and no success, I started validating everything I could. Some piece of words before the silver bullet here:

Noticed I knew you do have ASM? How come that? Well, most likely the error below the ASM happens at the point the DBCA runs a RMAN restore to create the database base files and metadata. This most likely happens at the time the write is happening on the ASM, once this is the most unstable process involved.

By looking deeper on the installation logs I could see:

[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: restoring datafile 00001 to +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: reading from backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: ORA-19870: error while restoring backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-19504: failed to create file "+DATA"
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-17502: ksfdcre:4 Failed to create file +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-15001: diskgroup "DATA" does not exist or is not mounted
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-01017: invalid username/password; logon denied

Bingo, so it’s a password issue?

Well, I’m creating the database and this actually matches with all the chmod 6751 thing…

What then?

Well, after a while going crazy validating passwd files and so one, realized something about the oracle user:

[oracle@PROD01 bin]$ id -a
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
[oracle@PROD01 bin]$ grep oracle /etc/group
wheel:x:10:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
madhoc:x:525:oracle

Can you see the oracle user is part of the oinstall group but not really appearing on /etc/group? Also not in osusergroup dba.

Well, let’s force it?

[oracle@PROD01 bin]$ sudo su -
Last login: Fri Aug 28 14:13:21 PDT 2020 on pts/3
[root@DMSDB1PA ~]# usermod -g oinstall -G oinstall,dba,asmadmin,asmoper,madhoc oracle
[root@PROD01 ~]# id oracle
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc)
[root@PROD01 ~]# grep oracle /etc/group
wheel:x:10:oracle
dba:x:203:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
oinstall:x:501:oracle
madhoc:x:525:oracle
[root@PROD01 ~]#

Well done!

Now try running DBCA again. That was a very tricky issue to find.

Know something else? At the point I was writing this I decided to have another look and ended up finding this MOS note: “ORA-17502 /ORA-01017: invalid username/password; logon denied ” While Creating 19c Database (Doc ID 2545858.1). We have a bug for it: BUG:29821687 – ORA-17502 /ORA-01017: INVALID USERNAME/PASSWORD; LOGON DENIED ” WHILE CREATING 19C DATABASE

You have the workaround already though. Go champs!

Hope it helps you, cheers!

OPatch Auto Failing to Apply Datapatch for a Removed Database

Hi all,

So, just a few days ago, during a client support, crossed the following case.

A few database creations initially failed with DBCA due to other issues, but it seems the DBCA didn’t cleread all creation steps after the failure and theoretical rollback.

As a consequence, whenever running DBCA with GUI client was seeing an old database. When trying to remove it with DBCA, the removal fail as the database can’t be brought up (creation has failed, remember?). Ok, we can leave with it, right?

Yes until we reached the point where Opatchauto failed with the following:

Verifying SQL patch applicability on home /u01/app/oracle/product/19c/db

"/bin/sh -c 'cd /u01/app/oracle/product/19c/db; ORACLE_HOME=/u01/app/oracle/product/19c/db ORACLE_SID=DB1 /u01/app/oracle/product/19c/db/OPatch/datapatch -prereq -verbose'" command failed with errors. Please refer to logs for more details. SQL changes, if any, can be analyzed by manually retrying the same command.

The reason? See the complete log about the failing step:

Executing command as oracle: 
 /bin/sh -c 'cd /u01/app/oracle/product/19c/db;ORACLE_HOME=/u01/app/oracle/product/19c/db ORACLE_SID=DB1 /u01/app/oracle/product/19c/db/OPatch/datapatch -verbose'
2020-09-02 16:26:56,362 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - COMMAND Looks like this: /bin/sh -c 'cd /u01/app/oracle/product/19c/db;ORACLE_HOME=/u01/app/oracle/product/19c/db ORACLE_SID=DB1 /u01/app/oracle/product/19c/db/OPatch/datapatch -verbose'
2020-09-02 16:26:57,662 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.GISystemCall - Is retry required=false
2020-09-02 16:26:57,662 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - status: 1
2020-09-02 16:26:57,662 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
COMMAND EXECUTION FAILURE :
SQL Patching tool version 19.8.0.0.0 Production on Wed Sep  2 16:26:57 2020
Copyright (c) 2012, 2020, Oracle.  All rights reserved.

Log file for this invocation: /u01/app/oracle/base/cfgtoollogs/sqlpatch/sqlpatch_25218_2020_09_02_16_26_57/sqlpatch_invocation.log

Connecting to database...
Error: prereq checks failed!
Database connect failed with: ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
Linux-x86_64 Error: 2: No such file or directory
Additional information: 4376
Additional information: 1275019259 (DBD ERROR: OCISessionBegin)

Please refer to MOS Note 1609718.1 and/or the invocation log
/u01/app/oracle/base/cfgtoollogs/sqlpatch/sqlpatch_25218_2020_09_02_16_26_57/sqlpatch_invocation.log
for information on how to resolve the above errors.

SQL Patching tool complete on Wed Sep  2 16:26:57 2020

Clearly, the database is still in place.
As per MOS (ORA-01078 Can Not Delete Database From Dbca on Linux (Doc ID 1559634.1)) -> See the /etc/oratab!

Thing is, the doesn’t have the DB1 line. Also, all related files, logs directory structure, passwd, init, etc… all wiped out. What else?

Here is goes what seems Oracle forgot to tell:

[oracle@PRODB01 dbca]$ srvctl status database -d DB1
Database is not running.
[oracle@DMSDB1PA dbca]$ srvctl config database -d DB1
Database unique name: DB1
Database name: DB1
Oracle home: /u01/app/oracle/product/19c/db
Oracle user: oracle
Spfile: 
Password file: 
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Disk Groups: DATA
Services: 
OSDBA group: oinstall
OSOPER group: 
Database instance: DB1
Ohhh, that took me a while to realize, this was a Standalone server. Once understood, the fix is straight forward:
[oracle@PRODB01 dbca]$ srvctl remove database -d DB1
Remove the database dms? (y/[n]) y
[oracle@PRODB01 dbca]$

I hope this can buy you some minutes of MOS, in case you are googling it first.
Or save you in case you gave up MOS already.

Cheers!