Failed Logon Delay Causing Performance Issues

On the other day when I got to the office I was called to check a database that was running slow. They had implemented a new process there and wanted to make sure it was not impacted.

When checked I saw this issue using OEM

 

User SYS causing a strange wait event Failed Logon Delay

Someone had  created a process running with the user SYS but they did not fully configured and a part of the process was trying to connect with the wrong password.

While they were looking in the configuration files and servers to see from where the issue was coming from, I started my own investigation to speed up the process.

1st I had to enable audit as it was disabled for unsuccessful loging attemps

SQL> audit session whenever not successful;

Audit succeeded.

 

Than I was able to see from where the failed connection came from, I just needed to look for the return code 1017 as ORA-1017 is invalid username/password; logon denied on sys.aud$

col ntimestamp# for a30 heading "Timestamp"
col userid for a6 heading "Username"
col userhost for a15 heading "Machine"
col spare1 for a10 heading "OS User"
col comment$text for a80 heading "Details"

select ntimestamp#, userid, userhost, spare1, comment$text,returncode from sys.aud$ where returncode=1017 or returncode=28000;

 

Oldie but goldie =)

Hope it helps,

Elisson Almeida

MySQL won’t start [ERROR] Found option without preceding group in config file

Hey folks,

have you ever received a call for a MySQL on windows that stopped working after someone did something at their my.cnf? Then you try to start the service by cmd and get the following error.

mysqld: [ERROR] Found option without preceding group in config file C:\ProgramData\MySQL\MySQL Server 8.0\my.ini at line 1.
mysqld: [ERROR] Fatal error in defaults handling. Program aborted!

Well, for some reason, the editor that was used (no idea which one was), threw some random byte at the beginning of the file. To solve that (on windows at least), open the file on Notepad++, go to Format > Convert to ANSI. Save the file and start again the service.

What was the weirdest thing that happened to you on a Windows Server?

MySQL Error1075 – Incorrect table definition; What’s happening?

Hey Folks,

A few months ago, I found an issue, where, for some reason, someone ignored the warnings and tried to restore a backup from a different version of MySQL (or even MariaDB, IDK). And as a result, half the database was running without Primary Keys. So when a system was trying to update their schema, we were getting errors (like the error bellow) trowed at our face.

1075 - Incorrect table definition; there can be only one auto column and it must be defined as a key

Ok, first things first, you would like to run an ALTER TABLE TABLE_NAME_HERE
ADD PRIMARY KEY (ID);
and see if it works.  The error was being thrown because the table key doesn’t have a single index on it… if you have problems with duplicated records on it, you can try the following script to solve the issue.

First, get the max id from the table, and then run the following: 

UPDATE TABLE_NAME_HERE JOIN (SELECT @sequence := MAX_ID_HERE ) r SET id=@sequence:=@sequence+1 where id= DUPLICATED_ID_HERE;

WARNING

Be aware that, if the rows that were duplicated, where referenced as FK on another table, you will get some headache (well, you already have problems…) !!

MySQL Error ‘Unknown or incorrect time zone’ at a replica

Have you ever tried to do a MySQL replication at a different timezone/SO and got the following error message at SHOW REPLICA STATUS? Well, there are two ways to solve this, Error ‘Unknown or incorrect time zone: ‘America/Cuiaba” on query. Default database: ‘glpi_tiab’. Query: ‘BEGIN’

If you are on Linux, you can run the following and solve the problem (after a restart of the service).

$ mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -u root mysql -p

On Windows, I found it to be easier if you download the timezone instead of build/import your own. You can download from here https://dev.mysql.com/downloads/timezones.html.

use mysql;
source /path/to/file/timezone_posix.sql;

Well, that’s it for today. See ya.

Retrieve the SQL Server Version from a Backup File

Have you ever been in the need to retrieve the SQL Server version that was used on a backup file?

Well, if for some weird reason that happens to you, the following SQL can help you. It won’t restore the database, it’ll just retrieve some basic info about it.

With that, you’ll have the DatabaseVersion (Internal Number Version) where the backup was from. You can also grab some useful information like the Server Name, Creation Date, and more.

RESTORE HEADERONLY FROM DISK = N'b:\backup\data_backup.bak'

Below we have a table of  Versions x Internal Number.

Version Internal Number Version Compat. Level
SQL Server 2019 895 – 904 150
SQL Server 2017 868 – 869 140
SQL Server 2016 852 130
SQL Server 2014 782 120
SQL Server 2012 706 110

I hope it helps!

19c+ DBMS_STATS.GATHER_TABLE_STATS Memory Leaking on PGA

Hi all,

So, here is a quick post but which wasn’t so eaasy to be mapped, though.

It happens that right after a 19c upgrade from 18c, we started facing Memory Leak messages related to PGA area on a regular basis. After a while following the v$process_memory allocation, it was possible to map it to an ODI routing with DBMS_STATS.GATHER_TABLE_STATS. The PGA exceeds the limit just a few minutes after.

Nice han, so what is happening?

Long story short: Bug 30846782 : 19C+ FAST/EXCESSIVE PGA GROWTH WHEN USING DBMS_STATS.GATHER_TABLE_STATS.

According to Oracle Docs: No Workaround!

But here is the golden info I want to share after an SR:

EXECUTE IMMEDIATE 'alter session set "_fix_control"='20424684:OFF';

You are welcome!

Cheers!

How to send Telegram messages from SQL Server

Hi folks,
You usually receive notifications from SQLServer at your e-mail, right? A nice way to receive updates from your SQL Server Schedule job, it’s sending them through telegram. To do so, you’ll have to configure one additional step and choose Operating System (CmdExec) as a type of command. And to call the Telegram API, we can use a PowerShell command like the following:

powershell "Invoke-RestMethod -Uri 'https://api.telegram.org/bot{API-KEY}/sendMessage?chat_id={CHAT-ID}&text={TEXT}'"

Just replace the following keys with the proper value (don’t forget to remove the curly brackets too).

API-KEY: Your bot API key
CHAT-ID: The ID of the chat where you’ll receive the message, could be a single user, or a chat group
TEXT: Your beloved message.

What else do you use to achieve this kind of alers? Slack? healthchecks.io? Tell us 😉

Plan Stability: SQLs switching to Bad Plans (HPV) and Load Very High (12c+)

Hi all,

That’s not the first post about this topic. Please read this one first.

Thing is: I access this post soooo often, as I face this sort of issue sooo frequently I decided to make a new one with some updated info. Even because, 3 years ago I wrote this article for 12.1 databases, now we have some relevant changes to 12.2 onwards.

Ok, so, for a little bit of context from the previous post:

  • After some critical issues with Plan Stability (new HSP being created out of nowhere very often) I mapped this to the usage of Adaptive Query Features.
  • At that point, I had a match to the latchfree events and mapped it specifically to latch misses on “Result Cache: RC Latch”.
  • As per MOS at the time, when Automatic Dynamic Sampling is used for the SQL statements, it can decide, based upon these statistics, that a better response time could be achieved by using the result cache for those queries. This can cause heavy usage of the result cache leading to the contention on latch free for “Result Cache: RC Latch”
  • So the villain was the Automatic Dynamic Sampling, as part to the SQL Plan Directives.
  • On the following months and migrations I started seeing more and more issues related to the SQL Plan Directives as part of the Adaptive Query Optimization features. So, I have been following the referred steps more and more.
  • Happens this means setting a few underscore parameters, which we should avoid unless approved/recommended by Oracle, so I’ve been trying to be less intrusive and simply disabling the whole Adaptive Features.
  • How to do it? That’s the tricky part and the thing to keep in mind, as we had changes from 12.1 to 12.2

Some words about the parameters settings:

  • The parameter “optimizer_adaptive_features” is a valid parameter for 12.1 and it’s enabled by default. I recommend leaving it as FALSE if you are facing this sort of issues.
  • This parameter is obsolete con 12.2 onwards though. It was replaced by both optimizer_adaptive_statistics and optimizer_adaptive_plans.
    • 12.1:

adap1

    • 12.2+:

adap2

(images from https://blogs.oracle.com/optimizer/optimizer-adaptive-features-in-oracle-database-12c-release-2)

  • Also, keep one eye on all %optim% parameters, as specific mechanisms can be enabled (for some reason) as I recently faced in a client ( _optimizer_adaptive_plan = TRUE).

When Adaptive Queries are good?

Ok Matheus, but if we should disable it, why did Oracle created in the first place?

Well, if something would be always bad, it would be disabled by default… There are some good usage cases: Dynamic Sampling is typically recommended when you know you are getting a bad execution plan due to complex predicates.

oh, noice, so I’ll change on my database I have a lot of complex queries” . Don’t, I suggest you to try and use an alter session statement to set the value for optimizer_dynamic_sampling parameter as it can be extremely difficult to come up with a system-wide setting. And you can mess up with other things.

When Adaptive Queries are bad?

If the queries compile times need to be as fast as possible, for example, unrepeated OLTP queries where you can’t amortize the additional cost of compilation over many executions. Serial statements are typically short running and any Dynamic Sampling overhead at compile time can have a large impact on overall system performance (if statements are frequently hard parsed).

For systems that match this profile, setting OAF=FALSE is recommended. For Oracle Database 12c Release 2 onwards, using the default OAS=FALSE is recommended.

I hope this helps you!

DBCA “Recovery manager failed to restore datafiles”

Hi all,

If that’s the 5th Blog Post with the same title one open, don’t close, READ THIS ONE!

This one is different.

It’s actually another case about Oracle throwing generic errors for DBCA where 99% of times it’s the same error, so all blogs are different but the same in essence, and none resolve your problem. So, let’s go by parts:

The Error from Client:

DBCA_Error

 

Generic Case (if this is the first blog you open about the subject):

  • Make sure the file $ORACLE_HOME/bin/oracle has privilege set to chmod 6751 in both (ASM and DB) homes. it should look like this:
[oracle@PROD01 bin]$ cd /u01/app/oracle/product/19c/db/bin
[oracle@PROD01 bin]$
[oracle@PROD01 bin]$ ls -ltr oracle*
-rwsr-s--x. 1 oracle asmadmin 441253104 Aug 27 22:29 oracle
  • If you are not sure, set it accordingly:
cd $ORACLE_HOME/bin && chmod 6751 oracle
  • Not yet? Check the disks assigned to ASM privileges and groups:
kfod status=TRUE asm_diskstring='/dev/asm*' disk=ALL
  • Ater all this, still not working? Go for the atypical case below:

 

Atypical Case (Exception)

After some struggle and no success, I started validating everything I could. Some piece of words before the silver bullet here:

Noticed I knew you do have ASM? How come that? Well, most likely the error below the ASM happens at the point the DBCA runs a RMAN restore to create the database base files and metadata. This most likely happens at the time the write is happening on the ASM, once this is the most unstable process involved.

By looking deeper on the installation logs I could see:

[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: restoring datafile 00001 to +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: reading from backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: ORA-19870: error while restoring backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-19504: failed to create file "+DATA"
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-17502: ksfdcre:4 Failed to create file +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-15001: diskgroup "DATA" does not exist or is not mounted
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-01017: invalid username/password; logon denied

Bingo, so it’s a password issue?

Well, I’m creating the database and this actually matches with all the chmod 6751 thing…

What then?

Well, after a while going crazy validating passwd files and so one, realized something about the oracle user:

[oracle@PROD01 bin]$ id -a
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
[oracle@PROD01 bin]$ grep oracle /etc/group
wheel:x:10:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
madhoc:x:525:oracle

Can you see the oracle user is part of the oinstall group but not really appearing on /etc/group? Also not in osusergroup dba.

Well, let’s force it?

[oracle@PROD01 bin]$ sudo su -
Last login: Fri Aug 28 14:13:21 PDT 2020 on pts/3
[root@DMSDB1PA ~]# usermod -g oinstall -G oinstall,dba,asmadmin,asmoper,madhoc oracle
[root@PROD01 ~]# id oracle
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc)
[root@PROD01 ~]# grep oracle /etc/group
wheel:x:10:oracle
dba:x:203:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
oinstall:x:501:oracle
madhoc:x:525:oracle
[root@PROD01 ~]#

Well done!

Now try running DBCA again. That was a very tricky issue to find.

Know something else? At the point I was writing this I decided to have another look and ended up finding this MOS note: “ORA-17502 /ORA-01017: invalid username/password; logon denied ” While Creating 19c Database (Doc ID 2545858.1). We have a bug for it: BUG:29821687 – ORA-17502 /ORA-01017: INVALID USERNAME/PASSWORD; LOGON DENIED ” WHILE CREATING 19C DATABASE

You have the workaround already though. Go champs!

Hope it helps you, cheers!

OPatch Auto Failing to Apply Datapatch for a Removed Database

Hi all,

So, just a few days ago, during a client support, crossed the following case.

A few database creations initially failed with DBCA due to other issues, but it seems the DBCA didn’t cleread all creation steps after the failure and theoretical rollback.

As a consequence, whenever running DBCA with GUI client was seeing an old database. When trying to remove it with DBCA, the removal fail as the database can’t be brought up (creation has failed, remember?). Ok, we can leave with it, right?

Yes until we reached the point where Opatchauto failed with the following:

Verifying SQL patch applicability on home /u01/app/oracle/product/19c/db

"/bin/sh -c 'cd /u01/app/oracle/product/19c/db; ORACLE_HOME=/u01/app/oracle/product/19c/db ORACLE_SID=DB1 /u01/app/oracle/product/19c/db/OPatch/datapatch -prereq -verbose'" command failed with errors. Please refer to logs for more details. SQL changes, if any, can be analyzed by manually retrying the same command.

The reason? See the complete log about the failing step:

Executing command as oracle: 
 /bin/sh -c 'cd /u01/app/oracle/product/19c/db;ORACLE_HOME=/u01/app/oracle/product/19c/db ORACLE_SID=DB1 /u01/app/oracle/product/19c/db/OPatch/datapatch -verbose'
2020-09-02 16:26:56,362 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - COMMAND Looks like this: /bin/sh -c 'cd /u01/app/oracle/product/19c/db;ORACLE_HOME=/u01/app/oracle/product/19c/db ORACLE_SID=DB1 /u01/app/oracle/product/19c/db/OPatch/datapatch -verbose'
2020-09-02 16:26:57,662 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.GISystemCall - Is retry required=false
2020-09-02 16:26:57,662 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - status: 1
2020-09-02 16:26:57,662 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
COMMAND EXECUTION FAILURE :
SQL Patching tool version 19.8.0.0.0 Production on Wed Sep  2 16:26:57 2020
Copyright (c) 2012, 2020, Oracle.  All rights reserved.

Log file for this invocation: /u01/app/oracle/base/cfgtoollogs/sqlpatch/sqlpatch_25218_2020_09_02_16_26_57/sqlpatch_invocation.log

Connecting to database...
Error: prereq checks failed!
Database connect failed with: ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
Linux-x86_64 Error: 2: No such file or directory
Additional information: 4376
Additional information: 1275019259 (DBD ERROR: OCISessionBegin)

Please refer to MOS Note 1609718.1 and/or the invocation log
/u01/app/oracle/base/cfgtoollogs/sqlpatch/sqlpatch_25218_2020_09_02_16_26_57/sqlpatch_invocation.log
for information on how to resolve the above errors.

SQL Patching tool complete on Wed Sep  2 16:26:57 2020

Clearly, the database is still in place.
As per MOS (ORA-01078 Can Not Delete Database From Dbca on Linux (Doc ID 1559634.1)) -> See the /etc/oratab!

Thing is, the doesn’t have the DB1 line. Also, all related files, logs directory structure, passwd, init, etc… all wiped out. What else?

Here is goes what seems Oracle forgot to tell:

[oracle@PRODB01 dbca]$ srvctl status database -d DB1
Database is not running.
[oracle@DMSDB1PA dbca]$ srvctl config database -d DB1
Database unique name: DB1
Database name: DB1
Oracle home: /u01/app/oracle/product/19c/db
Oracle user: oracle
Spfile: 
Password file: 
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Disk Groups: DATA
Services: 
OSDBA group: oinstall
OSOPER group: 
Database instance: DB1
Ohhh, that took me a while to realize, this was a Standalone server. Once understood, the fix is straight forward:
[oracle@PRODB01 dbca]$ srvctl remove database -d DB1
Remove the database dms? (y/[n]) y
[oracle@PRODB01 dbca]$

I hope this can buy you some minutes of MOS, in case you are googling it first.
Or save you in case you gave up MOS already.

Cheers!