April 2020 – grepOra

Career, Events

GUORS: I Encontro de Usuários Oracle do RS 2020 (Online)

By matheusdba 30 de April de 2020 30 de April de 2020

Olá pessoal,

Teremos evento dos GUORS no primeiro semestre, como de costume!

Porém desta vez será online!

Será no dia 21/5 (quinta-feira) as 20:00.
Onde? No conforto de sua casa!

Contaremos com a presença ilustre do Oracle ACE Marcus Vinicius falando sobre OCI for Oracle DBA’s: Introduction and Key Concepts.

Nessa sessão o nosso convidado falará sobre o OCI (Oracle Cloud Infrastructure), conceitos-chave e os desafios de um DBA Oracle na conversão de sua carreira para arquiteto Cloud a partir da perspectiva de quem vive esta experiência cotidianamente.

O Marcus Vinicius é DBA Oracle Sênior, Oracle ACE e possui diversas certificações em Bancos de Dados Oracle. Possui mais de 15 anos de experiência na área. Foi professor de pós-graduação no curso “MBA em Administração de Banco de Dados Oracle” no Veris IBTA, unidades de São Paulo e Campinas. Foi também DBA Oracle e Gerente de Operações na Discover Technology e Consultor Oracle Sênior na Oracle (ACS). Atualmente é Consultor Oracle Sênior na Accenture Enkitec Group. Também é conselheiro técnico do GUOB. Mantém um blog sobre Oracle há mais de 10 anos – https://www.viniciusdba.com.br/blog (não deixa de conhecer o Blog do Vini também!).

Não vai perder essa né?

Faça a sua inscrição AQUI!

Database, RAC | Data Guard

ALTERNATE Archive log destination configuration that can help you – at least it helped me =)

By Elisson Almeida 29 de April de 2020 30 de April de 2020

Hi all,

I was working on a database that was waiting on a backup policy to be configured on the Netbackup side, so basically it had no backups. No news here right =)

The dev team did not wait for much to hammer down the database which caused the archive log generation spiked high.

They were planning to work all weekend long and I needed to make sure that the database was available.

I had a spare diskgroup that I could use when the FRA got full but I did not wanted to monitor the space usage along the weekend to change the archive log destination, also did not want set it up ahead of it causing to double the archive log being sent to both locations wasting space.

So, one active location at a particular time and if /when the 1st location filled up, it would switch to the secondary location.

The configuration that helped me with was the ALTERNATE parameter in the log_archive_dest_n.

The parameter “Specifies an alternate archiving destination to be used when the original destination fails.”

If you want you can read the all details of the parameter here

Most of the configurations using this parameter are related to the Dataguard standby configurations but it works on this scenario as well.

This is how I used it:
*I used the noreopen configuration as I knew that the space issue would not be resolved during the weekend

alter system set log_archive_dest_1='location=use_db_recovery_file_dest noreopen alternate=log_archive_dest_2' scope=both sid'*';
alter system set log_archive_dest_2='+NEW_FLASH´ sid='*';
alter system set log_archive_dest_state_2= 'ALTERNATE' scope=both sid='*';

On the database, you can see the status of the archive destinations:

select dest_id, dest_name, status from v$archive_dest_status where status <> 'INACTIVE';

DEST_ID DEST_NAME STATUS
---------- -------------------- ---------
1 LOG_ARCHIVE_DEST_1 VALID
2 LOG_ARCHIVE_DEST_2 UNKNOWN

I saw this error in the alert when the space on the 1st destination exhausted (also some hiccups on the database´s services):

ORA-17502: ksfdcre:4 Failed to create file +FLASH
ORA-15041: diskgroup "+FLASH" space exhausted
ORA-16038: log 8 sequence# 1059 cannot be archived
ORA-19504: failed to create file ""

Checking the log_archive_dest parameter status, the 1st one got disabled and only the 2nd was is valid, the hickups were baralley noticed andthe database kept working all the time.

select dest_id, dest_name, status from v$archive_dest_status where status <> 'INACTIVE';

DEST_ID DEST_NAME STATUS
---------- -------------------- ---------
1 LOG_ARCHIVE_DEST_1 DISABLED
2 LOG_ARCHIVE_DEST_2 VALID

When the space issue was resolved, all I needed to do was to enable the 1st location and set the 2nd one again to alternate.

select dest_id, dest_name, status from v$archive_dest_status where status <> 'INACTIVE';

DEST_ID DEST_NAME STATUS
---------- -------------------- ---------
1 LOG_ARCHIVE_DEST_1 VALID
2 LOG_ARCHIVE_DEST_2 UNKNOWN

Thanks and hope it helps

Elisson Almeida

Community, Events

Virtual Conference: Systematic Oracle SQL Optimization in 2020 (Review)

By matheusdba 27 de April de 2020 18 de May de 2020

Hi all,

I’m here for a review and for a tip: Please don’t miss it in case those guys decide to do it again!

It was a really great conference, with a lot of really valuable sessions with real content.

I’ll just share the agenda so you can imagine by yourself. There is more in https://tanelpoder.com/conference/.

Day 1	Speaker	Topic
11:00-11:15	Tanel	Welcome & Introduction
11:15-12:30	Cary	A Richer Understanding of Software Performance
12:30-13:00	Slack	Break/Q&A/Chat
13:00-14:30	Jonathan	Trouble-shooting with Execution Plans
14:30-15:00	All Speakers	Panel: What Has Changed in 10 Years and What Has Not

Day 2	Speaker	Topic
11:00-11:15	Kerry	How to Stay Relevant
11:15-12:30	Tanel	Scripts and Tools for Optimizing SQL Execution and Indexing
12:30-13:00	Slack	Break/Q&A/Chat
13:00-14:30	Mauro	Chase the Optimizer Every Step of the Way
14:30-15:00	All Speakers	Panel: What Will Change in the Next 10 Years

Thanks guys for organizing such a great ending of the week. You are the best!

ASM, Database

ASM: Process PPAx holds disk from being dropped

By Elisson Almeida 22 de April de 2020 21 de April de 2020

So I was trying to remove some no longer needed disks from a cluster: 1st I umounted the disgroups from the other cluster nodes and from the last one I executed the DROP DISKGROUP DG_NAME INCLUDING CONTENTS. Nothing new right, but the SA told me that the disks were still being accessed.

I checked with kfod and kfed and also using asmcmd lsdsk -g –candidate, the disks were there, avaliable to be used in another diskgroup but they were no a part of any diskgroup.

I tried to check using asmcmd lsod but it was not returning anything while using lsop and fuser I was able to see the process that was holding the disks, which was an asm process PPA7:
*the disks that I´m was looking for was from the db5*

ASMCMD> lsod --process ppa
Instance Process OSPID Path
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_data3p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_redos1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_redos6p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_redos7p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_data3p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_redos1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_redos2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_data6p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_redos1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_redos2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ocrvoting2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ocrvoting5p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_data2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_redo1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_redo2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_redo3p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_redo4p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_data6p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_flash2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_redos1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_redos2p1

[oracrs@server01 ~]$ fuser /dev/mapper/mp_db5_redos2p1
/dev/dm-99: 38908
[oracrs@server01 ~]$ lsof /dev/mapper/mp_db5_redos2p1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
asm_ppa7_ 38908 oracrs 270u BLK 253,99 0t0 70663 /dev/mapper/../dm-99

Checking MOS I found some notes like Database Process Hold Locks Dropped ASM Disk (Doc ID 2159941.1) but still would not help me as there was no work around the issue.

So basicly I needed to kill the process but could I? It was a production system and kill ASM processes is not something that one should take lightly.

In the documentation the PPA process is found under the
Parallel Query Slave Process which made my life easier, right? Not really.

How to make sure that killing a process would not make the instance to go down?

There is a script and blog post from Tanel Poder which also helped me , and it is worth the reading,

Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0

SYS@+ASM1>SELECT indx,ksuprpnm,TO_CHAR(ksuprflg,'XXXXXXXXXXXXXXXX')
FROM x$ksupr
WHERE BITAND(ksuprflg,4) = 4 ORDER BY indx
/ 2 3 4

INDX KSUPRPNM TO_CHAR(KSUPRFLG,
---------- ------------------------------------------------ -----------------
2 oracle@server01 (PMON) E
3 oracle@server01 (CLMN) E
4 oracle@server01 (PSP0) 6
5 oracle@server01 (VKTM) 6
6 oracle@server01 (GEN0) 6
7 oracle@server01 (MMAN) 6
13 oracle@server01 (PMAN) 6
15 oracle@server01 (LMON) 6
16 oracle@server01 (LMD0) 6
17 oracle@server01 (LMS0) 6
20 oracle@server01 (LCK1) 6
21 oracle@server01 (DBW0) 6
22 oracle@server01 (LGWR) 6
23 oracle@server01 (CKPT) 6
24 oracle@server01 (SMON) 6
25 oracle@server01 (LREG) 6
27 oracle@server01 (RBAL) 6
28 oracle@server01 (GMON) 6
31 oracle@server01 (IMR0) 6
32 oracle@server01 (LCK0) 6
38 oracle@server01 (ASMB) 6

Cutting the sotry short as my process was not the query output, I did some other testing and I was comfortable enough to kill it and release the disks to the SA team.

Hope it helps

Elisson Almeida

Database, Errors and Bugs

Exadata: Cell Server Crashing on ORA-00600: [LinuxBlockIO::reap]

By matheusdba 1 de April de 2020 7 de February de 2020

Hi all,

So I started facing this in a client environment. Here is the alert message:

Target name=db01cel08.xxx.com
Message=ORA-00600: internal error code, arguments: [LinuxBlockIO::reap], [0x60000D502388], [], [], [], [], [], [], [], [], [], []
Event reported time=Dec 19, 2019 2:14:16 AM EDT

When checking on the cellserver I see this message:

[root@db01 ~]# ssh db01cel08
Last login: Thu Dec 19 04:45:13 2019 from db01.xxx.com
[root@db01cel08 ~]# cellcli
CellCLI: Release 12.1.2.3.5 - Production on Fri Dec 19 17:13:31 EDT 2019

Copyright (c) 2007, 2016, Oracle. All rights reserved.

CellCLI> LIST ALERTHISTORY detail

[...]

name: 10
alertDescription: "ORA-07445: exception encountered: core dump [__intel_new_memset()+62] [11] [0x000000000] [] [] []"
alertMessage: "ORA-07445: exception encountered: core dump [__intel_new_memset()+62] [11] [0x000000000] [] [] []"
alertSequenceID: 10
alertShortName: ADR
alertType: Stateless
beginTime: 2019-12-19T02:00:04-04:00
endTime:
examinedBy:
notificationState: 1
sequenceBeginTime: 2019-12-19T02:00:04-04:00
severity: critical
alertAction: "Errors in file /opt/oracle/cell/log/diag/asm/cell/SYS_112331_170406/trace/cellofltrc_19796_53.trc (incident=25). Diagnostic package is attached. It is also accessible at https://db01cel08.xxx.com/diagpack/download?name=db01cel08_2019_12_19T02_00_04_10.tar.bz2 It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at https://db01cel08.xxx.com/diagpack"

name: 11
alertDescription: "ORA-00600: internal error code, arguments: [LinuxBlockIO::reap], [0x60000D502388], [], [], [], [], [], [], [], [], [], []"
alertMessage: "ORA-00600: internal error code, arguments: [LinuxBlockIO::reap], [0x60000D502388], [], [], [], [], [], [], [], [], [], []"
alertSequenceID: 11
alertShortName: ADR
alertType: Stateless
beginTime: 2019-12-19T02:00:04-04:00
endTime:
examinedBy:
notificationState: 1
sequenceBeginTime: 2019-12-19T02:00:04-04:00
severity: critical
alertAction: "Errors in file /opt/oracle/cell/log/diag/asm/cell/db01cel08/trace/svtrc_9174_12.trc (incident=25). Diagnostic package is attached. It is also accessible at https://db01cel08.xxxx.com/diagpack/download?name=jdb01cel08_2019_12_19T02_00_04_11.tar.bz2 It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at https://db01cel08.xxx.com/diagpack"

name: 12_1
alertDescription: "A SQL PLAN quarantine has been added"
alertMessage: "A SQL PLAN quarantine has been added. As a result, Smart Scan is disabled for SQL statements with the quarantined SQL plan. Quarantine id : 21 Quarantine type : SQL PLAN Quarantine reason : Crash Quarantine Plan : SYSTEM Quarantine Mode : FULL_Quarantine DB Unique Name : XPTODB Incident id : 25 SQLID : 8j0az9sgxs5yh SQL Plan details : {SQL_PLAN_HASH_VALUE=281152830, PLAN_LINE_ID=9} In addition, the following disk region has been quarantined, and Smart Scan will be disabled for this region: Disk Region : {Grid Disk Name=Unknown, offset=186750337024, size=1M} "
alertSequenceID: 12
alertShortName: Software
alertType: Stateful
beginTime: 2019-12-19T02:00:12-04:00
examinedBy:
metricObjectName: QUARANTINE/21
notificationState: 1
sequenceBeginTime: 2019-12-19T02:00:12-04:00
severity: critical
alertAction: "A SQL statement caused the Cell Server (CELLSRV) service on the cell to crash. A SQL PLAN quarantine has been created to prevent the same SQL statement from causing the same cell to crash. When possible, disable offload for the SQL statement or apply the RDBMS patch that fixes the crash, then remove the quarantine with the following CellCLI command: CellCLI> drop quarantine 21 All quarantines are automatically removed when a cell is patched or upgraded. For information about how to disable offload for the SQL statement, refer to the section about 'SQL Processing Offload' in Oracle Exadata Storage Server User's Guide. Diagnostic package is attached. It is also accessible at https://db01cel08.xxx.com/diagpack/download?name=db01cel08_2019_12_19T02_00_12_12_1.tar.bz2 It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at https://db01cel08.xxx.com/diagpack"

CellCLI>

After some research, we could match the situation to Bug 13245134 – Query may fail with errors ORA-27618, ORA-27603, ORA-27626 or ORA-00600[linuxblockio::reap_1] or ora-600 [cacheput::process_1]

It’s also described as per: Exadata/SuperCluster: 11.2 databases missing fix for the bug 13245134 may lead to cell service crash with ora-600 [linuxblockio::reap_1]/ora-600 [cacheput::process_1] or ORA-27626: Exadata error: 242/Smart scan issues on the RDBMS side

In order to resolve the crashes quickly, I applied the patch online with:

After applying, all got solved:

[oracle@db01 ~]$ /oracle/xptodb/product/11.2.0.4/OPatch/opatch lsinventory -OH $ORACLE_HOME | grep 13245134
Patch (online) 13245134: applied on Thu Dec 19 23:34:50 EST 2019
13245134
[oracle@db01 ~]$

Hope it helps!