DBCA “Recovery manager failed to restore datafiles”

Hi all,

If that’s the 5th Blog Post with the same title one open, don’t close, READ THIS ONE!

This one is different.

It’s actually another case about Oracle throwing generic errors for DBCA where 99% of times it’s the same error, so all blogs are different but the same in essence, and none resolve your problem. So, let’s go by parts:

The Error from Client:

DBCA_Error

 

Generic Case (if this is the first blog you open about the subject):

  • Make sure the file $ORACLE_HOME/bin/oracle has privilege set to chmod 6751 in both (ASM and DB) homes. it should look like this:
[oracle@PROD01 bin]$ cd /u01/app/oracle/product/19c/db/bin
[oracle@PROD01 bin]$
[oracle@PROD01 bin]$ ls -ltr oracle*
-rwsr-s--x. 1 oracle asmadmin 441253104 Aug 27 22:29 oracle
  • If you are not sure, set it accordingly:
cd $ORACLE_HOME/bin && chmod 6751 oracle
  • Not yet? Check the disks assigned to ASM privileges and groups:
kfod status=TRUE asm_diskstring='/dev/asm*' disk=ALL
  • Ater all this, still not working? Go for the atypical case below:

 

Atypical Case (Exception)

After some struggle and no success, I started validating everything I could. Some piece of words before the silver bullet here:

Noticed I knew you do have ASM? How come that? Well, most likely the error below the ASM happens at the point the DBCA runs a RMAN restore to create the database base files and metadata. This most likely happens at the time the write is happening on the ASM, once this is the most unstable process involved.

By looking deeper on the installation logs I could see:

[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: restoring datafile 00001 to +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: reading from backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: ORA-19870: error while restoring backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-19504: failed to create file "+DATA"
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-17502: ksfdcre:4 Failed to create file +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-15001: diskgroup "DATA" does not exist or is not mounted
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-01017: invalid username/password; logon denied

Bingo, so it’s a password issue?

Well, I’m creating the database and this actually matches with all the chmod 6751 thing…

What then?

Well, after a while going crazy validating passwd files and so one, realized something about the oracle user:

[oracle@PROD01 bin]$ id -a
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
[oracle@PROD01 bin]$ grep oracle /etc/group
wheel:x:10:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
madhoc:x:525:oracle

Can you see the oracle user is part of the oinstall group but not really appearing on /etc/group? Also not in osusergroup dba.

Well, let’s force it?

[oracle@PROD01 bin]$ sudo su -
Last login: Fri Aug 28 14:13:21 PDT 2020 on pts/3
[root@DMSDB1PA ~]# usermod -g oinstall -G oinstall,dba,asmadmin,asmoper,madhoc oracle
[root@PROD01 ~]# id oracle
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc)
[root@PROD01 ~]# grep oracle /etc/group
wheel:x:10:oracle
dba:x:203:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
oinstall:x:501:oracle
madhoc:x:525:oracle
[root@PROD01 ~]#

Well done!

Now try running DBCA again. That was a very tricky issue to find.

Know something else? At the point I was writing this I decided to have another look and ended up finding this MOS note: “ORA-17502 /ORA-01017: invalid username/password; logon denied ” While Creating 19c Database (Doc ID 2545858.1). We have a bug for it: BUG:29821687 – ORA-17502 /ORA-01017: INVALID USERNAME/PASSWORD; LOGON DENIED ” WHILE CREATING 19C DATABASE

You have the workaround already though. Go champs!

Hope it helps you, cheers!

ASM: Process PPAx holds disk from being dropped

So I was trying to remove some no longer needed disks from a cluster: 1st I umounted the disgroups from the other cluster nodes and from the last one I executed the DROP DISKGROUP DG_NAME INCLUDING CONTENTS. Nothing new right, but the SA told me that the disks were still being accessed.

I checked with kfod and kfed and also using asmcmd lsdsk -g –candidate, the disks were there, avaliable to be used in another diskgroup but they were no a part of any diskgroup.

I tried to check using asmcmd lsod but it was not returning anything while using lsop and fuser I was able to see the process that was holding the disks, which was an asm process PPA7:
*the disks that I´m was looking for was from the db5*

ASMCMD> lsod --process ppa
Instance Process OSPID Path
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_data3p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_redos1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_redos6p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db1_redos7p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_data3p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_redos1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db2_redos2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_data6p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_redos1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db3_redos2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ocrvoting2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ocrvoting5p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_data2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_redo1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_redo2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_redo3p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_ora_redo4p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_data1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_data6p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_flash1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_flash2p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_redos1p1
1 oracle@server01 (PPA7) 38908 /dev/mapper/mp_db4_redos2p1

[oracrs@server01 ~]$ fuser /dev/mapper/mp_db5_redos2p1
/dev/dm-99: 38908
[oracrs@server01 ~]$ lsof /dev/mapper/mp_db5_redos2p1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
asm_ppa7_ 38908 oracrs 270u BLK 253,99 0t0 70663 /dev/mapper/../dm-99

Checking MOS I found some notes like Database Process Hold Locks Dropped ASM Disk (Doc ID 2159941.1) but still would not help me as there was no work around the issue.

So basicly I needed to kill the process but could I? It was a production system and kill ASM processes is not something that one should take lightly.

In the documentation the PPA process is found under the
Parallel Query Slave Process which made my life easier, right? Not really.

How to make sure that killing a process would not make the instance to go down?

There is a script and blog post from Tanel Poder which also helped me , and it is worth the reading,

Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0

SYS@+ASM1>SELECT indx,ksuprpnm,TO_CHAR(ksuprflg,'XXXXXXXXXXXXXXXX')
FROM x$ksupr
WHERE BITAND(ksuprflg,4) = 4 ORDER BY indx
/ 2 3 4

INDX KSUPRPNM TO_CHAR(KSUPRFLG,
---------- ------------------------------------------------ -----------------
2 oracle@server01 (PMON) E
3 oracle@server01 (CLMN) E
4 oracle@server01 (PSP0) 6
5 oracle@server01 (VKTM) 6
6 oracle@server01 (GEN0) 6
7 oracle@server01 (MMAN) 6
13 oracle@server01 (PMAN) 6
15 oracle@server01 (LMON) 6
16 oracle@server01 (LMD0) 6
17 oracle@server01 (LMS0) 6
20 oracle@server01 (LCK1) 6
21 oracle@server01 (DBW0) 6
22 oracle@server01 (LGWR) 6
23 oracle@server01 (CKPT) 6
24 oracle@server01 (SMON) 6
25 oracle@server01 (LREG) 6
27 oracle@server01 (RBAL) 6
28 oracle@server01 (GMON) 6
31 oracle@server01 (IMR0) 6
32 oracle@server01 (LCK0) 6
38 oracle@server01 (ASMB) 6

Cutting the sotry short as my process was not the query output, I did some other testing and I was comfortable enough to kill it and release the disks to the SA team.

Hope it helps

Elisson Almeida

Orphan ASM File Cleanup Script

Hi all,

So I got asked by a client to perform a checking on ASM for orphan files, as they have some frequent create/drop database on this environment, as being a development env.

Also, lots of databases shared the same data diskgroup, so I had to work this out for all databases and also for possible inexistent databases.

Some basic approaches I raised:

1) Locating uncatalloged files in ASM per database.
– Source: https://oraganism.wordpress.com/2012/09/09/orphaned-files-in-asm/
– This approach assumes the files on ASM uncatalogged to any database are the Orphaned ones. Which is a fair assumption.
– But I understand that files can be catalloged and unmonted, which would brake this approach.

2) Listing files in ASM but not in database (v$datafile, v$datafile_copy, v$controlfile, v$tempfile, v$logfile) by database.
– Source: https://oracledba.blogspot.com/2018/11/orphaned-files-in-asm.html
– This seems a fair assumption. Would need to be ran from each database.
– There is not guarantee if this is working properly or not.
– Not clear also if PDB files are included.
– There is another similar one: https://dbaliveblog.wordpress.com/asm-orphaned-file-identification-script/
– Also this one: https://anjo.pt/wp/keyword-oracle/2013/02/26/find-orphan-asm-files/

3) MOS: Query That Can Be Used to Find Orphaned Datafiles on a 12c ASM Instance (Doc ID 2228573.1)
– From MOS, seems the most recommended approach.
– Attention point: PDB$SEED may not be shown as per: PDB$SEED Datafiles Not Appear In CDB_DATA_FILES (Doc ID 1940806.1)
— On 12.1.0.2, recommended to use “EXCLUDE_SEED_CDB_VIEW”. To check if it can be done on session level.

I downloaded and ran referred script on MOS Script to report the list of files stored in ASM and CURRENTLY NOT OPENED (Doc ID 552082.1) on the environment.
But the results didn’t sound correct.

After a while, I ended up building my own script based on all mentioned approaches and it worked very fine.

After approved I dropped all the listed files, freed a several TBs of space and no database affected. So I’d assume it as correct and would really recommend it for you.

So what did I used:

SQL to Check ASM Space per Database:

set pages 350 timing on
col gname form a10
col dbname form a10
col file_type form a16
break on gname skip 2 on dbname skip 1
compute sum label total_db of gb on dbname
compute sum label total_diskg of gb on gname  
SELECT
    gname,
    dbname,
    file_type,
    round(SUM(space)/1024/1024) mb,
    round(SUM(space)/1024/1024/1024) gb,
    COUNT(*) "#FILES"
FROM
    (
        SELECT
            gname,
            regexp_substr(full_alias_path, '[[:alnum:]_]*',1,4) dbname,
            file_type,
            space,
            aname,
            system_created,
            alias_directory
        FROM
            (
                SELECT
                    concat('+'||gname, sys_connect_by_path(aname, '/')) full_alias_path,
                    system_created,
                    alias_directory,
                    file_type,
                    space,
                    level,
                    gname,
                    aname
                FROM
                    (
                        SELECT
                            b.name            gname,
                            a.parent_index    pindex,
                            a.name            aname,
                            a.reference_index rindex ,
                            a.system_created,
                            a.alias_directory,
                            c.type file_type,
                            c.space
                        FROM
                            v$asm_alias a,
                            v$asm_diskgroup b,
                            v$asm_file c
                        WHERE
                            a.group_number = b.group_number
                        AND a.group_number = c.group_number(+)
                        AND a.file_number = c.file_number(+)
                        AND a.file_incarnation = c.incarnation(+) ) START WITH (mod(pindex, power(2, 24))) = 0
                AND rindex IN
                    (
                        SELECT
                            a.reference_index
                        FROM
                            v$asm_alias a,
                            v$asm_diskgroup b
                        WHERE
                            a.group_number = b.group_number
                        AND (
                                mod(a.parent_index, power(2, 24))) = 0
                    ) CONNECT BY prior rindex = pindex )
        WHERE
            NOT file_type IS NULL
            and system_created = 'Y' )
GROUP BY
    gname,
    dbname,
    file_type
ORDER BY
    gname,
    dbname,
    file_type
/

Expected Output:

SQL> @asm_sizebydb

GNAME	   DBNAME     FILE_TYPE 	       MB	  GB	 #FILES
---------- ---------- ---------------- ---------- ---------- ----------
DATAC1	   DATABSE1   CONTROLFILE	     2316	   2	      1
		      DATAFILE		  7620756	7442	     49
		      DATAGUARDCONFIG	       16	   0	      2
		      ONLINELOG 	    82536	  81	     14
		      PARAMETERFILE		8	   0	      1
	   **********				  ----------
	   total_db					7525

	   DATABSE2   CONTROLFILE	      492	   0	      1
		      DATAFILE		  3081604	3009	     47
		      ONLINELOG 	      416	   0	      4
		      PARAMETERFILE	       16	   0	      2
		      PASSWORD			0	   0	      2
		      TEMPFILE		    83372	  81	      3
	   **********				  ----------
	   total_db					3090

	   DATABSE3   CONTROLFILE	      588	   1	      1
		      DATAFILE		  1430712	1397	      8
		      DATAGUARDCONFIG	       16	   0	      2
		      ONLINELOG 	   147816	 144	     18
		      PARAMETERFILE		8	   0	      1
	   **********				  ----------
	   total_db					1542
[...]

**********					  ----------
total_disk					       76868

SQL To list Orphan files per Database:

SET VERIFY OFF
SET LINESIZE 200
SET SERVEROUTPUT ON
SET PAGESIZE 50000

DECLARE
   cmd   CLOB;
BEGIN
   FOR c IN (SELECT name Diskgroup
               FROM V$ASM_DISKGROUP)
   LOOP
      FOR l
         IN (SELECT 'rm ' || files files
               FROM
                    (SELECT '+' || c.Diskgroup || files files, TYPE
                       FROM (    SELECT UPPER
                                        (
                                           SYS_CONNECT_BY_PATH (aa.name, '/')
                                        )
                                           files
                                      , aa.reference_index
                                      , b.TYPE
                                   FROM (SELECT file_number
                                              , alias_directory
                                              , name
                                              , reference_index
                                              , parent_index
                                           FROM v$asm_alias) aa
                                      , (SELECT parent_index
                                           FROM (SELECT parent_index
                                                   FROM v$asm_alias
                                                  WHERE     group_number =
                                                               (SELECT group_number
                                                                  FROM v$asm_diskgroup
                                                                 WHERE name =
                                                                          c.Diskgroup)
                                                        AND alias_index = 0)) a
                                      , (SELECT file_number, TYPE
                                           FROM (SELECT file_number, TYPE
                                                   FROM v$asm_file
                                                  WHERE group_number =
                                                           (SELECT group_number
                                                              FROM v$asm_diskgroup
                                                             WHERE name =
                                                                      c.Diskgroup)))
                                        b
                                  WHERE     aa.file_number = b.file_number(+)
                                        AND aa.alias_directory = 'N'
                                        AND b.TYPE IN
                                               ('DATAFILE'
                                              , 'ONLINELOG'
                                              , 'CONTROLFILE'
                                              , 'TEMPFILE')
                             START WITH aa.PARENT_INDEX = a.parent_index
                             CONNECT BY PRIOR aa.reference_index =
                                           aa.parent_index)
                      WHERE SUBSTR
                            (
                               files
                             , INSTR (files, '/', 1, 1)
                             ,   INSTR (files, '/', 1, 2)
                               - INSTR (files, '/', 1, 1)
                               + 1
                            ) =
                               (SELECT dbname
                                  FROM (SELECT    '/'
                                               || UPPER (db_unique_name)
                                               || '/'
                                                  dbname
                                          FROM v$database))
                     MINUS
                     (SELECT UPPER (name) files, 'DATAFILE' TYPE
                        FROM v$datafile
                      UNION ALL
                      SELECT UPPER (name) files, 'TEMPFILE' TYPE
                        FROM v$tempfile
                      UNION ALL
                      SELECT UPPER (name) files, 'CONTROLFILE' TYPE
                        FROM v$controlfile
                       WHERE name LIKE '+' || c.Diskgroup || '%'
                      UNION ALL
                      SELECT UPPER (name), 'CONTROLFILE' TYPE
                        FROM v$datafile_copy
                       WHERE deleted = 'NO'
                      UNION ALL
                      SELECT UPPER (MEMBER) files, 'ONLINELOG' TYPE
                        FROM v$logfile
                       WHERE MEMBER LIKE '+' || c.Diskgroup || '%')))
      LOOP
         DBMS_OUTPUT.put_line (l.files);
      END LOOP;
   END LOOP;
END;
/

Expected Output:

rm +DATA/XPTODB/CONTROLFILE/CURRENT.4928.955985765
rm +DATA/XPTODB/CONTROLFILE/CURRENT.4934.955986589
rm +DATA/XPTODB/CONTROLFILE/CURRENT.4962.955998825
rm +DATA/XPTODB/CONTROLFILE/CURRENT.5063.956480113
rm +DATA/XPTODB/CONTROLFILE/CURRENT.6374.955984145
rm +DATA/XPTODB/CONTROLFILE/CURRENT.7547.955968953
rm +DATA/XPTODB/DATAFILE/TBSEXEMPLE.4936.955985803
rm +DATA/XPTODB/DATAFILE/TBSEXEMPLE.4966.955998847
rm +DATA/XPTODB/DATAFILE/TBSEXEMPLE.7540.955968995
rm +DATA/XPTODB/DATAFILE/TBSEXEMPLE.7574.955984177
rm +DATA/XPTODB/DATAFILE/TBSEXEMPLE.4937.955985803
rm +DATA/XPTODB/DATAFILE/TBSEXEMPLE.4967.955998847
rm +DATA/XPTODB/DATAFILE/TBSEXEMPLE.7542.955968995
rm +DATA/XPTODB/DATAFILE/TBSEXEMPLE.7558.955984177
rm +DATA/XPTODB/DATAFILE/SYSAUX.4935.955986599
rm +DATA/XPTODB/DATAFILE/SYSAUX.4963.955998847
rm +DATA/XPTODB/DATAFILE/SYSAUX.6286.955984161
rm +DATA/XPTODB/DATAFILE/SYSAUX.7544.955968963
rm +DATA/XPTODB/DATAFILE/SYSTEM.4930.955986599
rm +DATA/XPTODB/DATAFILE/SYSTEM.4964.955998847
rm +DATA/XPTODB/DATAFILE/SYSTEM.7536.955968965

To run this for all databases on server (RAC Databases):

export ORAENV_ASK=NO
for DBSID in `ps -ef | grep ora_pmon | grep -v grep | awk -F_ '{ print $3}'` 
do
echo "######" ${DBSID}
export ORACLE_SID=${DBSID}
. oraenv
sqlplus / as sysdba
@script.sql
exit
done

Hope it helps you!

Waiting for ASM to startup after upgrading to Oracle 18c

Hi,
Something strange happened to me and I would like to share with you. I was upgrading one of my Virtual Machine which was an Oracle Restart – SIHA,  running 11.2.0.4 to 18c on Oracle Linux 6.9.

The upgrade went fine using the GUI no issues nor strange log messages, but the issue happened when I rebooted the VM. For some reason (which I don’t knew) the startup was hanging when trying to start HAS and then it outputted the error below:

PRCR-1070 :Failed to check if resource ora.asm is registered
CRS-0184 : Cannot communicate with the CRS daemon
Waiting for ASM to startup

Looking up on MOS I found this note – Oracle Linux 6 server hangs on boot with error: Waiting for ASM to startup (Doc ID 2495023.1)

The startup was hanging waiting on acfssihamount but as ASM was not up yet and it could not start.

To fix it you need to boot OS in rescue mode and execute

chroot /mnt/sysimage

Once you do that you will be able to change the startup configuration settings using chkconfig and disable acfssihamount

chkconfig --list acfssihamount
chkconfig acfssihamount off
chkconfig --list acfssihamount

After this exit out the rescue mode and the system should boot normally.

Hope it helps!

Elisson Almeida

Starting ASM: ORA-29701: unable to connect to Cluster Synchronization Service

Hey all,
So, I bet you have seen this error already, as this is quite common when messing up with Cluster configuration, which DBAs love to do…. no?

Well, here is what you may be facing:

SQL> startup
ORA-01078: failure in processing system parameters
ORA-29701: unable to connect to Cluster Synchronization Service
SQL>

The error is kind of clear: Cluster Synchronization Service (CSS) is not available. So, let’s start it from ASM Cluster (or HAS).

$GRID_HOME/bin/crsctl start resource -all

Or, for Standalone:

$GRID_HOME/bin/crsctl start has

To check on status:

$GRID_HOME/bin/crsctl status resource -t

Complete example (attention to CSSD):

[root@greporasrv1 ~]# crsctl start has
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.
[root@greporasrv1 ~]# crsctl start resource -all
CRS-5702: Resource ‘ora.evmd’ is already running on ‘greporasrv1’
CRS-2501: Resource ‘ora.ons’ is disabled
CRS-2672: Attempting to start ‘ora.cssd’ on ‘greporasrv1’
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘greporasrv1’
CRS-2676: Start of ‘ora.diskmon’ on ‘greporasrv1’ succeeded
CRS-2676: Start of ‘ora.cssd’ on ‘greporasrv1’ succeeded
CRS-4000: Command Start failed, or completed with errors.
[root@greporasrv1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ons
OFFLINE OFFLINE greporasrv1 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
1 ONLINE ONLINE greporasrv1 STABLE
ora.diskmon
1 OFFLINE OFFLINE STABLE
ora.evmd
1 ONLINE ONLINE greporasrv1 STABLE
--------------------------------------------------------------------------------
[root@greporasrv1 ~]#

Hope that worked! 😀

Oh, it didn’t? Did you changed hostname name or something? In this case, you may want to deconfig HAS and reconfigure using root.sh (part regular installation):

cd $ORACLE_HOME
./crs/install/roothas.pl -deconfig -force
./crs/install/roothas.pl -delete -force
./root.sh

 

Hey! Be careful with that, it might be unrecoverable. 😉


									

GRID Infrastructure life after rootcrs.pl -deconfig -force -verbose -lastnode

Hi all,

I had a client which asked us to reconfigure the GRID infrastructure on 11g after the did a:

$GI_HOME/crs/install/rootcrs.pl -deconfig -force -verbose -lastnode

The “deconfig” option is used when we need to remove the GI configuration cleanly and the “lastnode”  is executed on the last cluster node.

But what we need to do to recreate the the cluster? Well most would say “Run root.sh again” and that should solve it on most cases.

But when I tried to execute it I have several issues on crsconfig_params file. I tried to manually add the missing data and as there was much info to add but what to do next?

A colleague pointed to 2 MOS notes:

How to Deconfigure/Reconfigure(Rebuild OCR) or Deinstall Grid Infrastructure (Doc ID 1377349.1)
How to Configure or Re-configure Grid Infrastructure With config.sh/config.bat (Doc ID 1354258.1)

So if you follow those notes  you should prepare a response run the config.sh to create a proper crsconfig_params and then run root.sh

$GI_HOME/root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /app/11.2.0.4/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /app/11.2.0.4/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'greporarac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'greporarac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'greporarac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'greporarac1'
CRS-2676: Start of 'ora.diskmon' on 'greporarac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'greporarac1' succeeded
PROT-1: Failed to initialize ocrconfig
PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup

Failed to create Oracle Cluster Registry configuration, rc 255
Oracle Grid Infrastructure Repository configuration failed at /app/11.2.0.4/grid/crs/install/crsconfig_lib.pm line 6911.
/app/11.2.0.4/grid/perl/bin/perl -I/app/11.2.0.4/grid/perl/lib -I/app/11.2.0.4/grid/crs/install /app/11.2.0.4/grid/crs/install/rootcrs.pl execution failed
$GI_HOME/crs/config/config.sh -silent -responseFile $GI_HOME/crs/config/grid_configwizard_1.rsp -ignorePreReq

As a root user, execute the following script(s):
1. /app/11.2.0.4/grid/root.sh

Execute /app/11.2.0.4/grid/root.sh on the following nodes:
[rac1, rac1]

Successfully Setup Software.
[WARNING] [INS-32091] Software installation was successful. But some configuration assistants failed, were cancelled or skipped.
ACTION: Refer to the logs or contact Oracle Support Services.
oracle@rac1:/app/11.2.0.4/grid/crs/config>


root@rac1 /app/11.2.0.4/grid/root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /app/11.2.0.4/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /app/11.2.0.4/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to oracle-ohasd.conf

CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded

ASM created and started successfully.

Disk Group CRS created successfully.

clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Successful addition of voting disk 04713db813e14f5abf6b385896b1ca1d.
Successful addition of voting disk 8910f31e58db4f30bfdca40e34a0ffbd.
Successful addition of voting disk 7fefe24a8d4b4fcbbfd2d25a4307a1e0.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 04713db813e14f5abf6b385896b1ca1d (/dev/mapper/ora-pure-ractestwt1-crs-1) [CRS]
2. ONLINE 8910f31e58db4f30bfdca40e34a0ffbd (/dev/mapper/ora-pure-ractestwt1-crs-2) [CRS]
3. ONLINE 7fefe24a8d4b4fcbbfd2d25a4307a1e0 (/dev/mapper/ora-pure-ractestwt1-crs-3) [CRS]
Located 3 voting disk(s).


CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.CRS.dg' on 'rac1'
CRS-2676: Start of 'ora.CRS.dg' on 'rac1' succeeded
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:

#1) Respect the privacy of others.
#2) Think before you type.
#3) With great power comes great responsibility.

p_raghu's password on rac1:
Performing root user operation for Oracle 11g

The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /app/11.2.0.4/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /app/11.2.0.4/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to oracle-ohasd.conf

CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node rac1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

 

Even Better Script: Map ASM Disks to Physical Devices

Enjoyed last week post?

Cool, because looking further on the subject I found this pretty similar post, by Mohammad Nazmul Huda.

The additional script there is actually not working in my server, but the idea is great. So, I did just some small adjustments and it’s working pretty fine now:

# asm_report.sh (Adjusted by Matheus):

printf "\n%-15s %-14s %-11s %-7s\n" "ASM disk" "based on" "Minor,Major" "Size (MB)"
printf "%-15s %-14s %-11s %-7s\n" "===============" "=============" "===========" "========="
export ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome1
for i in `/usr/sbin/oracleasm listdisks`
do
v_asmdisk=`/usr/sbin/oracleasm querydisk -d $i | awk '{print $2}'| sed 's/\"//g'`
v_minor=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $1}'`
v_major=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $2}'`
v_device=`ls -la /dev | awk -v v_minor="$v_minor," -v v_major=$v_major '{if ( $5==v_minor ) { if ( $6==v_major ) { print $10}}}'`
v_size_bt=`blockdev --getsize64 /dev/${v_device}`
v_size=`expr $v_size_bt / 1024 / 1024`
Total_size=`expr $Total_size + $v_size`
Formated_size=`echo $v_size | sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'`
printf "%-15s %-14s %-11s %-7s\n" $v_asmdisk "/dev/$v_device" "[$v_minor $v_major]" $Formated_size
done
Formated_Total_size=`echo $Total_size | sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'`
printf "\nTotal (MB): %43s\n\n" $Formated_Total_size

Ok, and how it works?
[root@greporasrv ~]# sh asm_report.sh

ASM disk        based on      Minor,Major Size (MB)
=============== ============= =========== =========
DATA01          /dev/sdg1     [8 97]       255,999
DATA02          /dev/sdh1     [8 113]      255,999
DATA03          /dev/sdi1     [8 129]      255,999
DATA04          /dev/sdj1     [8 145]      255,999
FRA01           /dev/sdk1     [8 161]      307,199

Total (MB): 1,331,195

Even better, right?

Cheers!

CRS-10051: CVU found following errors with Clusterware setup : PRCW-1015 : Wallet % does not exist

Hello all!
So, recently I found this error in a CRS alert log from a client environment. Interesting error…

2018-03-26 16:33:53.277 [SRVM(9624)]CRS-10051: CVU found following errors with Clusterware setup : PRCW-1015 : Wallet greporadb does not exist. 
CLSW-9: The cluster wallet to be operated on does not exist. :[1015]PRCW-1015 : Wallet greporadb does not exist.

I found this also related to the error:

PRVG-1512 : Failed to retrieve current selection of public and private network classifications

So, it was mapped to known Bug 18234669, as per described in CRS-10051: CVU Found Following Errors With Clusterware Setup :PRCW-1015 : Wallet ora603ut does not exist (Doc ID 2008466.1).

Ok, but what to do?

1) In case you have the wallet, you can simply add it to the database:

crsctl add wallet -type CVUDB -name [dbname]

2) In case you haven’t, you can simply disable the resource ora.cvu, that is the one checking this:

> Checking Status

[root@grepora-srv ~]# crsctl stat res ora.cvu -p | grep CHECK_RESULT
CHECK_RESULTS=PRVG-1512 : Failed to retrieve current selection of public and private network classifications,PRCW-1015 : Wallet greporadb does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRCW-1015 : Wallet greporadb does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRCW-1015 : Wallet c4prod does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRVG-1512 : Failed to retrieve current selection of public and private network classifications,PRCW-1015 : Wallet greporadb does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRCW-1015 : Wallet greporadb does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRCW-1015 : Wallet c4prod does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRVG-1512 : Failed to retrieve current selection of public and private network classifications,PRCW-1015 : Wallet greporadb does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRCW-1015 : Wallet greporadb does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRCW-1015 : Wallet c4prod does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRVG-1512 : Failed to retrieve current selection of public and private network classifications,PRCW-1015 : Wallet greporadb does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRCW-1015 : Wallet greporadb does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015],PRCW-1015 : Wallet greporadb does not exist. ,CLSW-9: The cluster wallet to be operated on does not exist. :[1015]

> Disable CVU:

oracle:grepora-srv: srvctl disable cvu
oracle:grepora-srv:
oracle:grepora-srv: crsctl stat res ora.cvu -p | grep ENABLED
ENABLED=0
oracle:grepora-srv: srvctl status cvu                       
CVU is disabled

Hope it helps!

Script: Map ASM Disks to Physical Devices

Hey all!
So, I had to map a couple ASM disks to physical devices. But it’s not direct, which causes some manual work.

To save me from this, I found this great post by Alejandro Vargas, with a very nice script to make this mapping easier.

I found however, it was done for RHEL/OEL 6 and older, and I’m in OEL7. So I did some small changes to adapt it.

Anyway, decided to share as this is a great script to have handy. 🙂

# Alejandro’s script (RHEL/OEL 6 and older):

#!/bin/ksh
for i in `/etc/init.d/oracleasm listdisks`
do
v_asmdisk=`/etc/init.d/oracleasm querydisk $i | awk  '{print $2}'`
v_minor=`/etc/init.d/oracleasm querydisk $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk '{print $1}'`
v_major=`/etc/init.d/oracleasm querydisk $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk '{print $2}'`
v_device=`ls -la /dev | grep $v_minor | grep $v_major | awk '{print $10}'`
echo "ASM disk $v_asmdisk based on /dev/$v_device  [$v_minor $v_major]"
done

# Adjustments by Matheus (RHEL/OEL7):

#!/bin/ksh
for i in `/usr/sbin/oracleasm listdisks`
do
v_asmdisk=`/usr/sbin/oracleasm querydisk -d $i | awk '{print $2}'`
v_minor=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $1}'`
v_major=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $2}'`
v_device=`ls -la /dev | grep $v_minor | grep $v_major | awk '{print $10}'`
echo "ASM disk $v_asmdisk based on /dev/$v_device [$v_minor $v_major]"
done

# Example of execution:

[root@greporasrv]$ for i in `/usr/sbin/oracleasm listdisks`
> do
> v_asmdisk=`/usr/sbin/oracleasm querydisk -d $i | awk '{print $2}'`
> v_minor=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $1}'`
> v_major=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $2}'`
> v_device=`ls -la /dev | grep $v_minor | grep $v_major | awk '{print $10}'`
> echo "ASM disk $v_asmdisk based on /dev/$v_device [$v_minor $v_major]"
> done
ASM disk "DATA01" based on /dev/sdg1 [8 97]
ASM disk "DATA02" based on /dev/sdh1 [8 113]
ASM disk "DATA03" based on /dev/sdi1 [8 129]
ASM disk "DATA04" based on /dev/sdj1 [8 145]
ASM disk "FRA01" based on /dev/sdk1 [8 161]

Hope you enjoy it like I did.
Cheers!

GRID upgrade FREEZES – 11g to 12c

Hey guys,
Upgrading is always something critical and a delicate operation but when you have no feedback on in the screen even harder.

I was working on an upgrade and using the GUI to upgrade the GRID from 11g to 12c. The 11gr2 11.2.0.4 was working without issue and ASM was as well (note this point, we will come back here later on).

When it was time to run the rootupgrade.sh, it just got stuck. No matter what, the GRID upgrade to 12c just FROZE. Checking the logs the last message was only this:

CLSRSC-467: Shutdown of the current Oracle Grid Infrastructure stack has successfully completed.

Looking the other logs (/u01/app/12.1.0/grid/cfgtoollogs/crsconfig) there were messages related to OCR, pointing it cannot get OCR key with CLUUTIL, try using OCRDUMP. I checked ORC with ocrdump and ocrcheck. No issues there as well. Also, as I said before, the cluster was working without any issues.

As I had no error code or any thing that would give me a more specific cause. I went to a broad search on google and MOS. Saw all kind of things until I found the MOS: Wrong DiscoveryString /dev/*: rootupgrade.sh/root.sh hangs: Check OCR key using ocrdump (Doc ID 1916106.1)

I checked any my ASM disk discovery string was set to /dev/* which did not strike me as an issue as I mentioned it was working… BUT when I changed the script in ASM to /dev/asm-* the upgrade worked like a charm.

Also as note there is this note, with some best practices for upgrading: How to Upgrade to/Downgrade from Grid Infrastructure 12.1 and Known Issues (Doc ID 1579762.1).

Hope this helps and save some time in your troubleshooting.

Élisson Almeida