CRS Not Starting after Removing OS User: How to Workaround and How to Solve!

Hello all!
Turns that a few days ago a client reached me because his CRSD was simply not starting. Like this:

[root@proddb proddb]$ ./crsctl start res ora.crsd -init
CRS-2672: Attempting to start 'ora.crsd' on 'proddb'
CRS-2676: Start of 'ora.crsd' on 'proddb' succeeded

[root@proddb proddb]$ ps -ef |grep crsd
root 19217 13424 0 11:53 pts/0 00:00:00 grep crsd

After some investigation, I found the following:

2017-01-24 14:00:06.859: [ CRSSEC][1690195712]{1:51052:2} Exception: OwnerEntry construction failed to retrieve user id by name with ACL string: owner:jacknobody:rwx and error: 1
2017-01-24 14:00:06.912: [ CRSSEC][1690195712]{1:51052:2} Exception: ACL entry creation failed for: owner:jacknobody:rwx

Hmmm, seems some CRS resources are owned by “Jack Nobody”… Turns that I this us was removed from OS:

[root@proddb proddb]$ cat /etc/passwd |grep jacknobody
[root@proddb proddb]$ 

What to do now?

First, I’d recommend you to review MOS reference:
CRSD Daemon Fails to start with Exception Error during ACL Entry Lookup for a resource” (Doc ID 1491367.1)
ora.crsd Resource No Longer Starts on Either Node All Other Grid Infrastructure Initialization Resources Starting Fine (Doc ID 2179122.1)

The this is pretty simple, just re-add user to OS and all will work as expected!
It’s not needed to have same password, only to recreate user.

After recreating the user, the CRS was successfully started. 🙂

But if I need to remove the osuser, what to do?

Here is the quick step-by-step:

1. Check resources related to jacknobody.

– Connect to server proddb with root and take an OLR and OCR dump to verify which resources are owned by this user

    $ ocrdump -local /tmp/olr.log
    $ ocrdump /tmp/ocr.log
    $ grep jacknobody /tmp/olr.log /tmp/ocr.log

2. Change Ownership of resources related to jacknobody.
** For each resource found in 1 **

– Change the ownership of the respective resource type to the new user (oranew, for example), as root user.
Examples:

    $ crsctl setperm type ora.cluster_resource.type -o oranew
    $ crsctl setperm type ora.database.type -o oranew
    $ crsctl setperm type ora.listener.type -o oranew

– Verify the change for affected types:
Examples:

    $ crsctl getperm type ora.cluster_resource.type
    Name: ora.cluster_resource.type
    owner:oranew:rwx,pgrp:oinstall:rwx,other::r--
    
    $ crsctl getperm type ora.database.type
    Name: ora.database.type
    owner:oranew:rwx,pgrp:oinstall:rwx,other::r--
    
    $ crsctl getperm type  ora.listener.type
    Name: ora.listener.type
    owner:oranew:rwx,pgrp:oinstall:rwx,other::r--

3. Check if no resources still related to jacknobody.
– Check if all resources were correctly changed:

    $ ocrdump -local /tmp/olr_2.log
    $ ocrdump /tmp/ocr_2.log
    $ grep jacknobody /tmp/olr_2.log /tmp/ocr_2.log

– If any database resources still owned by jacknobody, then remove them from the OCR and add it back after deleting osuser:

    $ srvctl config database -d  # To get database confguration
    $ srvctl remove database -d  # To remove
    $ srvctl config database -d  # To re-add database. Include in this command desired options using configuration from first command.

4. Remove user jacknobody
Remove user from OS:

    $ userdel jacknobody

This is also described as per MOS Oracle Restart is not coming up due to non-existing owner ( Doc ID 1931046.1 ).

And this is it for today!
See you next week!

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from grepOra

Subscribe now to keep reading and get access to the full archive.

Continue reading