Exadata DNS Change – Pitfalls to be avoided

Hi all, it’s been a while but here I am!

There were some changes in the infrastructure at the place I work and I was asked to do a DNS change on a bit old Exadata X5. I had never done one before this, so the idea of this post is to help others who might face the issues I had.

The first thing I did was to look up the documentation about it and see the steps, yes there are blogs about it but the doc can help to get at least the first glance of the situation.

Long story short: Exadata has lots of components and the new DNS should be changed on all of them.

Here is a summary of the steps.

Infiniband switches

Connect to the switches and sudo to ilom-admin and change the DNS

su - ilom-admin
show /SP/clients/dns
set /SP/clients/dns nameserver=192.168.16.1,192.168.16.2,192.168.16.3
show /SP/clients/dns

 

Database nodes

For my image I only needed to change the /etc/resolv.conf, if you have a newer one you will need to user ipconf – That´s why you need to go to the documentation, at least there we hope that they will put some mentions on the pitfalls (well keep reading and you will see that was not my case)

Also changed the DNS on wach database node ilom, runing the ipmtool from the each node

ipmitool sunoem cli 'show /SP/clients/dns'
ipmitool sunoem cli 'set /SP/clients/dns nameserver=192.168.16.1,192.168.16.2,192.168.16.3'
ipmitool sunoem cli 'show /SP/clients/dns'


Cell nodes – Here things start to get interesting

For the storage cell there are some points that need to be taken under consideration:

Increase the ASM disk_repair_time – the goal here is to avoid a full rebalance if you do this within its timeframe, if you don’t know this parameter,  ASM will wait for up to the interval specified for DISK_REPAIR_TIME for the disk(s) to come online. If the disk(s) come back online within this interval, a resync operation will occur, where only the extents that were modified while the disks were offline are written to the disks once back online. If the disk(s) do not come back within this interval, ASM will initiate a forced drop of the disk(s), which will trigger a rebalance.

On each cell node we need to make sure all disks are OK, stop all cell disks, stop all cell services and user ipconfig to change the DNS configuration

#Check that putting the grid disks offline will not cause a problem for Oracle ASM - it should all say YES on the 3rd column 
cellcli -e LIST GRIDDISK ATTRIBUTES name,asmmodestatus,asmdeactivationoutcome

#Inactivate all grid disks on the cell - may take a while to complete
cellcli -e ALTER GRIDDISK ALL INACTIVE


#Confirm the grid disks are offline, it should show asmmodestatus=OFFLINE or asmmodestatus=UNUSED, and asmdeactivationoutcome=Yes for all grid disks
cellcli -e LIST GRIDDISK ATTRIBUTES name, asmmodestatus,asmdeactivationoutcome

#Confirm that the disks are offline
cellcli -e LIST GRIDDISK

#Shut down the cell services and ocrvottargetd service
cellcli -e ALTER CELL SHUTDOWN SERVICES ALL
service ocrvottargetd stop #on some images this services does not exists

To execute the ipconf on the old way we only need to call it can follow the prompts, but if you have a newer image you will need to provide its parameters as is shown in the documentation.

The documentation says that after it we could start the cell services back up but I would recommend validating the DNS prior to doing that, why is that you might say because mine did not work and I could have a bigger issue with a cell node without DNS trying to start the services.

So, how to test, use nslookup, dig and curl

nslookup dns_domain.com
curl -v 192.168.16.1:53
dig another_server_in_the_network

 

My tests did not work, I was able to ping the DNS servers but not to resolve any name, I had an SR on MOS but did not help much either, looking up as this is a production system I tried to see if the firewall was up on the Linux site, and to my surprise it was.

I tried to manually add rules to iptables but it did not work and then I came across this note Exadata: New DNS server is not accessible after changing using IPCONF (Doc ID 1581417.1)

And there it was, I needed to restart the cellwall service to recreate the iptables rules.

# Restart cellwall service
service cellwall restart
service cellwall status

One final point, check if ASM started the rebalance or not, if it did, do not start to bring down another cell node until the rebalance is finish, otherwise you may run into deeper issues.

 

I hope it helps!

Elisson Almeida

Opatchauto Failing on “CheckActiveFilesAndExecutables” during Prerequisite Check

Hi all,
So, very recently when applying the 2021 January CPU in a client environment, the following happened:

[root@dbserver01 32226239]# $ORACLE_HOME/OPatch/opatchauto apply

OPatchauto session is initiated at Sun Mar 14 03:00:06 2021

System initialization log file is /u01/app/oracle/product/19c/grid/cfgtoollogs/opatchautodb/systemconfig2021-03-14_03-00-08AM.log.

Session log file is /u01/app/oracle/product/19c/grid/cfgtoollogs/opatchauto/opatchauto2021-03-14_03-00-13AM.log
The id for this session is 1J89

Executing OPatch prereq operations to verify patch applicability on home /u01/app/oracle/product/19c/db
Patch applicability verified successfully on home /u01/app/oracle/product/19c/db


Executing patch validation checks on home /u01/app/oracle/product/19c/db
Patch validation checks successfully completed on home /u01/app/oracle/product/19c/db


Verifying SQL patch applicability on home /u01/app/oracle/product/19c/db
SQL patch applicability verified successfully on home /u01/app/oracle/product/19c/db


Executing OPatch prereq operations to verify patch applicability on home /u01/app/oracle/product/19c/grid
Patch applicability verified successfully on home /u01/app/oracle/product/19c/grid


Executing patch validation checks on home /u01/app/oracle/product/19c/grid
Patch validation checks successfully completed on home /u01/app/oracle/product/19c/grid


Preparing to bring down database service on home /u01/app/oracle/product/19c/db
Successfully prepared home /u01/app/oracle/product/19c/db to bring down database service


Bringing down database service on home /u01/app/oracle/product/19c/db
Following database has been stopped and will be restarted later during the session: er1pprd,obiee
Database service successfully brought down on home /u01/app/oracle/product/19c/db


Performing prepatch operations on CRS - bringing down CRS service on home /u01/app/oracle/product/19c/grid
Prepatch operation log file location: /u01/app/oracle/product/crsdata/dbserver01/crsconfig/hapatch_2021-03-14_03-06-15AM.log
CRS service brought down successfully on home /u01/app/oracle/product/19c/grid


Start applying binary patch on home /u01/app/oracle/product/19c/db
Failed while applying binary patches on home /u01/app/oracle/product/19c/db

Execution of [OPatchAutoBinaryAction] patch action failed, check log for more details. Failures:
Patch Target : dbserver01->/u01/app/oracle/product/19c/db Type[sidb]
Details: [
---------------------------Patching Failed---------------------------------
Command execution failed during patching in home: /u01/app/oracle/product/19c/db, host: dbserver01.
Command failed: /u01/app/oracle/product/19c/db/OPatch/opatchauto apply /ora02/soft/jan21cpu/32126842/32226239 -oh /u01/app/oracle/product/19c/db -target_type oracle_database -binary -invPtrLoc /u01/app/oracle/product/19c/grid/oraInst.loc -jre /u01/app/oracle/product/19c/grid/OPatch/jre -persistresult /u01/app/oracle/product/19c/db/opatchautocfg/db/sessioninfo/sessionresult_dbserver01_sidb_2.ser -analyzedresult /u01/app/oracle/product/19c/db/opatchautocfg/db/sessioninfo/sessionresult_analyze_dbserver01_sidb_2.ser
Command failure output:
==Following patches FAILED in apply:

Patch: /ora02/soft/jan21cpu/32126842/32226239/32218454
Log: /u01/app/oracle/product/19c/db/cfgtoollogs/opatchauto/core/opatch/opatch2021-03-14_03-17-58AM_1.log
Reason: Failed during Patching: oracle.opatch.opatchsdk.OPatchException: Prerequisite check "CheckActiveFilesAndExecutables" failed.

After fixing the cause of failure Run opatchauto resume

]
OPATCHAUTO-68061: The orchestration engine failed.
OPATCHAUTO-68061: The orchestration engine failed with return code 1
OPATCHAUTO-68061: Check the log for more details.
OPatchAuto failed.

OPatchauto session completed at Sun Mar 14 03:19:25 2021
Time taken to complete the session 19 minutes, 19 seconds

opatchauto failed with error code 42

OK, going by parts, let's see what we have on the refered log:

[Mar 14, 2021 3:33:32 AM] [INFO] Start fuser command /sbin/fuser /u01/app/oracle/product/19c/grid/bin/expdp at Sat Mar 14 03:33:32 PDT 2021
[Mar 14, 2021 3:33:32 AM] [INFO] Finish fuser command /sbin/fuser /u01/app/oracle/product/19c/grid/bin/expdp at Sat Mar 14 03:33:32 PDT 2021
[Mar 14, 2021 3:33:32 AM] [INFO] Following active executables are not used by opatch process :


Following active executables are used by opatch process :
/u01/app/oracle/product/19c/grid/lib/libclntsh.so.19.1
[Mar 14, 2021 3:33:32 AM] [INFO] Prerequisite check "CheckActiveFilesAndExecutables" failed.
The details are:


Following active executables are not used by opatch process :


Following active executables are used by opatch process :
/u01/app/oracle/product/19c/grid/lib/libclntsh.so.19.1
[Mar 14, 2021 3:33:33 AM] [INFO] UtilSession failed: Prerequisite check "CheckActiveFilesAndExecutables" failed.
[Mar 14, 2021 3:33:33 AM] [SEVERE] OUI-67073:UtilSession failed: Prerequisite check "CheckActiveFilesAndExecutables" failed.
[Mar 14, 2021 3:33:33 AM] [INFO] Finishing UtilSession at Sat Mar 14 03:33:33 PDT 2021
[Mar 14, 2021 3:33:33 AM] [INFO] Log file location: /u01/app/oracle/product/19c/grid/cfgtoollogs/opatchauto/core/opatch/oapatch_2021-03-14_03-06-15AM.log

This is an interesting situation.

After some validations making sure no service is online, the path is writable, oracle and root have the required privilege and access, I found some relevant Oracle notes:

  • 19c Installation Fails with error “libclntsh.so: file format not recognized; treating as linker script” (Doc ID 2631283.1): Pointing to file corruption
  • While Applying a Weblogic Patch, opatch Fails with “Prerequisite check “CheckActiveFilesAndExecutables” failed” Error (Doc ID 2705809.1): Not a DB note and pointing to other processes using the files.
  • Opatch failure due to “CheckActiveFilesAndExecutables” as Remote registry service holding files (Doc ID 2462952.1): Remote registry holding the binaries.
  • Prerequisite Check “Checkactivefilesandexecutables” Failed (Doc ID 1281644.1): Patch requisite miss on 10g
  • Failed to apply PSU due to CheckActiveFilesAndExecutables check failure (Doc ID 2506432.1): SQLPlus holding the binaries.
  • [OCI]: Database System Patching Failed With Error “DCS-10001:Internal Error Encountered: Failure : Failed To Apply” And Opatch Log Shows “Prerequisite check “CheckActiveFilesAndExecutables” failed” (Doc ID 2687607.1): My case is not an OCI and not in RAC.

So, no matches at all.

However, this last note gave me the hints I needed. From Doc ID 2687607.1, for RAC environments:

/u01/app/19.0.0.0/grid/crs/install/rootcrs.sh -unlock
/u01/app/19.0.0.0/grid/crs/install/rootcrs.sh -init
/u01/app/19.0.0.0/grid/crs/install/rootcrs.sh -prepatch
/u01/app/19.0.0.0/grid/crs/install/rootcrs.sh -postpatch

So, in my case, a Standalone On-Premise Database (and GI):

/ora01/app/oracle/product/19c/grid/crs/install/roothas.sh -unlock
/ora01/app/oracle/product/19c/grid/crs/install/roothas.sh -init
/ora01/app/oracle/product/19c/grid/crs/install/roothas.sh -prepatch
[ Apply the patch! ]
/ora01/app/oracle/product/19c/grid/crs/install/roothas.sh -postpatch

Check the output:

[root@dbserver01 jan21cpu]# /u01/app/oracle/product/19c/grid/crs/install/roothas.sh -unlock
Using configuration parameter file: /u01/app/oracle/product/19c/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u01/app/oracle/product/crsdata/dbserver01/crsconfig/haunlock__2021-03-14_04-00-35AM.log
2021/03/14 04:01:01 CLSRSC-347: Successfully unlock /u01/app/oracle/product/19c/grid
[root@dbserver01 jan21cpu]# /u01/app/oracle/product/19c/grid/crs/install/roothas.sh -init
Using configuration parameter file: /u01/app/oracle/product/19c/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u01/app/oracle/product/crsdata/dbserver01/crsconfig/roothas_2021-03-14_04-01-09AM.log
[root@dbserver01 jan21cpu]# /u01/app/oracle/product/19c/grid/crs/install/roothas.sh -prepatch
Using configuration parameter file: /u01/app/oracle/product/19c/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u01/app/oracle/product/crsdata/dbserver01/crsconfig/hapatch_2021-03-14_04-01-16AM.log
2021/03/14 04:01:27 CLSRSC-347: Successfully unlock /u01/app/oracle/product/19c/grid
2021/03/14 04:01:27 CLSRSC-671: Pre-patch steps for patching GI home successfully completed.

And now resuming the Opatchauto:

[root@dbserver01 jan21cpu]# cd 32126842/32226239/
[root@dbserver01 32226239]# $ORACLE_HOME/OPatch/opatchauto resume

OPatchauto session is initiated at Sun Mar 14 04:02:07 2021
Session log file is /u01/app/oracle/product/19c/grid/cfgtoollogs/opatchauto/opatchauto2021-03-14_04-02-10AM.log
Resuming existing session with id 1J89

Start applying binary patch on home /u01/app/oracle/product/19c/db
Binary patch applied successfully on home /u01/app/oracle/product/19c/db


Start applying binary patch on home /u01/app/oracle/product/19c/grid

Binary patch applied successfully on home /u01/app/oracle/product/19c/grid


Performing postpatch operations on CRS - starting CRS service on home /u01/app/oracle/product/19c/grid
Postpatch operation log file location: /u01/app/oracle/product/crsdata/dbserver01/crsconfig/hapatch_2021-03-14_04-27-58AM.log
CRS service started successfully on home /u01/app/oracle/product/19c/grid


Preparing home /u01/app/oracle/product/19c/db after database service restarted
No step execution required.........


Trying to apply SQL patch on home /u01/app/oracle/product/19c/db
SQL patch applied successfully on home /u01/app/oracle/product/19c/db

OPatchAuto successful.

--------------------------------Summary--------------------------------

Patching is completed successfully. Please find the summary as follows:

Host:dbserver01
SIDB Home:/u01/app/oracle/product/19c/db
Version:19.0.0.0.0
Summary:

==Following patches were SKIPPED:

Patch: /ora02/soft/jan21cpu/32126842/32226239/32218663
Reason: This patch is not applicable to this specified target type - "oracle_database"

Patch: /ora02/soft/jan21cpu/32126842/32226239/29340594
Reason: This patch is not applicable to this specified target type - "oracle_database"

Patch: /ora02/soft/jan21cpu/32126842/32226239/32240590
Reason: This patch is not applicable to this specified target type - "oracle_database"


==Following patches were SUCCESSFULLY applied:

Patch: /ora02/soft/jan21cpu/32126842/32226239/32218454
Log: /u01/app/oracle/product/19c/db/cfgtoollogs/opatchauto/core/opatch/opatch2021-03-14_04-02-36AM_1.log

Patch: /ora02/soft/jan21cpu/32126842/32226239/32222571
Log: /u01/app/oracle/product/19c/db/cfgtoollogs/opatchauto/core/opatch/opatch2021-03-14_04-02-36AM_1.log


Host:dbserver01
SIHA Home:/u01/app/oracle/product/19c/grid
Version:19.0.0.0.0
Summary:

==Following patches were SUCCESSFULLY applied:

Patch: /ora02/soft/jan21cpu/32126842/32226239/29340594
Log: /u01/app/oracle/product/19c/grid/cfgtoollogs/opatchauto/core/opatch/opatch2021-03-14_04-11-34AM_1.log

Patch: /ora02/soft/jan21cpu/32126842/32226239/32218454
Log: /u01/app/oracle/product/19c/grid/cfgtoollogs/opatchauto/core/opatch/opatch2021-03-14_04-11-34AM_1.log

Patch: /ora02/soft/jan21cpu/32126842/32226239/32218663
Log: /u01/app/oracle/product/19c/grid/cfgtoollogs/opatchauto/core/opatch/opatch2021-03-14_04-11-34AM_1.log

Patch: /ora02/soft/jan21cpu/32126842/32226239/32222571
Log: /u01/app/oracle/product/19c/grid/cfgtoollogs/opatchauto/core/opatch/opatch2021-03-14_04-11-34AM_1.log

Patch: /ora02/soft/jan21cpu/32126842/32226239/32240590
Log: /u01/app/oracle/product/19c/grid/cfgtoollogs/opatchauto/core/opatch/opatch2021-03-14_04-11-34AM_1.log

OPatchauto session completed at Sun Mar 14 04:31:48 2021
Time taken to complete the session 29 minutes, 43 seconds

And here is the relevant point: This has been happening to me on several environments and servers across the recent weeks. Always for 2021 January CPU.
My guess is that this might have something to do with this CPU binaries set or, most likely, with the latest OPatch version:

[oracle@dbserver01 ~]$ $ORACLE_HOME/OPatch/opatch version
OPatch Version: 12.2.0.1.24

I hope it helps you as well!

5 Best Practices for Setting Dispatchers for Shared Connections

Hi all,

Here are 5 Best Practices / Tips for when setting dispatches with Shared Connections:

1. Set local_listener on both instances on the database

alter system set LOCAL_LISTENER=”(address=(protocol=tcp)(port=1521)(host=yourhost))” scope=both sid=’instance_name’;
Ref: Shared Server: Dispatchers Are Not Registered With Listener (Doc ID 465881.1)

2. Dispatchers parameter should be set to utilize the VIP name of the host

alter system set dispatchers='(address=(protocol=tcp)(host=node1-vip))(dispatchers=2)’ scope=both sid=’instance_name’;
Ref: How To Configure Shared Server Dispatchers For RAC Environment (Doc ID 578524.1)

3. Dispatchers count should be set appropriately considering the number of sessions expected to connect to the database

A general rule of thumb is that 1 dispatcher can handle 50 shared server connections with minimal performance impact.
Ref: Shared Server Only: TNS-12518, TNS-12564 and TNS-12602 Errors at Connect Time (Doc ID 1539104.1)

4. Arguments can be used with the dispatchers parameter for closer control of how the shared server sessions are used

SESSIONS – Determines the max sessions allowed for each dispatcher.
CONNECTIONS – The maximum number of network connections to allow for each dispatcher.
Ref: https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/DISPATCHERS.html#GUID-DCBCCF94-8A73-4805-9138-412DA413FC7C

5. Shared_servers parameter can be set to control total number of shared servers spawned by the database

shared_servers set to 1 –> This will enable shared server sessions on the database.
max_shared_server  –> Specifies the maximum number of shared servers that can run simultaneously.
shared_server_sessions  –> Specifies the total number of shared server user sessions that can run simultaneously. Setting this parameter enables to reserve user sessions for dedicated servers.
Ref: Automatic Shared Server Configuration (Doc ID 265931.1)

See you next post!

AWS AQUA for Redshift

Hi all,

Quick one today. Did you see this new release? I’m a bit behind the schedule but trying to catch up with the news. It seems very interesting:

AQUA (Advanced Query Accelerator) for Amazon Redshift is available in preview. AQUA provides a new distributed and hardware-accelerated cache that brings compute to the storage layer for Amazon Redshift and delivers up to 10x faster query performance than other cloud data warehouses.

AQUA is a high-speed cache on top of Redshift Managed Storage that can scale out and process data in parallel across many AQUA nodes. AQUA uses AWS-designed analytics processors that dramatically accelerate data compression, encryption, and data processing on queries that scan, filter, and aggregate large data sets. With this new architecture, customers can run queries quicker than ever before, allowing them to query data directly, even at scale, and giving them more up-to-date dashboards, reducing development time, and making system maintenance easier.

It is available for preview in US East (Ohio), US East (N. Virginia), and US West (Oregon) regions at this point.

For more references:

  • There is a very tech detailed article about it HERE.
  • There is also a very nice TechTalk presenting it HERE.

Let’s keep up to date!

Microsoft Ignite 2021!

Hey Folks,
Better later than never… Yesterday was the first day of the Microsoft Ignite event, and again, like last year, you have a chance to earn a voucher to do a free certification exam.

Same as the last year, to receive the voucher all you have to do is complete at least one of the challenges.

Bellow, you have two links. One where you can see which Exams will be available and the other one will take you to the challenges page.