OEM Information Reports: ORA-00600 [kpndbcon-svchpnotNULL]

Having this error from an Information Report?

ORA-00600 [kpndbcon-svchpnotNULL]
ORA-00600: internal error code, arguments: [kpndbcon-svchpnotNULL], [], [], [], [], [], [], [], [], [], [], []

Don’t worry… Basically this is not an Oracle direct issue , the cause of this error is that while the report is running (it takes 2 or 3 minutes) one of the following happens:

  • The Database Session in the OEM Repository (Database Repository) is killed.
  • The Database Session in the Target Database (where OEM has to connect and get the data) is killed.
  • There is network issues between OEM Repository and the Target database causing “time outs” or that the session finishes erroneously. .
  • High workload in one database causes “time out” making the session finished erroneously.
  • So basically this is a communication problem, between the OEM Repository and the database from where the data is being gotten.
  • To keep reports like this running with database links is something that Oracle doesn’t support at all because of any network issue can cause that the report gets errors, you can read the following notes:

Some reference about it:

  • ORA-00600 [kpndbcon-svchpnotNULL] Errors (Doc ID 1615517.1)
  • ORA-00600 [kpndbcon-svchpnotNULL] query through dblink (Doc ID 1490700.1)
  • Information Publisher Report fails with Error Rendering Element. Exception: ORA-00600 [kpndbcon-svchpnotNULL] (Doc ID 1930280.1)


So what’s the solution
?
The solution here is easy, just re-run it.

Hope it helps. Cheers!

Creating a Free Oracle Database at AWS

Have you ever heard about Amazon RDS?

Amazon RDS is a Relational Database cloud based service, which intends to help you to automate administrative tasks like hardware provisioning, database creation, backups, etc. Currently six database engines are available under this service: Oracle Database, Microsoft SQL Server, Amazon Aurora, PostgreSQL, MySQL and MariaDB.

This post provides a quick step-by-step on how to create your first Oracle Database RDS.

Requirements:
– Have an Amazon AWS account. If you don’t have one, don’t panic! You can still create one at: https://portal.aws.amazon.com/gp/aws/developer/registration/index.html

Now, how to create a RDS? Here it goes:

More“Creating a Free Oracle Database at AWS”

OEM after a Maintenance: A memory component is suspected of causing a fault with a 100% certainty. Component Name : % Fault class : fault.memory.intel.dimm_ce

Hi all!
So, I had this message from a memory component in my Exadata:

Message=A memory component is suspected of causing a fault with a 100% certainty. Component Name : /SYS/MB/P0/D3 Fault class : fault.memory.intel.dimm_ce

But this was right after a maintenance on server. Checking on ILOM:

-> show /SYS/MB/P0/D3

 /SYS/MB/P0/D3
    Targets:
        PRSNT
        SERVICE

    Properties:
        type = DIMM
        ipmi_name = MB/P0/D3
        fru_name = 16384MB DDR4 SDRAM DIMM
        fru_manufacturer = Samsung
        fru_part_number = %
        fru_rev_level = 01
        fru_serial_number = %
        fault_state = OK
        clear_fault_action = (none)

Checking on CellCLI alert history:

CellCLI> list alerthistory detail

	 name:                   13_1
	 alertDescription:       "A memory component suspected of causing a fault"
	 alertMessage:           "A memory component is suspected of causing a fault with a 100% certainty.  Component Name : /SYS/MB/P0/D3  Fault class    : fault.memory.intel.dimm_ce  Fault message  : http://support.oracle.com/msg/SPX86A-8002-XM"
	 alertSequenceID:        13
	 alertShortName:         Hardware
	 alertType:              Stateful
	 beginTime:              %
	 endTime:                %
	 examinedBy:             
	 metricObjectName:       /SYS/MB/P0/D3_FAULT
	 notificationState:      1
	 sequenceBeginTime:      %
	 severity:               critical
	 alertAction:            "For additional information, please refer to http://support.oracle.com/msg/SPX86A-8002-XM Automatic Service Request has been notified with Unique Identifier: %.  Diagnostic package is attached. It is also accessible at % It will be retained on the storage server for 7 days. If the diagnostic package has expired, then it can be re-created at %"

Hm… Let’s read the MOS: SPX86A-8002-XM – Memory Correctable ECC (Doc ID 1615285.1)

Suggested Action for System Administrator

Replace the faulty memory DIMM at the earliest possible convenience.”

Hmm… But as I said, this was right after a maintenance on server, what if this is related?
Ok, some additional piece of information:

-> version 
SP firmware 3.2.9.23 
SP firmware build number: 116695 
SP firmware date: Thu Mar 30 11:38:01 CST 2017 
SP filesystem version: 0.2.10

At the current firmware level of SP firmware 3.2.9.23 the memory correctable error threshold limit for DIMM replacement is 240 CEs in a 72 hrs period.

So, the suggestion is:
– Clear all the error messages after complete the maintenance and lets check if the threshold is reached again. If so, we may need to really replace it.

How to do it? Easy:

ssh root@grepora01-ilom
-> show /SYS/MB/P0/D3
Expected:
[...]
fault_state = Faulted
[..]
-> set /SYS/MB/P0/D3 clear_fault_action=true
Are you sure you want to clear /SYS/MB/P0/D3 (y/n)? y
-> show /SYS/MB/P0/D3
[Expected]
 /SYS/MB/P0/D3
    Targets:
        PRSNT
        SERVICE
Properties:
type = DIMM
ipmi_name = MB/P0/D3
fru_name = 16384MB DDR4 SDRAM DIMM
fru_manufacturer = Samsung
fru_part_number = %
fru_rev_level = 01
fru_serial_number = %
 fault_state = OK
clear_fault_action = (none)

Hope it helps!
Cheers!

/bin/rm: cannot execute [Argument list too long]

Hey all!

Just a quickie and useful thing today. How many times you found this?

/bin/rm: cannot execute [Argument list too long]

Ok, so, first thing: Is it related to Oracle logs? If so, you may want to ADCRI. Check this post for more info: ADRCI Retention Policy and Ad-Hoc Purge Script for all Bases.

If not, you may solve this using find with rm. Ok, but want to keep the most recent files?

Some examples for you, removing audit files:

# Remove older then 1 day:

find /oracle/greporadb/admin/greporadb/adump -name "*.aud" -mtime +1 -exec rm {} \;

# Remove older then 1 hour:

find /oracle/greporadb/admin/greporadb/adump -name "*.aud" -cmin +60 -exec rm {} \;

More“/bin/rm: cannot execute [Argument list too long]”

IPv6 Formatting for JDBC and SQLPlus

Hey all!
Seems new right? But it’s available since 11gR2.
Not needed to explain what is IPV6, right? Any questions, go here.

In summary the only thing you need is to enclose the IPv6 address in square brackets. Like this:

For Easy Connect on IPV4:

SQL> connect user/pass@172.23.10.40:1521/GREPORADB
Connected.

 

For Easy Connect on IPV6:

SQL> connect user/pass@[1:95e05a:g0d:da7a:2007]:1521/GREPORADB
Connected.

For JDBC (thin) IPV4:

url="jdbc:oracle:thin:@(DESCRIPTION=
(LOAD_BALANCE=on) (ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=172.23.10.40) (PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=172.23.10.41)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=GREPORADB)))"

For JDBC (OCI) IPV4:

url="jdbc:oracle:oci:@(DESCRIPTION=(ADDRESS=
(PROTOCOL=TCP)(HOST=172.23.10.40)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=GREPORADB)))"

For JDBC (thin) IPV6:

url="jdbc:oracle:thin:@(DESCRIPTION=
(LOAD_BALANCE=on) (ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=[1:95e05a:g0d:da7a:2007]) (PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=[1:95e05a:g0d:da7a:2006])(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=GREPORADB)))"

For JDBC (OCI) IPV6:

url="jdbc:oracle:oci:@(DESCRIPTION=(ADDRESS=
(PROTOCOL=TCP)(HOST=[1:95e05a:g0d:da7a:2007])(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=GREPORADB)))"

As you can imagine, the same applies to your TNSNAMES entries.

Also, according to this, it can be used even for your LISTENER:

LISTENER =
 (DESCRIPTION_LIST =
  (DESCRIPTION =
   (ADDRESS =
    (PROTOCOL = TCP)
    (HOST = [1:95e05a:g0d:da7a:2007])(PORT =1521))
  )
 )

Cheers!

Managing AWR Warehouse Repository Database

1. Change Retention Period Of AWR Warehouse Repository Database

This retention of the AWR on the Repository Database can be changed by the following:

<OMS_HOME>/bin>./emcli awrwh_reconfigure -retention=<New retention period (in years)>
For example: 
[oracle@oem13c oms]$ emcli awrwh_reconfigure -retention=5

2. Change Staging Location Of AWR Dump Files

For the AWR Warehouse, the target database by default creates dump file in home directory. So after adding the target to AWR warehosue, we need to reconfigure it from OEM CLI to change the dump files directory as following:

<OMS_HOME>/bin>./emcli awrwh_reconfigure_src -target_name=<target database name> -target_type=rac_database -src_dir="<directory path>"
For example: 
[oracle@oem13c ]$ ./emcli awrwh_reconfigure_src -target_name=greporadb -target_type=rac_database -src_dir="/arwdata/awrw"

3. Change Upload Interval Of SnapShots In AWR Warehouse Repository Database

This configuration can be changed by the following:

<OMS_HOME>/bin>./emcli awrwh_reconfigure -upload_interval <New upload interval>
For example: 
[oracle@oem13c oms]$ ./emcli awrwh_reconfigure -upload_interval=12

4. List Current Configuration

This can be accomplished by the following:

[oracle@oem13c oms]$ emcli awrwh_reconfigure -list
Upload Interval (hours) = 12
Retention (years) = 5
Dump Location = /awrdata/awrw/
AWR Warehouse reconfigured successfully
[oracle@oem13c oms]$

Reference
EM13c: How To Change Retention Period Of AWR Warehouse Repository Database In 13.2 OEM Cloud Control (Doc ID 2247437.1)
EM13c: How To Change Staging Location Of Dump Files On AWR Warehouse Repository Database In 13.2 OEM Cloud Control (Doc ID 2247439.1)
EM13c: How To Change Upload Interval Of SnapShots In AWR Warehouse Repository Database In 13.2 OEM Cloud Control (Doc ID 2247438.1)

OEM: The ILOM server is currently offline or unreachable on the network.

Hi all!
Just got an alarm from OEM with this message. How to check it?
– First thing is to be able to connect on ILOM from DBNode.
– From there we can test the IPv4 and/or IPv6 interfaces through ping, as pe shown below.

This is also documented as per this Doc: Oracle Integrated Lights Out Manager (ILOM) 3.0 HTML Documentation Collection – Test IPv4 or IPv6 Network Configuration (CLI)

In my case, it was only a false alarm, as I was able to connect to other DBNodes from this ILOM:

[root@greporasrv01db01 ~]# ssh greporasrv01-ilom.jcrew.com
The authenticity of host 'greporasrv01-ilom.grepora.com (10.48.18.64)' can't be established.
RSA key fingerprint is 59:c5:9f:b1:60:59:15:16:94:c8:94:88:7b:4e:52:57.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'greporasrv01-ilom.grepora.com' (RSA) to the list of known hosts.
Password: 

Oracle(R) Integrated Lights Out Manager

Version 3.2.9.23 r116695

Copyright (c) 2017, Oracle and/or its affiliates. All rights reserved.

Warning: HTTPS certificate is set to factory default.

Hostname: greporasrv01-ilom

-> show /SP/network

 /SP/network
    Targets:
        interconnect
        ipv6
        test

    Properties:
        commitpending = (Cannot show property)
        dhcp_clientid = none
        dhcp_server_ip = none
        ipaddress = 10.50.12.64
        ipdiscovery = static
        ipgateway = 10.50.12.1
        ipnetmask = 255.255.255.0
        macaddress = 00:10:E0:95:73:E6
        managementport = MGMT
        outofbandmacaddress = 00:10:E0:95:73:E6
        pendingipaddress = 10.50.12.64
        pendingipdiscovery = static
        pendingipgateway = 10.50.12.1
        pendingipnetmask = 255.255.255.0
        pendingmanagementport = MGMT
        pendingvlan_id = (none)
        sidebandmacaddress = 00:10:E0:95:73:E7
        state = ipv4-only
        vlan_id = (none)

    Commands:
        cd
        set
        show

-> cd /SP/network/test
/SP/network/test

-> show

 /SP/network/test
    Targets:

    Properties:
        ping = (Cannot show property)
        ping6 = (Cannot show property)

    Commands:
        cd
        set
        show

-> set ping=10.50.12.51       -- DBNode1
Ping of 10.50.12.51 succeeded

-> set ping=10.50.12.52       -- DBNode2
Ping of 10.50.12.52 succeeded

 

OEM 13C Patching Agent: [ERROR- Failed to Update Target Type Metadata]

While applying Patch to OEM13c, specifically to OMS Agent, I got this error when trying to start it back:

Collection Status                            : Collections enabled
Heartbeat Status       : OMS responded illegally [ERROR- Failed to Update Target Type Metadata]
Last attempted heartbeat to OMS              : 2018-04-17 12:12:51
Last successful heartbeat to OMS             : (none)
Next scheduled heartbeat to OMS              : 2018-04-17 12:13:21
---------------------------------------------------------------
Agent is Running and Ready
[oracle@oem13c oms]$ ./emctl upload agent
Oracle Enterprise Manager Cloud Control 13c Release 2
Copyright (c) 1996, 2016 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
EMD upload error:full upload has failed: uploadXMLFiles skipped :: OMS version not checked yet. If this issue persists check trace files for ping to OMS related errors. (OMS_DOWN)
[oracle@oem13c oms]$ ./emctl pingOMS
Oracle Enterprise Manager Cloud Control 13c Release 2
Copyright (c) 1996, 2016 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
EMD pingOMS error: OMS sent an invalid response: “ERROR- Failed to Update Target Type Metadata”

Nice hãn?
I Found to MOS EM 13c Agent : pingOMS error: OMS sent an invalid response: “ERROR- Failed to Update Target Type Metadata” (Doc ID 2318564.1) saying:

“This particular issue is caused when any Agent Plugin is upgraded to a higher level than the OMS plugin.”

The solution according to MOS Doc is to rollback the Agent Plugins ahead to the OMS version. Checking it:
More“OEM 13C Patching Agent: [ERROR- Failed to Update Target Type Metadata]”

Monitoring Your Oracle Database With Grafana

Hi everybody,

Let’s talk about Dashboarding Oracle Databases with Grafana.

I always felt the need of a graphical monitoring tool for basic database things such as volume of archives, back-up archives, state of services, offline disks, space of diskgroup, consum of UNDO, consum of TEMP, space of filesystem, space of every diskgroups in all clusters. OEM seems just too much complicated to give a simple online graphical dashboard for this.

So I developed a “collector” of data that sends the data to Influxdb and generate these graphs. Simple like that.

Have a look on how it looks like:

grafana1

grafana2

Ok, but I how did it?
Here it goes a piece of code:
More“Monitoring Your Oracle Database With Grafana”

Verifying topology of Exadata

Hey all!
Some time ago I needed to check topology of a client’s Exadata due a network issue and made a very useful note. Sharing with you now. 😀

This and other cool commands can be found here: Network Diagnostics information for Oracle Database Machine Environments (Doc ID 1053498.1)

# /opt/oracle.SupportTools/ibdiagtools/verify-topology -t quarterrack

Newer versions don’t require -t option.
In case of halfrack, -t halfrack should be used in my case.

Ok, but how to know it? You can have it from here:

[root@greporaexa onecommand]# grep -i MACHINETYPES databasemachine.xml
[MACHINETYPES]X4-2 Eighth Rack HC 4TB[/MACHINETYPES]

Hope it helps! 🙂