OEM: Metric “Tablespace Allocation Metric” not Collected – Agent is Running but Not Ready

Hi all,

That’s an interesting case with OEM. A client reported the metric “Tablespace Allocation Metric” is not being updated on OEM for a specific database. In this case, the last gathering was in Nov/2020, as you’ll see.

When checking for it, the first try as usual was checking on the OEM agent status, and here is what I got:

oracle:dbserver@mydb02 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version          : 12.1.0.5.0
OMS Version            : (unknown)
Protocol Version       : 12.1.0.1.0
Agent Home             : /u01/app/oracle/product/agent12c/agent_inst
Agent Log Directory    : /u01/app/oracle/product/agent12c/agent_inst/sysman/log
Agent Binaries         : /u01/app/oracle/product/agent12c/core/12.1.0.5.0
Agent Process ID       : 61641
Parent Process ID      : 61394
Currently initializing component             : Target Manager (2) (54 of 70)
Receivelet Interaction Manager Current Activity: Outstanding receivelet event tasks
----------------------------------
        TargetID = oracle_pdb.c4test_PDB1 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29 
        TargetID = oracle_pdb.c3test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29 
        TargetID = oracle_pdb.c3test_PDB2 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30 
        TargetID = oracle_pdb.c4test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29 
        TargetID = oracle_pdb.c6test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29 
        TargetID = oracle_pdb.c3test_PDB3 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30 
        TargetID = rac_database.c1prod - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30 


Target Manager Current Activity              : Compute Dynamic Properties (total operations: 37, active: 7, finished: 28)


Current target operations in progress
-------------------------------------
        oracle_pdb.c6test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c4test_PDB1 - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c3test_PDB2 - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c3test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c4test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c3test_PDB3 - LOAD_TARGET_DYNAMIC running for 120 seconds
        rac_database.c1test - LOAD_TARGET_DYNAMIC running for 120 seconds


Dynamic property executor tasks running
------------------------------


---------------------------------------------------------------
Agent is Running but Not Ready

Agent not ready, that’s interesting.
Trying then to clear the agent state as this has solved some previous similar cases:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl clearstate agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
EMD clearstate completed successfully

Now running the problematic metric manually:

oracle:dbserver02@c1test2 /u01/app/oracle: runCollection c1test_DW:oracle_pdb tbspAllocation                                                                <
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
EMD runCollection error:The agent is running but is currently not ready to accept client requests

Ok, trying to just upload the case:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl upload
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD upload error:The agent is running but is currently not ready to accept client requests

Maybe something is stuck, so let’s kill the process and start all over again:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl stop agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
Stopping agent ...

 stopped.

Making sure we have no agent process running:

oracle:dbserver02@c1test2 /u01/app/oracle:  ps -ef | grep java | agent
oracle:dbserver02@c1test2 /u01/app/oracle:

Also adjusting the threshod for metric running:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl  setproperty agent -a
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
EMD setproperty succeeded
oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl setproperty agent -allow_new -name _cancelThread  -value 210
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
EMD setproperty succeeded

And starting the agent:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
Starting agent ........................................................................................................................... started but not ready.

On the Agent log:

oracle:dbserver02@c1test2 /u01/app/oracle: tail /u01/app/oracle/product/agent12c/agent_inst/sysman/log/gcagent.log
oracle:dbserver02@c1test2 /u01/app/oracle:
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - StatusAgentAction: satisfyRequest Begin
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - Agent Overall Health: 0
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - StatusAgentAction: satisfyRequest End
Response:
initializing
2020-12-20 13:15:03,457 [35:686116F5] INFO - >>> Reporting response: StatusAgentResponse (initializing) (request id 1) <<< 2020-12-20 13:15:03,457 [35:686116F5] DEBUG - closing request input stream for "StatusAgentRequest (AGENT timeout:300)" 2020-12-20 13:15:03,457 [35:686116F5] DEBUG - overriding the buffer with a thread local copy (size: 8192b) 2020-12-20 13:15:03,458 [35:686116F5] DEBUG - closing request output stream for "StatusAgentRequest (AGENT timeout:300)" 2020-12-20 13:15:03,458 [35:686116F5] DEBUG - StatusAgentAction.call() is complete. 2020-12-20 13:15:03,458 [35:B5326F3F:HTTP Listener-35 - /emd/lifecycle/main/] DEBUG - removing entry for emdctl@18081@dbserver02=>[160849530330001] completely
2020-12-20 13:15:03,458 [35:B5326F3F] DEBUG - requests executed.
2020-12-20 13:15:03,458 [35:B5326F3F] DEBUG - HTTPListener Threads deallocated resource back to LifecycleRequestHandler partition
2020-12-20 13:15:03,458 [35:3C0B0663:HTTP Listener-35] DEBUG - using connection SCEP@1197017148 [d=true,io=1,w=true,b=false|false],NOT_HANDSHAKING, in/out=0/0 Status = OK HandshakeStatus = NOT_HANDSHAKING
bytesConsumed = 5 bytesProduced = 26
2020-12-20 13:15:03,780 [35:3C0B0663] DEBUG - using connection SCEP@1197017148  [d=true,io=1,w=true,b=false|false],NOT_HANDSHAKING, in/out=0/0 Status = OK HandshakeStatus = NOT_HANDSHAKING
bytesConsumed = 26 bytesProduced = 5
2020-12-20 13:15:06,986 [31:858161EB] DEBUG - Submitting task SchedulerHeartbeat for execution
2020-12-20 13:15:06,986 [395:1AE716D8] DEBUG - Begin task SchedulerHeartbeat on Thread: GC.SysExecutor.8
2020-12-20 13:15:06,986 [395:F944F4C8:GC.SysExecutor.8 (SchedulerHeartbeat)] DEBUG - Scheduler heartbeat
2020-12-20 13:15:06,988 [395:F944F4C8] DEBUG - Scheduling next SchedulerHeartbeat after delay 29998 including periodShift of 0 milliseconds
2020-12-20 13:15:06,988 [395:1AE716D8:GC.SysExecutor.8] DEBUG - End task SchedulerHeartbeat
2020-12-20 13:15:07,016 [31:858161EB] DEBUG - Submitting task HeapMonitorTask for execution
2020-12-20 13:15:07,017 [396:1AE716D9] DEBUG - Begin task HeapMonitorTask on Thread: GC.SysExecutor.9
2020-12-20 13:15:07,017 [396:391F60D7:GC.SysExecutor.9 (HeapMonitorTask)] DEBUG - Scheduling next HeapMonitorTask after delay 5000 including periodShift of 0 milliseconds
2020-12-20 13:15:07,017 [396:1AE716D9:GC.SysExecutor.9] DEBUG - End task HeapMonitorTask
2020-12-20 13:15:12,017 [31:858161EB] DEBUG - Submitting task HeapMonitorTask for execution
2020-12-20 13:15:12,017 [37:1AE716D0] DEBUG - Begin task HeapMonitorTask on Thread: GC.SysExecutor.0
2020-12-20 13:15:12,017 [37:FE21F10E:GC.SysExecutor.0 (HeapMonitorTask)] DEBUG - Scheduling next HeapMonitorTask after delay 5000 including periodShift of 0 milliseconds
2020-12-20 13:15:12,017 [37:1AE716D0:GC.SysExecutor.0] DEBUG - End task HeapMonitorTask
2020-12-20 13:15:12,189 [33:6D553CF6] DEBUG - HTTPListener Threads deallocated resource back to LifecycleRequestHandler partition
2020-12-20 13:15:12,190 [35:3C0B0663] DEBUG - using connection SCEP@1611645943  [d=true,io=1,w=true,b=false|false],NOT_HANDSHAKING, in/out=0/0 Status = OK HandshakeStatus = NOT_HANDSHAKING
bytesConsumed = 100 bytesProduced = 121
2020-12-20 13:15:12,191 [35:7107E334:HTTP Listener-35 - /emd/persistence/main/] DEBUG - HTTPListener Threads allocated resource from LifecycleRequestHandler partition
2020-12-20 13:15:17,017 [31:858161EB] DEBUG - Submitting task HeapMonitorTask for execution
2020-12-20 13:15:17,018 [45:1AE716D1] DEBUG - Begin task HeapMonitorTask on Thread: GC.SysExecutor.1
2020-12-20 13:15:17,018 [45:CBCC52CF:GC.SysExecutor.1 (HeapMonitorTask)] DEBUG - Scheduling next HeapMonitorTask after delay 5000 including periodShift of 0 milliseconds
2020-12-20 13:15:17,018 [45:1AE716D1:GC.SysExecutor.1] DEBUG - End task HeapMonitorTask

Following MOS Enterprise Manager12c: Oracle Database Tablespace Monthly Space Usage shows no data (Doc ID 1536654.1), a few changes were made:

$/AGENT_INST/bin/emctl setproperty agent -allow_new -name MaxInComingConnections -value 150
$/AGENT_INST/bin/emctl setproperty agent -allow_new -name _cancelThread  -value 210

The status before the change:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version          : 12.1.0.5.0
OMS Version            : (unknown)
Protocol Version       : 12.1.0.1.0
Agent Home             : /u01/app/oracle/product/agent12c/agent_inst
Agent Log Directory    : /u01/app/oracle/product/agent12c/agent_inst/sysman/log
Agent Binaries         : /u01/app/oracle/product/agent12c/core/12.1.0.5.0
Agent Process ID       : 61641
Parent Process ID      : 61394
Currently initializing component             : Target Manager (2) (54 of 70)
Receivelet Interaction Manager Current Activity: Outstanding receivelet event tasks
----------------------------------
        TargetID = oracle_pdb.c4test_PDB1 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29 
        TargetID = oracle_pdb.c3test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29 
        TargetID = oracle_pdb.c3test_PDB2 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30 
        TargetID = oracle_pdb.c4test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29 
        TargetID = oracle_pdb.c6test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29 
        TargetID = oracle_pdb.c3test_PDB3 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30 
        TargetID = rac_database.c1test - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30 

Target Manager Current Activity              : Compute Dynamic Properties (total operations: 37, active: 7, finished: 28)

Current target operations in progress
-------------------------------------
        oracle_pdb.c6test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c4test_PDB1 - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c3test_PDB2 - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c3test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c4test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
        oracle_pdb.c3test_PDB3 - LOAD_TARGET_DYNAMIC running for 120 seconds
        rac_database.c1test - LOAD_TARGET_DYNAMIC running for 120 seconds

Dynamic property executor tasks running
------------------------------


---------------------------------------------------------------
Agent is Running but Not Ready

And the status after the change:

oracle:dbserver02@c1test2 /u01/app/oracle:  /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version          : 12.1.0.5.0
OMS Version            : 12.1.0.5.0
Protocol Version       : 12.1.0.1.0
Agent Home             : /u01/app/oracle/product/agent12c/agent_inst
Agent Log Directory    : /u01/app/oracle/product/agent12c/agent_inst/sysman/log
Agent Binaries         : /u01/app/oracle/product/agent12c/core/12.1.0.5.0
Agent Process ID       : 56994
Parent Process ID      : 56654
Agent URL              : https://dbserver02:3872/emd/main/
Local Agent URL in NAT : https://dbserver02:3872/emd/main/
Repository URL         : https://omsweb:4903/empbs/upload
Started at             : 2020-12-20 13:08:35
Started by user        : oracle
Operating System       : Linux version 3.10.0-957.27.2.el7.x86_64 (amd64)
Last Reload            : (none)
Last successful upload                       : 2020-12-20 13:40:41
Last attempted upload                        : 2020-12-20 13:40:41
Total Megabytes of XML files uploaded so far : 1.02
Number of XML files pending upload           : 0
Size of XML files pending upload(MB)         : 0
Available disk space on upload filesystem    : 10.85%
Collection Status                            : Collections enabled
Heartbeat Status                             : Ok
Last attempted heartbeat to OMS              : 2020-12-20 13:40:40
Last successful heartbeat to OMS             : 2020-12-20 13:40:40
Next scheduled heartbeat to OMS              : 2020-12-20 13:41:40

---------------------------------------------------------------
Agent is Running and Ready

Great! Agent issue resolved.
However, the metric is not being gathered not even after running it manually:

oracle:dbserver01@c1test1 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl control agent runCollection c1test_CDBROOT:oracle_pdb tbspAllocation
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
EMD runCollection completed successfully

oracle:dbserver01@c1test1 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent scheduler | grep tbspAllocation
2020-12-28 23:05:14.562 : rac_database:c1test:tbspAllocation_cdb
2020-12-29 03:07:21.988 : rac_database:c4prod:tbspAllocation_cdb
2020-12-29 03:08:11.888 : rac_database:c6prod:tbspAllocation_cdb
2020-12-29 03:09:39.103 : rac_database:c2prod:tbspAllocation_cdb
2020-12-29 03:09:55.372 : rac_database:c3prod:tbspAllocation_cdb

oracle:dbserver01@c1test1 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl control agent runCollection c1test_DW:oracle_pdb tbspAllocation
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
EMD runCollection completed successfully

On OEM Repository database:

SQL> select TARGET_NAME,TARGET_TYPE,TARGET_GUID,max(ROLLUP_TIMESTAMP )
from mgmt$metric_daily where TARGET_NAME like '%c1test%'
and TARGET_TYPE='oracle_pdb'
and METRIC_NAME='tbspAllocation'
group by TARGET_NAME,TARGET_TYPE,TARGET_GUID;  2    3    4    5

TARGET_NAME                    TARGET_TYPE          TARGET_GUID                      MAX(ROLLUP_TIMESTAM
------------------------------ -------------------- -------------------------------- -------------------
c1test_DW         oracle_pdb           7B1DF5DD4555EB978330A6D522004D44 2020-11-12 00:00:00
c1test_CDBROOT    oracle_pdb           4CE72911295C0287E053837F649B7D0E 2020-11-12 00:00:00


SQL> select TARGET_NAME,TARGET_TYPE,TARGET_GUID,ROLLUP_TIMESTAMP from mgmt$metric_daily where TARGET_NAME like '%c1test%' and TARGET_TYPE='oracle_pdb' and ROLLUP_TIMESTAMP>sysdate-3 order by 4

TARGET_NAME                    TARGET_TYPE          ROLLUP_TIMESTAMP       AVERAGE
------------------------------ -------------------- ------------------- ----------
c1test_DW         oracle_pdb           2020-11-06 00:00:00  1575.9375
c1test_DW         oracle_pdb           2020-11-07 00:00:00  1575.9375
c1test_DW         oracle_pdb           2020-11-08 00:00:00  1575.9375
c1test_DW         oracle_pdb           2020-11-09 00:00:00  1575.9375
c1test_DW         oracle_pdb           2020-11-10 00:00:00  1575.9375
c1test_DW         oracle_pdb           2020-11-11 00:00:00  1575.9375
c1test_DW         oracle_pdb           2020-11-12 00:00:00  1575.9375
c1test_CDBROOT    oracle_pdb           2020-11-05 00:00:00 37581.5625


TARGET_NAME                    TARGET_TYPE          ROLLUP_TIMESTAMP       AVERAGE
------------------------------ -------------------- ------------------- ----------
c1test_CDBROOT    oracle_pdb           2020-11-08 00:00:00  227138.75
c1test_CDBROOT    oracle_pdb           2020-11-09 00:00:00 455087.688
c1test_CDBROOT    oracle_pdb           2020-11-10 00:00:00 278230.875
c1test_CDBROOT    oracle_pdb           2020-11-11 00:00:00 208727.188
c1test_CDBROOT    oracle_pdb           2020-11-12 00:00:00 454964.063

Ok, so in summary: After fixing all issues on OEM side, everything running fine, still the database metrics are not being updated.

Long story short: After some investigation, bumped in MOS Database Hangs With Simple Queries like on view dba_data_files & dba_free_space (Doc ID 2665935.1)
Turns out this seemed to be a match. So proceeding with the recommendation:

SQL> alter session set container=DW;

Session altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         3 DW                             READ WRITE NO
SQL> select count(*) from dba_recyclebin;

  COUNT(*)
----------
     28522

SQL> purge recyclebin;

Recyclebin purged.

SQL> purge dba_recyclebin;

DBA Recyclebin purged.

Once done, all issues solved and metric being collected again:

Some additional reference:

  • Database Tablespace Metrics: Tablespace Allocation Is Not Collected (Metric tbspAllocation) (Doc ID 404692.1)
  • EM 12c : emctl start agent Fails With Error ‘Starting agent … started but not ready’ (Doc ID 1591477.1)
  • EM12c : emctl start / status agent ‘Agent Running but Not Ready’ ‘ERROR – The agent is overloaded [current requests: 30]’ Reported in gcagent.log (Doc ID 1546529.1)

I hope it helps!

Exadata Healthcheck – Top 5 Tools and Features!

Hi all,
It’s not new for Oracle DBAs the countless great tools we have out of the box to help us out with our daily tasks, such as ORAchk/EXAchk/ODAchk, Database Security Assessment Tool (DBSAT), Hang Manager, Cluster Health Advisor (CHA), Cluster Verification Utility (CVU), Memory Guard, Tracefile Analyzer (TFA) with tools like oratop, procwatcher, oswatcher, pstack, RDA, and the list goes on and on…

The good news is, most of the tools are now together on the Autonomous Health Framework (AHF), since version 12.2. None of those tools are running by default though, so you might need to choose some to start and enable on your environment.

But out of all this list, what if we could choose the top 5 features we can and should use as a start for Exadata Environment? Well, I did mine, see it below.

Oh, and by the way, you don’t pay anything else for them, counting you already have Oracle Support Services!

1. Cluster Health Advisor – Calibrate your Exa Environment!

Available along with the AFH since 12.2, the CHA works along the Cluster Health Monitor to provide you fine-grained notifications and correlations about your environment. And when I say it, I mean it: YOUR environment. This is because the CHA works better if you calibrate it with your statistics. As usual, not the worse problematic day or the low workload night, but an average day which can be used as a reference. All this is stored in the GIMR (as shown below) and used for future comparison and model inference.

This means the CHA is not a long list of IFs with fixed metrics, but an intelligent tool monitoring over 127 processes that perform work based on your workload. Not only this, the CHA is enriched with Machine Learning algorithms that model over 30 known DB problems based on over 150 metric predictors.

 

An example of inference can be seen below, where network and Global Cache statistics are used to inference a network issue.

Not rocket science, but always nice to have someone digesting tons of logs and metrics and reaching this sort of conclusion unassisted, right? You as DBA can steal all credits for the finding, no hard feelings.

And this is just one of the things CHA provides. It has tons of other functionalities. You should try using it more!

 

2. EXAchk – Daily Automated Runs (and Reports)

Most likely if you have an Exadata, you are used to running from time to time an EXAchk to review the recommendations and best practices for your environment. It’s something that requires almost no effort to run and to copy the reports, or you most likely have created an script to do so. What if I tell you Oracle has now automated this with AHF?

All you need to do is to confirm the scheduled runs and set the address for the reports to be sent. Find below a quick Cheatsheet:

a. Checking Status of the EXAchk

[root@exa01dbadm01 ~]# exachk -d info
------------------------------------------------------------

Master node = exa01dbadm01

exachk daemon version = 211300

Install location = /opt/oracle.ahf/exachk

Started at = Wed Jun 16 11:58:03 MDT 2021

Scheduler type = TFA Scheduler


[root@exa01dbadm01 ~]# exachk -d status
exachk is using TFA Scheduler. TFA PID: 369350

b. Checking Status of TFA Daemon Status and Auto Start

[root@exa01dbadm01 ~]# ahfctl statusahf

.-----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+--------------+---------------+--------+------+------------+----------------------+------------------+
| exa01dbadm01 | RUNNING | 369350 | 5000 | 21.1.3.0.0 | 21130020210607124914 | COMPLETE |
| exa01dbadm02 | RUNNING | 118950 | 5000 | 21.1.3.0.0 | 21130020210607124914 | COMPLETE |
'--------------+---------------+--------+------+------------+----------------------+------------------'

------------------------------------------------------------

Master node = exa01dbadm01

exachk daemon version = 211300

Install location = /opt/oracle.ahf/exachk

Started at = Wed Jun 16 11:58:03 MDT 2021

Scheduler type = TFA Scheduler

------------------------------------------------------------
ID: exachk.autostart_client_exatier1
------------------------------------------------------------
AUTORUN_FLAGS = -usediscovery -profile exatier1 -syslog -dball -showpass -tag autostart_client_exatier1 -readenvconfig
COLLECTION_RETENTION = 7
AUTORUN_SCHEDULE = 3 2 * * 1,2,3,4,5,6
------------------------------------------------------------
------------------------------------------------------------
ID: exachk.autostart_client
------------------------------------------------------------
AUTORUN_FLAGS = -usediscovery -syslog -tag autostart_client -readenvconfig
COLLECTION_RETENTION = 14
AUTORUN_SCHEDULE = 3 3 * * 0
------------------------------------------------------------

Next auto run starts on Jun 17, 2021 02:03:00

ID:exachk.AUTOSTART_CLIENT_EXATIER1

c. Gather EXAchk Next Automated Run

[root@exa01dbadm01 ~]# exachk -d nextautorun

Next auto run starts on Jun 17, 2021 02:03:00

ID:exachk.AUTOSTART_CLIENT_EXATIER1

[root@exa01dbadm01 ~]#

d. Changing EXAchk Notifications:

[root@exa01dbadm01 ~]# exachk -get NOTIFICATION_EMAIL,AUTORUN_SCHEDULE,COLLECTION_RETENTION
------------------------------------------------------------
ID: exachk.autostart_client_exatier1
------------------------------------------------------------
COLLECTION_RETENTION = 7
AUTORUN_SCHEDULE = 3 2 * * 1,2,3,4,5,6
------------------------------------------------------------
------------------------------------------------------------
ID: exachk.autostart_client
------------------------------------------------------------
COLLECTION_RETENTION = 14
AUTORUN_SCHEDULE = 3 3 * * 0
------------------------------------------------------------


[root@exa01dbadm01 ~]# exachk -id autostart_client -set NOTIFICATION_EMAIL=boesing@pythian.com

Updated attribute ['NOTIFICATION_EMAIL=boesing@pythian.com'] for Id[exachk.AUTOSTART_CLIENT]

Successfully copied Daemon Store to Remote Nodes


[root@exa01dbadm01 ~]# exachk -get NOTIFICATION_EMAIL,AUTORUN_SCHEDULE,COLLECTION_RETENTION
------------------------------------------------------------
ID: exachk.autostart_client_exatier1
------------------------------------------------------------
COLLECTION_RETENTION = 7
AUTORUN_SCHEDULE = 3 2 * * 1,2,3,4,5,6
------------------------------------------------------------
------------------------------------------------------------
ID: exachk.autostart_client
------------------------------------------------------------
NOTIFICATION_EMAIL = boesing@pythian.com
COLLECTION_RETENTION = 14
AUTORUN_SCHEDULE = 3 3 * * 0
------------------------------------------------------------

[root@exa01dbadm01 ~]# exachk -id autostart_client_exatier1 -set NOTIFICATION_EMAIL=boesing@pythian.com
Updated attribute ['NOTIFICATION_EMAIL=boesing@pythian.com'] for Id[exachk.AUTOSTART_CLIENT_EXATIER1]

Successfully copied Daemon Store to Remote Nodes


[root@exa01dbadm01 ~]# exachk -get NOTIFICATION_EMAIL,AUTORUN_SCHEDULE,COLLECTION_RETENTION
------------------------------------------------------------
ID: exachk.autostart_client_exatier1
------------------------------------------------------------
NOTIFICATION_EMAIL = boesing@pythian.com
COLLECTION_RETENTION = 7
AUTORUN_SCHEDULE = 3 2 * * 1,2,3,4,5,6
------------------------------------------------------------
------------------------------------------------------------
ID: exachk.autostart_client
------------------------------------------------------------
NOTIFICATION_EMAIL = boesing@pythian.com
COLLECTION_RETENTION = 14
AUTORUN_SCHEDULE = 3 3 * * 0
------------------------------------------------------------

e. Change EXAchk Schedule and Retention

[root@exa01dbadm01 ~]# exachk -id autostart_client_exaier1 –set "AUTORUN_SCHEDULE=0 3 * * *" -> Time= 3 AM daily
[root@exa01dbadm01 ~]# exachk-id autostart_client –set "collection_retention=90"

f. EXAchk: Testing Email Sending and Running EXAchk Report over email

This is for ad-hoc testing to check about email sending, out of the scheduled runs.

[root@exa01dbadm01 ~]# exachk -testemail notification_email=boesing@pythian.com
Email Successfully sent to ['boesing@pythian.com'] from 'root@exa01dbadm01
[root@exa01dbadm01 ~]# exachk -sendemail notification_email=boesing@pythian.com


Searching for running databases . . . . .

. . . . . . . . . . . .
List of running databases registered in OCR

1. xxxxxx
2. yyyy
3. None of above

Select databases from list for checking best practices. For multiple databases, select 3 for All or comma separated number like 1,2 etc [1-3][3].
[...]
Detailed report (html) - /u01/app/oracle/oracle.ahf/data/exa01dbadm01/exachk/user_root/output/exachk_exa01dbadm01_xxxxx_061621_134748/exachk_exa01dbadm01_xxxxx_061621_134748.html

UPLOAD [if required] - /u01/app/oracle/oracle.ahf/data/exa01dbadm01/exachk/user_root/output/exachk_exa01dbadm01_xxxxxx_061621_134748.zip
Email Successfully sent to ('boesing@pythian.com',) from 'root@exa01dbadm01' with attachment

3. TFA – Sanitize and Mask Options

Even with all the concerns on sensitive data being more and more relevant, this is something that actually surprised me. It’s possible to Sanitize and Mask data in collections. For example, mask will hide your inner data (let’s say table names):

[root@exa01dbadm01 ~]# tfactl diagcollect -srdc ORA-00600 -mask

Sanitize will hide your hardware setting. Not that useful if you have an Exadata, but might be interesting if you have commodity hardware you don’t want Oracle to know about.

[root@exa01dbadm01 ~]# tfactl diagcollect -srdc ORA-00600 -sanitize

4. TFA Changes – “Nothing was Changed” Resolver Tool

This is for all the DBAs which had already this dialogue:

Client: Yesterday was running fine, and today it’s veeeery slow. Nothing was changed!
DBA: Something changed, that’s for sure.
Client: Absolutely nothing changed.

So now we can access if indeed nothing changed from the client’s perspective (perhaps an automatic statistics gathering or something) or if anybody did something and is hard to identify.

It takes parameters from OS and DB and tracks of old and new values, reporting changes:

[root@exa01dbadm01 ~]# tfactl changes

Output from host : exa01dbadm02
------------------------------
No Changes Found

Output from host : exa01dbadm01
------------------------------
[Nov/14/2021 00:08:33.000]: [db.dbprod19.dbprod191]: Parameter: log_archive_dest_2: Value: service=dbprod19stb => ASYNC NOAFFIRM delay=240 optional compression=disable max_failure=0 reopen=300 db_unique_name=dbprod19stb net_timeout=300
[Nov/14/2021 00:08:33.000]: [db.dbprod19.dbprod191]: Parameter: log_archive_dest_2: Value: service=dbprod19stb => valid_for=(online_logfile,all_roles)

5. Oracle Health Check Collections Manager

Not a surprise if you don’t know this tool, but I’d really recommend you do look for it now. It’s a great tool and as with everything in this post, it’s free!

Oracle Health Check Collections Manager is an APEX companion application to Oracle EXAchk that gives you an enterprise-wide view of your health check collection data. All you need to have is an APEX 4.2 or 5 version and deploy the tool. The main idea is that you can consolidate all your reports in one place and, as a plus, you can manage all your EXAchk reports across the time, including a view on the items regression you may have.

This is an example of the view of the collections:

And this is an example of a new best practices failure:

Do you agree with my top list? Let me know your thoughts!

OEM 12.5: Tablespace Allocation Metric Not Collected—Agent Is Running but Not Ready

Hi all,

First of all, accept our apologies for the long period without posts. We are about to resolve it and restart with posts. 2021 was such a crazy year, but all is settling down.

Jumping to what you are here for: So, I ran across an interesting case within OEM 12 (Oracle Enterprise Manager) Release 5. A client reported the “tablespace allocation metric” was not being updated on OEM for a specific database. In this case, the most recent gathering was done in November 2020, as I”ll show you shortly. Then, we discovered the problem one month later. This post describes what we did to solve the issue.

As usual, the first thing I did was check on the Oracle Enterprise Manager (OEM) agent status. This is what it said:

oracle:dbserver@mydb02 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 12.1.0.5.0
OMS Version : (unknown)
Protocol Version : 12.1.0.1.0
Agent Home : /u01/app/oracle/product/agent12c/agent_inst
Agent Log Directory : /u01/app/oracle/product/agent12c/agent_inst/sysman/log
Agent Binaries : /u01/app/oracle/product/agent12c/core/12.1.0.5.0
Agent Process ID : 61641
Parent Process ID : 61394
Currently initializing component : Target Manager (2) (54 of 70)
Receivelet Interaction Manager Current Activity: Outstanding receivelet event tasks
----------------------------------
TargetID = oracle_pdb.c4test_PDB1 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_PDB2 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30
TargetID = oracle_pdb.c4test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c6test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_PDB3 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30
TargetID = rac_database.c1prod - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30


Target Manager Current Activity : Compute Dynamic Properties (total operations: 37, active: 7, finished: 28)


Current target operations in progress
-------------------------------------
oracle_pdb.c6test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c4test_PDB1 - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_PDB2 - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c4test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_PDB3 - LOAD_TARGET_DYNAMIC running for 120 seconds
rac_database.c1test - LOAD_TARGET_DYNAMIC running for 120 seconds


Dynamic property executor tasks running
------------------------------


---------------------------------------------------------------
Agent is Running but Not Ready

“Agent not ready.” Now, that’s interesting.

Next, I tried to clear the agent state as this had solved some previous similar cases:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl clearstate agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
EMD clearstate completed successfully

Then, I ran the problematic metric manually:

oracle:dbserver02@c1test2 /u01/app/oracle: runCollection c1test_DW:oracle_pdb tbspAllocation <
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD runCollection error:The agent is running but is currently not ready to accept client requests

This shows me trying to upload the case:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl upload
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD upload error:The agent is running but is currently not ready to accept client requests

I thought maybe something was stuck, so I decided to kill the process and start all over again:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl stop agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Stopping agent ...

stopped.

Here, I made sure I had no agent process running:

oracle:dbserver02@c1test2 /u01/app/oracle: ps -ef | grep java | agent
oracle:dbserver02@c1test2 /u01/app/oracle:

I also adjusted the threshold for metric running:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl setproperty agent -a
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
EMD setproperty succeeded
oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl setproperty agent -allow_new -name _cancelThread -value 210
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
EMD setproperty succeeded

And strating the agent

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Starting agent ........................................................................................................................... started but not ready.

This was on the agent log:

oracle:dbserver02@c1test2 /u01/app/oracle: tail /u01/app/oracle/product/agent12c/agent_inst/sysman/log/gcagent.log
oracle:dbserver02@c1test2 /u01/app/oracle:
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - StatusAgentAction: satisfyRequest Begin
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - Agent Overall Health: 0
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - StatusAgentAction: satisfyRequest End
Response:
initializing
2020-12-20 13:15:03,457 [35:686116F5] INFO - >>> Reporting response: StatusAgentResponse (initializing) (request id 1) <<<
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - closing request input stream for "StatusAgentRequest (AGENT timeout:300)"
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - overriding the buffer with a thread local copy (size: 8192b)
2020-12-20 13:15:03,458 [35:686116F5] DEBUG - closing request output stream for "StatusAgentRequest (AGENT timeout:300)"
2020-12-20 13:15:03,458 [35:686116F5] DEBUG - StatusAgentAction.call() is complete.
2020-12-20 13:15:03,458 [35:B5326F3F:HTTP Listener-35 - /emd/lifecycle/main/] DEBUG - removing entry for emdctl@18081@dbserver02=>[160849530330001] completely
2020-12-20 13:15:03,458 [35:B5326F3F] DEBUG - requests executed.
2020-12-20 13:15:03,458 [35:B5326F3F] DEBUG - HTTPListener Threads deallocated resource back to LifecycleRequestHandler partition
2020-12-20 13:15:03,458 [35:3C0B0663:HTTP Listener-35] DEBUG - using connection SCEP@1197017148 [d=true,io=1,w=true,b=false|false],NOT_HANDSHAKING, in/out=0/0 Status = OK HandshakeStatus = NOT_HANDSHAKING
bytesConsumed = 5 bytesProduced = 26
2020-12-20 13:15:03,780 [35:3C0B0663] DEBUG - using connection SCEP@1197017148 [d=true,io=1,w=true,b=false|false],NOT_HANDSHAKING, in/out=0/0 Status = OK HandshakeStatus = NOT_HANDSHAKING
bytesConsumed = 26 bytesProduced = 5
2020-12-20 13:15:06,986 [31:858161EB] DEBUG - Submitting task SchedulerHeartbeat for execution
2020-12-20 13:15:06,986 [395:1AE716D8] DEBUG - Begin task SchedulerHeartbeat on Thread: GC.SysExecutor.8
2020-12-20 13:15:06,986 [395:F944F4C8:GC.SysExecutor.8 (SchedulerHeartbeat)] DEBUG - Scheduler heartbeat
2020-12-20 13:15:06,988 [395:F944F4C8] DEBUG - Scheduling next SchedulerHeartbeat after delay 29998 including periodShift of 0 milliseconds
2020-12-20 13:15:06,988 [395:1AE716D8:GC.SysExecutor.8] DEBUG - End task SchedulerHeartbeat
2020-12-20 13:15:07,016 [31:858161EB] DEBUG - Submitting task HeapMonitorTask for execution
2020-12-20 13:15:07,017 [396:1AE716D9] DEBUG - Begin task HeapMonitorTask on Thread: GC.SysExecutor.9
2020-12-20 13:15:07,017 [396:391F60D7:GC.SysExecutor.9 (HeapMonitorTask)] DEBUG - Scheduling next HeapMonitorTask after delay 5000 including periodShift of 0 milliseconds
2020-12-20 13:15:07,017 [396:1AE716D9:GC.SysExecutor.9] DEBUG - End task HeapMonitorTask
2020-12-20 13:15:12,017 [31:858161EB] DEBUG - Submitting task HeapMonitorTask for execution
2020-12-20 13:15:12,017 [37:1AE716D0] DEBUG - Begin task HeapMonitorTask on Thread: GC.SysExecutor.0
2020-12-20 13:15:12,017 [37:FE21F10E:GC.SysExecutor.0 (HeapMonitorTask)] DEBUG - Scheduling next HeapMonitorTask after delay 5000 including periodShift of 0 milliseconds
2020-12-20 13:15:12,017 [37:1AE716D0:GC.SysExecutor.0] DEBUG - End task HeapMonitorTask
2020-12-20 13:15:12,189 [33:6D553CF6] DEBUG - HTTPListener Threads deallocated resource back to LifecycleRequestHandler partition
2020-12-20 13:15:12,190 [35:3C0B0663] DEBUG - using connection SCEP@1611645943 [d=true,io=1,w=true,b=false|false],NOT_HANDSHAKING, in/out=0/0 Status = OK HandshakeStatus = NOT_HANDSHAKING
bytesConsumed = 100 bytesProduced = 121
2020-12-20 13:15:12,191 [35:7107E334:HTTP Listener-35 - /emd/persistence/main/] DEBUG - HTTPListener Threads allocated resource from LifecycleRequestHandler partition
2020-12-20 13:15:17,017 [31:858161EB] DEBUG - Submitting task HeapMonitorTask for execution
2020-12-20 13:15:17,018 [45:1AE716D1] DEBUG - Begin task HeapMonitorTask on Thread: GC.SysExecutor.1
2020-12-20 13:15:17,018 [45:CBCC52CF:GC.SysExecutor.1 (HeapMonitorTask)] DEBUG - Scheduling next HeapMonitorTask after delay 5000 including periodShift of 0 milliseconds
2020-12-20 13:15:17,018 [45:1AE716D1:GC.SysExecutor.1] DEBUG - End task HeapMonitorTask

Following MOS Enterprise Manager12c: Oracle Database Tablespace Monthly Space Usage shows no data (Doc ID 1536654.1), I made a few changes:

$/AGENT_INST/bin/emctl setproperty agent -allow_new -name MaxInComingConnections -value 150
$/AGENT_INST/bin/emctl setproperty agent -allow_new -name _cancelThread -value 210

Here’s the status before the change:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 12.1.0.5.0
OMS Version : (unknown)
Protocol Version : 12.1.0.1.0
Agent Home : /u01/app/oracle/product/agent12c/agent_inst
Agent Log Directory : /u01/app/oracle/product/agent12c/agent_inst/sysman/log
Agent Binaries : /u01/app/oracle/product/agent12c/core/12.1.0.5.0
Agent Process ID : 61641
Parent Process ID : 61394
Currently initializing component : Target Manager (2) (54 of 70)
Receivelet Interaction Manager Current Activity: Outstanding receivelet event tasks
----------------------------------
TargetID = oracle_pdb.c4test_PDB1 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_PDB2 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30
TargetID = oracle_pdb.c4test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c6test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_PDB3 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30
TargetID = rac_database.c1test - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30

Target Manager Current Activity : Compute Dynamic Properties (total operations: 37, active: 7, finished: 28)

Current target operations in progress
-------------------------------------
oracle_pdb.c6test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c4test_PDB1 - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_PDB2 - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c4test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_PDB3 - LOAD_TARGET_DYNAMIC running for 120 seconds
rac_database.c1test - LOAD_TARGET_DYNAMIC running for 120 seconds

Dynamic property executor tasks running
------------------------------


---------------------------------------------------------------
Agent is Running but Not Ready

And this was the status after the change:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 12.1.0.5.0
OMS Version : 12.1.0.5.0
Protocol Version : 12.1.0.1.0
Agent Home : /u01/app/oracle/product/agent12c/agent_inst
Agent Log Directory : /u01/app/oracle/product/agent12c/agent_inst/sysman/log
Agent Binaries : /u01/app/oracle/product/agent12c/core/12.1.0.5.0
Agent Process ID : 56994
Parent Process ID : 56654
Agent URL : https://dbserver02:3872/emd/main/
Local Agent URL in NAT : https://dbserver02:3872/emd/main/
Repository URL : https://omsweb:4903/empbs/upload
Started at : 2020-12-20 13:08:35
Started by user : oracle
Operating System : Linux version 3.10.0-957.27.2.el7.x86_64 (amd64)
Last Reload : (none)
Last successful upload : 2020-12-20 13:40:41
Last attempted upload : 2020-12-20 13:40:41
Total Megabytes of XML files uploaded so far : 1.02
Number of XML files pending upload : 0
Size of XML files pending upload(MB) : 0
Available disk space on upload filesystem : 10.85%
Collection Status : Collections enabled
Heartbeat Status : Ok
Last attempted heartbeat to OMS : 2020-12-20 13:40:40
Last successful heartbeat to OMS : 2020-12-20 13:40:40
Next scheduled heartbeat to OMS : 2020-12-20 13:41:40

---------------------------------------------------------------
Agent is Running and Ready

Great! Agent issue resolved.

However, the metric was not being gathered—not even after running it manually:

oracle:dbserver01@c1test1 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl control agent runCollection c1test_CDBROOT:oracle_pdb tbspAllocation
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD runCollection completed successfully

oracle:dbserver01@c1test1 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent scheduler | grep tbspAllocation
2020-12-28 23:05:14.562 : rac_database:c1test:tbspAllocation_cdb
2020-12-29 03:07:21.988 : rac_database:c4prod:tbspAllocation_cdb
2020-12-29 03:08:11.888 : rac_database:c6prod:tbspAllocation_cdb
2020-12-29 03:09:39.103 : rac_database:c2prod:tbspAllocation_cdb
2020-12-29 03:09:55.372 : rac_database:c3prod:tbspAllocation_cdb

oracle:dbserver01@c1test1 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl control agent runCollection c1test_DW:oracle_pdb tbspAllocation
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD runCollection completed successfully

Meanwhile, on the OEM repository database:

SQL> select TARGET_NAME,TARGET_TYPE,TARGET_GUID,max(ROLLUP_TIMESTAMP )
from mgmt$metric_daily where TARGET_NAME like '%c1test%'
and TARGET_TYPE='oracle_pdb'
and METRIC_NAME='tbspAllocation'
group by TARGET_NAME,TARGET_TYPE,TARGET_GUID; 2 3 4 5

TARGET_NAME TARGET_TYPE TARGET_GUID MAX(ROLLUP_TIMESTAM
------------------------------ -------------------- -------------------------------- -------------------
c1test_DW oracle_pdb 7B1DF5DD4555EB978330A6D522004D44 2020-11-12 00:00:00
c1test_CDBROOT oracle_pdb 4CE72911295C0287E053837F649B7D0E 2020-11-12 00:00:00


SQL> select TARGET_NAME,TARGET_TYPE,TARGET_GUID,ROLLUP_TIMESTAMP from mgmt$metric_daily where TARGET_NAME like '%c1test%' and TARGET_TYPE='oracle_pdb' and ROLLUP_TIMESTAMP>sysdate-3 order by 4

TARGET_NAME TARGET_TYPE ROLLUP_TIMESTAMP AVERAGE
------------------------------ -------------------- ------------------- ----------
c1test_DW oracle_pdb 2020-11-06 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-07 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-08 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-09 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-10 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-11 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-12 00:00:00 1575.9375
c1test_CDBROOT oracle_pdb 2020-11-05 00:00:00 37581.5625


TARGET_NAME TARGET_TYPE ROLLUP_TIMESTAMP AVERAGE
------------------------------ -------------------- ------------------- ----------
c1test_CDBROOT oracle_pdb 2020-11-08 00:00:00 227138.75
c1test_CDBROOT oracle_pdb 2020-11-09 00:00:00 455087.688
c1test_CDBROOT oracle_pdb 2020-11-10 00:00:00 278230.875
c1test_CDBROOT oracle_pdb 2020-11-11 00:00:00 208727.188
c1test_CDBROOT oracle_pdb 2020-11-12 00:00:00 454964.063

In summary: After fixing all the issues on the OEM side, with everything running fine, the database metrics were still not being updated:

To make a long story short, after some investigation, I came across the following in MOS (My Oracle Support): Database Hangs With Simple Queries like on view dba_data_files & dba_free_space (Doc ID 2665935.1)

This seemed to be a match. So, I proceeded with its recommendations on OMS database:

SQL> alter session set container=DW;

Session altered.

SQL> show pdbs

CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
3 DW READ WRITE NO
SQL> select count(*) from dba_recyclebin;

COUNT(*)
----------
28522

SQL> purge recyclebin;

Recyclebin purged.

SQL> purge dba_recyclebin;

DBA Recyclebin purged.

With that done, all the issues were solved and the metric was being collected again:

Tablespace allocation metric.

 

Here are some additional references:

  • Database Tablespace Metrics: Tablespace Allocation Is Not Collected (Metric tbspAllocation) (Doc ID 404692.1)
  • EM 12c : emctl start agent Fails With Error ‘Starting agent … started but not ready’ (Doc ID 1591477.1)
  • EM12c : emctl start / status agent ‘Agent Running but Not Ready’ ‘ERROR – The agent is overloaded [current requests: 30]’ Reported in gcagent.log (Doc ID 1546529.1)

 

I hope this helps!

If you have any questions or thoughts, please leave them in the comments. And don’t forget to sign up for the next post.

 

 

AWS RDS: Reading Oracle Traces from SQL*Plus

Hi all,

One more for AWS services. Let’s say you need to read trace files from RDS, how to do it?

Here is an example of how to list and read those files based on the directories from the database setting. To see the existent directories:

SQL> select * from dba_directories;

OWNER DIRECTORY_NAME DIRECTORY_PATH ORIGIN_CON_ID
---------- ------------------------- ------------------------------------------------------------ -------------
SYS BDUMP /rdsdbdata/log/trace 0
SYS ADUMP /rdsdbdata/log/audit 0
SYS OPATCH_LOG_DIR /rdsdbbin/oracle/QOpatch 0
SYS OPATCH_SCRIPT_DIR /rdsdbbin/oracle/QOpatch 0
SYS DATA_PUMP_DIR /rdsdbdata/datapump 0
SYS OPATCH_INST_DIR /rdsdbbin/oracle/OPatch 0
SYS TMP /rdsdbdata/userdirs/01 0

7 rows selected.

Once identified the directory needed, we can list the file as per:

SQL> select * from table (rdsadmin.rds_file_util.listdir( p_directory => 'BDUMP')) order by mtime;

FILENAME TYPE FILESIZE MTIME
---------------------------------------- ---------- ---------- ---------
CPROD1_ora_48800.trm file 73 24-DEC-18
CPROD1_ora_48800.trc file 998 24-DEC-18
CPROD1_ora_86597.trc file 998 24-DEC-18
CPROD1_ora_86597.trm file 73 24-DEC-18
CPROD1_ora_7999.trc file 881 24-DEC-18
CPROD1_ora_7999.trm file 71 24-DEC-18
CPROD1_ora_7997.trm file 71 24-DEC-18
CPROD1_ora_7997.trc file 881 24-DEC-18
CPROD1_ora_8240.trm file 71 24-DEC-18
CPROD1_ora_8240.trc file 881 24-DEC-18
CPROD1_ora_8381.trm file 72 24-DEC-18
CPROD1_ora_8381.trc file 995 24-DEC-18
CPROD1_ora_8540.trc file 881 24-DEC-18
CPROD1_ora_8540.trm file 71 24-DEC-18
CPROD1_ora_9876.trc file 881 24-DEC-18
CPROD1_ora_9876.trm file 71 24-DEC-18
CPROD1_ora_11142.trm file 72 24-DEC-18
CPROD1_ora_11142.trc file 883 24-DEC-18
CPROD1_ora_11182.trc file 883 24-DEC-18
CPROD1_ora_11182.trm file 72 24-DEC-18
CPROD1_ora_55077.trm file 73 24-DEC-18
CPROD1_ora_55077.trc file 997 24-DEC-18
CPROD1_ora_92260.trm file 73 24-DEC-18
CPROD1_ora_92260.trc file 997 24-DEC-18
CPROD1_ora_123869.trc file 1000 24-DEC-18
CPROD1_ora_123869.trm file 74 24-DEC-18
CPROD1_ora_41305.trc file 998 24-DEC-18
CPROD1_ora_41305.trm file 73 24-DEC-18
CPROD1_j002_3293.trc file 114049 24-DEC-18
CPROD1_j002_3293.trm file 370 24-DEC-18
CPROD1_mmon_71739.trc file 7511332 24-DEC-18
CPROD1_mmon_71739.trm file 738330 24-DEC-18
CPROD1_ora_92888.trc file 997 24-DEC-18
CPROD1_ora_92888.trm file 73 24-DEC-18
trace/ directory 323584 24-DEC-18
alert_CPROD1.log file 204808 24-DEC-18
CPROD1_ora_70145.trc file 1470 24-DEC-18
CPROD1_ora_70145.trm file 109 24-DEC-18

3845 rows selected.

With the trace file name confirmed, you can see the content with the following:

SQL> select * from table (rdsadmin.rds_file_util.read_text_file( p_directory => 'BDUMP', p_filename => 'trace/CPROD1_s003_81573.trc'));

From the trace file:
=========================
..................................................................................
index undo for leaf key operations
KTB Redo
op: 0x02 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: C uba: 0x000e289e.51da.47
Dump kdilk : itl=98, kdxlkflg=0x1 sdc=0 indexid=0x1c53db block=0x0019bdc0
(kdxlpu): purge leaf row
key :(24):
07 78 76 0c 17 17 09 32 08 c7 07 1a 02 28 15 01 18 06 00 1a 3c 99 00 1e

File 3 is not mirrored.

End dump previous blocks for kdsgrp
* kdsgrp1-2: ***********************************************
kdsDumpState: RID context dump

45511581 rows selected.

Also, as a side note, something that may help you out is the listing with LIKE clause:

SQL> select * from table (rdsadmin.rds_file_util.listdir( p_directory => 'BDUMP')) where filename like '%CPROD1_s003_81573.trc%';

FILENAME TYPE FILESIZE MTIME
---------------------------------------- ---------- ---------- ---------
CPROD1_s003_81573.trc file 1948134047 23-DEC-18

I hope it helps! See you next post!

Exadata DNS Change – Pitfalls to be avoided

Hi all, it’s been a while but here I am!

There were some changes in the infrastructure at the place I work and I was asked to do a DNS change on a bit old Exadata X5. I had never done one before this, so the idea of this post is to help others who might face the issues I had.

The first thing I did was to look up the documentation about it and see the steps, yes there are blogs about it but the doc can help to get at least the first glance of the situation.

Long story short: Exadata has lots of components and the new DNS should be changed on all of them.

Here is a summary of the steps.

Infiniband switches

Connect to the switches and sudo to ilom-admin and change the DNS

su - ilom-admin
show /SP/clients/dns
set /SP/clients/dns nameserver=192.168.16.1,192.168.16.2,192.168.16.3
show /SP/clients/dns

 

Database nodes

For my image I only needed to change the /etc/resolv.conf, if you have a newer one you will need to user ipconf – That´s why you need to go to the documentation, at least there we hope that they will put some mentions on the pitfalls (well keep reading and you will see that was not my case)

Also changed the DNS on wach database node ilom, runing the ipmtool from the each node

ipmitool sunoem cli 'show /SP/clients/dns'
ipmitool sunoem cli 'set /SP/clients/dns nameserver=192.168.16.1,192.168.16.2,192.168.16.3'
ipmitool sunoem cli 'show /SP/clients/dns'


Cell nodes – Here things start to get interesting

For the storage cell there are some points that need to be taken under consideration:

Increase the ASM disk_repair_time – the goal here is to avoid a full rebalance if you do this within its timeframe, if you don’t know this parameter,  ASM will wait for up to the interval specified for DISK_REPAIR_TIME for the disk(s) to come online. If the disk(s) come back online within this interval, a resync operation will occur, where only the extents that were modified while the disks were offline are written to the disks once back online. If the disk(s) do not come back within this interval, ASM will initiate a forced drop of the disk(s), which will trigger a rebalance.

On each cell node we need to make sure all disks are OK, stop all cell disks, stop all cell services and user ipconfig to change the DNS configuration

#Check that putting the grid disks offline will not cause a problem for Oracle ASM - it should all say YES on the 3rd column 
cellcli -e LIST GRIDDISK ATTRIBUTES name,asmmodestatus,asmdeactivationoutcome

#Inactivate all grid disks on the cell - may take a while to complete
cellcli -e ALTER GRIDDISK ALL INACTIVE


#Confirm the grid disks are offline, it should show asmmodestatus=OFFLINE or asmmodestatus=UNUSED, and asmdeactivationoutcome=Yes for all grid disks
cellcli -e LIST GRIDDISK ATTRIBUTES name, asmmodestatus,asmdeactivationoutcome

#Confirm that the disks are offline
cellcli -e LIST GRIDDISK

#Shut down the cell services and ocrvottargetd service
cellcli -e ALTER CELL SHUTDOWN SERVICES ALL
service ocrvottargetd stop #on some images this services does not exists

To execute the ipconf on the old way we only need to call it can follow the prompts, but if you have a newer image you will need to provide its parameters as is shown in the documentation.

The documentation says that after it we could start the cell services back up but I would recommend validating the DNS prior to doing that, why is that you might say because mine did not work and I could have a bigger issue with a cell node without DNS trying to start the services.

So, how to test, use nslookup, dig and curl

nslookup dns_domain.com
curl -v 192.168.16.1:53
dig another_server_in_the_network

 

My tests did not work, I was able to ping the DNS servers but not to resolve any name, I had an SR on MOS but did not help much either, looking up as this is a production system I tried to see if the firewall was up on the Linux site, and to my surprise it was.

I tried to manually add rules to iptables but it did not work and then I came across this note Exadata: New DNS server is not accessible after changing using IPCONF (Doc ID 1581417.1)

And there it was, I needed to restart the cellwall service to recreate the iptables rules.

# Restart cellwall service
service cellwall restart
service cellwall status

One final point, check if ASM started the rebalance or not, if it did, do not start to bring down another cell node until the rebalance is finish, otherwise you may run into deeper issues.

 

I hope it helps!

Elisson Almeida

AWS AQUA for Redshift

Hi all,

Quick one today. Did you see this new release? I’m a bit behind the schedule but trying to catch up with the news. It seems very interesting:

AQUA (Advanced Query Accelerator) for Amazon Redshift is available in preview. AQUA provides a new distributed and hardware-accelerated cache that brings compute to the storage layer for Amazon Redshift and delivers up to 10x faster query performance than other cloud data warehouses.

AQUA is a high-speed cache on top of Redshift Managed Storage that can scale out and process data in parallel across many AQUA nodes. AQUA uses AWS-designed analytics processors that dramatically accelerate data compression, encryption, and data processing on queries that scan, filter, and aggregate large data sets. With this new architecture, customers can run queries quicker than ever before, allowing them to query data directly, even at scale, and giving them more up-to-date dashboards, reducing development time, and making system maintenance easier.

It is available for preview in US East (Ohio), US East (N. Virginia), and US West (Oregon) regions at this point.

For more references:

  • There is a very tech detailed article about it HERE.
  • There is also a very nice TechTalk presenting it HERE.

Let’s keep up to date!

ORA-07445 [kxsPurgeCursor()]

Hi all,
We started getting lots of ORA-07445 errors that are being reported in the Exadata production database from a client.
Checking on the traces we can see that someone is running the below SQL and that is triggering these errors:
declare
i number := 0;
begin
for c1 in (
select address, hash_value, sql_id from v$sql where
last_active_time < sysdate - 1/24
and executions < 2 ) loop
begin
dbms_shared_pool.purge(c1.address || ' ' || to_char(c1.hash_value), 'C');
i := i + 1;
exception when others then
dbms_output.put_line(c1.sql_id);
end;
end loop;
dbms_output.put_line('Cursors purged = ' || to_char(i));
end;
/
I’m not even going to discuss the PL above, it’s purging all new SQLs executed less than 2 times in the last 24hours. I’m not really agreeing with this…
What matters is: in the end, it was being caused because hash value of non-cursor is being passed to dbms_shared_pool.purge().  This is a match to Bug 29281112 – ORA-7445: [kxspurgecursor()+517] [sigsegv] (Doc ID 29281112.8)
There is no actual workaround besides stop passing a non-cursor hash to dbms_shared_pool.purge, however we have a fix for a better addressing of those cases on:

Quick Reference: Oracle Exadata default passwords

It’s not needed to say how important this is to have all the password reset, even though on new Exa hardware the SSH is usually disabled. How ever, oftenly we need to have access in some of the consoles knowing the password was never changed but unaware on what it should be.

For those cases (until you change it, ASAP), here is a quick reference:

Database Server:

  • root/welcome1
  • oracle/welcome1
  • grid/welcome1
  • grub/sos1Exadata

Exadata Storage Servers:

  • root/welcome1
  • celladmin/welcome1
  • cellmonitor/welcome1

InfiniBand switches:

  • root/welcome1
  • nm2user/changeme

Ethernet switches:

  • admin/welcome1

Power distribution units (PDUs):

  • admin/welcome1
  • root/welcome1

Database server ILOMs:

  • root/welcome1

Exadata Storage Server ILOMs:

  • root/welcome1

InfiniBand ILOMs:

  • ilom-admin/ilom-admin
  • ilom-operator/ilom-operator

Keyboard, video, mouse (KVM):

  • admin/welcome1

Change them!

I hope this was useful for you.

Removing user from SQL Server database

So, have you ever been in the need to drop a user from the SQL Server database, but you stuck with the following related error?

Error: 15138 The database principal owns a schema in the database, and cannot be dropped.

If you don’t have access to the SSMS to see which schema or objects the user owns, the following SQL should do the job.

USE [DATABASENAME]
GO
SELECT so.name OBJECT, su.name OWNER
FROM sys.schemas so
INNER JOIN sysusers su ON (so.principal_id = su.uid)
WHERE su.name = 'username'
GO
SELECT so.name OBJECT, su.name OWNER, so.xtype TP
FROM sys.sysobjects so
INNER JOIN sysusers su ON (so.uid = su.uid)
WHERE su.name = 'username'
GO

Once you have the owned objects/schemas, you can change them with the following SQL (schema example):

USE [DATABASENAME]
GO
ALTER AUTHORIZATION ON SCHEMA::[db_datareader] TO [newowner] /* new owner username */
ALTER AUTHORIZATION ON SCHEMA::[db_datawriter] TO [newowner]
GO

Then you’re ready to drop the user DROP USER [username].

Bonus SQL: The following will list the schemas and their owner.

SELECT schema_name(schema_id) as SCHEMANAME,
user_name(s.principal_id) as USERNAME
FROM sys.schemas AS s

AWS CloudFormation

Hello everyone,

We are going to start one series of posts where we are going to show the AWS CloudFormation usage.

The nexts post going to abord a creation one RDS, DMS using AWS cloudFormation stack. In this series we’re going to showcase AWS CloudFormation by creating one RDS and one DMS.

First one fast introduction about AWS cloudFormation.

Let start to explain AWS CloudFormation

” AWS CloudFormation provides a common language for you to model and provision AWS and third party application resources in your cloud environment.”

In practice it’s a json/yaml file, where we can describe instructions to create AWS services.

Let’s  code!.

First step that we need is connect to AWS Console. In the search field type CloudFormation like the picture below.

Click on CloudFormation to open the service console.

 

Click on Stacks.

 

 

After this, click on create a new stack and select “with new resources”

We are going to click on Create template in designer and you will be redirected to page like below.

 

Click on Template and the code editor is going to open.

The next step will be to create one script to deploy a service. In this example, we going to use a DMS script and a RDS Postgresql. The examples used in this article are available in the next article.

To execute the script the first step is to validate by clicking the highlighted button in the below image.

The return can be OK or error. If the return is OK you can create the stack. To do this clicking the highlighted button in the below image.

You can check execution events by clicking in the Events page. The return is similar to the image below.

In the next articles, we are going see the source code and to use the AWS DMS Service to replicate data from Oracle Database to RDS PostgreSQL.