Exadata Healthcheck – Top 5 Tools and Features!

Hi all,
It’s not new for Oracle DBAs the countless great tools we have out of the box to help us out with our daily tasks, such as ORAchk/EXAchk/ODAchk, Database Security Assessment Tool (DBSAT), Hang Manager, Cluster Health Advisor (CHA), Cluster Verification Utility (CVU), Memory Guard, Tracefile Analyzer (TFA) with tools like oratop, procwatcher, oswatcher, pstack, RDA, and the list goes on and on…

The good news is, most of the tools are now together on the Autonomous Health Framework (AHF), since version 12.2. None of those tools are running by default though, so you might need to choose some to start and enable on your environment.

But out of all this list, what if we could choose the top 5 features we can and should use as a start for Exadata Environment? Well, I did mine, see it below.

Oh, and by the way, you don’t pay anything else for them, counting you already have Oracle Support Services!

1. Cluster Health Advisor – Calibrate your Exa Environment!

Available along with the AFH since 12.2, the CHA works along the Cluster Health Monitor to provide you fine-grained notifications and correlations about your environment. And when I say it, I mean it: YOUR environment. This is because the CHA works better if you calibrate it with your statistics. As usual, not the worse problematic day or the low workload night, but an average day which can be used as a reference. All this is stored in the GIMR (as shown below) and used for future comparison and model inference.

This means the CHA is not a long list of IFs with fixed metrics, but an intelligent tool monitoring over 127 processes that perform work based on your workload. Not only this, the CHA is enriched with Machine Learning algorithms that model over 30 known DB problems based on over 150 metric predictors.

 

An example of inference can be seen below, where network and Global Cache statistics are used to inference a network issue.

Not rocket science, but always nice to have someone digesting tons of logs and metrics and reaching this sort of conclusion unassisted, right? You as DBA can steal all credits for the finding, no hard feelings.

And this is just one of the things CHA provides. It has tons of other functionalities. You should try using it more!

 

2. EXAchk – Daily Automated Runs (and Reports)

Most likely if you have an Exadata, you are used to running from time to time an EXAchk to review the recommendations and best practices for your environment. It’s something that requires almost no effort to run and to copy the reports, or you most likely have created an script to do so. What if I tell you Oracle has now automated this with AHF?

All you need to do is to confirm the scheduled runs and set the address for the reports to be sent. Find below a quick Cheatsheet:

a. Checking Status of the EXAchk

[root@exa01dbadm01 ~]# exachk -d info
------------------------------------------------------------

Master node = exa01dbadm01

exachk daemon version = 211300

Install location = /opt/oracle.ahf/exachk

Started at = Wed Jun 16 11:58:03 MDT 2021

Scheduler type = TFA Scheduler


[root@exa01dbadm01 ~]# exachk -d status
exachk is using TFA Scheduler. TFA PID: 369350

b. Checking Status of TFA Daemon Status and Auto Start

[root@exa01dbadm01 ~]# ahfctl statusahf

.-----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+--------------+---------------+--------+------+------------+----------------------+------------------+
| exa01dbadm01 | RUNNING | 369350 | 5000 | 21.1.3.0.0 | 21130020210607124914 | COMPLETE |
| exa01dbadm02 | RUNNING | 118950 | 5000 | 21.1.3.0.0 | 21130020210607124914 | COMPLETE |
'--------------+---------------+--------+------+------------+----------------------+------------------'

------------------------------------------------------------

Master node = exa01dbadm01

exachk daemon version = 211300

Install location = /opt/oracle.ahf/exachk

Started at = Wed Jun 16 11:58:03 MDT 2021

Scheduler type = TFA Scheduler

------------------------------------------------------------
ID: exachk.autostart_client_exatier1
------------------------------------------------------------
AUTORUN_FLAGS = -usediscovery -profile exatier1 -syslog -dball -showpass -tag autostart_client_exatier1 -readenvconfig
COLLECTION_RETENTION = 7
AUTORUN_SCHEDULE = 3 2 * * 1,2,3,4,5,6
------------------------------------------------------------
------------------------------------------------------------
ID: exachk.autostart_client
------------------------------------------------------------
AUTORUN_FLAGS = -usediscovery -syslog -tag autostart_client -readenvconfig
COLLECTION_RETENTION = 14
AUTORUN_SCHEDULE = 3 3 * * 0
------------------------------------------------------------

Next auto run starts on Jun 17, 2021 02:03:00

ID:exachk.AUTOSTART_CLIENT_EXATIER1

c. Gather EXAchk Next Automated Run

[root@exa01dbadm01 ~]# exachk -d nextautorun

Next auto run starts on Jun 17, 2021 02:03:00

ID:exachk.AUTOSTART_CLIENT_EXATIER1

[root@exa01dbadm01 ~]#

d. Changing EXAchk Notifications:

[root@exa01dbadm01 ~]# exachk -get NOTIFICATION_EMAIL,AUTORUN_SCHEDULE,COLLECTION_RETENTION
------------------------------------------------------------
ID: exachk.autostart_client_exatier1
------------------------------------------------------------
COLLECTION_RETENTION = 7
AUTORUN_SCHEDULE = 3 2 * * 1,2,3,4,5,6
------------------------------------------------------------
------------------------------------------------------------
ID: exachk.autostart_client
------------------------------------------------------------
COLLECTION_RETENTION = 14
AUTORUN_SCHEDULE = 3 3 * * 0
------------------------------------------------------------


[root@exa01dbadm01 ~]# exachk -id autostart_client -set NOTIFICATION_EMAIL=boesing@pythian.com

Updated attribute ['NOTIFICATION_EMAIL=boesing@pythian.com'] for Id[exachk.AUTOSTART_CLIENT]

Successfully copied Daemon Store to Remote Nodes


[root@exa01dbadm01 ~]# exachk -get NOTIFICATION_EMAIL,AUTORUN_SCHEDULE,COLLECTION_RETENTION
------------------------------------------------------------
ID: exachk.autostart_client_exatier1
------------------------------------------------------------
COLLECTION_RETENTION = 7
AUTORUN_SCHEDULE = 3 2 * * 1,2,3,4,5,6
------------------------------------------------------------
------------------------------------------------------------
ID: exachk.autostart_client
------------------------------------------------------------
NOTIFICATION_EMAIL = boesing@pythian.com
COLLECTION_RETENTION = 14
AUTORUN_SCHEDULE = 3 3 * * 0
------------------------------------------------------------

[root@exa01dbadm01 ~]# exachk -id autostart_client_exatier1 -set NOTIFICATION_EMAIL=boesing@pythian.com
Updated attribute ['NOTIFICATION_EMAIL=boesing@pythian.com'] for Id[exachk.AUTOSTART_CLIENT_EXATIER1]

Successfully copied Daemon Store to Remote Nodes


[root@exa01dbadm01 ~]# exachk -get NOTIFICATION_EMAIL,AUTORUN_SCHEDULE,COLLECTION_RETENTION
------------------------------------------------------------
ID: exachk.autostart_client_exatier1
------------------------------------------------------------
NOTIFICATION_EMAIL = boesing@pythian.com
COLLECTION_RETENTION = 7
AUTORUN_SCHEDULE = 3 2 * * 1,2,3,4,5,6
------------------------------------------------------------
------------------------------------------------------------
ID: exachk.autostart_client
------------------------------------------------------------
NOTIFICATION_EMAIL = boesing@pythian.com
COLLECTION_RETENTION = 14
AUTORUN_SCHEDULE = 3 3 * * 0
------------------------------------------------------------

e. Change EXAchk Schedule and Retention

[root@exa01dbadm01 ~]# exachk -id autostart_client_exaier1 –set "AUTORUN_SCHEDULE=0 3 * * *" -> Time= 3 AM daily
[root@exa01dbadm01 ~]# exachk-id autostart_client –set "collection_retention=90"

f. EXAchk: Testing Email Sending and Running EXAchk Report over email

This is for ad-hoc testing to check about email sending, out of the scheduled runs.

[root@exa01dbadm01 ~]# exachk -testemail notification_email=boesing@pythian.com
Email Successfully sent to ['boesing@pythian.com'] from 'root@exa01dbadm01
[root@exa01dbadm01 ~]# exachk -sendemail notification_email=boesing@pythian.com


Searching for running databases . . . . .

. . . . . . . . . . . .
List of running databases registered in OCR

1. xxxxxx
2. yyyy
3. None of above

Select databases from list for checking best practices. For multiple databases, select 3 for All or comma separated number like 1,2 etc [1-3][3].
[...]
Detailed report (html) - /u01/app/oracle/oracle.ahf/data/exa01dbadm01/exachk/user_root/output/exachk_exa01dbadm01_xxxxx_061621_134748/exachk_exa01dbadm01_xxxxx_061621_134748.html

UPLOAD [if required] - /u01/app/oracle/oracle.ahf/data/exa01dbadm01/exachk/user_root/output/exachk_exa01dbadm01_xxxxxx_061621_134748.zip
Email Successfully sent to ('boesing@pythian.com',) from 'root@exa01dbadm01' with attachment

3. TFA – Sanitize and Mask Options

Even with all the concerns on sensitive data being more and more relevant, this is something that actually surprised me. It’s possible to Sanitize and Mask data in collections. For example, mask will hide your inner data (let’s say table names):

[root@exa01dbadm01 ~]# tfactl diagcollect -srdc ORA-00600 -mask

Sanitize will hide your hardware setting. Not that useful if you have an Exadata, but might be interesting if you have commodity hardware you don’t want Oracle to know about.

[root@exa01dbadm01 ~]# tfactl diagcollect -srdc ORA-00600 -sanitize

4. TFA Changes – “Nothing was Changed” Resolver Tool

This is for all the DBAs which had already this dialogue:

Client: Yesterday was running fine, and today it’s veeeery slow. Nothing was changed!
DBA: Something changed, that’s for sure.
Client: Absolutely nothing changed.

So now we can access if indeed nothing changed from the client’s perspective (perhaps an automatic statistics gathering or something) or if anybody did something and is hard to identify.

It takes parameters from OS and DB and tracks of old and new values, reporting changes:

[root@exa01dbadm01 ~]# tfactl changes

Output from host : exa01dbadm02
------------------------------
No Changes Found

Output from host : exa01dbadm01
------------------------------
[Nov/14/2021 00:08:33.000]: [db.dbprod19.dbprod191]: Parameter: log_archive_dest_2: Value: service=dbprod19stb => ASYNC NOAFFIRM delay=240 optional compression=disable max_failure=0 reopen=300 db_unique_name=dbprod19stb net_timeout=300
[Nov/14/2021 00:08:33.000]: [db.dbprod19.dbprod191]: Parameter: log_archive_dest_2: Value: service=dbprod19stb => valid_for=(online_logfile,all_roles)

5. Oracle Health Check Collections Manager

Not a surprise if you don’t know this tool, but I’d really recommend you do look for it now. It’s a great tool and as with everything in this post, it’s free!

Oracle Health Check Collections Manager is an APEX companion application to Oracle EXAchk that gives you an enterprise-wide view of your health check collection data. All you need to have is an APEX 4.2 or 5 version and deploy the tool. The main idea is that you can consolidate all your reports in one place and, as a plus, you can manage all your EXAchk reports across the time, including a view on the items regression you may have.

This is an example of the view of the collections:

And this is an example of a new best practices failure:

Do you agree with my top list? Let me know your thoughts!

OEM 12.5: Tablespace Allocation Metric Not Collected—Agent Is Running but Not Ready

Hi all,

First of all, accept our apologies for the long period without posts. We are about to resolve it and restart with posts. 2021 was such a crazy year, but all is settling down.

Jumping to what you are here for: So, I ran across an interesting case within OEM 12 (Oracle Enterprise Manager) Release 5. A client reported the “tablespace allocation metric” was not being updated on OEM for a specific database. In this case, the most recent gathering was done in November 2020, as I”ll show you shortly. Then, we discovered the problem one month later. This post describes what we did to solve the issue.

As usual, the first thing I did was check on the Oracle Enterprise Manager (OEM) agent status. This is what it said:

oracle:dbserver@mydb02 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 12.1.0.5.0
OMS Version : (unknown)
Protocol Version : 12.1.0.1.0
Agent Home : /u01/app/oracle/product/agent12c/agent_inst
Agent Log Directory : /u01/app/oracle/product/agent12c/agent_inst/sysman/log
Agent Binaries : /u01/app/oracle/product/agent12c/core/12.1.0.5.0
Agent Process ID : 61641
Parent Process ID : 61394
Currently initializing component : Target Manager (2) (54 of 70)
Receivelet Interaction Manager Current Activity: Outstanding receivelet event tasks
----------------------------------
TargetID = oracle_pdb.c4test_PDB1 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_PDB2 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30
TargetID = oracle_pdb.c4test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c6test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_PDB3 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30
TargetID = rac_database.c1prod - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30


Target Manager Current Activity : Compute Dynamic Properties (total operations: 37, active: 7, finished: 28)


Current target operations in progress
-------------------------------------
oracle_pdb.c6test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c4test_PDB1 - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_PDB2 - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c4test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_PDB3 - LOAD_TARGET_DYNAMIC running for 120 seconds
rac_database.c1test - LOAD_TARGET_DYNAMIC running for 120 seconds


Dynamic property executor tasks running
------------------------------


---------------------------------------------------------------
Agent is Running but Not Ready

“Agent not ready.” Now, that’s interesting.

Next, I tried to clear the agent state as this had solved some previous similar cases:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl clearstate agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
EMD clearstate completed successfully

Then, I ran the problematic metric manually:

oracle:dbserver02@c1test2 /u01/app/oracle: runCollection c1test_DW:oracle_pdb tbspAllocation <
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD runCollection error:The agent is running but is currently not ready to accept client requests

This shows me trying to upload the case:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl upload
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD upload error:The agent is running but is currently not ready to accept client requests

I thought maybe something was stuck, so I decided to kill the process and start all over again:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl stop agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Stopping agent ...

stopped.

Here, I made sure I had no agent process running:

oracle:dbserver02@c1test2 /u01/app/oracle: ps -ef | grep java | agent
oracle:dbserver02@c1test2 /u01/app/oracle:

I also adjusted the threshold for metric running:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl setproperty agent -a
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
EMD setproperty succeeded
oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl setproperty agent -allow_new -name _cancelThread -value 210
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
EMD setproperty succeeded

And strating the agent

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Starting agent ........................................................................................................................... started but not ready.

This was on the agent log:

oracle:dbserver02@c1test2 /u01/app/oracle: tail /u01/app/oracle/product/agent12c/agent_inst/sysman/log/gcagent.log
oracle:dbserver02@c1test2 /u01/app/oracle:
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - StatusAgentAction: satisfyRequest Begin
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - Agent Overall Health: 0
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - StatusAgentAction: satisfyRequest End
Response:
initializing
2020-12-20 13:15:03,457 [35:686116F5] INFO - >>> Reporting response: StatusAgentResponse (initializing) (request id 1) <<<
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - closing request input stream for "StatusAgentRequest (AGENT timeout:300)"
2020-12-20 13:15:03,457 [35:686116F5] DEBUG - overriding the buffer with a thread local copy (size: 8192b)
2020-12-20 13:15:03,458 [35:686116F5] DEBUG - closing request output stream for "StatusAgentRequest (AGENT timeout:300)"
2020-12-20 13:15:03,458 [35:686116F5] DEBUG - StatusAgentAction.call() is complete.
2020-12-20 13:15:03,458 [35:B5326F3F:HTTP Listener-35 - /emd/lifecycle/main/] DEBUG - removing entry for emdctl@18081@dbserver02=>[160849530330001] completely
2020-12-20 13:15:03,458 [35:B5326F3F] DEBUG - requests executed.
2020-12-20 13:15:03,458 [35:B5326F3F] DEBUG - HTTPListener Threads deallocated resource back to LifecycleRequestHandler partition
2020-12-20 13:15:03,458 [35:3C0B0663:HTTP Listener-35] DEBUG - using connection SCEP@1197017148 [d=true,io=1,w=true,b=false|false],NOT_HANDSHAKING, in/out=0/0 Status = OK HandshakeStatus = NOT_HANDSHAKING
bytesConsumed = 5 bytesProduced = 26
2020-12-20 13:15:03,780 [35:3C0B0663] DEBUG - using connection SCEP@1197017148 [d=true,io=1,w=true,b=false|false],NOT_HANDSHAKING, in/out=0/0 Status = OK HandshakeStatus = NOT_HANDSHAKING
bytesConsumed = 26 bytesProduced = 5
2020-12-20 13:15:06,986 [31:858161EB] DEBUG - Submitting task SchedulerHeartbeat for execution
2020-12-20 13:15:06,986 [395:1AE716D8] DEBUG - Begin task SchedulerHeartbeat on Thread: GC.SysExecutor.8
2020-12-20 13:15:06,986 [395:F944F4C8:GC.SysExecutor.8 (SchedulerHeartbeat)] DEBUG - Scheduler heartbeat
2020-12-20 13:15:06,988 [395:F944F4C8] DEBUG - Scheduling next SchedulerHeartbeat after delay 29998 including periodShift of 0 milliseconds
2020-12-20 13:15:06,988 [395:1AE716D8:GC.SysExecutor.8] DEBUG - End task SchedulerHeartbeat
2020-12-20 13:15:07,016 [31:858161EB] DEBUG - Submitting task HeapMonitorTask for execution
2020-12-20 13:15:07,017 [396:1AE716D9] DEBUG - Begin task HeapMonitorTask on Thread: GC.SysExecutor.9
2020-12-20 13:15:07,017 [396:391F60D7:GC.SysExecutor.9 (HeapMonitorTask)] DEBUG - Scheduling next HeapMonitorTask after delay 5000 including periodShift of 0 milliseconds
2020-12-20 13:15:07,017 [396:1AE716D9:GC.SysExecutor.9] DEBUG - End task HeapMonitorTask
2020-12-20 13:15:12,017 [31:858161EB] DEBUG - Submitting task HeapMonitorTask for execution
2020-12-20 13:15:12,017 [37:1AE716D0] DEBUG - Begin task HeapMonitorTask on Thread: GC.SysExecutor.0
2020-12-20 13:15:12,017 [37:FE21F10E:GC.SysExecutor.0 (HeapMonitorTask)] DEBUG - Scheduling next HeapMonitorTask after delay 5000 including periodShift of 0 milliseconds
2020-12-20 13:15:12,017 [37:1AE716D0:GC.SysExecutor.0] DEBUG - End task HeapMonitorTask
2020-12-20 13:15:12,189 [33:6D553CF6] DEBUG - HTTPListener Threads deallocated resource back to LifecycleRequestHandler partition
2020-12-20 13:15:12,190 [35:3C0B0663] DEBUG - using connection SCEP@1611645943 [d=true,io=1,w=true,b=false|false],NOT_HANDSHAKING, in/out=0/0 Status = OK HandshakeStatus = NOT_HANDSHAKING
bytesConsumed = 100 bytesProduced = 121
2020-12-20 13:15:12,191 [35:7107E334:HTTP Listener-35 - /emd/persistence/main/] DEBUG - HTTPListener Threads allocated resource from LifecycleRequestHandler partition
2020-12-20 13:15:17,017 [31:858161EB] DEBUG - Submitting task HeapMonitorTask for execution
2020-12-20 13:15:17,018 [45:1AE716D1] DEBUG - Begin task HeapMonitorTask on Thread: GC.SysExecutor.1
2020-12-20 13:15:17,018 [45:CBCC52CF:GC.SysExecutor.1 (HeapMonitorTask)] DEBUG - Scheduling next HeapMonitorTask after delay 5000 including periodShift of 0 milliseconds
2020-12-20 13:15:17,018 [45:1AE716D1:GC.SysExecutor.1] DEBUG - End task HeapMonitorTask

Following MOS Enterprise Manager12c: Oracle Database Tablespace Monthly Space Usage shows no data (Doc ID 1536654.1), I made a few changes:

$/AGENT_INST/bin/emctl setproperty agent -allow_new -name MaxInComingConnections -value 150
$/AGENT_INST/bin/emctl setproperty agent -allow_new -name _cancelThread -value 210

Here’s the status before the change:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 12.1.0.5.0
OMS Version : (unknown)
Protocol Version : 12.1.0.1.0
Agent Home : /u01/app/oracle/product/agent12c/agent_inst
Agent Log Directory : /u01/app/oracle/product/agent12c/agent_inst/sysman/log
Agent Binaries : /u01/app/oracle/product/agent12c/core/12.1.0.5.0
Agent Process ID : 61641
Parent Process ID : 61394
Currently initializing component : Target Manager (2) (54 of 70)
Receivelet Interaction Manager Current Activity: Outstanding receivelet event tasks
----------------------------------
TargetID = oracle_pdb.c4test_PDB1 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_PDB2 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30
TargetID = oracle_pdb.c4test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c6test_CDBROOT - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:29
TargetID = oracle_pdb.c3test_PDB3 - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30
TargetID = rac_database.c1test - EventType - TARGET_EVENT for operation LOAD_TARGET submitted at 2020-12-20 12:54:30

Target Manager Current Activity : Compute Dynamic Properties (total operations: 37, active: 7, finished: 28)

Current target operations in progress
-------------------------------------
oracle_pdb.c6test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c4test_PDB1 - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_PDB2 - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c4test_CDBROOT - LOAD_TARGET_DYNAMIC running for 120 seconds
oracle_pdb.c3test_PDB3 - LOAD_TARGET_DYNAMIC running for 120 seconds
rac_database.c1test - LOAD_TARGET_DYNAMIC running for 120 seconds

Dynamic property executor tasks running
------------------------------


---------------------------------------------------------------
Agent is Running but Not Ready

And this was the status after the change:

oracle:dbserver02@c1test2 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 12.1.0.5.0
OMS Version : 12.1.0.5.0
Protocol Version : 12.1.0.1.0
Agent Home : /u01/app/oracle/product/agent12c/agent_inst
Agent Log Directory : /u01/app/oracle/product/agent12c/agent_inst/sysman/log
Agent Binaries : /u01/app/oracle/product/agent12c/core/12.1.0.5.0
Agent Process ID : 56994
Parent Process ID : 56654
Agent URL : https://dbserver02:3872/emd/main/
Local Agent URL in NAT : https://dbserver02:3872/emd/main/
Repository URL : https://omsweb:4903/empbs/upload
Started at : 2020-12-20 13:08:35
Started by user : oracle
Operating System : Linux version 3.10.0-957.27.2.el7.x86_64 (amd64)
Last Reload : (none)
Last successful upload : 2020-12-20 13:40:41
Last attempted upload : 2020-12-20 13:40:41
Total Megabytes of XML files uploaded so far : 1.02
Number of XML files pending upload : 0
Size of XML files pending upload(MB) : 0
Available disk space on upload filesystem : 10.85%
Collection Status : Collections enabled
Heartbeat Status : Ok
Last attempted heartbeat to OMS : 2020-12-20 13:40:40
Last successful heartbeat to OMS : 2020-12-20 13:40:40
Next scheduled heartbeat to OMS : 2020-12-20 13:41:40

---------------------------------------------------------------
Agent is Running and Ready

Great! Agent issue resolved.

However, the metric was not being gathered—not even after running it manually:

oracle:dbserver01@c1test1 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl control agent runCollection c1test_CDBROOT:oracle_pdb tbspAllocation
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD runCollection completed successfully

oracle:dbserver01@c1test1 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl status agent scheduler | grep tbspAllocation
2020-12-28 23:05:14.562 : rac_database:c1test:tbspAllocation_cdb
2020-12-29 03:07:21.988 : rac_database:c4prod:tbspAllocation_cdb
2020-12-29 03:08:11.888 : rac_database:c6prod:tbspAllocation_cdb
2020-12-29 03:09:39.103 : rac_database:c2prod:tbspAllocation_cdb
2020-12-29 03:09:55.372 : rac_database:c3prod:tbspAllocation_cdb

oracle:dbserver01@c1test1 /u01/app/oracle: /u01/app/oracle/product/agent12c/core/12.1.0.5.0/bin/emctl control agent runCollection c1test_DW:oracle_pdb tbspAllocation
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD runCollection completed successfully

Meanwhile, on the OEM repository database:

SQL> select TARGET_NAME,TARGET_TYPE,TARGET_GUID,max(ROLLUP_TIMESTAMP )
from mgmt$metric_daily where TARGET_NAME like '%c1test%'
and TARGET_TYPE='oracle_pdb'
and METRIC_NAME='tbspAllocation'
group by TARGET_NAME,TARGET_TYPE,TARGET_GUID; 2 3 4 5

TARGET_NAME TARGET_TYPE TARGET_GUID MAX(ROLLUP_TIMESTAM
------------------------------ -------------------- -------------------------------- -------------------
c1test_DW oracle_pdb 7B1DF5DD4555EB978330A6D522004D44 2020-11-12 00:00:00
c1test_CDBROOT oracle_pdb 4CE72911295C0287E053837F649B7D0E 2020-11-12 00:00:00


SQL> select TARGET_NAME,TARGET_TYPE,TARGET_GUID,ROLLUP_TIMESTAMP from mgmt$metric_daily where TARGET_NAME like '%c1test%' and TARGET_TYPE='oracle_pdb' and ROLLUP_TIMESTAMP>sysdate-3 order by 4

TARGET_NAME TARGET_TYPE ROLLUP_TIMESTAMP AVERAGE
------------------------------ -------------------- ------------------- ----------
c1test_DW oracle_pdb 2020-11-06 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-07 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-08 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-09 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-10 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-11 00:00:00 1575.9375
c1test_DW oracle_pdb 2020-11-12 00:00:00 1575.9375
c1test_CDBROOT oracle_pdb 2020-11-05 00:00:00 37581.5625


TARGET_NAME TARGET_TYPE ROLLUP_TIMESTAMP AVERAGE
------------------------------ -------------------- ------------------- ----------
c1test_CDBROOT oracle_pdb 2020-11-08 00:00:00 227138.75
c1test_CDBROOT oracle_pdb 2020-11-09 00:00:00 455087.688
c1test_CDBROOT oracle_pdb 2020-11-10 00:00:00 278230.875
c1test_CDBROOT oracle_pdb 2020-11-11 00:00:00 208727.188
c1test_CDBROOT oracle_pdb 2020-11-12 00:00:00 454964.063

In summary: After fixing all the issues on the OEM side, with everything running fine, the database metrics were still not being updated:

To make a long story short, after some investigation, I came across the following in MOS (My Oracle Support): Database Hangs With Simple Queries like on view dba_data_files & dba_free_space (Doc ID 2665935.1)

This seemed to be a match. So, I proceeded with its recommendations on OMS database:

SQL> alter session set container=DW;

Session altered.

SQL> show pdbs

CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
3 DW READ WRITE NO
SQL> select count(*) from dba_recyclebin;

COUNT(*)
----------
28522

SQL> purge recyclebin;

Recyclebin purged.

SQL> purge dba_recyclebin;

DBA Recyclebin purged.

With that done, all the issues were solved and the metric was being collected again:

Tablespace allocation metric.

 

Here are some additional references:

  • Database Tablespace Metrics: Tablespace Allocation Is Not Collected (Metric tbspAllocation) (Doc ID 404692.1)
  • EM 12c : emctl start agent Fails With Error ‘Starting agent … started but not ready’ (Doc ID 1591477.1)
  • EM12c : emctl start / status agent ‘Agent Running but Not Ready’ ‘ERROR – The agent is overloaded [current requests: 30]’ Reported in gcagent.log (Doc ID 1546529.1)

 

I hope this helps!

If you have any questions or thoughts, please leave them in the comments. And don’t forget to sign up for the next post.