Autonomous Health Framework – TFA events, changes and set

Continuing on this series, there are 2 commands which are very useful and we’ll see today: “events” and “changes”. Besides “set”, which we’ll cover too.

On the events command you will be able to drill down to a specific date and time, passing a specific component if needed as well you can check a database or instance as well. Using the last option you can, this is very useful when you want to see a history for the system and lists all important events in the system:

Usage : /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl events [-search <keyword> | -component <ASM|CRS> | -database <db_name> | -instance <db_instance_name> | -source <filename> | -from <time> -to <time> | -json | -fields all|<fields_list> ]
-component [ASM|CRS] Searches all ASM or CRS events
-database Searches all events from a particular database
-instance Searches all events from a particular instance
-source Searches all events from a particular alert file
-json Outputs event information in JSON format
-fields When provided with the -json option, the command will only return the requested fields
-from "Mon/dd/yyyy hh:mm:ss" From <time>
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-to "Mon/dd/yyyy hh:mm:ss" To <time>
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-for "Mon/dd/yyyy" For <date>
or "yyyy-mm-dd"
-last <n><h|d>

[root@servertst01 bin]# ./tfactl events -last 30d -database dbaseprd


Output from host : servertst02
------------------------------

Event Summary:
INFO :7
ERROR :3
WARNING :0

Event Timeline:
[Apr/24/2020 01:43:06.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 21, new inc 23)
[Apr/24/2020 02:09:47.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 23, new inc 25)
[Apr/24/2020 02:12:36.000]: [db.dbaseprd.dbaseprd2]: Instance terminated by USER, pid = 246331
[Apr/24/2020 02:12:36.000]: [db.dbaseprd.dbaseprd2]: Shutting down instance (abort)
[Apr/24/2020 02:28:29.000]: [db.dbaseprd.dbaseprd2]: Starting ORACLE instance (normal)
[Apr/24/2020 02:28:37.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 0, new inc 29)
[Apr/24/2020 02:31:06.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 29, new inc 31)
[Apr/24/2020 03:14:38.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 31, new inc 33)
[Apr/25/2020 11:46:12.000]: [db.dbaseprd.dbaseprd2]: ORA-00060: deadlock detected while waiting for resource
[May/07/2020 04:18:24.000]: [db.dbaseprd.dbaseprd2]: ORA-08103: object no longer exists

Output from host : servertst03
------------------------------

Event Summary:
INFO :7
ERROR :5
WARNING :0

Event Timeline:
[Apr/16/2020 04:12:25.000]: [db.dbaseprd.dbaseprd3]: ORA-08103: object no longer exists
[Apr/23/2020 04:12:17.000]: [db.dbaseprd.dbaseprd3]: ORA-08103: object no longer exists
[Apr/24/2020 01:43:06.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 21, new inc 23)
[Apr/24/2020 02:09:47.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 23, new inc 25)
[Apr/24/2020 02:12:37.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 25, new inc 27)
[Apr/24/2020 02:28:38.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 27, new inc 29)
[Apr/24/2020 02:31:05.000]: [db.dbaseprd.dbaseprd3]: Instance terminated by USER, pid = 6814
[Apr/24/2020 02:31:05.000]: [db.dbaseprd.dbaseprd3]: Shutting down instance (abort)
[Apr/24/2020 03:14:28.000]: [db.dbaseprd.dbaseprd3]: Starting ORACLE instance (normal)
[Apr/24/2020 03:14:37.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 0, new inc 33)
[Apr/28/2020 10:12:44.000]: [db.dbaseprd.dbaseprd3]: ORA-00060: deadlock detected while waiting for resource
[May/10/2020 13:40:21.000]: [db.dbaseprd.dbaseprd3]: ORA-00060: deadlock detected while waiting for resource

Output from host : servertst01
------------------------------

Event Summary:
INFO :8
ERROR :2
WARNING :0

Event Timeline:
[Apr/24/2020 01:43:05.000]: [db.dbaseprd.dbaseprd1]: Instance terminated by USER, pid = 37711
[Apr/24/2020 01:43:05.000]: [db.dbaseprd.dbaseprd1]: Shutting down instance (abort)
[Apr/24/2020 02:09:38.000]: [db.dbaseprd.dbaseprd1]: Starting ORACLE instance (normal)
[Apr/24/2020 02:09:46.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 0, new inc 25)
[Apr/24/2020 02:12:37.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 25, new inc 27)
[Apr/24/2020 02:28:39.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 27, new inc 29)
[Apr/24/2020 02:31:06.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 29, new inc 31)
[Apr/24/2020 03:14:38.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 31, new inc 33)
[Apr/24/2020 14:40:10.000]: [db.dbaseprd.dbaseprd1]: Incident details in: /oraadm/dbaseprd/admin/traces/diag/rdbms/dbaseprd/dbaseprd1/incident/incdir_115657/dbaseprd1_ora_151392_i115657.trc
[Apr/24/2020 14:40:10.000]: [db.dbaseprd.dbaseprd1]: ORA-00600: internal error code, arguments: [pevmexe.c: ENTER: which perc?], [], [], [], [], [], [], [], [], [], [], []

[root@servertst01 bin]#

The “changes” command, this shows the changes in the system, as I did not have any changes made after the TFA was installed was not able to reproduce it here. It lists all changes in the system:

Usage : /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl changes [ -from <time> -to <time> | -for <time> | last <time_length> ]
-from "Mon/dd/yyyy hh:mm:ss" From <time>
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-to "Mon/dd/yyyy hh:mm:ss" To <time>
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-for "Mon/dd/yyyy" For <date>
or "yyyy-mm-dd"
-last <n><h|d>

Another way to see the changes in your environment is to use the search command passing a part to it and a specific component.

[root@servertst01 bin]# ./tfactl analyze -search "ALTER" -comp db
INFO: analyzing db (DB Alert Logs) logs for the last 60 minutes... Please wait...
INFO: analyzing host: servertst01

Report title: DB Alert Logs
Report date range: last ~1 hour(s)
Report (default) time zone: WET - Western European Time
Analysis started at: 11-May-2020 03:04:17 PM WEST
Elapsed analysis time: 2 second(s).
Configuration file: /oraadm/dba/ahf/oracle.ahf/tfa/ext/tnt/conf/tnt.prop
Configuration group: db
Parameter: ALTER
Total message count: 44,397, from 10-Feb-2020 01:19:51 PM WET to 11-May-2020 03:01:46 PM WEST
Messages matching last ~1 hour(s): 18, from 11-May-2020 02:17:23 PM WEST to 11-May-2020 03:01:46 PM WEST
Matching regex: ALTER
Case sensitive: false
Match count: 1
May 11 15:01:46 2020 - generic - ALTER SYSTEM SET open_cursors=6000 SCOPE=BOTH SID='*';

INFO: analyzing db (DB Alert Logs) logs for the last 60 minutes... Please wait...
INFO: analyzing host: servertst02

Report title: DB Alert Logs
Report date range: last ~1 hour(s)
Report (default) time zone: WET - Western European Time
Analysis started at: 11-May-2020 03:04:23 PM WEST
Elapsed analysis time: 0 second(s).
Configuration file: /oraadm/dba/ahf/oracle.ahf/tfa/ext/tnt/conf/tnt.prop
Configuration group: db
Parameter: ALTER
Total message count: 3,598, from 10-Feb-2020 03:03:42 PM WET to 11-May-2020 02:30:16 PM WEST
Messages matching last ~1 hour(s): 5, from 11-May-2020 02:30:05 PM WEST to 11-May-2020 02:30:16 PM WEST
Matching regex: ALTER
Case sensitive: false
Match count: 2
May 11 14:30:05 2020 - generic - ALTER SYSTEM ARCHIVE LOG
May 11 14:30:16 2020 - generic - ALTER SYSTEM ARCHIVE LOG
Thread 2 advanced to log sequence 1096 (LGWR switch)
Current log# 4 seq# 1096 mem# 0: +REDO1/testdba/onlinelog/group_4.274.1032015207
Current log# 4 seq# 1096 mem# 1: +REDO2/testdba/onlinelog/group_4.274.1032015207

INFO: analyzing db (DB Alert Logs) logs for the last 60 minutes... Please wait...
INFO: analyzing host: servertst03

Report title: DB Alert Logs
Report date range: last ~1 hour(s)
Report (default) time zone: WET - Western European Time
Analysis started at: 11-May-2020 03:04:25 PM WEST
Elapsed analysis time: 0 second(s).
Configuration file: /oraadm/dba/ahf/oracle.ahf/tfa/ext/tnt/conf/tnt.prop
Configuration group: db
Parameter: ALTER
Total message count: 1,687, from 19-Mar-2020 06:28:29 PM WET to 11-May-2020 02:30:18 PM WEST
Messages matching last ~1 hour(s): 4, from 11-May-2020 02:30:06 PM WEST to 11-May-2020 02:30:18 PM WEST
Matching regex: ALTER
Case sensitive: false
Match count: 0

And for last the set command which will configure several options for TFA:

[root@servertst01 bin]# ./tfactl set -h

Usage : tfact set ahfnotificationaddress="<space separated email ids>"
ahfnotificationaddress="<email1> <email2>" set AHF Notification Address

Turn ON/OFF or Modify various TFA features

Usage : /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set [ autodiagcollect=<ON|OFF> | trimfiles=<ON|OFF> | tracelevel=<COLLECT|SCAN|INVENTORY|OTHER|ISA|HANDLER|MAIN|CLIENT|CONSOLE>:<FATAL|ERROR|WARN|INFO|DEBUG|TRACE> | reposizeMB=<n> [repositorydir=<dir>] [-force] | repositorydir=<dir> [reposizeMB=<n>] [-force] | logsize=<n> [-local] | logcount=<n> [-local] | maxcorefilesize=<n> [-local] | maxcorecollectionsize=<n> [-local] | maxfilecollectionsize=<n> | autopurge=<ON|OFF> | publicip=<ON|OFF> | minSpaceForRTScan=<n> | rtscan=<ON|OFF> | diskUsageMon=<ON|OFF> | diskUsageMonInterval=<n> | manageLogsAutoPurge=<ON|OFF> | manageLogsAutoPurgeInterval=<n> | manageLogsAutoPurgePolicyAge=<d|h> | minagetopurge=<n> | tfaIpsPoolSize=<n> | tfaDbUtlPurgeAge=<n> | tfaDbUtlPurgeMode=<simple|resource|profile> | tfaDbUtlPurgeThreadDelay=<n> | tfaDbUtlCrsProfileDelay=<n> | indexRecoveryMode=<recreate|restore>] [-c]

autodiagcollect allow for automatic diagnostic collection when an event
is observed (default ON)
trimfiles allow trimming of files during diagcollection (default ON)
tracelevel control the trace level of log files in /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag/tfa
(default INFO for all facilities)
reposizeMB=<n> set the maximum size of diagcollection repository to <n>MB
repositorydir=<dir> set the diagcollection repository to <dir>
-force skip inital checks while changing repository (Not Recommended)
logsize=<n> set the maximum size of each TFA log to <n>MB (default 50 MB)
logcount=<n> set the maximum number of TFA logs to <n> (default 10)
port=<n> set TFA Port to <n>
maxcorefilesize=<n> set the maximum size of Core File to <n>MB (default 20 MB )
maxcorecollectionsize=<n> set the maximum collection size of Core Files to <n>MB (default 200 MB )
maxfilecollectionsize=<n> set the maximum file collection size to <n>MB (default 5 GB )
autopurge allow automatic purging of collections when less space
is observed in repository (default OFF)
publicip allow TFA to run on public network
smtp Update SMTP Configuration
minSpaceForRTScan=<n> Minimun space required to run RT Scanning(default 500)
rtscan allow Alert Log Scanning
diskUsageMon allow Disk Usage Monitoring
diskUsageMonInterval=<n> Time interval between consecutive Disk Usage Snapshot(default 60 minutes)
manageLogsAutoPurge allow Manage Log Auto Purging
manageLogsAutoPurgeInterval=<n> Time interval between consecutive Managelogs Auto Purge(default 60 minutes)
manageLogsAutoPurgePolicyAge=<d|h> Logs older than the time period will be auto purged(default 30 days)
minagetopurge set the age in hours for collections to be skipped by
AutoPurge (default 12 Hours)
tfaIpsPoolSize set the TFA IPS pool size
tfaDbUtlPurgeAge set the TFA ISA Purge Age (in seconds)
tfaDbUtlPurgeMode set the TFA ISA Purge Mode (simple/resource)
tfaDbUtlPurgeThreadDelay set the TFA ISA Purge Thread Delay (in minutes)
tfaDbUtlCrsProfileDelay set the TFA ISA Crs Profile Delay
indexRecoveryMode set the Lucene index recovery mode (recreate/restore)
rediscoveryInterval set the time interval for running lite rediscovery
-c set the value on all nodes (Does not apply to repository
settings)
-local set the value on the local node (if option is not included
the value will be set on all the nodes)
Examples:
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set autodiagcollect=ON
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set autopurge=ON
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set tracelevel=INVENTORY:3
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set reposizeMB=20480
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set logsize=100
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set port=5000

Hope it helps, there are still many scenarios of TFA which I would like to bring but not that much time, I will try to make long posts for the next ones. We´ll see.

Thanks for reading!

Elisson Almeida

Autonomous Health Framework – Managing logs and REST API

Hi all,

We are used to use adrci to manage our database logs right, if you don’t about adrci please check here and here.

But did you know that TFA can cleanup the CRS and database logs as well?

You might be thinking: 2 set of tools for the same job? Yyeah kind off… =)

Adrci is configurable and performs the purges per configured policy and tfactl managelogs will not. Also the tfactl managelogs will also clean the listener logs as well and you can clean all cluster logs from one node without having to connect to each server to execute the purge from adrci for example.

The cleanup can be done by database or GI which comes in handy as well.

You have a dry run option which will show what will be deleted and total freed space without actually deleting any files

Usage : /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl [run] managelogs [ -purge [[-older <n><m|h|d>] | [-gi] | [-database <all|d1,d2..>] | [-dryrun] ]] [ -show [usage|variation] [ [-older <n><d>] | [-gi] | [-database <all|d1,d2..>] ] ] [-node <all|local|node1,node2..>]

Options:
-purge           Purge logs
   -older        Timeperiod for log purge
   -gi           Purge Grid Infrastructure logs(all ADR homes under GIBASE/diag and crsdata(cvu dirs))
   -database     Purge Database logs (Default all else provide list)
   -dryrun       Estimate logs which will be cleared by purge command
-show            Print usage/variation details
   -older         Timeperiod for change in log volume
   -gi           Space utilization under GIBASE
   -database     Space utilization for Database logs (Default all else provide list)


 -older <n><m|h|d>  Files from past 'n' [d]ays or 'n' [h]ours or 'n' [m]inutes


e.g:
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl managelogs -purge -older 30d -dryrun
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl managelogs -purge -older 30d
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl managelogs -show usage

   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl run managelogs -purge -older 30d -dryrun
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl run managelogs -purge -older 30d
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl run managelogs -show usage

 

TFA also comes with a REST API which you can use POST and GET http which you need to setup prior usage, it will setup 2 users, tfaadmin and tfarest, and start a ORDS service on the server.

[root@servertst01 bin]# ./tfactl rest status

TFA REST Services is not running

[root@servertst01 bin]# ./tfactl rest start

TFA REST Services is not running

[root@servertst01 bin]# ./tfactl rest -start

Configuring TFA REST Services using ORDS :

This might take couple of minutes. Please be patient.

Adding Dependency Jars to ORDS

Adding users to ORDS :

Password must be at least 8 characters and contain a combination of Upper case and Lower case letters, numbers and one of [!@#%_]

Enter a password for user tfaadmin:
Confirm password for user tfaadmin:

Enter a password for user tfarest:
Confirm password for user tfarest:

Starting TFA REST Services

Successfully started TFA REST Services [PID : 158682]

TFA URL : https://servertest01:9090/ords/tfactl/print/status

COMPLIANCE URL : https://servertest01:9090/ords/tfactl/orachk/version

 

The commands are most like the ones from the CLI but using this REST API, you will receive JSON outputs which you can process and automate the use of TFA

Below are some examples

/tfactl/print/status

[root@servertst01 ~]$ curl -k --user tfaadmin:tfaadmin https://servertst01:9090/ords/tfactl/print/status
[ {
"status" : "CheckOK",
"hostname" : "servertst01",
"pid" : 135195,
"port" : 5000,
"version" : "20.1.3.0.0",
"buildId" : "20130020200429095054",
"inventoryStatus" : "COMPLETE"
}, {
"status" : "CheckOK",
"hostname" : "servertst02",
"pid" : 241723,
"port" : 5000,
"version" : "20.1.3.0.0",
"buildId" : "20130020200429095054",
"inventoryStatus" : "COMPLETE"
} ]

/print/actions

[root@servertst01 ~]$ curl -k --user tfaadmin:tfaadmin https://servertst01:9090/ords/tfactl/print/actions
[ {
"actionName" : "Run inventory",
"hostname" : "servertst01",
"client" : "DiagCollectionThread",
"startTime" : "May 13 15:48:52 WEST",
"endTime" : "May 13 15:49:58 WEST",
"status" : "COMPLETE",
"comments" : "<components><component><name>EMAGENT</name></component><component><name>DBCLIENT</name></component><component><name>ACFS</name></component><component><name>RDBMS</name></component><component><name>OS</name></component><component><name>CRS</name></component><component><name>EM</name></component><component><name>ASMPROXY</name></component><component><name>ASMIO</name></component><component><name>CRSCLIENT</name></component><component><name>OMS</name></component><component><name>DBWLM</name></component><component><name>INSTALL</name></component><component><name>TNS</name></component><component><name>OCM</name></component><component><name>ASM</name></component><component><name>RHP</name></component><component><name>CHA</name></component><component><name>AFD</name></component><component><name>EMPLUGINS</name></component><component><name>CFGTOOLS</name></component><component><name>WLS</name></component></components>"
}, {
"actionName" : "Collect traces",
"hostname" : "servertst01",
"client" : "tfactl",
"startTime" : "May 13 15:48:51 WEST",
"endTime" : "May 13 15:53:21 WEST",
"status" : "COMPLETE",
"comments" : "-database all -rdbms -asm -crsclient -dbclient -dbwlm -tns -rhp -afd -crs -wls -emagent -oms -ocm -emplugins -em -acfs -install -cfgtools -os -sundiag -chmos -monitor -copy -since 12h -node all -z TFA_DEF_ZIP_20200513154851 -user root -tag TFA_DEF_TAG_20200513154851 -logid 20200513154851:20200513154851servertst01"
}, {
"actionName" : "Run inventory",
"hostname" : "servertst01",
"client" : "DiagCollectionThread",
"startTime" : "May 13 15:47:48 WEST",
"endTime" : "May 13 15:48:20 WEST",
"status" : "COMPLETE",
"comments" : "<components><component><name>EMAGENT</name></component><component><name>DBCLIENT</name></component><component><name>ACFS</name></component><component><name>RDBMS</name></component><component><name>OS</name></component><component><name>CRS</name></component><component><name>EM</name></component><component><name>ASMPROXY</name></component><component><name>ASMIO</name></component><component><name>CRSCLIENT</name></component><component><name>OMS</name></component><component><name>DBWLM</name></component><component><name>INSTALL</name></component><component><name>TNS</name></component><component><name>OCM</name></component><component><name>ASM</name></component><component><name>RHP</name></component><component><name>CHA</name></component><component><name>AFD</name></component><component><name>EMPLUGINS</name></component><component><name>CFGTOOLS</name></component><component><name>WLS</name></component></components>"
}, {
"actionName" : "Collect traces",
"hostname" : "servertst01",
"client" : "tfactl",
"startTime" : "May 13 15:47:48 WEST",
"endTime" : "May 13 15:52:29 WEST",
"status" : "COMPLETE",
"comments" : "-database all -rdbms -asm -crsclient -dbclient -dbwlm -tns -rhp -afd -crs -wls -emagent -oms -ocm -emplugins -em -acfs -install -cfgtools -os -sundiag -chmos -monitor -copy -since 12h -node all -z TFA_DEF_ZIP_20200513154748 -user root -tag TFA_DEF_TAG_20200513154748 -logid 20200513154748:20200513154748servertst01"
}, {
"actionName" : "Run inventory",
"hostname" : "servertst02",
"client" : "DiagCollectionThread",
"startTime" : "May 13 15:48:54 WEST",
"endTime" : "May 13 15:49:04 WEST",
"status" : "COMPLETE",
"comments" : "<components><component><name>EMAGENT</name></component><component><name>DBCLIENT</name></component><component><name>ACFS</name></component><component><name>RDBMS</name></component><component><name>OS</name></component><component><name>CRS</name></component><component><name>EM</name></component><component><name>ASMPROXY</name></component><component><name>ASMIO</name></component><component><name>CRSCLIENT</name></component><component><name>OMS</name></component><component><name>DBWLM</name></component><component><name>INSTALL</name></component><component><name>OCM</name></component><component><name>TNS</name></component><component><name>ASM</name></component><component><name>RHP</name></component><component><name>CHA</name></component><component><name>AFD</name></component><component><name>EMPLUGINS</name></component><component><name>CFGTOOLS</name></component><component><name>WLS</name></component></components>"
}, {
"actionName" : "Collect traces",
"hostname" : "servertst02",
"client" : "tfactl",
"startTime" : "May 13 15:48:53 WEST",
"endTime" : "May 13 15:53:27 WEST",
"status" : "COMPLETE",
"comments" : "-timezone Europe/Lisbon -nodelist servertst02_servertst01 -s 05.13.2020.03.48 -e 05.13.2020.15.48 -updateendtime -z TFA_DEF_ZIP_20200513154851.zip -tag TFA_DEF_TAG_20200513154851 -logid 20200513154851 -emagent -dbclient -acfs -rdbms -os -crs -em -asmproxy -asmio -crsclient -oms -dbwlm -install -tns -ocm -asm -rhp -cha -afd -emplugins -cfgtools -wls -chmos -copy -user root -uid 9a79d417-a877-4355-a2f2-210f1d45024a:20200513154851servertst01"
}, {
"actionName" : "Run inventory",
"hostname" : "servertst02",
"client" : "DiagCollectionThread",
"startTime" : "May 13 15:47:50 WEST",
"endTime" : "May 13 15:48:12 WEST",
"status" : "COMPLETE",
"comments" : "<components><component><name>EMAGENT</name></component><component><name>DBCLIENT</name></component><component><name>ACFS</name></component><component><name>RDBMS</name></component><component><name>OS</name></component><component><name>CRS</name></component><component><name>EM</name></component><component><name>ASMPROXY</name></component><component><name>ASMIO</name></component><component><name>CRSCLIENT</name></component><component><name>OMS</name></component><component><name>DBWLM</name></component><component><name>INSTALL</name></component><component><name>OCM</name></component><component><name>TNS</name></component><component><name>ASM</name></component><component><name>RHP</name></component><component><name>CHA</name></component><component><name>AFD</name></component><component><name>EMPLUGINS</name></component><component><name>CFGTOOLS</name></component><component><name>WLS</name></component></components>"
}, {
"actionName" : "Collect traces",
"hostname" : "servertst02",
"client" : "tfactl",
"startTime" : "May 13 15:47:49 WEST",
"endTime" : "May 13 15:52:37 WEST",
"status" : "COMPLETE",
"comments" : "-timezone Europe/Lisbon -nodelist servertst02_servertst01 -s 05.13.2020.03.47 -e 05.13.2020.15.47 -updateendtime -z TFA_DEF_ZIP_20200513154748.zip -tag TFA_DEF_TAG_20200513154748 -logid 20200513154748 -emagent -dbclient -acfs -rdbms -os -crs -em -asmproxy -asmio -crsclient -oms -dbwlm -install -tns -ocm -asm -rhp -cha -afd -emplugins -cfgtools -wls -chmos -copy -user root -uid 94229462-6332-4a7e-849b-9469c1ad86ef:20200513154748servertst01"
} ]

/tfactl/diagcollect

[root@servertst01 ~]$ curl -k --user tfaadmin:tfaadmin https://servertst01:9090/ords/tfactl/diagcollect
{
"collectionId" : "20200513161625servertst01",
"zipName" : "TFA_DEF_ZIP_20200513161625",
"tagName" : "TFA_DEF_TAG_20200513161625",
"message" : [ "Diagcollect request will be processed soon by TFA" ]
}

 

You can download the report using the collectionId as below

curl -k --user tfaadmin:tfaadmin -L0 https://servertst01:9090/ords/tfactl/download/20200513161625servertst01 --output remoteTFA.zip

 % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 196M 0 196M 0 0 195M 0 --:--:-- 0:00:01 --:--:-- 195M

On the next ones I will do a review on a ORACHK and OSWatcher which are also part of the Autonomous Health Framework

I Hope it helps!

Elisson Almeida

Autonomous Health Framework – OSWatcher

Hi all,

Another one for our series about Autonomous Health Framework:

Along with the AHF, we have some tools that we can take advantage of, one of them is OSWatcher. OSwatcher is a utility to capture performance metrics from the operating system using native OS tools for IO, network, CPU, memory, etc.

It gathers a snapshot of your system and stores it in a directory which you can then use it to parse the information there and perform a system wide analysis.

You can stop it if you want, out of the box it will gather OS information every 30 minutes

To see if it is running you can use tfactl toolstatus as below

[root@servertst01 bin]# ./tfactl toolstatus

.------------------------------------------------------------------.
| TOOLS STATUS - HOST : servertst01 |
+----------------------+--------------+--------------+-------------+
| Tool Type | Tool | Version | Status |
+----------------------+--------------+--------------+-------------+
| Development Tools | orachk | 19.3.0.0.0 | DEPLOYED |
| | oratop | 14.1.2 | DEPLOYED |
+----------------------+--------------+--------------+-------------+
| Support Tools Bundle | darda | 2.10.0.R6036 | DEPLOYED |
| | oswbb | 8.3.2 | RUNNING |
| | prw | 12.1.13.11.4 | NOT RUNNING |
+----------------------+--------------+--------------+-------------+
| TFA Utilities | alertsummary | 19.3.0.0.0 | DEPLOYED |
| | calog | 19.3.0.0.0 | DEPLOYED |
| | dbcheck | 18.3.0.0.0 | DEPLOYED |
| | dbglevel | 19.3.0.0.0 | DEPLOYED |
| | grep | 19.3.0.0.0 | DEPLOYED |
| | history | 19.3.0.0.0 | DEPLOYED |
| | ls | 19.3.0.0.0 | DEPLOYED |
| | managelogs | 19.3.0.0.0 | DEPLOYED |
| | menu | 19.3.0.0.0 | DEPLOYED |
| | param | 19.3.0.0.0 | DEPLOYED |
| | ps | 19.3.0.0.0 | DEPLOYED |
| | pstack | 19.3.0.0.0 | DEPLOYED |
| | summary | 19.3.0.0.0 | DEPLOYED |
| | tail | 19.3.0.0.0 | DEPLOYED |
| | triage | 19.3.0.0.0 | DEPLOYED |
| | vi | 19.3.0.0.0 | DEPLOYED |
'----------------------+--------------+--------------+-------------'

Note :-
DEPLOYED : Installed and Available - To be configured or run interactively.
NOT RUNNING : Configured and Available - Currently turned off interactively.
RUNNING : Configured and Available.

[root@servertst01 bin]#

And to run a simple test you can call tfactl oswbb and it should parse all data in its archive directory BUT and I tried to do so I ran into a java out of memory error
I bumped the memory of the process up to 4G and still gor the same error.

If you want to try to increase your process memory you will need to edit the file oswbb.pm under oracle.ahf/tfa/ext/oswbb

You will see a line like below, and you will need to change the memory valeu to one that your system can handle, I´m not saying it will not work you can try.

system("$java -Xmx512M -jar $oswjar -i $adir @flags");

Here is the error:

[root@servertst01 bin]# ./tfactl oswbb

Starting OSW Analyzer V8.3.0
OSWatcher Analyzer Written by Oracle Center of Expertise
Copyright (c) 2019 by Oracle Corporation

Parsing Data. Please Wait...

Scanning file headers for version and platform info...


Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at u.c(Unknown Source)
at u.a(Unknown Source)
at OSWGraph.OSWGraph.main(Unknown Source)


Analysis results are saved in /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/oswbb/root/oswbb

And there process memory which was a bit ovee 4G when it died

[root@servertst01 ~]$ ps -eo rss,pid,euser,lstart,args:100 --sort %mem | grep -v grep | grep java | awk '{printf $1/1024 "MB"; $1=""; print }'| sort |grep osw
4364.3MB 103531 root Wed May 13 18:09:12 2020 /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/jre/bin/java -Xmx4096M -jar /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/oswbb/root/oswbb/oswbba.jar -i /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/oswbb/oracrs/archive

I had over 4 days of data there, so by using the options -B and -E I was able to workaround the issue. If you are doing some troubleshooting I would advise to read 4 days of data anyway as we could be see averages and a spike could be reduced and not see in the analysis.
So it did not bothered that much.

Here when passing the date range I was able to execute the process as needed.

[root@servertst01 bin]# ./tfactl oswbb -6 -B May 13 09:25:00 2020 -E May 13 09:30:00 2020

Validating times in the archive...


Starting OSW Analyzer V8.3.0
OSWatcher Analyzer Written by Oracle Center of Expertise
Copyright (c) 2019 by Oracle Corporation

Parsing Data. Please Wait...

Scanning file headers for version and platform info...


Parsing file servertst01_pidstat_20.05.13.0900.dat ...

Parsing file servertst01_iostat_20.05.13.0900.dat ...
This directory already exists. Rewriting...

Parsing file servertst01_vmstat_20.05.13.0900.dat ...

Parsing file servertst01_netstat_20.05.13.0900.dat ...

Parsing file servertst01_top_20.05.13.0900.dat ...

Parsing file servertst01_ps_20.05.13.0900.dat ...


Parsing Completed.


Enter 1 to Display CPU Process Queue Graphs
Enter 2 to Display CPU Utilization Graphs
Enter 3 to Display CPU Other Graphs
Enter 4 to Display Memory Graphs
Enter 5 to Display Disk IO Graphs
Enter 61 to Display Individual OS Process I/O RPS Graphs
Enter 62 to Display Individual OS Process I/O WPS Graphs
Enter 63 to Display Individual OS Process Percent User CPU Graphs
Enter 64 to Display Individual OS Process Percent System CPU Graphs
Enter 65 to Display Individual OS Process Percent Total CPU (User + System) Graphs
Enter 66 to Display Individual OS Process Percent Memory Graphs

Enter GP to Generate Individual Process Profile
Enter GC to Generate All CPU Gif Files
Enter GM to Generate All Memory Gif Files
Enter GD to Generate All Disk Gif Files
Enter GN to Generate All Network Gif Files

Enter L to Specify Alternate Location of Gif Directory
Enter Z to Zoom Graph Time Scale (Does not change analysis dataset)
Enter B to Returns to Baseline Graph Time Scale (Does not change analysis dataset)
Enter R to Remove Currently Displayed Graphs

Enter X to Export Parsed Data to Flat File
Enter S to Analyze Subset of Data(Changes analysis dataset including graph time scale)
Enter A to Analyze Data
Enter D to Generate DashBoard

Enter Q to Quit Program

Please Select an Option:

Hope it helps!

Elisson Almeida

Autonomous Health Framework – TFA data collection

Hi all,

Here I am or another post on the Autonomous Health Framework series!

Based on previous posts, we now have the AHF installed, what can we do with it?
I would say a lot! Everything? I don’t think so.

I will post here what I usually use more on TFA and then I will focus on orachk (which is a monster on its own as well).

Imagine that you have a problem and need to gather different information from the OS and the Oracle Stack from different nodes?

That´s where TFA – Trace file analyzer comes into play. It will collect the data for you and if you run as its root. You can specify the component that you need if you need to drill down to a very specify issue.

Also when you have a Service request with MOS they will probably ask for it to be executed but if they don’t, you should do it and upload any way as it will save a lot of time to troubleshooting and back and forth messages.

And if you have Internet access, you can upload it using the TFA as well, see it here.

[root@servertst01 bin]# ./tfactl diagcollect –h

Invalid Option for diagcollect: –h

Collect logs from across nodes in cluster

Usage : /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect [ [component_name1] [component_name2] ... [component_nameN] | [-srdc ] | [-defips]] [-sr <SR#>] [-node <all|local|n1,n2,..>] [-tag ] [-z ] [-last <m|h|d>| -from  -to  | -for ] [-nocopy] [-notrim] [-silent] [-nocores][-collectalldirs][-collectdir <dir1,dir2..>][-examples]
components:-ips|-database|-asm|-crsclient|-dbclient|-dbwlm|-tns|-rhp|-procinfo|-cvu|-afd|-crs|-cha|-chastore|-wls|-emagent|-oms|-ocm|-emplugins|-em|-acfs|-install|-cfgtools|-os|-ashhtml|-ashtext|-awrhtml|-awrtext|-qos
-srdc Service Request Data Collection (SRDC).
-defips Include in the default collection the IPS Packages for:
ASM, CRS and Databases
-sr Enter SR number to which the collection will be uploaded
-node Specify comma separated list of host names for collection
-tag  The files will be collected into tagname directory inside
repository
-z  The collection zip file will be given this name within the
TFA collection repository
-last <m|h|d> Files from last 'n' [m]inutes, 'n' [d]ays or 'n' [h]ours
-since Same as -last. Kept for backward compatibility.
-from "Mon/dd/yyyy hh:mm:ss" From 
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-to "Mon/dd/yyyy hh:mm:ss" To 
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-for "Mon/dd/yyyy" For .
or "yyyy-mm-dd"
-nocopy Does not copy back the zip files to initiating node from all nodes
-notrim Does not trim the files collected
-silent This option is used to submit the diagcollection as a background
process
-nocores Do not collect Core files when it would normally have been
collected
-collectalldirs Collect all files from a directory marked "Collect All"
flag to true
-collectdir Specify comma separated list of directories and collection will
include all files from these irrespective of type and time constraints
in addition to components specified
-examples Show diagcollect usage examples
For detailed help on each component use:
/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect [component_name1] [component_name2] ... [component_nameN] -help
[root@servertst01 bin]#

[root@servertst01 bin]# ./tfactl diagcollect -examples
Examples:
/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect
Trim and Zip all files updated in the last 12 hours as well as chmos/osw data
from across the cluster and collect at the initiating node
Note: This collection could be larger than required but is there as the
simplest way to capture diagnostics if an issue has recently occurred.

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -last 8h
Trim and Zip all files updated in the last 8 hours as well as chmos/osw data
from across the cluster and collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -database hrdb,fdb -last 1d -z foo
Trim and Zip all files from databases hrdb & fdb in the last 1 day and
collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -crs -os -node node1,node2 -last 6h
Trim and Zip all crs files, o/s logs and chmos/osw data from node1 & node2
updated in the last 6 hours and collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -asm -node node1 -from "May/07/2020" -to "May/07/2020 21:00:00"
Trim and Zip all ASM logs from node1 updated between from and to time and
collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -for "May/07/2020"
Trim and Zip all log files updated on "May/07/2020" and collect at the
collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -for "May/07/2020 21:00:00"
Trim and Zip all log files updated from 09:00 on "May/07/2020" to 09:00 on "May/08/2020"
(i.e. 12 hours before and after the time given) and collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -crs -collectdir /tmp_dir1,/tmp_dir2
Trim and Zip all crs files updated in the last 12 hours
Also collect all files from /tmp_dir1 and /tmp_dir2 at the initiating node

 

Autonomous Health Framework – Installation and Usage Basics

Hi all,

So, this week we’ll have an Autonomous Health Framework series o posts! One post per day, stay tuned!

Today I’ll review some points of the Autonomous Health Framework and how it can help us on the day to day tasks as well as on specific problem resolutions. This framework is a bundle of tools that were already available like orachk/exachk, tfa plus other features that I will try to cover focusing on practical cases.

So, first of all, you need to install it, if you have a RAC cluster most likely you already have it but it is always good to update it, as Oracle security recommendations as also part of the orachk analyze. For reference: In an ideal world you have root access or sudo, as its daemons under root provides a more wide log collection capabilities.

  1. Download the latest version from Autonomous Health Framework (AHF) – Including TFA and ORAchk/EXAChk (Doc ID 2550798.1)
  2. Transfer and unzip the file to your server and as root follow the installation as below
root@servertst01 oracrs]# ./ahf_setup -ahf_loc /oraadm/oracrs/product/19.0.0/ahf -data_dir /oraadm/oracrs/product/19.0.0/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_120962_2020_05_07-22_41_11.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf

AHF Data Directory : /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst01 | RUNNING | 135195 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------------------------+
| AHF Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf |
| TFA Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data |
| Repository | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/servertst01/diag |
'-----------------+--------------------------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_120962_2020_05_07-22_41_11.log to /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/servertst01/diag/ahf/

As you can see as the root user is not allowed to ssh, the installation was done only on the local node. Repeat the process on the remaning ones

Once you have it completed you will be able to to a tfactl status and a toolstatus and see the processes and tools available.
[root@servertst01 bin]# ./tfactl status

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst01 | RUNNING | 135195 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst02 | RUNNING | 241723 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'
[root@servertst01 bin]#
[root@servertst01 bin]# ./tfactl toolstatus

.------------------------------------------------------------------.
| TOOLS STATUS - HOST : servertst01 |
+----------------------+--------------+--------------+-------------+
| Tool Type | Tool | Version | Status |
+----------------------+--------------+--------------+-------------+
| Development Tools | orachk | 19.3.0.0.0 | DEPLOYED |
| | oratop | 14.1.2 | DEPLOYED |
+----------------------+--------------+--------------+-------------+
| Support Tools Bundle | darda | 2.10.0.R6036 | DEPLOYED |
| | oswbb | 8.3.2 | RUNNING |
| | prw | 12.1.13.11.4 | NOT RUNNING |
+----------------------+--------------+--------------+-------------+
| TFA Utilities | alertsummary | 19.3.0.0.0 | DEPLOYED |
| | calog | 19.3.0.0.0 | DEPLOYED |
| | dbcheck | 18.3.0.0.0 | DEPLOYED |
| | dbglevel | 19.3.0.0.0 | DEPLOYED |
| | grep | 19.3.0.0.0 | DEPLOYED |
| | history | 19.3.0.0.0 | DEPLOYED |
| | ls | 19.3.0.0.0 | DEPLOYED |
| | managelogs | 19.3.0.0.0 | DEPLOYED |
| | menu | 19.3.0.0.0 | DEPLOYED |
| | param | 19.3.0.0.0 | DEPLOYED |
| | ps | 19.3.0.0.0 | DEPLOYED |
| | pstack | 19.3.0.0.0 | DEPLOYED |
| | summary | 19.3.0.0.0 | DEPLOYED |
| | tail | 19.3.0.0.0 | DEPLOYED |
| | triage | 19.3.0.0.0 | DEPLOYED |
| | vi | 19.3.0.0.0 | DEPLOYED |
'----------------------+--------------+--------------+-------------'

Note :-
DEPLOYED : Installed and Available - To be configured or run interactively.
NOT RUNNING : Configured and Available - Currently turned off interactively.
RUNNING : Configured and Available.

[root@servertst01 bin]#

3. Using it you can ‘query’ all cluster nodes for information from a single point which can really speed up your information collection. Here is an example of the summary command:

[root@servertst01 bin]# ./tfactl summary -h
---------------------------------------------------------------------------------
Usage : TFACTL [run] summary -help
---------------------------------------------------------------------------------
Command : /oraptp/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl [run] summary [OPTIONS]
Following Options are supported:
        [no_components] : [Default] Complete Summary Collection
        -overview       : [Optional/Default] Complete Summary Collection - Overview
        -crs            : [Optional/Default] CRS Status Summary
        -asm            : [Optional/Default] ASM Status Summary
        -acfs           : [Optional/Default] ACFS Status Summary
        -database       : [Optional/Default] DATABASE Status Summary
        -exadata        : [Optional/Default] EXADATA Status Summary
                          Not enabled/ignored in Windows and Non-Exadata machine
        -patch          : [Optional/Default] Patch Details
        -listener       : [Optional/Default] LISTENER Status Summary
        -network        : [Optional/Default] NETWORK Status Summary
        -os             : [Optional/Default] OS Status Summary
        -tfa            : [Optional/Default] TFA Status Summary
        -summary        : [Optional/Default] Summary Tool Metadata

        -json           : [Optional] - Prepare json report
        -html           : [Optional] - Prepare html report
        -print          : [Optional] - Display [html or json] Report at Console
        -silent         : [Optional] - Interactive console by defauly
        -history <num>  : [Optional] - View Previous <numberof> Summary Collection History in Interpreter
        -node <node(s)> : [Optional] - local or Comma Separated Node Name(s)
        -help           : Usage/Help.
---------------------------------------------------------------------------------
  - Data Collection From Node - servertst02 .. Done.
[root@servertst01 bin]#

[root@servertst01 bin]# ./tfactl summary

  Executing Summary in Parallel on Following Nodes:
    Node : servertst01
    Node : servertst02

LOGFILE LOCATION : /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/summary/root/20200508124309/log/summary_command_20200508124309_servertst01_35756.log

  Component Specific Summary collection :
    - Collecting CRS details ... Done.
    - Collecting ASM details ... Done.
    - Collecting ACFS details ... Done.
    - Collecting DATABASE details ... Done.
    - Collecting PATCH details ... Done.
    - Collecting LISTENER details ... Done.
    - Collecting NETWORK details ... Done.
    - Collecting OS details ... Done.
    - Collecting TFA details ... Done.
    - Collecting SUMMARY details ... Done.

  Remote Summary Data Collection : In-Progress - Please wait ...
  - Data Collection From Node - servertst02 .. Done.

  Prepare Clusterwide Summary Overview ... Done
      cluster_status_summary

  COMPONENT   STATUS    DETAILS
+-----------+---------+---------------------------------------------------------------------------------------------------+
  CRS         PROBLEM   .-----------------------------------------------.
                        | CRS_SERVER_STATUS   : ONLINE                  |
                        | CRS_STATE           : ONLINE                  |
                        | CRS_INTEGRITY_CHECK : FAIL                    |
                        | CRS_RESOURCE_STATUS : OFFLINE Resources Found |
                        '-----------------------------------------------'
  ASM         PROBLEM   .-------------------------------------------------------.
                        | ASM_DISK_SIZE_STATUS : WARNING - Available Size < 20% |
                        | ASM_BLOCK_STATUS     : PASS                           |
                        | ASM_CHAIN_STATUS     : PASS                           |
                        | ASM_INCIDENTS        : PASS                           |
                        | ASM_PROBLEMS         : PASS                           |
                        '-------------------------------------------------------'
  ACFS        OFFLINE   .-----------------------.
                        | ACFS_STATUS : OFFLINE |
                        '-----------------------'
  DATABASE    PROBLEM   .-----------------------------------------------------------------------------------------------.
                        | ORACLE_HOME_DETAILS                                                        | ORACLE_HOME_NAME |
                        +----------------------------------------------------------------------------+------------------+
                        | .------------------------------------------------------------------------. | OraDb11g_home1   |
                        | | PROBLEMS | INCIDENTS | DB_BLOCKS | DATABASE_NAME | STATUS  | DB_CHAINS | |                  |
                        | +----------+-----------+-----------+---------------+---------+-----------+ |                  |
                        | | PROBLEM  | PROBLEM   | PASS      | FSIGNTST      | PROBLEM | PROBLEM   | |                  |
                        | '----------+-----------+-----------+---------------+---------+-----------' |                  |
                        '----------------------------------------------------------------------------+------------------'
  PATCH       OK        .----------------------------------------------.
                        | CRS_PATCH_CONSISTENCY_ACROSS_NODES      : OK |
                        | DATABASE_PATCH_CONSISTENCY_ACROSS_NODES : OK |
                        '----------------------------------------------'
  LISTENER    OK        .-----------------------.
                        | LISTNER_STATUS   : OK |
                        '-----------------------'
  NETWORK     OK        .---------------------------.
                        | CLUSTER_NETWORK_STATUS :  |
                        '---------------------------'
  OS          OK        .-----------------------.
                        | MEM_USAGE_STATUS : OK |
                        '-----------------------'
  TFA         OK        .----------------------.
                        | TFA_STATUS : RUNNING |
                        '----------------------'
  SUMMARY     OK        .-----------------------------------.
                        | SUMMARY_EXECUTION_TIME : 0H:2M:9S |
                        '-----------------------------------'
+-----------+---------+---------------------------------------------------------------------------------------------------+


        ### Entering in to SUMMARY Command-Line Interface ###

tfactl_summary>list

  Components : Select Component - select [component_number|component_name]
        1 => overview
        2 => crs_overview
        3 => asm_overview
        4 => acfs_overview
        5 => database_overview
        6 => patch_overview
        7 => listener_overview
        8 => network_overview
        9 => os_overview
        10 => tfa_overview
        11 => summary_overview

 

See you tomorrow at the next post o this series!

Elisson Almeida

 

Monitoring database storage over-committing with OEM extention metric when using autoextend datafiles

Hi all,

Working on a project, I was asked to stop using autoextend on the database file, while setting up a new environment on a client, to avoid ASM diskgroup space issues.

In case the free space was not enough to handle all datafiles growth when often people would add datafiles in autoextend without checking the complete database e storage, which could cause a bigger problem.

I really like the autoextend, as it can prevent processes from failing and even more. Instead of be wasting space as often, in multiple tablespace environment, as the tablespace grows differently and as I mention not all validations are often done when adding space. We just want to resolve the issue and move to the next one.

So how can we save space by using it on where is needed and not being at risk of over-committing the storage use?

The idea is to monitor the database possible growth and compare with the current free space in the ASM diskgroup being used by the database.

So to accomplish that, I used column MAXBYTES in dba_data_files and dba_temp_files. In the example below we can see the current size from the BYTES column and the MAXBYTES as the limit to which the file can grow to be.  If the MAXBYTES is 0 the file does not have autoextend on.

SYS@db011>select TABLESPACE_NAME,FILE_NAME,BYTES,MAXBYTES from dba_data_files where TABLESPACE_NAME like 'SYS%' or TABLESPACE_NAME='UNDOTBS1';

TABLESPACE_NAME FILE_NAME BYTES MAXBYTES
------------------------------ -------------------------------------------------------------------------- ---------- ------------
SYSTEM +ORA_DATA/db01/datafile/system.264.1032015007 2147483648 32212254720
SYSAUX +ORA_DATA/db01/datafile/sysaux.319.1032015009 2751463424 32212254720
UNDOTBS1 +ORA_DATA/db01/datafile/undotbs1.320.103201500 4294967296 0

So, to gather the max database size, I summed the greatest value between BYTES and MAXBYTES (when available) and using the v$asm_diskgroup to grab the current avaliable space from the diskgroup used in the db_create_file_dest parameter and finally subtracting the available space from the max database size:

select dg.FREE_MB - round((dbf.max+tempf.max),0) as total_free
from
(select trunc(sum(greatest(maxbytes,bytes)/1024/1024)) as max
from dba_data_files ) dbf,
(select trunc(sum(greatest(maxbytes,bytes)/1024/1024)) as max
from dba_temp_files) tempf,
(select FREE_MB
from v$asm_diskgroup where name in (select LTRIM( value, '+' ) from v$parameter where name='db_create_file_dest')) dg;

Based on this we can create several ways to avoid the storage over-committing.

The one I used was to create a metric extension:

If the value of the above query is negative, means that the max database size is greater than the available size therefore the capacity plan for that database must be reviewed.

  1. To create the metric extension, go to, on the OEM menu  – Enterprise -> Monitoring -> Metric Extension > Actions -> Create -> Metric Extension.
  2. In the General Properties page select the target type as Cluster database if it´s RAC or Database instance.
  3. In the Adapter, select “SQL”
  4. In Collection schedule, enable the metric, upload the data and set the collection frequently to your environment needs.
  5. In the Adapter page add the above query.
  6. In the Columns page, add a column for the result of the query.
    1. To fit my environment I only set the critical alert and in case the value of the metric is below 0.
  7. Continue filling the necessary information on the next pages, as well as test the new metric on a selected target, Afterwards save the new metric.
  8. Prior to deploying it to any target you need save the metric as a deployable draft, select the created metric and go the action menu and select Save as Deployable draft

Now you can deploy the new metric to the targets.

Hope this can help!

Elisson Almeida

Oracle Patching with OPlan

Everyone that I’ve worked with knows that I don’t like patching (and sometimes I try to imagine who does), but they are necessary to corrects bugs and improve the Oracle software stability.

When you have a single node server with one database, the patch planing is no brainer but when you have a RAC with multiple nodes, different Oracle homes and so on, the planning and preparations start to get more complex and it is easy to miss or overlook a step in the planning which can lead to issues during your patching.

So to help me with all that I use oplan. Oplan is a tool which comes with along OPatch and you can get its latest version in patch 6880880

More informations on oplan can be found here: Oracle Software Patching with OPLAN (Doc ID 1306814.1)

OK, so what do I used it most for?

Generating the apply patching steps, which are very in handy:

$ORACLE_HOME/OPatch/oplan/oplan generateApplySteps <bundle patch location>

And my favorite, rollback steps, which I have done more times that I would like to admit:

$ORACLE_HOME/OPatch/oplan/oplan generateRollbackSteps <bundle patch location>

Also as rollback, I do tar of the oracle binaries being patched prior as there times even the rollback did not work :-/

Both files will be created under the directory below and you will see an html and text files.

$ORACLE_HOME/cfgtoollogs/oplan/<TimeStamp>/

This process is to help you organise your steps, read it through prior executing to make sure it makes sense in your environment

Oplan has its limitations, from the Oracle note which I mentioned above:

Data Guard configurations are not supported.
OPlan can be used to create patch plans for Oracle home's running Oracle Data Guard configurations, but OPlan does not consider such an environment usable as 'Data Guard Standby-First Patch Apply' alternative. See the following for additional information on 'Data Guard Standby-First Patch Apply'

<Document 1265700.1> Oracle Patch Assurance - Data Guard Standby-First Patch Apply

Shared Oracle Home Configurations are not supported.

Single Instance Databases running in the same configuration are not supported

Even so I would still use it as it generates a plan based on your target environment adding more information that you would need to do manually if you were only to read the README files from the patching

Hope it helps.

Thanks and until next time

Elisson Almeida

Getting Oracle version – new utility on 18c oraversion

While scripting on an environment with different Oracle versions, I often needed to get the Oracle version for the target database as usually the SQL to be executed is version depended.

I used different methods to grab the database  version some you can see below:

SQL> select 
substr(
substr(
banner, 
instr(banner,' ',-(length(banner)-instr(banner,'.')),1)+1, 
instr(banner,' ',+instr(banner,'.'),1)-instr(banner,' ',-(length(banner)-instr(banner,'.')),1)-1
),
1, 
instr(
substr(
banner, 
instr(banner,' ',-(length(banner)-instr(banner,'.')),1)+1,
instr(banner,' ',+instr(banner,'.'),1)-instr(banner,' ',-(length(banner)-instr(banner,'.')),1)-1
),
'.'
)-1
) version
from v$version
where rownum = 1
; 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

VERSION
--------------------------------------------------------------------------------
18

Or like this

SQL> select substr(version,1,instr(version,'.')-1) version from product_component_version where product like 'Oracle%';

VERSION
--------------------------------------------------------------------------------
18

But on Oracle 18c there is a new utility called oraversion which you can get the same result as the above queries.

[oracle@server01 ~]$ oraversion
This program prints release version information.
These are its possible arguments:
-compositeVersion: Print the full version number: a.b.c.d.e.
-baseVersion: Print the base version number: a.0.0.0.0.
-majorVersion: Print the major version number: a.
-buildStamp: Print the date/time associated with the build.
-buildDescription: Print a description of the build.
-help: Print this message.
[oracle@server1 ~]$

[oracle@server01 ~]$ oraversion -majorVersion
18
[oracle@server01 ~]$

This could be somewhat useful but I though it was worth sharing.

Until next time.

Elisson Almeida

Quick ways to transfer files to Oracle Support – MOS directly from the Database Server

Hi all,

This will be a quick one but helped me a lot last week.

I was working on an Ora-0600 issue on a 2 node RAC cluster. Working with Oracle Support I was asked to transfer a TFA data as well ADRCI data so they could move forward with the SR, very common process right?

After I generated all the files, it was over 3gb of data to be sent over. If I was to get the file from the database server to my laptop I would need to transfer these files multiple times due to the jump servers in the middle.

This time I was luck as the database server was able to ping the site https://transport.oracle.com so I tried sending the files directly to MOS.

I will show you how I did it. This is not a mistery but can really save you time

You have 2 options which are straight forward

Using curl

curl -T FILE_YOU_WANT_TO_SEND -u MOS_USER https://transport.oracle.com/upload/issue/SR_NUMBER/

Using tfacfl

tfactl upload -sr SR_NUMBER -user MOS_USER FILE_YOU_WANT_TO_SEND

On both you need to provide your MOS credentials the SR number and the file you want o upload

After the upload is done, you will see in the attachments in the SR that the files is with the status like “Transferring to the SR” after a few minutes later you will see the update on the SR saying that the file was uploaded to the SR.

Hope this helps!

Elisson Almeida

Oracle New Visual Online Tools

Hi all,
So, after a long time of same old documentation style, Oracle started sharing some nice and visual pages for some specifics. Here are 2 examples:

1. Oracle Move to Cloud  (http://www.oracle.com/goto/move)
This one was shown to me by Ricardo Gonzalez, and it’s a GREAT tool if you are planning migrations to Oracle Cloud in general.

The part I like most is where you select the source and destination formats/versions you want and all recommended migration methods are suggested. Of course additional validation on database are required for each method, but it can potentially remind you of things you may be forgetting.

NewScreenshot 2019-05-08 às 16.56.46

Something else I like very much as well is the section with executive summaries of each migration tool. It’s basically ready for a print screen to put in the slides you are going to present to your boss… 🙂

NewScreenshot 2019-05-08 às 16.57.16.png

Also, you have handy some links to papers for the most different migration scenarios, as per:

NewScreenshot 2019-05-08 às 16.57.24

2. Oracle Database Features: (https://apex.oracle.com/database-features/

NewScreenshot 2019-05-08 às 16.52.14

And by selecting any of the features you can see more info in a very simple way, as per below. By the way, with one click you can be directed to the documentation (the old fashioned one):

NewScreenshot 2019-05-08 às 16.55.08

 

Hope you enjoy those as much as I did and hope Oracle continue providing those sort of interfaces to us!

Cheers!