DBCA “Recovery manager failed to restore datafiles”

Hi all,

If that’s the 5th Blog Post with the same title one open, don’t close, READ THIS ONE!

This one is different.

It’s actually another case about Oracle throwing generic errors for DBCA where 99% of times it’s the same error, so all blogs are different but the same in essence, and none resolve your problem. So, let’s go by parts:

The Error from Client:

DBCA_Error

 

Generic Case (if this is the first blog you open about the subject):

  • Make sure the file $ORACLE_HOME/bin/oracle has privilege set to chmod 6751 in both (ASM and DB) homes. it should look like this:
[oracle@PROD01 bin]$ cd /u01/app/oracle/product/19c/db/bin
[oracle@PROD01 bin]$
[oracle@PROD01 bin]$ ls -ltr oracle*
-rwsr-s--x. 1 oracle asmadmin 441253104 Aug 27 22:29 oracle
  • If you are not sure, set it accordingly:
cd $ORACLE_HOME/bin && chmod 6751 oracle
  • Not yet? Check the disks assigned to ASM privileges and groups:
kfod status=TRUE asm_diskstring='/dev/asm*' disk=ALL
  • Ater all this, still not working? Go for the atypical case below:

 

Atypical Case (Exception)

After some struggle and no success, I started validating everything I could. Some piece of words before the silver bullet here:

Noticed I knew you do have ASM? How come that? Well, most likely the error below the ASM happens at the point the DBCA runs a RMAN restore to create the database base files and metadata. This most likely happens at the time the write is happening on the ASM, once this is the most unstable process involved.

By looking deeper on the installation logs I could see:

[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: restoring datafile 00001 to +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: reading from backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=channel ORA_DISK_1: ORA-19870: error while restoring backup piece /ora01/app/oracle/product/19c/db/assistants/dbca/templates/Seed_Database.dfb
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-19504: failed to create file "+DATA"
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-17502: ksfdcre:4 Failed to create file +DATA
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-15001: diskgroup "DATA" does not exist or is not mounted
[Thread-527] [ 2020-08-27 23:50:04.942 PDT ] [RMANUtil$RMANUtilErrorListener.handleError:1386] ERROR=ORA-01017: invalid username/password; logon denied

Bingo, so it’s a password issue?

Well, I’m creating the database and this actually matches with all the chmod 6751 thing…

What then?

Well, after a while going crazy validating passwd files and so one, realized something about the oracle user:

[oracle@PROD01 bin]$ id -a
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
[oracle@PROD01 bin]$ grep oracle /etc/group
wheel:x:10:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
madhoc:x:525:oracle

Can you see the oracle user is part of the oinstall group but not really appearing on /etc/group? Also not in osusergroup dba.

Well, let’s force it?

[oracle@PROD01 bin]$ sudo su -
Last login: Fri Aug 28 14:13:21 PDT 2020 on pts/3
[root@DMSDB1PA ~]# usermod -g oinstall -G oinstall,dba,asmadmin,asmoper,madhoc oracle
[root@PROD01 ~]# id oracle
uid=500(oracle) gid=501(oinstall) groups=501(oinstall),10(wheel),203(dba),503(asmadmin),504(asmoper),525(madhoc)
[root@PROD01 ~]# grep oracle /etc/group
wheel:x:10:oracle
dba:x:203:oracle
asmadmin:x:503:oracle
asmoper:x:504:oracle
oinstall:x:501:oracle
madhoc:x:525:oracle
[root@PROD01 ~]#

Well done!

Now try running DBCA again. That was a very tricky issue to find.

Know something else? At the point I was writing this I decided to have another look and ended up finding this MOS note: “ORA-17502 /ORA-01017: invalid username/password; logon denied ” While Creating 19c Database (Doc ID 2545858.1). We have a bug for it: BUG:29821687 – ORA-17502 /ORA-01017: INVALID USERNAME/PASSWORD; LOGON DENIED ” WHILE CREATING 19C DATABASE

You have the workaround already though. Go champs!

Hope it helps you, cheers!

OPatch Auto Failing to Apply Datapatch for a Removed Database

Hi all,

So, just a few days ago, during a client support, crossed the following case.

A few database creations initially failed with DBCA due to other issues, but it seems the DBCA didn’t cleread all creation steps after the failure and theoretical rollback.

As a consequence, whenever running DBCA with GUI client was seeing an old database. When trying to remove it with DBCA, the removal fail as the database can’t be brought up (creation has failed, remember?). Ok, we can leave with it, right?

Yes until we reached the point where Opatchauto failed with the following:

Verifying SQL patch applicability on home /u01/app/oracle/product/19c/db

"/bin/sh -c 'cd /u01/app/oracle/product/19c/db; ORACLE_HOME=/u01/app/oracle/product/19c/db ORACLE_SID=DB1 /u01/app/oracle/product/19c/db/OPatch/datapatch -prereq -verbose'" command failed with errors. Please refer to logs for more details. SQL changes, if any, can be analyzed by manually retrying the same command.

The reason? See the complete log about the failing step:

Executing command as oracle: 
 /bin/sh -c 'cd /u01/app/oracle/product/19c/db;ORACLE_HOME=/u01/app/oracle/product/19c/db ORACLE_SID=DB1 /u01/app/oracle/product/19c/db/OPatch/datapatch -verbose'
2020-09-02 16:26:56,362 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - COMMAND Looks like this: /bin/sh -c 'cd /u01/app/oracle/product/19c/db;ORACLE_HOME=/u01/app/oracle/product/19c/db ORACLE_SID=DB1 /u01/app/oracle/product/19c/db/OPatch/datapatch -verbose'
2020-09-02 16:26:57,662 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.GISystemCall - Is retry required=false
2020-09-02 16:26:57,662 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - status: 1
2020-09-02 16:26:57,662 INFO  [438] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
COMMAND EXECUTION FAILURE :
SQL Patching tool version 19.8.0.0.0 Production on Wed Sep  2 16:26:57 2020
Copyright (c) 2012, 2020, Oracle.  All rights reserved.

Log file for this invocation: /u01/app/oracle/base/cfgtoollogs/sqlpatch/sqlpatch_25218_2020_09_02_16_26_57/sqlpatch_invocation.log

Connecting to database...
Error: prereq checks failed!
Database connect failed with: ORA-01034: ORACLE not available
ORA-27101: shared memory realm does not exist
Linux-x86_64 Error: 2: No such file or directory
Additional information: 4376
Additional information: 1275019259 (DBD ERROR: OCISessionBegin)

Please refer to MOS Note 1609718.1 and/or the invocation log
/u01/app/oracle/base/cfgtoollogs/sqlpatch/sqlpatch_25218_2020_09_02_16_26_57/sqlpatch_invocation.log
for information on how to resolve the above errors.

SQL Patching tool complete on Wed Sep  2 16:26:57 2020

Clearly, the database is still in place.
As per MOS (ORA-01078 Can Not Delete Database From Dbca on Linux (Doc ID 1559634.1)) -> See the /etc/oratab!

Thing is, the doesn’t have the DB1 line. Also, all related files, logs directory structure, passwd, init, etc… all wiped out. What else?

Here is goes what seems Oracle forgot to tell:

[oracle@PRODB01 dbca]$ srvctl status database -d DB1
Database is not running.
[oracle@DMSDB1PA dbca]$ srvctl config database -d DB1
Database unique name: DB1
Database name: DB1
Oracle home: /u01/app/oracle/product/19c/db
Oracle user: oracle
Spfile: 
Password file: 
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Disk Groups: DATA
Services: 
OSDBA group: oinstall
OSOPER group: 
Database instance: DB1
Ohhh, that took me a while to realize, this was a Standalone server. Once understood, the fix is straight forward:
[oracle@PRODB01 dbca]$ srvctl remove database -d DB1
Remove the database dms? (y/[n]) y
[oracle@PRODB01 dbca]$

I hope this can buy you some minutes of MOS, in case you are googling it first.
Or save you in case you gave up MOS already.

Cheers!

I bet you heard about ORACLE_PDB_SID, right? But do you know how it works?

This one is more on a curious side =) but somehow interesting.

Not too long ago Mike Dietrichde published a post on to connect directly to a PDB without a password.

After it, lots of DBAs started to post their testing showing the same, But how it works? As there is only a bequeath connection available to a CDB?

Well, I was doing some digging for another project and stumbled on a curious thing. A logon trigger called DBMS_SET_PDB

To my surprise, this trigger alters the session´s container to the one specified by the ORACLE_PDB_SID variable. Similar to Tanel Poder´s script called cc.sql or ccr.sql which basically are short cuts to avoid a lot of typing =)

SYS@CDB19c>select
DBMS_Metadata.Get_DDL('TRIGGER',t.trigger_name,t.owner)
From
DBA_Triggers t
Where
TRIGGER_NAME ='DBMS_SET_PDB'; 2 3 4 5 6

DBMS_METADATA.GET_DDL('TRIGGER',T.TRIGGER_NAME,T.OWNER)
--------------------------------------------------------------------------------

CREATE OR REPLACE NONEDITIONABLE TRIGGER "SYS"."DBMS_SET_PDB" after logon on database
WHEN (user = 'SYS' or user = 'SYSTEM') declare
pdb_name varchar2(64);
begin
DBMS_SYSTEM.get_env ('ORACLE_PDB_SID', pdb_name);
if(pdb_name is not null)
then
EXECUTE IMMEDIATE 'alter session set container = ' || '"' || pdb_name
|| '"';
end if;
exception
when others then
NULL;
end dbms_set_pdb;
ALTER TRIGGER "SYS"."DBMS_SET_PDB" ENABLE

Also is mentioned in the article that this is this would only work on versions 18.8 and above. But well, as we have the code for the trigger, should it possible to work on 12c as well?

Let´s try it| =) the oldest container which I have at the moment is 12r2  but that should do it.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SYS@CDB12CR2>select * from v$version;

BANNER CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production 0
PL/SQL Release 12.2.0.1.0 - Production 0
CORE 12.2.0.1.0 Production 0
TNS for Linux: Version 12.2.0.1.0 - Production 0
NLSRTL Version 12.2.0.1.0 - Production 0

SYS@CDB12CR2>CREATE OR REPLACE NONEDITIONABLE TRIGGER "SYS"."DBMS_SET_PDB_EFA" after logon on database
2 WHEN (user = 'SYS' or user = 'SYSTEM') declare
3 pdb_name varchar2(64);
begin
4 5 DBMS_SYSTEM.get_env ('ORACLE_PDB_SID', pdb_name);
6 if(pdb_name is not null)
7 then
8 EXECUTE IMMEDIATE 'alter session set container = ' || '"' || pdb_name
9 || '"';
10 end if;
exception
when others then
NULL;
end DBMS_SET_PDB_EFA; 11 12 13 14
15 /

Trigger created.

SYS@CDB12CR2>ALTER TRIGGER "SYS"."DBMS_SET_PDB_EFA" ENABLE
2 /

Trigger altered.

SYS@CDB12CR2>show pdbs

CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PDB12CR2 READ WRITE NO
SYS@CDB12CR2>exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
[oracle@server01 ~]$ export ORACLE_PDB_SID=PDB12CR2;
[oracle@server01 ~]$ CDB12CR2
The Oracle base remains unchanged with value /oraadm/oracle
LD_LIBRARY_PATH=/oraadm/oracle/product/12.2.0.1/lib
ORACLE_BASE=/oraadm/oracle
ORACLE_HOME=/oraadm/oracle/product/12.2.0.1
ORACLE_PDB_SID=PDB12CR2
ORACLE_SID=CDB12CR2
ORAENV_ASK=NO

[oracle@server01 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Thu Jul 30 11:30:34 2020

Copyright (c) 1982, 2016, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SYS@CDB12CR2>show pdbs

CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
3 PDB12CR2 READ WRITE NO
SYS@CDB12CR2>alter session set container=cdb$root;

Session altered.

SYS@CDB12CR2>show pdbs

CON_ID CON_NAME OPEN MODE RESTRICTED
---------- ------------------------------ ---------- ----------
2 PDB$SEED READ ONLY NO
3 PDB12CR2 READ WRITE NO
SYS@CDB12CR2>

And as you can see, the trigger works as expected right, it’s not a rocket science as it gets the context environment variable value and alters the session´s container.

Thanks and until next time.

Cheers.

Elisson Almeida

 

Note:

Right after I wrote this post, I went to look for more information on ORACLE_PDB_SID and found another post from Mike Dietrichde which also shows the trigger code and some pitfalls that you must be aware.

Autonomous Health Framework – TFA events, changes and set

Continuing on this series, there are 2 commands which are very useful and we’ll see today: “events” and “changes”. Besides “set”, which we’ll cover too.

On the events command you will be able to drill down to a specific date and time, passing a specific component if needed as well you can check a database or instance as well. Using the last option you can, this is very useful when you want to see a history for the system and lists all important events in the system:

Usage : /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl events [-search <keyword> | -component <ASM|CRS> | -database <db_name> | -instance <db_instance_name> | -source <filename> | -from <time> -to <time> | -json | -fields all|<fields_list> ]
-component [ASM|CRS] Searches all ASM or CRS events
-database Searches all events from a particular database
-instance Searches all events from a particular instance
-source Searches all events from a particular alert file
-json Outputs event information in JSON format
-fields When provided with the -json option, the command will only return the requested fields
-from "Mon/dd/yyyy hh:mm:ss" From <time>
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-to "Mon/dd/yyyy hh:mm:ss" To <time>
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-for "Mon/dd/yyyy" For <date>
or "yyyy-mm-dd"
-last <n><h|d>

[root@servertst01 bin]# ./tfactl events -last 30d -database dbaseprd


Output from host : servertst02
------------------------------

Event Summary:
INFO :7
ERROR :3
WARNING :0

Event Timeline:
[Apr/24/2020 01:43:06.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 21, new inc 23)
[Apr/24/2020 02:09:47.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 23, new inc 25)
[Apr/24/2020 02:12:36.000]: [db.dbaseprd.dbaseprd2]: Instance terminated by USER, pid = 246331
[Apr/24/2020 02:12:36.000]: [db.dbaseprd.dbaseprd2]: Shutting down instance (abort)
[Apr/24/2020 02:28:29.000]: [db.dbaseprd.dbaseprd2]: Starting ORACLE instance (normal)
[Apr/24/2020 02:28:37.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 0, new inc 29)
[Apr/24/2020 02:31:06.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 29, new inc 31)
[Apr/24/2020 03:14:38.000]: [db.dbaseprd.dbaseprd2]: Reconfiguration started (old inc 31, new inc 33)
[Apr/25/2020 11:46:12.000]: [db.dbaseprd.dbaseprd2]: ORA-00060: deadlock detected while waiting for resource
[May/07/2020 04:18:24.000]: [db.dbaseprd.dbaseprd2]: ORA-08103: object no longer exists

Output from host : servertst03
------------------------------

Event Summary:
INFO :7
ERROR :5
WARNING :0

Event Timeline:
[Apr/16/2020 04:12:25.000]: [db.dbaseprd.dbaseprd3]: ORA-08103: object no longer exists
[Apr/23/2020 04:12:17.000]: [db.dbaseprd.dbaseprd3]: ORA-08103: object no longer exists
[Apr/24/2020 01:43:06.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 21, new inc 23)
[Apr/24/2020 02:09:47.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 23, new inc 25)
[Apr/24/2020 02:12:37.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 25, new inc 27)
[Apr/24/2020 02:28:38.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 27, new inc 29)
[Apr/24/2020 02:31:05.000]: [db.dbaseprd.dbaseprd3]: Instance terminated by USER, pid = 6814
[Apr/24/2020 02:31:05.000]: [db.dbaseprd.dbaseprd3]: Shutting down instance (abort)
[Apr/24/2020 03:14:28.000]: [db.dbaseprd.dbaseprd3]: Starting ORACLE instance (normal)
[Apr/24/2020 03:14:37.000]: [db.dbaseprd.dbaseprd3]: Reconfiguration started (old inc 0, new inc 33)
[Apr/28/2020 10:12:44.000]: [db.dbaseprd.dbaseprd3]: ORA-00060: deadlock detected while waiting for resource
[May/10/2020 13:40:21.000]: [db.dbaseprd.dbaseprd3]: ORA-00060: deadlock detected while waiting for resource

Output from host : servertst01
------------------------------

Event Summary:
INFO :8
ERROR :2
WARNING :0

Event Timeline:
[Apr/24/2020 01:43:05.000]: [db.dbaseprd.dbaseprd1]: Instance terminated by USER, pid = 37711
[Apr/24/2020 01:43:05.000]: [db.dbaseprd.dbaseprd1]: Shutting down instance (abort)
[Apr/24/2020 02:09:38.000]: [db.dbaseprd.dbaseprd1]: Starting ORACLE instance (normal)
[Apr/24/2020 02:09:46.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 0, new inc 25)
[Apr/24/2020 02:12:37.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 25, new inc 27)
[Apr/24/2020 02:28:39.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 27, new inc 29)
[Apr/24/2020 02:31:06.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 29, new inc 31)
[Apr/24/2020 03:14:38.000]: [db.dbaseprd.dbaseprd1]: Reconfiguration started (old inc 31, new inc 33)
[Apr/24/2020 14:40:10.000]: [db.dbaseprd.dbaseprd1]: Incident details in: /oraadm/dbaseprd/admin/traces/diag/rdbms/dbaseprd/dbaseprd1/incident/incdir_115657/dbaseprd1_ora_151392_i115657.trc
[Apr/24/2020 14:40:10.000]: [db.dbaseprd.dbaseprd1]: ORA-00600: internal error code, arguments: [pevmexe.c: ENTER: which perc?], [], [], [], [], [], [], [], [], [], [], []

[root@servertst01 bin]#

The “changes” command, this shows the changes in the system, as I did not have any changes made after the TFA was installed was not able to reproduce it here. It lists all changes in the system:

Usage : /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl changes [ -from <time> -to <time> | -for <time> | last <time_length> ]
-from "Mon/dd/yyyy hh:mm:ss" From <time>
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-to "Mon/dd/yyyy hh:mm:ss" To <time>
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-for "Mon/dd/yyyy" For <date>
or "yyyy-mm-dd"
-last <n><h|d>

Another way to see the changes in your environment is to use the search command passing a part to it and a specific component.

[root@servertst01 bin]# ./tfactl analyze -search "ALTER" -comp db
INFO: analyzing db (DB Alert Logs) logs for the last 60 minutes... Please wait...
INFO: analyzing host: servertst01

Report title: DB Alert Logs
Report date range: last ~1 hour(s)
Report (default) time zone: WET - Western European Time
Analysis started at: 11-May-2020 03:04:17 PM WEST
Elapsed analysis time: 2 second(s).
Configuration file: /oraadm/dba/ahf/oracle.ahf/tfa/ext/tnt/conf/tnt.prop
Configuration group: db
Parameter: ALTER
Total message count: 44,397, from 10-Feb-2020 01:19:51 PM WET to 11-May-2020 03:01:46 PM WEST
Messages matching last ~1 hour(s): 18, from 11-May-2020 02:17:23 PM WEST to 11-May-2020 03:01:46 PM WEST
Matching regex: ALTER
Case sensitive: false
Match count: 1
May 11 15:01:46 2020 - generic - ALTER SYSTEM SET open_cursors=6000 SCOPE=BOTH SID='*';

INFO: analyzing db (DB Alert Logs) logs for the last 60 minutes... Please wait...
INFO: analyzing host: servertst02

Report title: DB Alert Logs
Report date range: last ~1 hour(s)
Report (default) time zone: WET - Western European Time
Analysis started at: 11-May-2020 03:04:23 PM WEST
Elapsed analysis time: 0 second(s).
Configuration file: /oraadm/dba/ahf/oracle.ahf/tfa/ext/tnt/conf/tnt.prop
Configuration group: db
Parameter: ALTER
Total message count: 3,598, from 10-Feb-2020 03:03:42 PM WET to 11-May-2020 02:30:16 PM WEST
Messages matching last ~1 hour(s): 5, from 11-May-2020 02:30:05 PM WEST to 11-May-2020 02:30:16 PM WEST
Matching regex: ALTER
Case sensitive: false
Match count: 2
May 11 14:30:05 2020 - generic - ALTER SYSTEM ARCHIVE LOG
May 11 14:30:16 2020 - generic - ALTER SYSTEM ARCHIVE LOG
Thread 2 advanced to log sequence 1096 (LGWR switch)
Current log# 4 seq# 1096 mem# 0: +REDO1/testdba/onlinelog/group_4.274.1032015207
Current log# 4 seq# 1096 mem# 1: +REDO2/testdba/onlinelog/group_4.274.1032015207

INFO: analyzing db (DB Alert Logs) logs for the last 60 minutes... Please wait...
INFO: analyzing host: servertst03

Report title: DB Alert Logs
Report date range: last ~1 hour(s)
Report (default) time zone: WET - Western European Time
Analysis started at: 11-May-2020 03:04:25 PM WEST
Elapsed analysis time: 0 second(s).
Configuration file: /oraadm/dba/ahf/oracle.ahf/tfa/ext/tnt/conf/tnt.prop
Configuration group: db
Parameter: ALTER
Total message count: 1,687, from 19-Mar-2020 06:28:29 PM WET to 11-May-2020 02:30:18 PM WEST
Messages matching last ~1 hour(s): 4, from 11-May-2020 02:30:06 PM WEST to 11-May-2020 02:30:18 PM WEST
Matching regex: ALTER
Case sensitive: false
Match count: 0

And for last the set command which will configure several options for TFA:

[root@servertst01 bin]# ./tfactl set -h

Usage : tfact set ahfnotificationaddress="<space separated email ids>"
ahfnotificationaddress="<email1> <email2>" set AHF Notification Address

Turn ON/OFF or Modify various TFA features

Usage : /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set [ autodiagcollect=<ON|OFF> | trimfiles=<ON|OFF> | tracelevel=<COLLECT|SCAN|INVENTORY|OTHER|ISA|HANDLER|MAIN|CLIENT|CONSOLE>:<FATAL|ERROR|WARN|INFO|DEBUG|TRACE> | reposizeMB=<n> [repositorydir=<dir>] [-force] | repositorydir=<dir> [reposizeMB=<n>] [-force] | logsize=<n> [-local] | logcount=<n> [-local] | maxcorefilesize=<n> [-local] | maxcorecollectionsize=<n> [-local] | maxfilecollectionsize=<n> | autopurge=<ON|OFF> | publicip=<ON|OFF> | minSpaceForRTScan=<n> | rtscan=<ON|OFF> | diskUsageMon=<ON|OFF> | diskUsageMonInterval=<n> | manageLogsAutoPurge=<ON|OFF> | manageLogsAutoPurgeInterval=<n> | manageLogsAutoPurgePolicyAge=<d|h> | minagetopurge=<n> | tfaIpsPoolSize=<n> | tfaDbUtlPurgeAge=<n> | tfaDbUtlPurgeMode=<simple|resource|profile> | tfaDbUtlPurgeThreadDelay=<n> | tfaDbUtlCrsProfileDelay=<n> | indexRecoveryMode=<recreate|restore>] [-c]

autodiagcollect allow for automatic diagnostic collection when an event
is observed (default ON)
trimfiles allow trimming of files during diagcollection (default ON)
tracelevel control the trace level of log files in /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag/tfa
(default INFO for all facilities)
reposizeMB=<n> set the maximum size of diagcollection repository to <n>MB
repositorydir=<dir> set the diagcollection repository to <dir>
-force skip inital checks while changing repository (Not Recommended)
logsize=<n> set the maximum size of each TFA log to <n>MB (default 50 MB)
logcount=<n> set the maximum number of TFA logs to <n> (default 10)
port=<n> set TFA Port to <n>
maxcorefilesize=<n> set the maximum size of Core File to <n>MB (default 20 MB )
maxcorecollectionsize=<n> set the maximum collection size of Core Files to <n>MB (default 200 MB )
maxfilecollectionsize=<n> set the maximum file collection size to <n>MB (default 5 GB )
autopurge allow automatic purging of collections when less space
is observed in repository (default OFF)
publicip allow TFA to run on public network
smtp Update SMTP Configuration
minSpaceForRTScan=<n> Minimun space required to run RT Scanning(default 500)
rtscan allow Alert Log Scanning
diskUsageMon allow Disk Usage Monitoring
diskUsageMonInterval=<n> Time interval between consecutive Disk Usage Snapshot(default 60 minutes)
manageLogsAutoPurge allow Manage Log Auto Purging
manageLogsAutoPurgeInterval=<n> Time interval between consecutive Managelogs Auto Purge(default 60 minutes)
manageLogsAutoPurgePolicyAge=<d|h> Logs older than the time period will be auto purged(default 30 days)
minagetopurge set the age in hours for collections to be skipped by
AutoPurge (default 12 Hours)
tfaIpsPoolSize set the TFA IPS pool size
tfaDbUtlPurgeAge set the TFA ISA Purge Age (in seconds)
tfaDbUtlPurgeMode set the TFA ISA Purge Mode (simple/resource)
tfaDbUtlPurgeThreadDelay set the TFA ISA Purge Thread Delay (in minutes)
tfaDbUtlCrsProfileDelay set the TFA ISA Crs Profile Delay
indexRecoveryMode set the Lucene index recovery mode (recreate/restore)
rediscoveryInterval set the time interval for running lite rediscovery
-c set the value on all nodes (Does not apply to repository
settings)
-local set the value on the local node (if option is not included
the value will be set on all the nodes)
Examples:
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set autodiagcollect=ON
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set autopurge=ON
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set tracelevel=INVENTORY:3
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set reposizeMB=20480
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set logsize=100
/oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl set port=5000

Hope it helps, there are still many scenarios of TFA which I would like to bring but not that much time, I will try to make long posts for the next ones. We´ll see.

Thanks for reading!

Elisson Almeida

Autonomous Health Framework – Managing logs and REST API

Hi all,

We are used to use adrci to manage our database logs right, if you don’t about adrci please check here and here.

But did you know that TFA can cleanup the CRS and database logs as well?

You might be thinking: 2 set of tools for the same job? Yyeah kind off… =)

Adrci is configurable and performs the purges per configured policy and tfactl managelogs will not. Also the tfactl managelogs will also clean the listener logs as well and you can clean all cluster logs from one node without having to connect to each server to execute the purge from adrci for example.

The cleanup can be done by database or GI which comes in handy as well.

You have a dry run option which will show what will be deleted and total freed space without actually deleting any files

Usage : /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl [run] managelogs [ -purge [[-older <n><m|h|d>] | [-gi] | [-database <all|d1,d2..>] | [-dryrun] ]] [ -show [usage|variation] [ [-older <n><d>] | [-gi] | [-database <all|d1,d2..>] ] ] [-node <all|local|node1,node2..>]

Options:
-purge           Purge logs
   -older        Timeperiod for log purge
   -gi           Purge Grid Infrastructure logs(all ADR homes under GIBASE/diag and crsdata(cvu dirs))
   -database     Purge Database logs (Default all else provide list)
   -dryrun       Estimate logs which will be cleared by purge command
-show            Print usage/variation details
   -older         Timeperiod for change in log volume
   -gi           Space utilization under GIBASE
   -database     Space utilization for Database logs (Default all else provide list)


 -older <n><m|h|d>  Files from past 'n' [d]ays or 'n' [h]ours or 'n' [m]inutes


e.g:
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl managelogs -purge -older 30d -dryrun
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl managelogs -purge -older 30d
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl managelogs -show usage

   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl run managelogs -purge -older 30d -dryrun
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl run managelogs -purge -older 30d
   /oraadm/dba/ahf/oracle.ahf/tfa/bin/tfactl run managelogs -show usage

 

TFA also comes with a REST API which you can use POST and GET http which you need to setup prior usage, it will setup 2 users, tfaadmin and tfarest, and start a ORDS service on the server.

[root@servertst01 bin]# ./tfactl rest status

TFA REST Services is not running

[root@servertst01 bin]# ./tfactl rest start

TFA REST Services is not running

[root@servertst01 bin]# ./tfactl rest -start

Configuring TFA REST Services using ORDS :

This might take couple of minutes. Please be patient.

Adding Dependency Jars to ORDS

Adding users to ORDS :

Password must be at least 8 characters and contain a combination of Upper case and Lower case letters, numbers and one of [!@#%_]

Enter a password for user tfaadmin:
Confirm password for user tfaadmin:

Enter a password for user tfarest:
Confirm password for user tfarest:

Starting TFA REST Services

Successfully started TFA REST Services [PID : 158682]

TFA URL : https://servertest01:9090/ords/tfactl/print/status

COMPLIANCE URL : https://servertest01:9090/ords/tfactl/orachk/version

 

The commands are most like the ones from the CLI but using this REST API, you will receive JSON outputs which you can process and automate the use of TFA

Below are some examples

/tfactl/print/status

[root@servertst01 ~]$ curl -k --user tfaadmin:tfaadmin https://servertst01:9090/ords/tfactl/print/status
[ {
"status" : "CheckOK",
"hostname" : "servertst01",
"pid" : 135195,
"port" : 5000,
"version" : "20.1.3.0.0",
"buildId" : "20130020200429095054",
"inventoryStatus" : "COMPLETE"
}, {
"status" : "CheckOK",
"hostname" : "servertst02",
"pid" : 241723,
"port" : 5000,
"version" : "20.1.3.0.0",
"buildId" : "20130020200429095054",
"inventoryStatus" : "COMPLETE"
} ]

/print/actions

[root@servertst01 ~]$ curl -k --user tfaadmin:tfaadmin https://servertst01:9090/ords/tfactl/print/actions
[ {
"actionName" : "Run inventory",
"hostname" : "servertst01",
"client" : "DiagCollectionThread",
"startTime" : "May 13 15:48:52 WEST",
"endTime" : "May 13 15:49:58 WEST",
"status" : "COMPLETE",
"comments" : "<components><component><name>EMAGENT</name></component><component><name>DBCLIENT</name></component><component><name>ACFS</name></component><component><name>RDBMS</name></component><component><name>OS</name></component><component><name>CRS</name></component><component><name>EM</name></component><component><name>ASMPROXY</name></component><component><name>ASMIO</name></component><component><name>CRSCLIENT</name></component><component><name>OMS</name></component><component><name>DBWLM</name></component><component><name>INSTALL</name></component><component><name>TNS</name></component><component><name>OCM</name></component><component><name>ASM</name></component><component><name>RHP</name></component><component><name>CHA</name></component><component><name>AFD</name></component><component><name>EMPLUGINS</name></component><component><name>CFGTOOLS</name></component><component><name>WLS</name></component></components>"
}, {
"actionName" : "Collect traces",
"hostname" : "servertst01",
"client" : "tfactl",
"startTime" : "May 13 15:48:51 WEST",
"endTime" : "May 13 15:53:21 WEST",
"status" : "COMPLETE",
"comments" : "-database all -rdbms -asm -crsclient -dbclient -dbwlm -tns -rhp -afd -crs -wls -emagent -oms -ocm -emplugins -em -acfs -install -cfgtools -os -sundiag -chmos -monitor -copy -since 12h -node all -z TFA_DEF_ZIP_20200513154851 -user root -tag TFA_DEF_TAG_20200513154851 -logid 20200513154851:20200513154851servertst01"
}, {
"actionName" : "Run inventory",
"hostname" : "servertst01",
"client" : "DiagCollectionThread",
"startTime" : "May 13 15:47:48 WEST",
"endTime" : "May 13 15:48:20 WEST",
"status" : "COMPLETE",
"comments" : "<components><component><name>EMAGENT</name></component><component><name>DBCLIENT</name></component><component><name>ACFS</name></component><component><name>RDBMS</name></component><component><name>OS</name></component><component><name>CRS</name></component><component><name>EM</name></component><component><name>ASMPROXY</name></component><component><name>ASMIO</name></component><component><name>CRSCLIENT</name></component><component><name>OMS</name></component><component><name>DBWLM</name></component><component><name>INSTALL</name></component><component><name>TNS</name></component><component><name>OCM</name></component><component><name>ASM</name></component><component><name>RHP</name></component><component><name>CHA</name></component><component><name>AFD</name></component><component><name>EMPLUGINS</name></component><component><name>CFGTOOLS</name></component><component><name>WLS</name></component></components>"
}, {
"actionName" : "Collect traces",
"hostname" : "servertst01",
"client" : "tfactl",
"startTime" : "May 13 15:47:48 WEST",
"endTime" : "May 13 15:52:29 WEST",
"status" : "COMPLETE",
"comments" : "-database all -rdbms -asm -crsclient -dbclient -dbwlm -tns -rhp -afd -crs -wls -emagent -oms -ocm -emplugins -em -acfs -install -cfgtools -os -sundiag -chmos -monitor -copy -since 12h -node all -z TFA_DEF_ZIP_20200513154748 -user root -tag TFA_DEF_TAG_20200513154748 -logid 20200513154748:20200513154748servertst01"
}, {
"actionName" : "Run inventory",
"hostname" : "servertst02",
"client" : "DiagCollectionThread",
"startTime" : "May 13 15:48:54 WEST",
"endTime" : "May 13 15:49:04 WEST",
"status" : "COMPLETE",
"comments" : "<components><component><name>EMAGENT</name></component><component><name>DBCLIENT</name></component><component><name>ACFS</name></component><component><name>RDBMS</name></component><component><name>OS</name></component><component><name>CRS</name></component><component><name>EM</name></component><component><name>ASMPROXY</name></component><component><name>ASMIO</name></component><component><name>CRSCLIENT</name></component><component><name>OMS</name></component><component><name>DBWLM</name></component><component><name>INSTALL</name></component><component><name>OCM</name></component><component><name>TNS</name></component><component><name>ASM</name></component><component><name>RHP</name></component><component><name>CHA</name></component><component><name>AFD</name></component><component><name>EMPLUGINS</name></component><component><name>CFGTOOLS</name></component><component><name>WLS</name></component></components>"
}, {
"actionName" : "Collect traces",
"hostname" : "servertst02",
"client" : "tfactl",
"startTime" : "May 13 15:48:53 WEST",
"endTime" : "May 13 15:53:27 WEST",
"status" : "COMPLETE",
"comments" : "-timezone Europe/Lisbon -nodelist servertst02_servertst01 -s 05.13.2020.03.48 -e 05.13.2020.15.48 -updateendtime -z TFA_DEF_ZIP_20200513154851.zip -tag TFA_DEF_TAG_20200513154851 -logid 20200513154851 -emagent -dbclient -acfs -rdbms -os -crs -em -asmproxy -asmio -crsclient -oms -dbwlm -install -tns -ocm -asm -rhp -cha -afd -emplugins -cfgtools -wls -chmos -copy -user root -uid 9a79d417-a877-4355-a2f2-210f1d45024a:20200513154851servertst01"
}, {
"actionName" : "Run inventory",
"hostname" : "servertst02",
"client" : "DiagCollectionThread",
"startTime" : "May 13 15:47:50 WEST",
"endTime" : "May 13 15:48:12 WEST",
"status" : "COMPLETE",
"comments" : "<components><component><name>EMAGENT</name></component><component><name>DBCLIENT</name></component><component><name>ACFS</name></component><component><name>RDBMS</name></component><component><name>OS</name></component><component><name>CRS</name></component><component><name>EM</name></component><component><name>ASMPROXY</name></component><component><name>ASMIO</name></component><component><name>CRSCLIENT</name></component><component><name>OMS</name></component><component><name>DBWLM</name></component><component><name>INSTALL</name></component><component><name>OCM</name></component><component><name>TNS</name></component><component><name>ASM</name></component><component><name>RHP</name></component><component><name>CHA</name></component><component><name>AFD</name></component><component><name>EMPLUGINS</name></component><component><name>CFGTOOLS</name></component><component><name>WLS</name></component></components>"
}, {
"actionName" : "Collect traces",
"hostname" : "servertst02",
"client" : "tfactl",
"startTime" : "May 13 15:47:49 WEST",
"endTime" : "May 13 15:52:37 WEST",
"status" : "COMPLETE",
"comments" : "-timezone Europe/Lisbon -nodelist servertst02_servertst01 -s 05.13.2020.03.47 -e 05.13.2020.15.47 -updateendtime -z TFA_DEF_ZIP_20200513154748.zip -tag TFA_DEF_TAG_20200513154748 -logid 20200513154748 -emagent -dbclient -acfs -rdbms -os -crs -em -asmproxy -asmio -crsclient -oms -dbwlm -install -tns -ocm -asm -rhp -cha -afd -emplugins -cfgtools -wls -chmos -copy -user root -uid 94229462-6332-4a7e-849b-9469c1ad86ef:20200513154748servertst01"
} ]

/tfactl/diagcollect

[root@servertst01 ~]$ curl -k --user tfaadmin:tfaadmin https://servertst01:9090/ords/tfactl/diagcollect
{
"collectionId" : "20200513161625servertst01",
"zipName" : "TFA_DEF_ZIP_20200513161625",
"tagName" : "TFA_DEF_TAG_20200513161625",
"message" : [ "Diagcollect request will be processed soon by TFA" ]
}

 

You can download the report using the collectionId as below

curl -k --user tfaadmin:tfaadmin -L0 https://servertst01:9090/ords/tfactl/download/20200513161625servertst01 --output remoteTFA.zip

 % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 196M 0 196M 0 0 195M 0 --:--:-- 0:00:01 --:--:-- 195M

On the next ones I will do a review on a ORACHK and OSWatcher which are also part of the Autonomous Health Framework

I Hope it helps!

Elisson Almeida

Autonomous Health Framework – OSWatcher

Hi all,

Another one for our series about Autonomous Health Framework:

Along with the AHF, we have some tools that we can take advantage of, one of them is OSWatcher. OSwatcher is a utility to capture performance metrics from the operating system using native OS tools for IO, network, CPU, memory, etc.

It gathers a snapshot of your system and stores it in a directory which you can then use it to parse the information there and perform a system wide analysis.

You can stop it if you want, out of the box it will gather OS information every 30 minutes

To see if it is running you can use tfactl toolstatus as below

[root@servertst01 bin]# ./tfactl toolstatus

.------------------------------------------------------------------.
| TOOLS STATUS - HOST : servertst01 |
+----------------------+--------------+--------------+-------------+
| Tool Type | Tool | Version | Status |
+----------------------+--------------+--------------+-------------+
| Development Tools | orachk | 19.3.0.0.0 | DEPLOYED |
| | oratop | 14.1.2 | DEPLOYED |
+----------------------+--------------+--------------+-------------+
| Support Tools Bundle | darda | 2.10.0.R6036 | DEPLOYED |
| | oswbb | 8.3.2 | RUNNING |
| | prw | 12.1.13.11.4 | NOT RUNNING |
+----------------------+--------------+--------------+-------------+
| TFA Utilities | alertsummary | 19.3.0.0.0 | DEPLOYED |
| | calog | 19.3.0.0.0 | DEPLOYED |
| | dbcheck | 18.3.0.0.0 | DEPLOYED |
| | dbglevel | 19.3.0.0.0 | DEPLOYED |
| | grep | 19.3.0.0.0 | DEPLOYED |
| | history | 19.3.0.0.0 | DEPLOYED |
| | ls | 19.3.0.0.0 | DEPLOYED |
| | managelogs | 19.3.0.0.0 | DEPLOYED |
| | menu | 19.3.0.0.0 | DEPLOYED |
| | param | 19.3.0.0.0 | DEPLOYED |
| | ps | 19.3.0.0.0 | DEPLOYED |
| | pstack | 19.3.0.0.0 | DEPLOYED |
| | summary | 19.3.0.0.0 | DEPLOYED |
| | tail | 19.3.0.0.0 | DEPLOYED |
| | triage | 19.3.0.0.0 | DEPLOYED |
| | vi | 19.3.0.0.0 | DEPLOYED |
'----------------------+--------------+--------------+-------------'

Note :-
DEPLOYED : Installed and Available - To be configured or run interactively.
NOT RUNNING : Configured and Available - Currently turned off interactively.
RUNNING : Configured and Available.

[root@servertst01 bin]#

And to run a simple test you can call tfactl oswbb and it should parse all data in its archive directory BUT and I tried to do so I ran into a java out of memory error
I bumped the memory of the process up to 4G and still gor the same error.

If you want to try to increase your process memory you will need to edit the file oswbb.pm under oracle.ahf/tfa/ext/oswbb

You will see a line like below, and you will need to change the memory valeu to one that your system can handle, I´m not saying it will not work you can try.

system("$java -Xmx512M -jar $oswjar -i $adir @flags");

Here is the error:

[root@servertst01 bin]# ./tfactl oswbb

Starting OSW Analyzer V8.3.0
OSWatcher Analyzer Written by Oracle Center of Expertise
Copyright (c) 2019 by Oracle Corporation

Parsing Data. Please Wait...

Scanning file headers for version and platform info...


Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at u.c(Unknown Source)
at u.a(Unknown Source)
at OSWGraph.OSWGraph.main(Unknown Source)


Analysis results are saved in /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/oswbb/root/oswbb

And there process memory which was a bit ovee 4G when it died

[root@servertst01 ~]$ ps -eo rss,pid,euser,lstart,args:100 --sort %mem | grep -v grep | grep java | awk '{printf $1/1024 "MB"; $1=""; print }'| sort |grep osw
4364.3MB 103531 root Wed May 13 18:09:12 2020 /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/jre/bin/java -Xmx4096M -jar /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/oswbb/root/oswbb/oswbba.jar -i /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/oswbb/oracrs/archive

I had over 4 days of data there, so by using the options -B and -E I was able to workaround the issue. If you are doing some troubleshooting I would advise to read 4 days of data anyway as we could be see averages and a spike could be reduced and not see in the analysis.
So it did not bothered that much.

Here when passing the date range I was able to execute the process as needed.

[root@servertst01 bin]# ./tfactl oswbb -6 -B May 13 09:25:00 2020 -E May 13 09:30:00 2020

Validating times in the archive...


Starting OSW Analyzer V8.3.0
OSWatcher Analyzer Written by Oracle Center of Expertise
Copyright (c) 2019 by Oracle Corporation

Parsing Data. Please Wait...

Scanning file headers for version and platform info...


Parsing file servertst01_pidstat_20.05.13.0900.dat ...

Parsing file servertst01_iostat_20.05.13.0900.dat ...
This directory already exists. Rewriting...

Parsing file servertst01_vmstat_20.05.13.0900.dat ...

Parsing file servertst01_netstat_20.05.13.0900.dat ...

Parsing file servertst01_top_20.05.13.0900.dat ...

Parsing file servertst01_ps_20.05.13.0900.dat ...


Parsing Completed.


Enter 1 to Display CPU Process Queue Graphs
Enter 2 to Display CPU Utilization Graphs
Enter 3 to Display CPU Other Graphs
Enter 4 to Display Memory Graphs
Enter 5 to Display Disk IO Graphs
Enter 61 to Display Individual OS Process I/O RPS Graphs
Enter 62 to Display Individual OS Process I/O WPS Graphs
Enter 63 to Display Individual OS Process Percent User CPU Graphs
Enter 64 to Display Individual OS Process Percent System CPU Graphs
Enter 65 to Display Individual OS Process Percent Total CPU (User + System) Graphs
Enter 66 to Display Individual OS Process Percent Memory Graphs

Enter GP to Generate Individual Process Profile
Enter GC to Generate All CPU Gif Files
Enter GM to Generate All Memory Gif Files
Enter GD to Generate All Disk Gif Files
Enter GN to Generate All Network Gif Files

Enter L to Specify Alternate Location of Gif Directory
Enter Z to Zoom Graph Time Scale (Does not change analysis dataset)
Enter B to Returns to Baseline Graph Time Scale (Does not change analysis dataset)
Enter R to Remove Currently Displayed Graphs

Enter X to Export Parsed Data to Flat File
Enter S to Analyze Subset of Data(Changes analysis dataset including graph time scale)
Enter A to Analyze Data
Enter D to Generate DashBoard

Enter Q to Quit Program

Please Select an Option:

Hope it helps!

Elisson Almeida

Autonomous Health Framework – TFA data collection

Hi all,

Here I am or another post on the Autonomous Health Framework series!

Based on previous posts, we now have the AHF installed, what can we do with it?
I would say a lot! Everything? I don’t think so.

I will post here what I usually use more on TFA and then I will focus on orachk (which is a monster on its own as well).

Imagine that you have a problem and need to gather different information from the OS and the Oracle Stack from different nodes?

That´s where TFA – Trace file analyzer comes into play. It will collect the data for you and if you run as its root. You can specify the component that you need if you need to drill down to a very specify issue.

Also when you have a Service request with MOS they will probably ask for it to be executed but if they don’t, you should do it and upload any way as it will save a lot of time to troubleshooting and back and forth messages.

And if you have Internet access, you can upload it using the TFA as well, see it here.

[root@servertst01 bin]# ./tfactl diagcollect –h

Invalid Option for diagcollect: –h

Collect logs from across nodes in cluster

Usage : /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect [ [component_name1] [component_name2] ... [component_nameN] | [-srdc ] | [-defips]] [-sr <SR#>] [-node <all|local|n1,n2,..>] [-tag ] [-z ] [-last <m|h|d>| -from  -to  | -for ] [-nocopy] [-notrim] [-silent] [-nocores][-collectalldirs][-collectdir <dir1,dir2..>][-examples]
components:-ips|-database|-asm|-crsclient|-dbclient|-dbwlm|-tns|-rhp|-procinfo|-cvu|-afd|-crs|-cha|-chastore|-wls|-emagent|-oms|-ocm|-emplugins|-em|-acfs|-install|-cfgtools|-os|-ashhtml|-ashtext|-awrhtml|-awrtext|-qos
-srdc Service Request Data Collection (SRDC).
-defips Include in the default collection the IPS Packages for:
ASM, CRS and Databases
-sr Enter SR number to which the collection will be uploaded
-node Specify comma separated list of host names for collection
-tag  The files will be collected into tagname directory inside
repository
-z  The collection zip file will be given this name within the
TFA collection repository
-last <m|h|d> Files from last 'n' [m]inutes, 'n' [d]ays or 'n' [h]ours
-since Same as -last. Kept for backward compatibility.
-from "Mon/dd/yyyy hh:mm:ss" From 
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-to "Mon/dd/yyyy hh:mm:ss" To 
or "yyyy-mm-dd hh:mm:ss"
or "yyyy-mm-ddThh:mm:ss"
or "yyyy-mm-dd"
-for "Mon/dd/yyyy" For .
or "yyyy-mm-dd"
-nocopy Does not copy back the zip files to initiating node from all nodes
-notrim Does not trim the files collected
-silent This option is used to submit the diagcollection as a background
process
-nocores Do not collect Core files when it would normally have been
collected
-collectalldirs Collect all files from a directory marked "Collect All"
flag to true
-collectdir Specify comma separated list of directories and collection will
include all files from these irrespective of type and time constraints
in addition to components specified
-examples Show diagcollect usage examples
For detailed help on each component use:
/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect [component_name1] [component_name2] ... [component_nameN] -help
[root@servertst01 bin]#

[root@servertst01 bin]# ./tfactl diagcollect -examples
Examples:
/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect
Trim and Zip all files updated in the last 12 hours as well as chmos/osw data
from across the cluster and collect at the initiating node
Note: This collection could be larger than required but is there as the
simplest way to capture diagnostics if an issue has recently occurred.

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -last 8h
Trim and Zip all files updated in the last 8 hours as well as chmos/osw data
from across the cluster and collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -database hrdb,fdb -last 1d -z foo
Trim and Zip all files from databases hrdb & fdb in the last 1 day and
collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -crs -os -node node1,node2 -last 6h
Trim and Zip all crs files, o/s logs and chmos/osw data from node1 & node2
updated in the last 6 hours and collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -asm -node node1 -from "May/07/2020" -to "May/07/2020 21:00:00"
Trim and Zip all ASM logs from node1 updated between from and to time and
collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -for "May/07/2020"
Trim and Zip all log files updated on "May/07/2020" and collect at the
collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -for "May/07/2020 21:00:00"
Trim and Zip all log files updated from 09:00 on "May/07/2020" to 09:00 on "May/08/2020"
(i.e. 12 hours before and after the time given) and collect at the initiating node

/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl diagcollect -crs -collectdir /tmp_dir1,/tmp_dir2
Trim and Zip all crs files updated in the last 12 hours
Also collect all files from /tmp_dir1 and /tmp_dir2 at the initiating node

 

Autonomous Health Framework – TFA repository purge

Hello again,

Here I am for the third posts on this series about Autonomous Health Framework.

And today’s it’s a quick one for a very specific item. Let’s say TFA related information is stored in the cluster nodes local storage under the directory specified in the -data_dir parameter during the installation:

[root@servertst01 bin]# ./tfactl showrepo -h
usage: tfactl showrepo [-h] [-all] [-tfa] [-compliance]
Get repository location of AHF components
optional arguments:
-h, --help show this help message and exit
-all
-tfa
-compliance
[root@servertst01 bin]# ./tfactl showrepo -all
.-------------------------------------------------------------------------------------.
| servertst02 |
+----------------------+--------------------------------------------------------------+
| Repository Parameter | Value |
+----------------------+--------------------------------------------------------------+
| Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository |
| Maximum Size (MB) | 4425 |
| Current Size (MB) | 170 |
| Free Size (MB) | 4255 |
| Status | OPEN |
'----------------------+--------------------------------------------------------------'

.-------------------------------------------------------------------------------------.
| servertst01 |
+----------------------+--------------------------------------------------------------+
| Repository Parameter | Value |
+----------------------+--------------------------------------------------------------+
| Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository |
| Maximum Size (MB) | 3833 |
| Current Size (MB) | 898 |
| Free Size (MB) | 2935 |
| Status | OPEN |
'----------------------+--------------------------------------------------------------'

orachk repository: /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/servertst01/orachk

[root@servertst01 bin]# ./tfactl showrepo -tfa
.-------------------------------------------------------------------------------------.
| servertst02 |
+----------------------+--------------------------------------------------------------+
| Repository Parameter | Value |
+----------------------+--------------------------------------------------------------+
| Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository |
| Maximum Size (MB) | 4425 |
| Current Size (MB) | 172 |
| Free Size (MB) | 4253 |
| Status | OPEN |
'----------------------+--------------------------------------------------------------'

.-------------------------------------------------------------------------------------.
| servertst01 |
+----------------------+--------------------------------------------------------------+
| Repository Parameter | Value |
+----------------------+--------------------------------------------------------------+
| Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository |
| Maximum Size (MB) | 3833 |
| Current Size (MB) | 900 |
| Free Size (MB) | 2933 |
| Status | OPEN |
'----------------------+--------------------------------------------------------------'

Space running out could be a real problem so purging it from time to time would help avoid problems:

[root@servertst01 bin]# ./tfactl purge -h

Delete collections from TFA repository

Usage : /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl purge -older x[h|d] [-force]
Examples:
/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl purge -older 30d - To remove file(s) older than 30 days.
/oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl purge -older 10h - To remove file(s) older than 10 hours.
[root@servertst01 bin]#

 

This worked for me, I hope it works for you!

Elisson Almeida


									

Autonomous Health Framework – Missing node on TFA

Hi all,

Welcome to the second post of this series!

Yesterday we one about Autonomous Health Framework basics and tomorrow we have another one!

For today: While manually installing AHF – when the root user can not ssh to the cluster nodes – you might have a missing communication between the nodes while installing
which you don’t have all cluster nodes found by all TFA installations.

In my example, I have a 3 nodes cluster which I was doing the manual install, on the 1st two nodes I was able to see them but on the 3rd node the tfa was not being able to see the other 2.

Here are the steps or the manual installation:

  • Node1
[root@servertst01 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_178647_2020_05_08-16_24_07.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/dba/ahf/oracle.ahf

AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/dba/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------+
| AHF Location | /oraadm/dba/ahf/oracle.ahf |
| TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/dba/ahf/oracle.ahf/data |
| Repository | /oraadm/dba/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag |
'-----------------+--------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_178647_2020_05_08-16_24_07.log to /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag/ahf/

Node 2

[root@servertst02 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_236841_2020_05_08-16_27_47.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

[ERROR] : AHF-00009: AHF Location directory [/oraadm/dba/ahf] not found

[root@servertst02 oraadm]# mkdir -p /oraadm/dba/ahf/data
[root@servertst02 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_237428_2020_05_08-16_27_59.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/dba/ahf/oracle.ahf

AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/dba/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------+
| AHF Location | /oraadm/dba/ahf/oracle.ahf |
| TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/dba/ahf/oracle.ahf/data |
| Repository | /oraadm/dba/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst02/diag |
'-----------------+--------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_237428_2020_05_08-16_27_59.log to /oraadm/dba/ahf/oracle.ahf/data/servertst02/diag/ahf/

[root@servertst02 oraadm]# ls /oraadm/dba/ahf/oracle.ahf/data/servertst02/

Node 3

[root@servertst03 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_129104_2020_05_08-16_31_08.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/dba/ahf/oracle.ahf

AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/dba/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst03 | RUNNING | 135279 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------+
| AHF Location | /oraadm/dba/ahf/oracle.ahf |
| TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/dba/ahf/oracle.ahf/data |
| Repository | /oraadm/dba/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst03/diag |
'-----------------+--------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_129104_2020_05_08-16_31_08.log to /oraadm/dba/ahf/oracle.ahf/data/servertst03/diag/ahf/

As you can see the node 2 was able to see the existing installation of node 1 but when it came to node 3 it was not able to.

To resolve the issue, we need to use the syncnodes command:

[root@servertst01 bin]# ./tfactl status

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'
[root@servertst01 bin]# ./tfactl syncnodes

Login using root is disabled in sshd config. Please enable it or

Please copy these files manually to remote node and restart TFA
1. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/server.jks
2. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/client.jks
3. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/internal/ssl.properties

These files must be owned by root and should have 600 permissions.

I copied the files to the specified location and stopped and started TFA and it was able to see all nodes in the cluster.

[root@servertst03 bin]# ./tfactl statusahf

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst03 | RUNNING | 135279 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'

orachk scheduler is running [PID: 135279] [Version: 20.1.3]
[root@servertst03 bin]# ./tfactl stop
Stopping TFA from the Command Line
Stopped OSWatcher
Nothing to do !
Killing TFA running with pid 135279
. . .
Successfully stopped TFA..
[root@servertst03 bin]# ./tfactl start
Starting TFA..
Waiting up to 100 seconds for TFA to be started..
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Successfully started TFA Process..
. . . . .
TFA Started and listening for commands
[root@servertst03 bin]# ./tfactl status

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst03 | RUNNING | 191886 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'

And that solved the issue!

I hope it helps you!

Elisson Almeida

Autonomous Health Framework – Installation and Usage Basics

Hi all,

So, this week we’ll have an Autonomous Health Framework series o posts! One post per day, stay tuned!

Today I’ll review some points of the Autonomous Health Framework and how it can help us on the day to day tasks as well as on specific problem resolutions. This framework is a bundle of tools that were already available like orachk/exachk, tfa plus other features that I will try to cover focusing on practical cases.

So, first of all, you need to install it, if you have a RAC cluster most likely you already have it but it is always good to update it, as Oracle security recommendations as also part of the orachk analyze. For reference: In an ideal world you have root access or sudo, as its daemons under root provides a more wide log collection capabilities.

  1. Download the latest version from Autonomous Health Framework (AHF) – Including TFA and ORAchk/EXAChk (Doc ID 2550798.1)
  2. Transfer and unzip the file to your server and as root follow the installation as below
root@servertst01 oracrs]# ./ahf_setup -ahf_loc /oraadm/oracrs/product/19.0.0/ahf -data_dir /oraadm/oracrs/product/19.0.0/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_120962_2020_05_07-22_41_11.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf

AHF Data Directory : /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst01 | RUNNING | 135195 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------------------------+
| AHF Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf |
| TFA Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data |
| Repository | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/servertst01/diag |
'-----------------+--------------------------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_120962_2020_05_07-22_41_11.log to /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/servertst01/diag/ahf/

As you can see as the root user is not allowed to ssh, the installation was done only on the local node. Repeat the process on the remaning ones

Once you have it completed you will be able to to a tfactl status and a toolstatus and see the processes and tools available.
[root@servertst01 bin]# ./tfactl status

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst01 | RUNNING | 135195 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst02 | RUNNING | 241723 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'
[root@servertst01 bin]#
[root@servertst01 bin]# ./tfactl toolstatus

.------------------------------------------------------------------.
| TOOLS STATUS - HOST : servertst01 |
+----------------------+--------------+--------------+-------------+
| Tool Type | Tool | Version | Status |
+----------------------+--------------+--------------+-------------+
| Development Tools | orachk | 19.3.0.0.0 | DEPLOYED |
| | oratop | 14.1.2 | DEPLOYED |
+----------------------+--------------+--------------+-------------+
| Support Tools Bundle | darda | 2.10.0.R6036 | DEPLOYED |
| | oswbb | 8.3.2 | RUNNING |
| | prw | 12.1.13.11.4 | NOT RUNNING |
+----------------------+--------------+--------------+-------------+
| TFA Utilities | alertsummary | 19.3.0.0.0 | DEPLOYED |
| | calog | 19.3.0.0.0 | DEPLOYED |
| | dbcheck | 18.3.0.0.0 | DEPLOYED |
| | dbglevel | 19.3.0.0.0 | DEPLOYED |
| | grep | 19.3.0.0.0 | DEPLOYED |
| | history | 19.3.0.0.0 | DEPLOYED |
| | ls | 19.3.0.0.0 | DEPLOYED |
| | managelogs | 19.3.0.0.0 | DEPLOYED |
| | menu | 19.3.0.0.0 | DEPLOYED |
| | param | 19.3.0.0.0 | DEPLOYED |
| | ps | 19.3.0.0.0 | DEPLOYED |
| | pstack | 19.3.0.0.0 | DEPLOYED |
| | summary | 19.3.0.0.0 | DEPLOYED |
| | tail | 19.3.0.0.0 | DEPLOYED |
| | triage | 19.3.0.0.0 | DEPLOYED |
| | vi | 19.3.0.0.0 | DEPLOYED |
'----------------------+--------------+--------------+-------------'

Note :-
DEPLOYED : Installed and Available - To be configured or run interactively.
NOT RUNNING : Configured and Available - Currently turned off interactively.
RUNNING : Configured and Available.

[root@servertst01 bin]#

3. Using it you can ‘query’ all cluster nodes for information from a single point which can really speed up your information collection. Here is an example of the summary command:

[root@servertst01 bin]# ./tfactl summary -h
---------------------------------------------------------------------------------
Usage : TFACTL [run] summary -help
---------------------------------------------------------------------------------
Command : /oraptp/oracrs/product/19.0.0/ahf/oracle.ahf/tfa/bin/tfactl [run] summary [OPTIONS]
Following Options are supported:
        [no_components] : [Default] Complete Summary Collection
        -overview       : [Optional/Default] Complete Summary Collection - Overview
        -crs            : [Optional/Default] CRS Status Summary
        -asm            : [Optional/Default] ASM Status Summary
        -acfs           : [Optional/Default] ACFS Status Summary
        -database       : [Optional/Default] DATABASE Status Summary
        -exadata        : [Optional/Default] EXADATA Status Summary
                          Not enabled/ignored in Windows and Non-Exadata machine
        -patch          : [Optional/Default] Patch Details
        -listener       : [Optional/Default] LISTENER Status Summary
        -network        : [Optional/Default] NETWORK Status Summary
        -os             : [Optional/Default] OS Status Summary
        -tfa            : [Optional/Default] TFA Status Summary
        -summary        : [Optional/Default] Summary Tool Metadata

        -json           : [Optional] - Prepare json report
        -html           : [Optional] - Prepare html report
        -print          : [Optional] - Display [html or json] Report at Console
        -silent         : [Optional] - Interactive console by defauly
        -history <num>  : [Optional] - View Previous <numberof> Summary Collection History in Interpreter
        -node <node(s)> : [Optional] - local or Comma Separated Node Name(s)
        -help           : Usage/Help.
---------------------------------------------------------------------------------
  - Data Collection From Node - servertst02 .. Done.
[root@servertst01 bin]#

[root@servertst01 bin]# ./tfactl summary

  Executing Summary in Parallel on Following Nodes:
    Node : servertst01
    Node : servertst02

LOGFILE LOCATION : /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/summary/root/20200508124309/log/summary_command_20200508124309_servertst01_35756.log

  Component Specific Summary collection :
    - Collecting CRS details ... Done.
    - Collecting ASM details ... Done.
    - Collecting ACFS details ... Done.
    - Collecting DATABASE details ... Done.
    - Collecting PATCH details ... Done.
    - Collecting LISTENER details ... Done.
    - Collecting NETWORK details ... Done.
    - Collecting OS details ... Done.
    - Collecting TFA details ... Done.
    - Collecting SUMMARY details ... Done.

  Remote Summary Data Collection : In-Progress - Please wait ...
  - Data Collection From Node - servertst02 .. Done.

  Prepare Clusterwide Summary Overview ... Done
      cluster_status_summary

  COMPONENT   STATUS    DETAILS
+-----------+---------+---------------------------------------------------------------------------------------------------+
  CRS         PROBLEM   .-----------------------------------------------.
                        | CRS_SERVER_STATUS   : ONLINE                  |
                        | CRS_STATE           : ONLINE                  |
                        | CRS_INTEGRITY_CHECK : FAIL                    |
                        | CRS_RESOURCE_STATUS : OFFLINE Resources Found |
                        '-----------------------------------------------'
  ASM         PROBLEM   .-------------------------------------------------------.
                        | ASM_DISK_SIZE_STATUS : WARNING - Available Size < 20% |
                        | ASM_BLOCK_STATUS     : PASS                           |
                        | ASM_CHAIN_STATUS     : PASS                           |
                        | ASM_INCIDENTS        : PASS                           |
                        | ASM_PROBLEMS         : PASS                           |
                        '-------------------------------------------------------'
  ACFS        OFFLINE   .-----------------------.
                        | ACFS_STATUS : OFFLINE |
                        '-----------------------'
  DATABASE    PROBLEM   .-----------------------------------------------------------------------------------------------.
                        | ORACLE_HOME_DETAILS                                                        | ORACLE_HOME_NAME |
                        +----------------------------------------------------------------------------+------------------+
                        | .------------------------------------------------------------------------. | OraDb11g_home1   |
                        | | PROBLEMS | INCIDENTS | DB_BLOCKS | DATABASE_NAME | STATUS  | DB_CHAINS | |                  |
                        | +----------+-----------+-----------+---------------+---------+-----------+ |                  |
                        | | PROBLEM  | PROBLEM   | PASS      | FSIGNTST      | PROBLEM | PROBLEM   | |                  |
                        | '----------+-----------+-----------+---------------+---------+-----------' |                  |
                        '----------------------------------------------------------------------------+------------------'
  PATCH       OK        .----------------------------------------------.
                        | CRS_PATCH_CONSISTENCY_ACROSS_NODES      : OK |
                        | DATABASE_PATCH_CONSISTENCY_ACROSS_NODES : OK |
                        '----------------------------------------------'
  LISTENER    OK        .-----------------------.
                        | LISTNER_STATUS   : OK |
                        '-----------------------'
  NETWORK     OK        .---------------------------.
                        | CLUSTER_NETWORK_STATUS :  |
                        '---------------------------'
  OS          OK        .-----------------------.
                        | MEM_USAGE_STATUS : OK |
                        '-----------------------'
  TFA         OK        .----------------------.
                        | TFA_STATUS : RUNNING |
                        '----------------------'
  SUMMARY     OK        .-----------------------------------.
                        | SUMMARY_EXECUTION_TIME : 0H:2M:9S |
                        '-----------------------------------'
+-----------+---------+---------------------------------------------------------------------------------------------------+


        ### Entering in to SUMMARY Command-Line Interface ###

tfactl_summary>list

  Components : Select Component - select [component_number|component_name]
        1 => overview
        2 => crs_overview
        3 => asm_overview
        4 => acfs_overview
        5 => database_overview
        6 => patch_overview
        7 => listener_overview
        8 => network_overview
        9 => os_overview
        10 => tfa_overview
        11 => summary_overview

 

See you tomorrow at the next post o this series!

Elisson Almeida