Autonomous Health Framework – Missing node on TFA

Hi all,

Welcome to the second post of this series!

Yesterday we one about Autonomous Health Framework basics and tomorrow we have another one!

For today: While manually installing AHF – when the root user can not ssh to the cluster nodes – you might have a missing communication between the nodes while installing
which you don’t have all cluster nodes found by all TFA installations.

In my example, I have a 3 nodes cluster which I was doing the manual install, on the 1st two nodes I was able to see them but on the 3rd node the tfa was not being able to see the other 2.

Here are the steps or the manual installation:

  • Node1
[root@servertst01 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_178647_2020_05_08-16_24_07.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/dba/ahf/oracle.ahf

AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/dba/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------+
| AHF Location | /oraadm/dba/ahf/oracle.ahf |
| TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/dba/ahf/oracle.ahf/data |
| Repository | /oraadm/dba/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag |
'-----------------+--------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_178647_2020_05_08-16_24_07.log to /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag/ahf/

Node 2

[root@servertst02 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_236841_2020_05_08-16_27_47.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

[ERROR] : AHF-00009: AHF Location directory [/oraadm/dba/ahf] not found

[root@servertst02 oraadm]# mkdir -p /oraadm/dba/ahf/data
[root@servertst02 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_237428_2020_05_08-16_27_59.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/dba/ahf/oracle.ahf

AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/dba/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------+
| AHF Location | /oraadm/dba/ahf/oracle.ahf |
| TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/dba/ahf/oracle.ahf/data |
| Repository | /oraadm/dba/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst02/diag |
'-----------------+--------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_237428_2020_05_08-16_27_59.log to /oraadm/dba/ahf/oracle.ahf/data/servertst02/diag/ahf/

[root@servertst02 oraadm]# ls /oraadm/dba/ahf/oracle.ahf/data/servertst02/

Node 3

[root@servertst03 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_129104_2020_05_08-16_31_08.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/dba/ahf/oracle.ahf

AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/dba/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst03 | RUNNING | 135279 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------+
| AHF Location | /oraadm/dba/ahf/oracle.ahf |
| TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/dba/ahf/oracle.ahf/data |
| Repository | /oraadm/dba/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst03/diag |
'-----------------+--------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_129104_2020_05_08-16_31_08.log to /oraadm/dba/ahf/oracle.ahf/data/servertst03/diag/ahf/

As you can see the node 2 was able to see the existing installation of node 1 but when it came to node 3 it was not able to.

To resolve the issue, we need to use the syncnodes command:

[root@servertst01 bin]# ./tfactl status

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'
[root@servertst01 bin]# ./tfactl syncnodes

Login using root is disabled in sshd config. Please enable it or

Please copy these files manually to remote node and restart TFA
1. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/server.jks
2. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/client.jks
3. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/internal/ssl.properties

These files must be owned by root and should have 600 permissions.

I copied the files to the specified location and stopped and started TFA and it was able to see all nodes in the cluster.

[root@servertst03 bin]# ./tfactl statusahf

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst03 | RUNNING | 135279 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'

orachk scheduler is running [PID: 135279] [Version: 20.1.3]
[root@servertst03 bin]# ./tfactl stop
Stopping TFA from the Command Line
Stopped OSWatcher
Nothing to do !
Killing TFA running with pid 135279
. . .
Successfully stopped TFA..
[root@servertst03 bin]# ./tfactl start
Starting TFA..
Waiting up to 100 seconds for TFA to be started..
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Successfully started TFA Process..
. . . . .
TFA Started and listening for commands
[root@servertst03 bin]# ./tfactl status

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst03 | RUNNING | 191886 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'

And that solved the issue!

I hope it helps you!

Elisson Almeida

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.