Autonomous Health Framework – Missing node on TFA

Hi all,

Welcome to the second post of this series!

Yesterday we one about Autonomous Health Framework basics and tomorrow we have another one!

For today: While manually installing AHF – when the root user can not ssh to the cluster nodes – you might have a missing communication between the nodes while installing
which you don’t have all cluster nodes found by all TFA installations.

In my example, I have a 3 nodes cluster which I was doing the manual install, on the 1st two nodes I was able to see them but on the 3rd node the tfa was not being able to see the other 2.

Here are the steps or the manual installation:

  • Node1
[root@servertst01 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_178647_2020_05_08-16_24_07.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/dba/ahf/oracle.ahf

AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/dba/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------+
| AHF Location | /oraadm/dba/ahf/oracle.ahf |
| TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/dba/ahf/oracle.ahf/data |
| Repository | /oraadm/dba/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag |
'-----------------+--------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_178647_2020_05_08-16_24_07.log to /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag/ahf/

Node 2

[root@servertst02 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_236841_2020_05_08-16_27_47.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

[ERROR] : AHF-00009: AHF Location directory [/oraadm/dba/ahf] not found

[root@servertst02 oraadm]# mkdir -p /oraadm/dba/ahf/data
[root@servertst02 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_237428_2020_05_08-16_27_59.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/dba/ahf/oracle.ahf

AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/dba/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------+
| AHF Location | /oraadm/dba/ahf/oracle.ahf |
| TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/dba/ahf/oracle.ahf/data |
| Repository | /oraadm/dba/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst02/diag |
'-----------------+--------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_237428_2020_05_08-16_27_59.log to /oraadm/dba/ahf/oracle.ahf/data/servertst02/diag/ahf/

[root@servertst02 oraadm]# ls /oraadm/dba/ahf/oracle.ahf/data/servertst02/

Node 3

[root@servertst03 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_129104_2020_05_08-16_31_08.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.3 Build Date: 202004290950

TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Installed TFA Version : 192000 Build ID : 20190426041420

AHF Location : /oraadm/dba/ahf/oracle.ahf

AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data

Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home

Do you want to add AHF Notification Email IDs ? [Y]|N : n

Login using root is disabled in sshd config. Installing AHF only on Local Node

Extracting AHF to /oraadm/dba/ahf/oracle.ahf

Configuring TFA Services

Copying TFA Data Files to AHF

Discovering Nodes and Oracle Resources

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.---------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-------------+---------------+--------+------+------------+----------------------+
| servertst03 | RUNNING | 135279 | 5000 | 20.1.3.0.0 | 20130020200429095054 |
'-------------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------.
| Summary of AHF Configuration |
+-----------------+--------------------------------------------------+
| Parameter | Value |
+-----------------+--------------------------------------------------+
| AHF Location | /oraadm/dba/ahf/oracle.ahf |
| TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa |
| Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk |
| Data Directory | /oraadm/dba/ahf/oracle.ahf/data |
| Repository | /oraadm/dba/ahf/oracle.ahf/data/repository |
| Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst03/diag |
'-----------------+--------------------------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_129104_2020_05_08-16_31_08.log to /oraadm/dba/ahf/oracle.ahf/data/servertst03/diag/ahf/

As you can see the node 2 was able to see the existing installation of node 1 but when it came to node 3 it was not able to.

To resolve the issue, we need to use the syncnodes command:

[root@servertst01 bin]# ./tfactl status

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'
[root@servertst01 bin]# ./tfactl syncnodes

Login using root is disabled in sshd config. Please enable it or

Please copy these files manually to remote node and restart TFA
1. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/server.jks
2. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/client.jks
3. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/internal/ssl.properties

These files must be owned by root and should have 600 permissions.

I copied the files to the specified location and stopped and started TFA and it was able to see all nodes in the cluster.

[root@servertst03 bin]# ./tfactl statusahf

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst03 | RUNNING | 135279 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'

orachk scheduler is running [PID: 135279] [Version: 20.1.3]
[root@servertst03 bin]# ./tfactl stop
Stopping TFA from the Command Line
Stopped OSWatcher
Nothing to do !
Killing TFA running with pid 135279
. . .
Successfully stopped TFA..
[root@servertst03 bin]# ./tfactl start
Starting TFA..
Waiting up to 100 seconds for TFA to be started..
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Successfully started TFA Process..
. . . . .
TFA Started and listening for commands
[root@servertst03 bin]# ./tfactl status

.----------------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------------+---------------+--------+------+------------+----------------------+------------------+
| servertst03 | RUNNING | 191886 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
| servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE |
'-------------+---------------+--------+------+------------+----------------------+------------------'

And that solved the issue!

I hope it helps you!

Elisson Almeida

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from grepOra

Subscribe now to keep reading and get access to the full archive.

Continue reading