Hi all,
Welcome to the second post of this series!
Yesterday we one about Autonomous Health Framework basics and tomorrow we have another one!
For today: While manually installing AHF – when the root user can not ssh to the cluster nodes – you might have a missing communication between the nodes while installing
which you don’t have all cluster nodes found by all TFA installations.
In my example, I have a 3 nodes cluster which I was doing the manual install, on the 1st two nodes I was able to see them but on the 3rd node the tfa was not being able to see the other 2.
Here are the steps or the manual installation:
- Node1
[root@servertst01 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data AHF Installer for Platform Linux Architecture x86_64 AHF Installation Log : /tmp/ahf_install_178647_2020_05_08-16_24_07.log Starting Autonomous Health Framework (AHF) Installation AHF Version: 20.1.3 Build Date: 202004290950 TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home Installed TFA Version : 192000 Build ID : 20190426041420 AHF Location : /oraadm/dba/ahf/oracle.ahf AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst01/tfa_home Do you want to add AHF Notification Email IDs ? [Y]|N : n Login using root is disabled in sshd config. Installing AHF only on Local Node Extracting AHF to /oraadm/dba/ahf/oracle.ahf Configuring TFA Services Copying TFA Data Files to AHF Discovering Nodes and Oracle Resources Starting TFA Services Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service. Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service. .---------------------------------------------------------------------------------. | Host | Status of TFA | PID | Port | Version | Build ID | +-------------+---------------+--------+------+------------+----------------------+ | servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | '-------------+---------------+--------+------+------------+----------------------' Running TFA Inventory... Adding default users to TFA Access list... .--------------------------------------------------------------------. | Summary of AHF Configuration | +-----------------+--------------------------------------------------+ | Parameter | Value | +-----------------+--------------------------------------------------+ | AHF Location | /oraadm/dba/ahf/oracle.ahf | | TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa | | Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk | | Data Directory | /oraadm/dba/ahf/oracle.ahf/data | | Repository | /oraadm/dba/ahf/oracle.ahf/data/repository | | Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag | '-----------------+--------------------------------------------------' Starting orachk daemon from AHF ... AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin AHF is successfully installed Moving /tmp/ahf_install_178647_2020_05_08-16_24_07.log to /oraadm/dba/ahf/oracle.ahf/data/servertst01/diag/ahf/
Node 2
[root@servertst02 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data AHF Installer for Platform Linux Architecture x86_64 AHF Installation Log : /tmp/ahf_install_236841_2020_05_08-16_27_47.log Starting Autonomous Health Framework (AHF) Installation AHF Version: 20.1.3 Build Date: 202004290950 TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home Installed TFA Version : 192000 Build ID : 20190426041420 [ERROR] : AHF-00009: AHF Location directory [/oraadm/dba/ahf] not found [root@servertst02 oraadm]# mkdir -p /oraadm/dba/ahf/data [root@servertst02 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data AHF Installer for Platform Linux Architecture x86_64 AHF Installation Log : /tmp/ahf_install_237428_2020_05_08-16_27_59.log Starting Autonomous Health Framework (AHF) Installation AHF Version: 20.1.3 Build Date: 202004290950 TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home Installed TFA Version : 192000 Build ID : 20190426041420 AHF Location : /oraadm/dba/ahf/oracle.ahf AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst02/tfa_home Do you want to add AHF Notification Email IDs ? [Y]|N : n Login using root is disabled in sshd config. Installing AHF only on Local Node Extracting AHF to /oraadm/dba/ahf/oracle.ahf Configuring TFA Services Copying TFA Data Files to AHF Discovering Nodes and Oracle Resources Starting TFA Services Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service. Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service. .---------------------------------------------------------------------------------. | Host | Status of TFA | PID | Port | Version | Build ID | +-------------+---------------+--------+------+------------+----------------------+ | servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 | | servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | '-------------+---------------+--------+------+------------+----------------------' Running TFA Inventory... Adding default users to TFA Access list... .--------------------------------------------------------------------. | Summary of AHF Configuration | +-----------------+--------------------------------------------------+ | Parameter | Value | +-----------------+--------------------------------------------------+ | AHF Location | /oraadm/dba/ahf/oracle.ahf | | TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa | | Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk | | Data Directory | /oraadm/dba/ahf/oracle.ahf/data | | Repository | /oraadm/dba/ahf/oracle.ahf/data/repository | | Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst02/diag | '-----------------+--------------------------------------------------' Starting orachk daemon from AHF ... AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin AHF is successfully installed Moving /tmp/ahf_install_237428_2020_05_08-16_27_59.log to /oraadm/dba/ahf/oracle.ahf/data/servertst02/diag/ahf/ [root@servertst02 oraadm]# ls /oraadm/dba/ahf/oracle.ahf/data/servertst02/
Node 3
[root@servertst03 oraadm]# /oraadm/home/oracrs/ahf_setup -ahf_loc /oraadm/dba/ahf -data_dir /oraadm/dba/ahf/data AHF Installer for Platform Linux Architecture x86_64 AHF Installation Log : /tmp/ahf_install_129104_2020_05_08-16_31_08.log Starting Autonomous Health Framework (AHF) Installation AHF Version: 20.1.3 Build Date: 202004290950 TFA is already installed at : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home Installed TFA Version : 192000 Build ID : 20190426041420 AHF Location : /oraadm/dba/ahf/oracle.ahf AHF Data Directory : /oraadm/dba/ahf/oracle.ahf/data Shutting down TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home Copying TFA Data Files from /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home Uninstalling TFA : /oraadm/oracrs/product/19.0.0/tfa/servertst03/tfa_home Do you want to add AHF Notification Email IDs ? [Y]|N : n Login using root is disabled in sshd config. Installing AHF only on Local Node Extracting AHF to /oraadm/dba/ahf/oracle.ahf Configuring TFA Services Copying TFA Data Files to AHF Discovering Nodes and Oracle Resources Starting TFA Services Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service. Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service. .---------------------------------------------------------------------------------. | Host | Status of TFA | PID | Port | Version | Build ID | +-------------+---------------+--------+------+------------+----------------------+ | servertst03 | RUNNING | 135279 | 5000 | 20.1.3.0.0 | 20130020200429095054 | '-------------+---------------+--------+------+------------+----------------------' Running TFA Inventory... Adding default users to TFA Access list... .--------------------------------------------------------------------. | Summary of AHF Configuration | +-----------------+--------------------------------------------------+ | Parameter | Value | +-----------------+--------------------------------------------------+ | AHF Location | /oraadm/dba/ahf/oracle.ahf | | TFA Location | /oraadm/dba/ahf/oracle.ahf/tfa | | Orachk Location | /oraadm/dba/ahf/oracle.ahf/orachk | | Data Directory | /oraadm/dba/ahf/oracle.ahf/data | | Repository | /oraadm/dba/ahf/oracle.ahf/data/repository | | Diag Directory | /oraadm/dba/ahf/oracle.ahf/data/servertst03/diag | '-----------------+--------------------------------------------------' Starting orachk daemon from AHF ... AHF binaries are available in /oraadm/dba/ahf/oracle.ahf/bin AHF is successfully installed Moving /tmp/ahf_install_129104_2020_05_08-16_31_08.log to /oraadm/dba/ahf/oracle.ahf/data/servertst03/diag/ahf/
As you can see the node 2 was able to see the existing installation of node 1 but when it came to node 3 it was not able to.
To resolve the issue, we need to use the syncnodes command:
[root@servertst01 bin]# ./tfactl status .----------------------------------------------------------------------------------------------------. | Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status | +-------------+---------------+--------+------+------------+----------------------+------------------+ | servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE | | servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE | '-------------+---------------+--------+------+------------+----------------------+------------------' [root@servertst01 bin]# ./tfactl syncnodes Login using root is disabled in sshd config. Please enable it or Please copy these files manually to remote node and restart TFA 1. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/server.jks 2. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/client.jks 3. /oraadm/dba/ahf/oracle.ahf/data/servertst01/tfa/internal/ssl.properties
These files must be owned by root and should have 600 permissions.
I copied the files to the specified location and stopped and started TFA and it was able to see all nodes in the cluster.
[root@servertst03 bin]# ./tfactl statusahf .----------------------------------------------------------------------------------------------------. | Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status | +-------------+---------------+--------+------+------------+----------------------+------------------+ | servertst03 | RUNNING | 135279 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE | '-------------+---------------+--------+------+------------+----------------------+------------------' orachk scheduler is running [PID: 135279] [Version: 20.1.3] [root@servertst03 bin]# ./tfactl stop Stopping TFA from the Command Line Stopped OSWatcher Nothing to do ! Killing TFA running with pid 135279 . . . Successfully stopped TFA.. [root@servertst03 bin]# ./tfactl start Starting TFA.. Waiting up to 100 seconds for TFA to be started.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Successfully started TFA Process.. . . . . . TFA Started and listening for commands [root@servertst03 bin]# ./tfactl status .----------------------------------------------------------------------------------------------------. | Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status | +-------------+---------------+--------+------+------------+----------------------+------------------+ | servertst03 | RUNNING | 191886 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE | | servertst01 | RUNNING | 189966 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE | | servertst02 | RUNNING | 249057 | 5000 | 20.1.3.0.0 | 20130020200429095054 | COMPLETE | '-------------+---------------+--------+------+------------+----------------------+------------------'
And that solved the issue!
I hope it helps you!
Elisson Almeida