Hi all,
Another one for our series about Autonomous Health Framework:
Along with the AHF, we have some tools that we can take advantage of, one of them is OSWatcher. OSwatcher is a utility to capture performance metrics from the operating system using native OS tools for IO, network, CPU, memory, etc.
It gathers a snapshot of your system and stores it in a directory which you can then use it to parse the information there and perform a system wide analysis.
You can stop it if you want, out of the box it will gather OS information every 30 minutes
To see if it is running you can use tfactl toolstatus as below
[root@servertst01 bin]# ./tfactl toolstatus .------------------------------------------------------------------. | TOOLS STATUS - HOST : servertst01 | +----------------------+--------------+--------------+-------------+ | Tool Type | Tool | Version | Status | +----------------------+--------------+--------------+-------------+ | Development Tools | orachk | 19.3.0.0.0 | DEPLOYED | | | oratop | 14.1.2 | DEPLOYED | +----------------------+--------------+--------------+-------------+ | Support Tools Bundle | darda | 2.10.0.R6036 | DEPLOYED | | | oswbb | 8.3.2 | RUNNING | | | prw | 12.1.13.11.4 | NOT RUNNING | +----------------------+--------------+--------------+-------------+ | TFA Utilities | alertsummary | 19.3.0.0.0 | DEPLOYED | | | calog | 19.3.0.0.0 | DEPLOYED | | | dbcheck | 18.3.0.0.0 | DEPLOYED | | | dbglevel | 19.3.0.0.0 | DEPLOYED | | | grep | 19.3.0.0.0 | DEPLOYED | | | history | 19.3.0.0.0 | DEPLOYED | | | ls | 19.3.0.0.0 | DEPLOYED | | | managelogs | 19.3.0.0.0 | DEPLOYED | | | menu | 19.3.0.0.0 | DEPLOYED | | | param | 19.3.0.0.0 | DEPLOYED | | | ps | 19.3.0.0.0 | DEPLOYED | | | pstack | 19.3.0.0.0 | DEPLOYED | | | summary | 19.3.0.0.0 | DEPLOYED | | | tail | 19.3.0.0.0 | DEPLOYED | | | triage | 19.3.0.0.0 | DEPLOYED | | | vi | 19.3.0.0.0 | DEPLOYED | '----------------------+--------------+--------------+-------------' Note :- DEPLOYED : Installed and Available - To be configured or run interactively. NOT RUNNING : Configured and Available - Currently turned off interactively. RUNNING : Configured and Available. [root@servertst01 bin]#
And to run a simple test you can call tfactl oswbb and it should parse all data in its archive directory BUT and I tried to do so I ran into a java out of memory error
I bumped the memory of the process up to 4G and still gor the same error.
If you want to try to increase your process memory you will need to edit the file oswbb.pm under oracle.ahf/tfa/ext/oswbb
You will see a line like below, and you will need to change the memory valeu to one that your system can handle, I´m not saying it will not work you can try.
system("$java -Xmx512M -jar $oswjar -i $adir @flags");
Here is the error:
[root@servertst01 bin]# ./tfactl oswbb Starting OSW Analyzer V8.3.0 OSWatcher Analyzer Written by Oracle Center of Expertise Copyright (c) 2019 by Oracle Corporation Parsing Data. Please Wait... Scanning file headers for version and platform info... Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at u.c(Unknown Source) at u.a(Unknown Source) at OSWGraph.OSWGraph.main(Unknown Source) Analysis results are saved in /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/oswbb/root/oswbb
And there process memory which was a bit ovee 4G when it died
[root@servertst01 ~]$ ps -eo rss,pid,euser,lstart,args:100 --sort %mem | grep -v grep | grep java | awk '{printf $1/1024 "MB"; $1=""; print }'| sort |grep osw 4364.3MB 103531 root Wed May 13 18:09:12 2020 /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/jre/bin/java -Xmx4096M -jar /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/oswbb/root/oswbb/oswbba.jar -i /oraadm/oracrs/product/19.0.0/ahf/oracle.ahf/data/repository/suptools/servertst01/oswbb/oracrs/archive
I had over 4 days of data there, so by using the options -B and -E I was able to workaround the issue. If you are doing some troubleshooting I would advise to read 4 days of data anyway as we could be see averages and a spike could be reduced and not see in the analysis.
So it did not bothered that much.
Here when passing the date range I was able to execute the process as needed.
[root@servertst01 bin]# ./tfactl oswbb -6 -B May 13 09:25:00 2020 -E May 13 09:30:00 2020 Validating times in the archive... Starting OSW Analyzer V8.3.0 OSWatcher Analyzer Written by Oracle Center of Expertise Copyright (c) 2019 by Oracle Corporation Parsing Data. Please Wait... Scanning file headers for version and platform info... Parsing file servertst01_pidstat_20.05.13.0900.dat ... Parsing file servertst01_iostat_20.05.13.0900.dat ... This directory already exists. Rewriting... Parsing file servertst01_vmstat_20.05.13.0900.dat ... Parsing file servertst01_netstat_20.05.13.0900.dat ... Parsing file servertst01_top_20.05.13.0900.dat ... Parsing file servertst01_ps_20.05.13.0900.dat ... Parsing Completed. Enter 1 to Display CPU Process Queue Graphs Enter 2 to Display CPU Utilization Graphs Enter 3 to Display CPU Other Graphs Enter 4 to Display Memory Graphs Enter 5 to Display Disk IO Graphs Enter 61 to Display Individual OS Process I/O RPS Graphs Enter 62 to Display Individual OS Process I/O WPS Graphs Enter 63 to Display Individual OS Process Percent User CPU Graphs Enter 64 to Display Individual OS Process Percent System CPU Graphs Enter 65 to Display Individual OS Process Percent Total CPU (User + System) Graphs Enter 66 to Display Individual OS Process Percent Memory Graphs Enter GP to Generate Individual Process Profile Enter GC to Generate All CPU Gif Files Enter GM to Generate All Memory Gif Files Enter GD to Generate All Disk Gif Files Enter GN to Generate All Network Gif Files Enter L to Specify Alternate Location of Gif Directory Enter Z to Zoom Graph Time Scale (Does not change analysis dataset) Enter B to Returns to Baseline Graph Time Scale (Does not change analysis dataset) Enter R to Remove Currently Displayed Graphs Enter X to Export Parsed Data to Flat File Enter S to Analyze Subset of Data(Changes analysis dataset including graph time scale) Enter A to Analyze Data Enter D to Generate DashBoard Enter Q to Quit Program Please Select an Option:
Hope it helps!
Elisson Almeida