Oracle memory usage on Linux / Unix

Hi all,

So one of the most important things that we need to do when setting up a new server or checking the capacity of the server is to see how much memory Oracle is using.

When checking the capacity there are some practical things that always help me to get a fast glimpse of the system:

  • When opening topas and hitting M you will see this below
Topas Monitor for host: SERVER1 Interval: 2 Sat Dec 8 03:39:59 2019
================================================================================
REF1 SRAD TOTALMEM INUSE FREE FILECACHE HOMETHRDS CPUS
--------------------------------------------------------------------------------
0 0 59.8G 59.6G 212.3 16.3G 528 0-15
1 1 61.4G 61.2G 188.8 15.7G 536 16-31

On the memory session you will see 3 categories, INUSE, FREE and FILECACHE. There you may see what is being using for what but there is not much granularity there.

  • When using top you have this summary below
top - 11:48:08 up 119 days, 10:18, 1 user, load average: 26.76, 26.16, 25.95
Tasks: 1936 total, 38 running, 1898 sleeping, 0 stopped, 0 zombie
Cpu(s): 79.3%us, 1.1%sy, 0.0%ni, 15.1%id, 4.3%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 263750172k total, 219075656k used, 44674516k free, 797476k buffers
Swap: 16773116k total, 505760k used, 16267356k free, 88055108k cached

Same you have a high level usage. So here comes the question:

How are you to prove that you have a memory shortage?

I often use vmstat on Linux looking on the columns si and so equals to 0 (swap in and swap out) and when the free command, the free column you will also have no or very low swap being used

/home/oracle> vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
15 3 505760 44896608 797480 88062288 0 0 7037 1304 0 0 29 2 61 8 0
16 1 505760 44922964 797480 88062320 0 0 432272 144314 38784 31348 41 2 52 5 0
14 2 505760 44943072 797480 88062320 0 0 468904 155424 32676 27522 34 1 60 5 0
15 2 505760 44943032 797480 88062328 0 0 431032 144275 32596 27469 34 1 60 5 0
15 2 505760 44920136 797480 88062352 0 0 396232 145052 30772 26657 32 1 62 6 0
19 1 505760 44928576 797480 88062360 0 0 429360 160158 33640 28012 36 1 58 5 0
15 3 505760 44935340 797480 88062368 0 0 477232 161849 28393 21423 41 1 53 5 0
17 1 505760 44924744 797480 88062368 0 0 515265 160212 27478 20578 40 1 54 5 0
16 1 505760 44921596 797480 88062368 0 0 495408 159304 25458 19548 37 1 58 5 0
18 1 505760 44918144 797480 88062384 0 0 552880 168895 28203 22774 38 1 56 5 0
15 2 505760 44922344 797480 88062392 0 0 546920 160463 25321 19151 37 1 58 5 0
16 4 505760 44921544 797480 88062400 0 0 571544 153810 25429 20011 36 1 58 5 0
16 1 505760 44919620 797480 88062400 0 0 577552 160004 27132 20111 40 1 54 5 0
19 2 505760 44360240 797480 88062400 0 0 584969 155553 29467 22145 41 2 52 5 0
/home/oracle> free
total used free shared buffers cached
Mem: 263750172 219060896 44689276 91608 797480 88062464
-/+ buffers/cache: 130200952 133549220
Swap: 16773116 505760 16267356

To check a process specific memory usage (RSS) I often use ps along with other commands to calculate the process memory for a specific process id as below:

/home/oracle> ps -eo rss,pid,euser,lstart,args:100 --sort %mem | grep -v grep | grep 35796 | awk '{printf $1/1024 "MB"; $1=""; print }'| sort
19.6016MB 35796 oracle Sat Sep 8 02:43:54 2018 ora_lg00_ORC1
34.957MB 32340 oracle Sat Jan 5 11:50:09 2019 oracleORC1 (LOCAL=NO)

RSS is resident memory, but when comes to shared memory like the Oracle SGA the methods above could be miss leading – not to say wrong – but as Oracle memory is shared we may see double counting on the results. I sometimes use pmap to check a process memory as well when available

/home/oracle> pmap 35796
35796: ora_lg00_ORC1
total 0K

But, still when checking a server wide scope, do you want to keep doing manual work and lots of math? I don’t think so. 🙂

That’s why when I came across SMEM made my life a lot easier. It is a python script which gives you a nice breakdown of the memory usage and without the miss leading double counting.

You can see the commands and processes and their memory:

[root@srv01 smem-1.4]# ./smem -trk | head
PID User Command Swap USS PSS RSS
4829 root /opt/stackdriver/collectd/s 444.0K 4.0G 4.0G 4.0G
5647 oracle asm_gen0_+ASM 50.1M 424.4M 425.0M 437.8M
16512 oracle rman software/product/11.2. 0 172.9M 173.7M 177.8M
85107 oracle ora_n001_db01 42.3M 147.8M 147.8M 185.8M
85103 oracle ora_n000_db01 42.4M 146.5M 146.6M 184.6M
85109 oracle ora_n002_db01 42.2M 145.6M 145.6M 183.5M
85111 oracle ora_n003_db01 42.1M 145.1M 145.2M 183.1M
7287 oracle ora_dia0_db01 1.6M 68.6M 68.8M 107.8M

As well the overall server per user:

root@srv01 smem-1.4]# ./smem -turk 
User Count Swap USS PSS RSS oracle 1358 4.8G 7.8G 8.0G 76.6G 
root 43 12.0M 4.1G 4.1G 4.2G user1 10 0 321.0M 328.0M 369.2M 
nobody 2 96.0K 2.1M 2.3M 6.0M user2 2 0 684.0K 1.7M 7.7M 
user4 2 0 632.0K 1.7M 7.9M user4 1 72.0K 536.0K 540.0K 2.1M 
ntp 1 424.0K 332.0K 368.0K 2.4M 
smmsp 1 1.3M 160.0K 298.0K 1.9M 
rpc 1 336.0K 68.0K 73.0K 1.7M 
rpcuser 1 808.0K 4.0K 16.0K 1.9M 
--------------------------------------------------- 
1422 4.8G 12.2G 12.5G 81.3G

Hope it helps, see you next time!

Amazon EC2: X11 Forwarding After Sudo SSH Session

Hello all!

So, now with more use of resources like Cloud servers, more and more silent instalations are being done, right? Myself, I do it in silent always I can.

What if I need to export X. Plus, if I need to export it from user oracle, but I can only login with ec2-user, as usual?

Here is the process for that:

1) Connect to AWS EC2 instance

[user@securehost ~]$ ssh -X ec2-user@ipaddress
Last login: Fri Dec 7 14:41:41 2018 from grepora.srv.com
__| __|_ )
 _| ( / Amazon Linux AMI
 ___|\___|___|
https://aws.amazon.com/amazon-linux-ami/2018.03-release-notes/
13 package(s) needed for security, out of 16 available
Run "sudo yum update" to apply all updates.

2) Test xclock works from ec2-user

[ec2-user@ipaddress ~]$ xclock
Warning: Missing charsets in String to FontSet conversion
^C

3) Show all magic cookie

[ec2-user@ipaddress ~]$ xauth list
ipaddress/unix:12 MIT-MAGIC-COOKIE-1 7e53e7600ff4177d7bbc66bde0a1b1ca
ipaddress/unix:11 MIT-MAGIC-COOKIE-1 e3d1a8915484c929ef3e809b047e6352
ipaddress/unix:10 MIT-MAGIC-COOKIE-1 07b3de3093cef835c19239ea952231b7

4) Show DISPLAY variable

[ec2-user@ipaddress ~]$ env|grep DISPLAY
DISPLAY=localhost:10.0

5) Create /tmp/xauth based on current DISPLAY variable

[ec2-user@ipaddress ~]$ xauth list | grep unix`echo $DISPLAY | cut -c10-12` > /tmp/xauth
[ec2-user@ipaddress ~]$ ll /tmp/xauth ; cat /tmp/xauth 
-rw-rw-r-- 1 ec2-user ec2-user 78 Dec 7 14:47 /tmp/xauth
ipaddress/unix:10 MIT-MAGIC-COOKIE-1 07b3de3093cef835c19239ea952231b7

6) Sudo to oracle

[ec2-user@ipaddress ~]$ sudo su - oracle
Last login: Fri Dec 7 14:43:12 UTC 2018 on pts/0

7) Add and Verify xauth

[oracle@ipaddress ~]$ xauth add `cat /tmp/xauth`
[oracle@ipaddress ~]$ xauth list
ipaddress/unix:10 MIT-MAGIC-COOKIE-1 07b3de3093cef835c19239ea952231b7

8) Verify and Add DISPLAY variable

[oracle@ipaddress ~]$ env|grep DISPLAY
[oracle@ipaddress ~]$ export DISPLAY=localhost:10.0

9) Test xclock works from oracle

[oracle@ipaddress ~]$ xclock
Warning: Missing charsets in String to FontSet conversion
^C
[oracle@ipaddress ~]$

Now you should be able to see the clock and so other graphical resources, like DBCA and so on.

Hope it helps!

Even Better Script: Map ASM Disks to Physical Devices

Enjoyed last week post?

Cool, because looking further on the subject I found this pretty similar post, by Mohammad Nazmul Huda.

The additional script there is actually not working in my server, but the idea is great. So, I did just some small adjustments and it’s working pretty fine now:

# asm_report.sh (Adjusted by Matheus):

printf "\n%-15s %-14s %-11s %-7s\n" "ASM disk" "based on" "Minor,Major" "Size (MB)"
printf "%-15s %-14s %-11s %-7s\n" "===============" "=============" "===========" "========="
export ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome1
for i in `/usr/sbin/oracleasm listdisks`
do
v_asmdisk=`/usr/sbin/oracleasm querydisk -d $i | awk '{print $2}'| sed 's/\"//g'`
v_minor=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $1}'`
v_major=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $2}'`
v_device=`ls -la /dev | awk -v v_minor="$v_minor," -v v_major=$v_major '{if ( $5==v_minor ) { if ( $6==v_major ) { print $10}}}'`
v_size_bt=`blockdev --getsize64 /dev/${v_device}`
v_size=`expr $v_size_bt / 1024 / 1024`
Total_size=`expr $Total_size + $v_size`
Formated_size=`echo $v_size | sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'`
printf "%-15s %-14s %-11s %-7s\n" $v_asmdisk "/dev/$v_device" "[$v_minor $v_major]" $Formated_size
done
Formated_Total_size=`echo $Total_size | sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'`
printf "\nTotal (MB): %43s\n\n" $Formated_Total_size

Ok, and how it works?
[root@greporasrv ~]# sh asm_report.sh

ASM disk        based on      Minor,Major Size (MB)
=============== ============= =========== =========
DATA01          /dev/sdg1     [8 97]       255,999
DATA02          /dev/sdh1     [8 113]      255,999
DATA03          /dev/sdi1     [8 129]      255,999
DATA04          /dev/sdj1     [8 145]      255,999
FRA01           /dev/sdk1     [8 161]      307,199

Total (MB): 1,331,195

Even better, right?

Cheers!

Script: Map ASM Disks to Physical Devices

Hey all!
So, I had to map a couple ASM disks to physical devices. But it’s not direct, which causes some manual work.

To save me from this, I found this great post by Alejandro Vargas, with a very nice script to make this mapping easier.

I found however, it was done for RHEL/OEL 6 and older, and I’m in OEL7. So I did some small changes to adapt it.

Anyway, decided to share as this is a great script to have handy. 🙂

# Alejandro’s script (RHEL/OEL 6 and older):

#!/bin/ksh
for i in `/etc/init.d/oracleasm listdisks`
do
v_asmdisk=`/etc/init.d/oracleasm querydisk $i | awk  '{print $2}'`
v_minor=`/etc/init.d/oracleasm querydisk $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk '{print $1}'`
v_major=`/etc/init.d/oracleasm querydisk $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk '{print $2}'`
v_device=`ls -la /dev | grep $v_minor | grep $v_major | awk '{print $10}'`
echo "ASM disk $v_asmdisk based on /dev/$v_device  [$v_minor $v_major]"
done

# Adjustments by Matheus (RHEL/OEL7):

#!/bin/ksh
for i in `/usr/sbin/oracleasm listdisks`
do
v_asmdisk=`/usr/sbin/oracleasm querydisk -d $i | awk '{print $2}'`
v_minor=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $1}'`
v_major=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $2}'`
v_device=`ls -la /dev | grep $v_minor | grep $v_major | awk '{print $10}'`
echo "ASM disk $v_asmdisk based on /dev/$v_device [$v_minor $v_major]"
done

# Example of execution:

[root@greporasrv]$ for i in `/usr/sbin/oracleasm listdisks`
> do
> v_asmdisk=`/usr/sbin/oracleasm querydisk -d $i | awk '{print $2}'`
> v_minor=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $1}'`
> v_major=`/usr/sbin/oracleasm querydisk -d $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk -F, '{print $2}'`
> v_device=`ls -la /dev | grep $v_minor | grep $v_major | awk '{print $10}'`
> echo "ASM disk $v_asmdisk based on /dev/$v_device [$v_minor $v_major]"
> done
ASM disk "DATA01" based on /dev/sdg1 [8 97]
ASM disk "DATA02" based on /dev/sdh1 [8 113]
ASM disk "DATA03" based on /dev/sdi1 [8 129]
ASM disk "DATA04" based on /dev/sdj1 [8 145]
ASM disk "FRA01" based on /dev/sdk1 [8 161]

Hope you enjoy it like I did.
Cheers!

/bin/rm: cannot execute [Argument list too long]

Hey all!

Just a quickie and useful thing today. How many times you found this?

/bin/rm: cannot execute [Argument list too long]

Ok, so, first thing: Is it related to Oracle logs? If so, you may want to ADCRI. Check this post for more info: ADRCI Retention Policy and Ad-Hoc Purge Script for all Bases.

If not, you may solve this using find with rm. Ok, but want to keep the most recent files?

Some examples for you, removing audit files:

# Remove older then 1 day:

find /oracle/greporadb/admin/greporadb/adump -name "*.aud" -mtime +1 -exec rm {} \;

# Remove older then 1 hour:

find /oracle/greporadb/admin/greporadb/adump -name "*.aud" -cmin +60 -exec rm {} \;

Continue reading

Tips to Improve your Oracle DB Shellscripting!

There are some small tricks to speed up your shellscripts for Oracle DBs. I’m pretty sure you can do almost magic only with those:

# To get all OS Users with an Oracle Instance in server:

ps -ef | grep ora_pmon | grep -v grep| awk -F_ '{ print $1}'|cut -f1 -d" "|sort -u

# To get all DBSids in server:

ps -ef | grep ora_pmon | grep -v grep | awk -F_ '{ print $3}'

# To get all DBNames in server:

ps -ef | grep ora_pmon | grep -v grep | awk -F_ '{ print substr($3, 1, length($3)-1)}'

# To get Oracle Home from /etc/oratab for a database:

grep "dbname:" /etc/oratab|cut -d: -f2 -s

# Looping for any of the lists from commands above:

for DBUSER|DBNAME|DBINST in `$Comand_From_ListAbove` do [...] done

# Example:
Take DBInstance and Oracle Home for Each DBName on Each OSUser:

for DBUSER in `ps -ef | grep ora_pmon | grep -v grep| awk -F_ '{ print $1}'|cut -f1 -d" "|sort -u`
do
for DBNAME in `ps -ef | grep ora_pmon | grep -v grep |grep ${DBUSER} | awk -F_ '{ print substr($3, 1, length($3)-1)}'`
do
DBSID=`ps -ef | grep ora_pmon | grep -i ${DBNAME^^}| awk -F_ '{ print $3}'`
DBOH=`grep "^${DBNAME,,}:" /etc/oratab|cut -d: -f2 -s`
done
done

# Variable Upper Case or Lower Case:

For uppercase: ${DBNAME^^}
For lowercase: ${DBNAME,,}

# Putting Colors on your messages:

RED='\033[0;31m'
GREEN='\033[0;32m'
PURPLE='\033[0;35m'
NC='\033[0m' # No Color
echo -e "Status: ${GREEN}All Good!${NC}";
echo -e "Status: ${RED}ERROR!${NC}"
echo -e "Status: ${PURPLE}Do you like messages in purple?${NC}"

# Conditional: Is there any line with a specific content?

if [`cat myfile.log | grep "text_I_want"| wc -l |awk '{print $1}'` -gt 0 ]; then
echo -e "Count Greater Then Zero: ${GREEN}Line Exists!${NC}";
else echo -e "Count equals to Zero: ${RED}No line with this text...${NC}"
fi;

# Some other related posts:
Shellscript & Oracle: Run Script for a List of Databases
Shellscript: Using eval and SQLPlus
Recursive string change
Grepping Entries from Alert.log
Grepping Alert by Day
Shellscript: Master Blaster KB!

Anything I forgot to put here? Tell me with a comment!
Cheers!

Disk Space is not Released After Deleting Files

When deleting a large files, the file is deleted successfully but the size of the filesystem does not reflect the change.
The files was be deleted, but only restarting the jvm or java process released the disk space.
Usually occurs with log files.

The lsof command show the files opened in system.
For example: lsof |grep deleted

java 15138 oracle 2959r REG 253,3 5875027 131163 /logs/soa_domain/WLS1_SOA1/WLS1_SOA1.out03422 (deleted)
java 15138 oracle 3054r REG 253,3 10480928 131166 /logs/soa_domain/WLS1_SOA1/WLS1_SOA1-diagnostic-81.log (deleted)
java 15138 oracle 3062r REG 253,3 10479417 131200 /logs/soa_domain/WLS1_SOA1/WLS1_SOA1-diagnostic-82.log (deleted)

The command output shows pid, owner, file descriptor (fd), size and file.

Workaround:
If can’t restart the process, it is possible to force the system to de-allocate the space consumed by an in-use file by forcing the system to truncate the file.
$ echo > /proc/pid/fd/fd_number

Be careful not to truncate unwanted files.

In my case:
$ echo > /proc/15138/fd/2959
$ echo > /proc/15138/fd/3054
$ echo > /proc/15138/fd/3062

That’s all for today.

Shellscript & Oracle: Run Script for a List of Databases

Hey all!
Quick tip/script for today: How to run same script for a list of databases quickly?

In my case I have same password/user in all databases, but in case you haven’t you can make a similar awk command to retrieve users and passwords from a file.

The list:

$: cat /tmp/dbs.cfg
db01
db02
db03
testdb

The script:

for DBSID in ${*-$(awk -F: '!/^#/ {print $1}' /tmp/dbs.cfg}
do
    print "
        connect user/password@${DBSID}
        @script_to_run.sql
        exit " |
    sqlplus /nolog >> /tmp/output_test.log
done

Hope it helps!
Cheers!

Verifying topology of Exadata

Hey all!
Some time ago I needed to check topology of a client’s Exadata due a network issue and made a very useful note. Sharing with you now. 😀

This and other cool commands can be found here: Network Diagnostics information for Oracle Database Machine Environments (Doc ID 1053498.1)

# /opt/oracle.SupportTools/ibdiagtools/verify-topology -t quarterrack

Newer versions don’t require -t option.
In case of halfrack, -t halfrack should be used in my case.

Ok, but how to know it? You can have it from here:

[root@greporaexa onecommand]# grep -i MACHINETYPES databasemachine.xml
[MACHINETYPES]X4-2 Eighth Rack HC 4TB[/MACHINETYPES]

Hope it helps! 🙂

Scheduler Job for OS Audit Trail Cleanup Routine

Hello all!
As you all know, most of OS files generated by oracle, like traces and logs can be cleared/managed using ADRCI. However, more than once, I saw filesystem get full of Audit Trail files.
In general, I see DBAs and companies implementing shell scripts to house keep those files. Usually some variation of the find with rm I posted some weeks ago. However, we have a very good “official” solition for this, using the DBMS_AUDIT_MGMT.

Basically we can create Scheduler Jobs in our databases to keep track on this. The advantages are seveal. The job clear in both nodes, this keep all Oracle information correctly updated, you may receive OEM notification for failure if monitoring jobs, etc.
You can see more detaiils here in MOS Note 731908.1 New Feature DBMS_AUDIT_MGMT to Manage and Purge Audit Information.

How I did?

1. Initial Cleanup to clear files older than a week.
2. Set last archive to a week ago.
3. Created Purge Job, purging older than a week

As per:

Continue reading