ASM: Disk Size Imbalance Query

It can be useful if you work frequently with OEM metrics…

# OEM’s Query

SELECT file_num, MAX(extent_count) max_disk_extents, MIN(extent_count)
min_disk_extents
, MAX(extent_count) - MIN(extent_count) disk_extents_imbalance
FROM (SELECT number_kffxp file_num, disk_kffxp disk_num, COUNT(xnum_kffxp)
extent_count
FROM x$kffxp
WHERE group_kffxp = 1
AND disk_kffxp != 65534
GROUP BY number_kffxp, disk_kffxp
ORDER BY number_kffxp, disk_kffxp)
GROUP BY file_num
HAVING MAX(extent_count) - MIN(extent_count) > 5
ORDER BY disk_extents_imbalance DESC;

# Matheus’ Query

select max(free_mb) biggest, min(free_mb) lowest, avg(free_mb) AVG,
trunc(GREATEST ((avg(free_mb)*100/max(free_mb)),(min(free_mb)*100/avg(free_mb))),2)||'%' as balanced,
trunc(100-(GREATEST ((avg(free_mb)*100/max(free_mb)),(min(free_mb)*100/avg(free_mb)))),2)||'%' as inbalanced
from v$asm_disk
where group_number in
(select group_number from v$asm_diskgroup where name = upper('&DG'));

I made my own query for two reasons:
1) I didn’t have the OEM query in the time i made it.
2) My query measures the imbalance with the avg of the disks (if every disk would balanced, how would be the difference), rather than the real/present difference between the disk with the maximum and the minimum usage…

So, you can chose the one you need… 🙂

Matheus.

GoldenGate: RAC One Node Archivelog Missing

The situation:

We have a GoldenGate on Allow Mode running some extracts on RAC One Node Database (reading the archivelogs). And then, suddenly, the instance crashes (network lost contact to the server) and the other instance (thread) was auto started by CRS. To the database no problems: The other node redologs was used during the startup recover and every thing is ok.

The application running with Weblogic serverpool and gridlink just had a little contention and continued the operation thought the started instance. The Goldengate switch was manually made, but some sequences was lost. What we found? the sequences was in the old thread’s redologfiles. It should be backed up if fast_start_mttr_target was different to zero. Buuut, the world is not so beautiful:

raconenodedb> show parameters mttr
NAME TYPE VALUE
------------------------------------
fast_start_mttr_target integer 0

How we solved?
Simple solution: identified the group/thread and made a cp from ASM. The copied redolog was used as archivelog on goldengate and everything was ok.

raconenodedb> select sequence#,group#,thread# from v$log where thread#=2 order by 1;
SEQUENCE# GROUP# THREAD#
---------- ---------- ----------
39636 6 2
39637 7 2
39638 8 2
39639 9 2
39640 10 2
ASMCMD> cp group_10.288.859482805 /oracle/grup10_thread2
copying +DGDATA/MYDB/ONLINELOG/group_10.288.859482805 -> /oracle/grup10_thread2

Easy like that.

Matheus.

Adding ASM Disks on RHEL Cluster with Failgroups

# Recognizing as ASMDISK on ASM Libs (ORACLEASM):

1) All cluster nodes: /etc/init.d/oracleasm scandisk
[root@db1host1p ~]# /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [ OK ]
[root@db2host2p ~]# /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [ OK ]

2) One of cluster nodes:
[root@db1host1p ~]# /etc/init.d/oracleasm createdisk DGDATA059 /dev/asmdsk/DGDATA059
Marking disk "DGDATA059" as an ASM disk: [ OK ]
[root@db1host1p ~]# /etc/init.d/oracleasm createdisk DGDATA060 /dev/asmdsk/DGDATA060
Marking disk "DGDATA060" as an ASM disk: [ OK ]
[root@db1host1p ~]# /etc/init.d/oracleasm createdisk DGDATA061 /dev/asmdsk/DGDATA061
Marking disk "DGDATA061" as an ASM disk: [ OK ]
[root@db1host1p ~]# /etc/init.d/oracleasm createdisk DGDATA062 /dev/asmdsk/DGDATA062
Marking disk "DGDATA062" as an ASM disk: [ OK ]
[root@db1host1p ~]# /etc/init.d/oracleasm createdisk DGDATA159 /dev/asmdsk/DGDATA159
Marking disk "DGDATA159" as an ASM disk: [ OK ]
[root@db1host1p ~]# /etc/init.d/oracleasm createdisk DGDATA160 /dev/asmdsk/DGDATA160
Marking disk "DGDATA160" as an ASM disk: [ OK ]
[root@db1host1p ~]# /etc/init.d/oracleasm createdisk DGDATA161 /dev/asmdsk/DGDATA161
Marking disk "DGDATA161" as an ASM disk: [ OK ]
[root@db1host1p ~]# /etc/init.d/oracleasm createdisk DGDATA162 /dev/asmdsk/DGDATA162
Marking disk "DGDATA162" as an ASM disk: [ OK ]

3) All cluster nodes: /etc/init.d/oracleasm scandisk
[root@db1host1p ~]# /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [ OK ]
[root@db2host2p ~]# /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [ OK ]

# Adding Disk on Diskgroup (sqlplus / as sysasm – ASM Instance)
1) Listing Failgroups
SQL> select distinct failgroup from v$asm_disk where group_number in (select group_number from v$asm_diskgroup where name='DGDATA');
FAILGROUP
----------------------------------------------------
FGMASTER
FGAUX

1) Adding Disks (naming and setting rebalance power):
SQL> alter diskgroup DGDATA
2 add failgroup FG01 disk
3 'ORCL:DGDATA059' name DGDATA059,
4 'ORCL:DGDATA060' name DGDATA060,
5 'ORCL:DGDATA061' name DGDATA061,
6 'ORCL:DGDATA062' name DGDATA062
7 add failgroup FG02 disk
8 'ORCL:DGDATA159' name DGDATA159,
9 'ORCL:DGDATA160' name DGDATA160,
10 'ORCL:DGDATA161' name DGDATA161,
11 'ORCL:DGDATA162' name DGDATA162
12 rebalance power 10 nowait;
Diskgroup altered

2) Be patient, and wait the rebalancing:
SQL> select * from v$asm_operation;
GROUP_NUMBER OPERATION STATE POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE
------------ ----------- ---------- ---------- ---------- -----------
4 REBAL RUN 10 10 191386 540431 1651 211 5 REBAL WAIT 4
SQL> /
GROUP_NUMBER OPERATION STATE POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE
------------ --------------- ----------------------------------------
4 REBAL RUN 10 10 443438 548118 2345 44 5 REBAL WAIT 4
SQL> /
no rows selected

Well done! 😀
Matheus.

Manually Mounting ACFS

A server rebooted and I needed to remount the ACFS where the Oracle Home is. About that:
Today’s post: Manually Mounting ACFS
Tomorrow’s Someday’s post: Kludge: Mounting ACFS Thought Shellscript
Day Before Tomorrow’s Another Day’s post: Auto Mounting Cluster Services Through Oracle Restart

But, first, some usefull links:
– ACFS Introduction
– ACFS Advanced
– ACFS Command-Line Utilities

# Manually Mounting ACFS
Checked my $ORACLE_HOME (mounted on ACFS) is not available to start the database. Checked ACFS service is down. So, let’s do all the process:

# Starting ACFS
[root@db1host1p ~]$ $GRID_HOME/bin/acfsload start -s

# Volumes OFFLINE: Let’s Enable it:
[root@db1host1p ~]$ $GRID_HOME/bin/crsctl stat res -t |grep acfs
ora.dghome.sephome.acfs
ONLINE OFFLINE db1host1p
[root@db1host1p ~]$ su - grid
[grid@db1host1p ~]$ asmcmd
ASMCMD> volinfo -a
Diskgroup Name: DGHOME
Volume Name: LVHOME
Volume Device: /dev/asm/lvhome-270
State: DISABLED
Size (MB): 10240
Resize Unit (MB): 32
Redundancy: MIRROR
Stripe Columns: 4
Stripe Width (K): 128
Usage: ACFS
Mountpath: /oracle/MYDB
ASMCMD> volenable -a
ASMCMD> volinfo -a
Diskgroup Name: DGHOME
Volume Name: LVHOME
Volume Device: /dev/asm/lvhome-270
State: ENABLED
Size (MB): 10240
Resize Unit (MB): 32
Redundancy: MIRROR
Stripe Columns: 4
Stripe Width (K): 128
Usage: ACFS
Mountpath: /oracle/MYDB

[root@db1host1p ~]$ $GRID_HOME/bin/crsctl stat res -t |grep acfs
ora.dghome.sephome.acfs
ONLINE ONLINE db1host1p mounted on /oracle/MYDB
ONLINE ONLINE db2host2p mounted on /oracle/MYDB

# As root, let’s mount it:
[root@db1host1p ~]# mount -t acfs /dev/asm/lvhome-270 /oracle/MYDB

# Then, with the $ORACLE_HOME available:
[oracle@db1host1p ~]$ srvctl start instance -d MYDB -i MYDB001

Matheus.

ASM: Adding disk “_DROPPED%” FORCE

Ok doke,
First let I make it clear: Adding a disk with force should be avoided, mainly by all the rebalance involved. The best choice, if you has “time”, is to just put disks online, like:

1) ALTER DISKGROUP ONLINE DISK ; or
2) ALTER DISKGROUP ONLINE DISKS IN FAILGROUP ; or
3) ALTER DISKGROUP ONLINE ALL;

But, the post is about adding back to DG the dropped disks.
Let’s imagine, to undestand my situation, you lost the contact with one of your two site storage… In this example, represented by failgroup FGAUX. You would see the disks like this:

SQL> select name,failgroup,state from v$asm_disk a where state <> 'NORMAL';

NAME FAILGROUP STATE
------------------------------ ------------------------------ --------
_DROPPED_0000_DGDATA FGAUX FORCING
_DROPPED_0001_DGDATA FGAUX FORCING
_DROPPED_0002_DGDATA FGAUX FORCING

So, you know your disks by the name pattern (0 are FGMAIN and 1 are FGAUX, the problematic). You can do something like:

[root@database-host ~]# /etc/init.d/oracleasm listdisks |grep DGDATA
DGDATA001
DGDATA002
DGDATA003
DGDATA101
DGDATA102
DGDATA103

Now, make the simple… 🙂

SQL> ALTER DISKGROUP DGDATA ADD
FAILGROUP FGAUX
DISK
'ORCL:DGDATA101' name DGDATA101 FORCE,
'ORCL:DGDATA102' name DGDATA102 FORCE,
'ORCL:DGDATA103' name DGDATA103 FORCE;

Diskgroup altered.

SQL> ALTER DISKGROUP DGDATA rebalance power 8;

Diskgroup altered.

While waiting the reball, let’s see the disks in DG:

SQL> select * from v$asm_operation where group_number=(select group_number from v$asm_diskgroup where name='DGDATA');

GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- ----------- --------------------------------------------
3 REBAL WAIT 8
SQL> select name,failgroup,state from v$asm_disk a where group_number=(select group_number from v$asm_diskgroup where name='DGDATA');

NAME FAILGROUP STATE
------------------------------ ------------------------------ --------
_DROPPED_0000_DGDATA FGAUX FORCING
_DROPPED_0001_DGDATA FGAUX FORCING
_DROPPED_0002_DGDATA FGAUX FORCING
DGDATA101 FGAUX NORMAL
DGDATA102 FGAUX NORMAL
DGDATA103 FGAUX NORMAL
DGDATA001 FGMAIN NORMAL
DGDATA002 FGMAIN NORMAL
DGDATA003 FGMAIN NORMAL

And, when the rebalance end, the situation will be OK:

SQL> select * from v$asm_operation where group_number=(select group_number from v$asm_diskgroup where name='DGDATA');

GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- ----------- --------------------------------------------
3 REBAL RUN 8 8 629 19087 10143 1

SQL> select * from v$asm_operation where group_number=(select group_number from v$asm_diskgroup where name='DGDATA');

no rows selected

SQL> select name,failgroup,state from v$asm_disk a where group_number=(select group_number from v$asm_diskgroup where name='DGDATA');

NAME FAILGROUP STATE
------------------------------ ------------------------------ --------
DGDATA101 FGAUX NORMAL
DGDATA102 FGAUX NORMAL
DGDATA103 FGAUX NORMAL
DGDATA001 FGMAIN NORMAL
DGDATA002 FGMAIN NORMAL
DGDATA003 FGMAIN NORMAL

OK? Easy! 😀

Matheus.