OEM 12c+: Prevent OMS from Sending Old Notifications for Events

Are you receiving old notifications from OEM? Like 2 or 3 days past, mostly already solved or after a blackout?
Yeah, this is annoying, specially when getting floods and floods of notifications.

Ok, so here go a very good tip: You can set grace period for notifications! 🙂
Easy easy, do this way:

cd /bin
emctl set property -name oracle.sysman.core.notification.grace_period -value [provide value in minutes]

The oracle.sysman.core.notification.grace_period OMS parameter has been introduced in 12c and allows the user to configure the grace period within which the notification should be sent. The value is set in minutes.

For example:

emctl set property -name oracle.sysman.core.notification.grace_period -value 1440

With this, OMS sends only those notifications which have been raised in the past 1440 mins (last 24 hours) and ignores all the notifications for events / incidents created prior to this time period.

After this, you’ll need to start OMS:

emctl start oms

The oracle.sysman.core.notification.grace_period parameter applies to all the Notification methods, but if the requirement is to specify the grace period for a particular notification method only, you can use the below parameters accordingly:

oracle.sysman.core.notification.grace_period_connector: For Connectors
oracle.sysman.core.notification.grace_period_email: For email notifications
oracle.sysman.core.notification.grace_period_oscmd: For OS Command notifications
oracle.sysman.core.notification.grace_period_plsql: For PLSQL notifications
oracle.sysman.core.notification.grace_period_snmp: For SNMP Trap notifications
oracle.sysman.core.notification.grace_period_ticket: For ticketing tools

This is weel described as per MOS: 12c Cloud control: How to Prevent OMS from Sending out Old Notifications for Events / Incidents Occurred in the Past? (Doc ID 1605351.1)

Cheers!

EMagent 12c with high number of threads causing “su: cannot set user id: Resource temporarily unavailable”

Hello all,
Yeah, these days I got some errors and when validating the server found the fllowing error:

su: cannot set user id: Resource temporarily unavailable

As you can imagine, in order to fix the issue, I adjusted the /etc/security/limits.conf increasing oracle nprocs to:

oracle           soft    nproc           4047
oracle           hard    nproc           20384

Ok, turns out that after a while I got the same errors again…
After some investigating I find that the EM Agent process was with 5020 threads!
Take a look:

More“EMagent 12c with high number of threads causing “su: cannot set user id: Resource temporarily unavailable””

Metric evaluation error start – Get dynamic property error,Dynamic Category property error,Computation of a critical dynamic property failed. Retries Completed.

Hi all,
Getting this error? OMS 12.1.0.3?
Ok… After an investigation on MOS and also a SR opened with Oracle is was pointed to relation with Bug 17575631 CLUSTER DB HOME PAGE ERRORS- REGIONS FETCH REAL TIME DATA WILL NOT BE DISPLAYED on Database Plugin.
It’s solved in OMS 12.1.0.4 and also in 12.1.0.5.1 EM Database Plugin Bundle patch.

 I’ll show you how to workaround it. 🙂

More“Metric evaluation error start – Get dynamic property error,Dynamic Category property error,Computation of a critical dynamic property failed. Retries Completed.”

EM Repository: ORA-00060: Deadlock detected error in alert log

Hello!
Some time ago I found some ORA-00060: Deadlock detected. errors in a client OEM database… Like this:

Thu Dec 22 09:01:55 2016
ORA-00060: Deadlock detected. More info in file /oracle/oemdb/diag/rdbms/oemdb/oemdb/trace/oemdb_ora_1757.trc.
Thu Dec 22 09:02:07 2016
ORA-00060: Deadlock detected. More info in file /oracle/oemdb/diag/rdbms/oemdb/oemdb/trace/oemdb_ora_1759.trc.
ORA-00060: Deadlock detected. More info in file /oracle/oemdb/diag/rdbms/oemdb/oemdb/trace/oemdb_ora_1759.trc.

In summary, after investigating the trace (as per below), found that the issueis caused by the following command:

More“EM Repository: ORA-00060: Deadlock detected error in alert log”

Infiniband Error: Cable is present on Port “X” but it is polling for peer port

Facing this error? Let me guess: Ports 03, 05, 06, 08, 09 and 12 are alerting? You have a Quarter Rack? Have recently installed Exadata plugin to version 12.1.0.3 or higher?
Don’t panic!

This is probably related to Bug 15937297 : EM 12C HAS ERRORS CABLE IS PRESENT ON PORT ‘N’ BUT IT IS POLLING FOR PEER PORT. The full message might be like “Cable is present on Port 6 but it is polling for peer port. This could happen when the peer port is unplugged/disabled“.

In fact, the bug was closed as not a bug. 🙂
As part of the 12.1.0.3 Exadata plugin, the IB switch ports are now checked for non-terminated cables. So these errors ‘polling for peer port’ are the expected behavior.  Once ‘polling for peer port’ is an enhanced feature of the 12.1.0.3 plugin, this explains why you most likely did not see these errors until you upgraded the OMS to 12.1.0.2 and then updated the plugins.

In Quarter Racks, the following ports 3, 5, 6, 8, 9 and 12 are usually cabled ahead of time, but not terminated. In some racks port 32 may also be unterminated. Checking for incident in OEM you might see something like this image:

newscreenshot-2016-12-26-as-20-03-50

More“Infiniband Error: Cable is present on Port “X” but it is polling for peer port”

Saving database space with ASSM

It’s good way reclaim WASTED space from tables and index using  the Segment Advisor.

To perform an database reclaim procedure using Automatic Segment Space Management (ASSM) it is preferred to create tablespaces with below option:

grepdb> CREATE TABLESPACE HR
DATAFILE '+GREPORADG/'
SIZE 10M EXTENT MANAGEMENT LOCAL
SEGMENT SPACE MANAGEMENT AUTO;

Only tablespaces with segment space auto are eligible to Segment Advisor.

To manually run the Segment Advisor on OEM:

guid-65f07e4f-0482-47df-bdf9-8d34b625093a-default

It will save some database storage area, and make it more effective cause by LHWM/HHWM.

Maiquel.

ASM: Disk Size Imbalance Query

It can be useful if you work frequently with OEM metrics…

# OEM’s Query

SELECT file_num, MAX(extent_count) max_disk_extents, MIN(extent_count)
min_disk_extents
, MAX(extent_count) - MIN(extent_count) disk_extents_imbalance
FROM (SELECT number_kffxp file_num, disk_kffxp disk_num, COUNT(xnum_kffxp)
extent_count
FROM x$kffxp
WHERE group_kffxp = 1
AND disk_kffxp != 65534
GROUP BY number_kffxp, disk_kffxp
ORDER BY number_kffxp, disk_kffxp)
GROUP BY file_num
HAVING MAX(extent_count) - MIN(extent_count) > 5
ORDER BY disk_extents_imbalance DESC;

# Matheus’ Query

select max(free_mb) biggest, min(free_mb) lowest, avg(free_mb) AVG,
trunc(GREATEST ((avg(free_mb)*100/max(free_mb)),(min(free_mb)*100/avg(free_mb))),2)||'%' as balanced,
trunc(100-(GREATEST ((avg(free_mb)*100/max(free_mb)),(min(free_mb)*100/avg(free_mb)))),2)||'%' as inbalanced
from v$asm_disk
where group_number in
(select group_number from v$asm_diskgroup where name = upper('&DG'));

I made my own query for two reasons:
1) I didn’t have the OEM query in the time i made it.
2) My query measures the imbalance with the avg of the disks (if every disk would balanced, how would be the difference), rather than the real/present difference between the disk with the maximum and the minimum usage…

So, you can chose the one you need… 🙂

Matheus.

Service Detected on OEM but not in SRVCTL or SERVICE_NAMES Parameter?

Okey, it happens.
To me, after a database moving from a cluster to another. The service was registered by SRVCTL in the old cluster but is not needed. So, was not registered in the new cluster.
But OEM insists to list, for example, the “service3” as offline. The problem is that you can not remove it by SRVCTL, because you had not registered, right? See the example below:

Listing services:
srvdatabase1:/home/oracle>srvctl status service -d systemdb
Service service1_systemdb is running on nodes: srvdatabase1
Service service2 is running on nodes: srvdatabase1
Service service2_systemdb is running on nodes: srvdatabase1

In the service_name parameter:
srvdatabase1:/home/oracle>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 8 15:21:00 2015
Copyright (c) 1982, 2009, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> show parameters service;
NAME                                 TYPE
------------------------------------ --------------------------------
VALUE
------------------------------
service_names                        string
service2,test,systemdb

And the offline alarm goes to “service3“?
The easiest fix:

SQL> exec dbms_service.DELETE_SERVICE('service3');
PL/SQL procedure successfully completed.

Matheus.