Enterprise Manager

EM Event: Number of failed login attempts exceeds threshold value [Part 1]

By matheusdba 15 de March de 2017 15 de March de 2017

Hello!
Just got this error? Very simple to check. Below the easy solution using audit:

Activating audit to failed logins:

SQL> audit session whenever not successful;

Audit succeeded.

Then wait some time and check:

More“EM Event: Number of failed login attempts exceeds threshold value [Part 1]”

Database, Enterprise Manager, Errors and Bugs

EM Repository: ORA-00060: Deadlock detected error in alert log

By matheusdba 1 de March de 2017 19 de March de 2017

Hello!
Some time ago I found some ORA-00060: Deadlock detected. errors in a client OEM database… Like this:

Thu Dec 22 09:01:55 2016
ORA-00060: Deadlock detected. More info in file /oracle/oemdb/diag/rdbms/oemdb/oemdb/trace/oemdb_ora_1757.trc.
Thu Dec 22 09:02:07 2016
ORA-00060: Deadlock detected. More info in file /oracle/oemdb/diag/rdbms/oemdb/oemdb/trace/oemdb_ora_1759.trc.
ORA-00060: Deadlock detected. More info in file /oracle/oemdb/diag/rdbms/oemdb/oemdb/trace/oemdb_ora_1759.trc.

In summary, after investigating the trace (as per below), found that the issueis caused by the following command:

More“EM Repository: ORA-00060: Deadlock detected error in alert log”

Enterprise Manager

Target MGMTLSNR or MGMTDB Down in OEM

By matheusdba 8 de February de 2017 31 de December de 2016

Hello!
After discover your new 12c targets you are facing target status down for MGMT targets, similar to image below?

mgmt_targets

And checking on server all looks to be ok, right?

More“Target MGMTLSNR or MGMTDB Down in OEM”

Cloud Services | Exadata | Linux, Enterprise Manager, Oracle Engineered Systems | Exadata

Infiniband Error: Cable is present on Port “X” but it is polling for peer port

By matheusdba 25 de January de 2017 21 de April de 2020

Facing this error? Let me guess: Ports 03, 05, 06, 08, 09 and 12 are alerting? You have a Quarter Rack? Have recently installed Exadata plugin to version 12.1.0.3 or higher?
Don’t panic!

This is probably related to Bug 15937297 : EM 12C HAS ERRORS CABLE IS PRESENT ON PORT ‘N’ BUT IT IS POLLING FOR PEER PORT. The full message might be like “Cable is present on Port 6 but it is polling for peer port. This could happen when the peer port is unplugged/disabled“.

In fact, the bug was closed as not a bug. 🙂
As part of the 12.1.0.3 Exadata plugin, the IB switch ports are now checked for non-terminated cables. So these errors ‘polling for peer port’ are the expected behavior. Once ‘polling for peer port’ is an enhanced feature of the 12.1.0.3 plugin, this explains why you most likely did not see these errors until you upgraded the OMS to 12.1.0.2 and then updated the plugins.

In Quarter Racks, the following ports 3, 5, 6, 8, 9 and 12 are usually cabled ahead of time, but not terminated. In some racks port 32 may also be unterminated. Checking for incident in OEM you might see something like this image:

newscreenshot-2016-12-26-as-20-03-50

More“Infiniband Error: Cable is present on Port “X” but it is polling for peer port”

Enterprise Manager, Errors and Bugs

RS-7445 [Serv MS leaking memory] [It will be restarted] [] [] [] [] [] [] [] [] [] []

By matheusdba 18 de January de 2017 21 de April de 2020

Hello!
Having this error from cell alerthistory.log? Don’t panic!
Take a look in MOS: Exadata Storage Cell reports error RS-7445 [Serv MS Leaking Memory] (Doc ID 1954357.1). It’s related to Bug – RS-7445 [SERV MS LEAKING MEMORY].

The issue is a memory leak in the Java executable and affects systems running with JDK 7u51 or later versions. This is relevant for all versions in Release 11.2 to 12.1.

What happens is that MS process is consuming high memory (up to 2GB). Normally MS use around 1GB but because of the bug the memory allocated can grow upt to 2GB. You can check it as per example below:

[root@exaserver ~]# ps -feal|grep java
0 S root     16493 14737  0  80   0 - 15317 pipe_w 18:34 pts/0    00:00:00 grep java
0 S root     22310 27043  2  80   0 - 267080 futex_ 18:15 ?       00:00:27 /usr/java/default/bin/java -Xms256m -Xmx512m -XX:-UseLargePages -Djava.library.path=/opt/oracle/cell/cellsrv/lib -Ddisable.checkForUpdate=true -jar /opt/oracle/cell/oc4j/ms/j2ee/home/oc4j.jar -out /opt/oracle/cell/cellsrv/deploy/log/ms.lst -err /opt/oracle/cell/cellsrv/deploy/log/ms.err

Note that: 267080 * 4096 = 1143MB (1GB). If your number is higher than this, it indicates the presence of the bug.

More“RS-7445 [Serv MS leaking memory] [It will be restarted] [] [] [] [] [] [] [] [] [] []”

Enterprise Manager, MySQL, Shell Scripts

Getting today’s Errors and Warnings from MySQL log

By matheusdba 15 de April de 2016 21 de April de 2020

Quick one!

# Warnings

cat /var/log/mysqld.log |grep `date +%y%m%d` | grep "\[Warning\]"

# Errors

cat /var/log/mysqld.log |grep `date +%y%m%d` | grep "\[ERROR\]"

And a Bonus!
To get entries from X days ago:

cat /var/log/mysqld.log |grep `date --date="46 days ago" +%y%m%d`

Matheus.

Database, Enterprise Manager, MySQL

Monitoring MySQL with Nagios – Quick View

By matheusdba 7 de April de 2016 21 de April de 2020

Hi all!
As you know, we have some commercial solutions to monitoring/alerting MySQL, like MySQL Enterprise Monitor or Oracle Grid/Cloud Control.

But, regarding we are using MySQL instead of Oracle Database, we can assume it’s probably a decision taken based on cost. So, considering Open Source solutions, we basically have Nagios, Zabbix, OpenNMS…

MangagedMonitoringConsole

Thinking on Nagios, in my opinion the “supra sumo” is mysql_health_check.pl.
Below whitepaper and presentation:
White Paper
Presentation
Code
Good one by Sheeri Cabral and posted here!

Any way, with theese two we can make lots of magic:

1. check_mysql.pl
– Check status of MySql server (slow queries, etc)
– Queries per second graph

2. check_db_query.pl
– Allowes to run SQL Queries and setting thresholds for warning e critical. Ex:

check_db_query.pl -d database -q query [-w warn] [-c crit] [-C conn_file] [-p placeholder]

Ex for Nagios call:

define command{
command_name    check_db_entries
command_line    /usr/local/bin/perl $USER1$/check_db_query.pl -d "$ARG1$" -q "$ARG2$" $ARG3$
}

So, now it’s just make your queries and implement your free monitoring on MySQL! 🙂
Matheus.

Enterprise Manager, GoldenGate

GoldenGate Plugin for EM13c!

By matheusdba 16 de March de 2016 16 de March de 2016

Released!
Take a look here.

Quick post just to tell you that. 🙂

ASM, Database, Enterprise Manager, SQL | PLSQL

ASM: Disk Size Imbalance Query

By matheusdba 1 de July de 2015 21 de April de 2020

It can be useful if you work frequently with OEM metrics…

# OEM’s Query

SELECT file_num, MAX(extent_count) max_disk_extents, MIN(extent_count)
min_disk_extents
, MAX(extent_count) - MIN(extent_count) disk_extents_imbalance
FROM (SELECT number_kffxp file_num, disk_kffxp disk_num, COUNT(xnum_kffxp)
extent_count
FROM x$kffxp
WHERE group_kffxp = 1
AND disk_kffxp != 65534
GROUP BY number_kffxp, disk_kffxp
ORDER BY number_kffxp, disk_kffxp)
GROUP BY file_num
HAVING MAX(extent_count) - MIN(extent_count) > 5
ORDER BY disk_extents_imbalance DESC;

# Matheus’ Query

select max(free_mb) biggest, min(free_mb) lowest, avg(free_mb) AVG,
trunc(GREATEST ((avg(free_mb)*100/max(free_mb)),(min(free_mb)*100/avg(free_mb))),2)||'%' as balanced,
trunc(100-(GREATEST ((avg(free_mb)*100/max(free_mb)),(min(free_mb)*100/avg(free_mb)))),2)||'%' as inbalanced
from v$asm_disk
where group_number in
(select group_number from v$asm_diskgroup where name = upper('&DG'));

I made my own query for two reasons:
1) I didn’t have the OEM query in the time i made it.
2) My query measures the imbalance with the avg of the disks (if every disk would balanced, how would be the difference), rather than the real/present difference between the disk with the maximum and the minimum usage…

So, you can chose the one you need… 🙂

Matheus.

Database, Enterprise Manager, Errors and Bugs

Service Detected on OEM but not in SRVCTL or SERVICE_NAMES Parameter?

By matheusdba 9 de June de 2015 31 de January de 2017

Okey, it happens.
To me, after a database moving from a cluster to another. The service was registered by SRVCTL in the old cluster but is not needed. So, was not registered in the new cluster.
But OEM insists to list, for example, the “service3” as offline. The problem is that you can not remove it by SRVCTL, because you had not registered, right? See the example below:

Listing services:
srvdatabase1:/home/oracle>srvctl status service -d systemdb Service service1_systemdb is running on nodes: srvdatabase1 Service service2 is running on nodes: srvdatabase1 Service service2_systemdb is running on nodes: srvdatabase1

In the service_name parameter:
srvdatabase1:/home/oracle>sqlplus / as sysdba SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 8 15:21:00 2015 Copyright (c) 1982, 2009, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options SQL> show parameters service; NAME TYPE ------------------------------------ -------------------------------- VALUE ------------------------------ service_names string service2,test,systemdb

And the offline alarm goes to “service3“?
The easiest fix:

SQL> exec dbms_service.DELETE_SERVICE('service3'); PL/SQL procedure successfully completed.

Matheus.