Adaptive Query Optimization: SQL Plan Directives Causing High CPU after 12c Upgrade

Hello all!

As DBAs we are always being recommended by Oracle and also recommending to clients to update their databases, but we have to be aware about new features and their effects. This is the case of Adaptive Query Optimization and in this particular case on SQL Plan Directives.

SQL Plan Directives are one of the functionalities that compose the Adaptive Query Optimization in Oracle Database 12c. The basic idea is pretty interesting: The SQL Optimizer keeps reviewing bad (“suboptimal”, as they like to say) plans, tipically incorrect cardinality estimations and generates SQL plan directives, like for missing histograms or extended statistics.

In my case, just after the upgrade to 12c (made on Jan 27th), the CPU usage increased for the same report always ran in the database:

12c_upgrade

Ok, how to check it?


Generating an AWR Report on database, was possible to check “latch free” as Top event by DB Time (53%):

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Wait   % DB Wait
Event                                Waits Time (sec)    Avg(ms)   time Class
------------------------------ ----------- ---------- ---------- ------ --------
latch free                       4,264,495      38.5K       9.02   53.1 Other
DB CPU                                          30.8K              42.5
enq: TX - row lock contention          332     1144.3    3446.70    1.6 Applicat
control file sequential read     3,337,505      876.5       0.26    1.2 System I
Disk file Mirror Read              944,380      355.6       0.38     .5 User I/O
log file sync                      636,939      292.7       0.46     .4 Commit
reliable message                   471,241        172       0.37     .2 Other
cell single block physical rea     155,890      146.5       0.94     .2 User I/O
PX Deq: Slave Session Stats      1,451,559        102       0.07     .1 Other
log switch/archive                      21       80.5    3834.31     .1 Other

Checking on Latch Missed for this database we can see high activity on “Result Cache: RC Latch”:

Latch Miss Sources             DB/Inst: PRODDB/proddb1  Snaps: 16439-16448
-> only latches with sleeps are shown
-> ordered by name, sleeps desc

                                                     NoWait              Waiter
Latch Name               Where                       Misses     Sleeps   Sleeps
------------------------ -------------------------- ------- ---------- --------
ASM map operation hash t kffmTranslate                    0         13        9
ASM map operation hash t kffmDoDone_1                     0          3        7
KJC message pool free li kjcsmpav: allocate a msg b       0          3        3
OS process allocation    kso_delete_process:1             0          1        1
OS process: request allo kso_new_process                  0          1        1
PC and Classifier lists  No latch                         0          4        0
Result Cache: RC Latch   Result Cache: Serializatio       0  4,266,399 2.19E+06
Result Cache: RC Latch   Result Cache: Serializatio       0        170 2.08E+06
active checkpoint queue  kcbbacq: scan active check       0          4        3
active service list      kswslogon: session logout        0         54       17
active service list      kswslogon: session login         0          4        6
active service list      kswsgetso: get service obj       0          2       29
active service list      kswsgsnp: get service name       0          2        6
active service list      kswsigsn: get service name       0          1        3
active service list      kswssvhid2nam: get service       0          1        0
archive process latch    kcrrasgn                         0          7        0
business card            Delete qc                        0          1        1
cache buffers chains     kcbchg: change complete          0        965        5
cache buffers chains     kcbget: new pin                  0         98       91
cache buffers chains     kclbla                           0         39       52
cache buffers chains     kcbzib: finish processing        0         32        0
cache buffers chains     kclcqchk                         0         16        0
cache buffers chains     kcllkopb_1                       0         15        0
cache buffers chains     kcbgcur: fast path excl          0         10        1
cache buffers chains     kcbgtcr: fast path pin           0          9        6
cache buffers chains     kclwrt                           0          7        1
cache buffers chains     kcb_cleanout: cleanup            0          3        0
cache buffers chains     kcbesc: escalate                 0          3        0
cache buffers chains     kcbget: fast path                0          3       42
cache buffers chains     kcbgtcr: fast path exam          0          3        1

This is the exact scenario described in MOS as per Result Cache: RC Latch’ when RESULT_CACHE_MODE = MANUAL on Oracle 12c (Doc ID 2002089.1). But why it’s using the Result Cache?

As per MOS, when Automatic Dynamic Sampling is used for the SQL statements, it can decide, based upon these statistics, that a better response time could be achieved by using the result cache for those queries. This can cause heavy usage of the result cache leading to the contention on latch free for “Result Cache: RC Latch”.

Doc points as solution to disable result cache usage by the Automatic Dynamic Statistics mechanism, with command:

alter system set "_optimizer_ads_use_result_cache" = FALSE;

Ok, this solved this specific situation, but once your database can be facing some bad effects after upgrading to 12c, specially if related to plan stability which can be a problem for specific workloads, I’d recommend you to disable the whole thing until you be able to test your workload in a non-prod environment, or even stabilize your production as it was in 11g.

In addition, it’s useful to say that this sampling is not influenced by the old parameter optimizer_dynamic_sampling. So, to disable it, I’d suggest:

To disable use and creation of SQL Plan Directives

SQL> ALTER SYSTEM SET "_OPTIMIZER_DSDIR_USAGE_CONTROL"=0 SCOPE=BOTH;
SQL> alter system set "_sql_plan_directive_mgmt_control"=0 scope=both;

To Write last directives from memory to SYSAUX

SQL> exec dbms_spd.flush_sql_plan_directive;

To drop existing directives

begin
for i in (select directive_id from dba_sql_plan_directives where type ='DYNAMIC_SAMPLING')
LOOP
dbms_spd.drop_sql_plan_directive(i.directive_id);
END LOOP;
END;
/

This can be found in MOS Bug 20465582 – High parse time in 12c for multi-table join SQL with SQL plan directives enabled (Doc ID 20465582.8).

And yes, after this my problem was solved…

Hope it helps your strugle with 12c Plan Stability issues. 🙂

See you!

2 Comments

    1. Thank you for the input, David!
      However, by disabling OPTIMIZER_ADAPTIVE_FEATURES you will also disable adaptive plan, automatic re-optimization and adaptive distribution methods.
      In this case I want to disable only SQL plan directives, that is the one is causing bad effects.

      Also, note tha OPTIMIZER_ADAPTIVE_FEATURES is set as obsolete in Oracle 12.2.
      OPTIMIZER_ADAPTIVE_PLANS and OPTIMIZER_ADAPTIVE_STATISTICS will take all functions on this one, exactly to better separate things and avoid changing “more than needed”.
      (some reference here: https://blogs.oracle.com/optimizer/optimizer-adaptive-features-in-oracle-database-12c-release-2)

      However, I agree with you that in case of performance change after upgrade, and if you are not sure on the root cause, maybe change directly OPTIMIZER_ADAPTIVE_FEATURES (in 12.1) would be a lot easier and direct. After this you can isolate the real cause. 🙂

      Thanks!

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.