ORA-04031: unable to allocate 352 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(1,0)”,”krsdicle”)

Problem statement

Database is recently upgraded to 12.1.0.2 version.
After few days, database started terminating with ORA-04031 error consistently.

Excerpts from database alert log
  
 ORA-04031: unable to allocate 384 bytes of shared memory ("shared pool","unknown object","sga heap(1,0)","KGLHD")
 ORA-04031: unable to allocate 768 bytes of shared memory ("shared pool","unknown object","KKSSP^1960","kglss")
 ORA-04031: unable to allocate 352 bytes of shared memory ("shared pool","unknown object","sga heap(1,0)","krsdicle")
 ORA-04031: unable to allocate 384 bytes of shared memory ("shared pool","select o.owner#,o.name,o.nam...","sga heap(1,0)","KGLHD")
 ORA-04031: unable to allocate 1824 bytes of shared memory ("shared pool","update user$ set exptime=DEC...","sga heap(1,0)","KQR X SO")
 ORA-04031: unable to allocate 2504 bytes of shared memory ("shared pool","WITH MONITOR_DATA AS (SELECT...","SQLA^2cef633c","kafco : qkacol")

Diagnosis

To begin with, we first observed very high shared pool utilization.
Initially shared pool size has been increased from 1GB to 4GB.

After couple of days, again database crashed with similar errors. This time shared pool size has been increased to 7GB. Despite above change done, database crashed.

Further shared pool size has been increased to 9GB.

Increasing shared pool size multiple times did not help, and hence we started investigating AWR reports.

Upon investigation, AWR reports shows heavy Hard parses per second.

Hard Parse Count Per Sec
  
 Load Profile                    Per Second   Per Transaction
 -------------------        ---------------   ----------------
 DB Time(s):                            4.3               0.7
 DB CPU(s):                             4.6               0.7
 User calls:                          636.1              96.8
 Parses (SQL):                      1,377.7             209.7
 Hard parses (SQL):                   906.1             137.9
 Logons:                               28.7               4.4
 Executes (SQL):                    1,503.7             228.9

  
 select BEGIN_TIME,METRIC_NAME,AVERAGE
 from dba_hist_sysmetric_summary
 where METRIC_NAME='Hard Parse Count Per Sec'
 and trunc(BEGIN_TIME) >=trunc(sysdate-60)
 and AVERAGE > 600
 order by 1;
 
 BEGIN_TIME           METRIC_NAME                            AVERAGE
 -------------------- ----------------------------------- -----------
 26-MAR-2018 11:29:50 Hard Parse Count Per Sec            627.073261
 26-MAR-2018 12:29:51 Hard Parse Count Per Sec            601.087453
 28-MAR-2018 11:29:52 Hard Parse Count Per Sec            620.686635
 02-APR-2018 09:29:54 Hard Parse Count Per Sec            622.436011
 02-APR-2018 10:29:54 Hard Parse Count Per Sec            861.500578
 02-APR-2018 11:29:53 Hard Parse Count Per Sec            905.789608
 02-APR-2018 12:29:54 Hard Parse Count Per Sec            835.835878
 02-APR-2018 13:29:54 Hard Parse Count Per Sec            646.823878
 02-APR-2018 14:29:54 Hard Parse Count Per Sec            654.385633
 02-APR-2018 15:29:53 Hard Parse Count Per Sec            645.756677
 03-APR-2018 10:30:14 Hard Parse Count Per Sec            697.683740
 03-APR-2018 11:30:15 Hard Parse Count Per Sec            726.630095
 03-APR-2018 12:29:14 Hard Parse Count Per Sec            640.659254
 04-APR-2018 11:29:15 Hard Parse Count Per Sec            635.936752

Top application queries were identified leading to heavy hard parsing

Sample Queries performing Hard parsing
  
 col sample_sql_Text for a40
 with d1 as
 (
 select /*+ MATERIALIZE */ force_matching_signature fms, count(distinct(sql_id)) count, sum(executions) executions
 from v$sqlarea
 where force_matching_signature > 0
 group by force_matching_signature
 having count(distinct(sql_id)) >= 5
 )
 select (Select sql_fulltext from v$sqlarea 
         where force_matching_signature=d1.fms 
         and rownum=1) sample_sql_Text, 
         d1.*
 from d1
 order by count desc;
 
 SAMPLE_SQL_TEXT                                           FMS                COUNT           EXECUTIONS
 ---------------------------------------- -------------------- -------------------- --------------------
 select count(*) as rowcount from TAB       595475347031080823                 4937                28552
 where b_filename_type='DP2' and run_da
 te >=trunc(sysdate-3) and b_filename lik
 e trim('% XXXXXXXXXXXXXXXXXXXXX %')
 
 update TABNAME set T4S=229483 where       7308982053163128587                 2811                 2811
 DOC_TYPE='DP3' and STMTDT='31-Mar-20
 18'
 
 update TABNAME set T3S=154728 where        110354220139214304                 2798                 2798
 DOC_TYPE='DP3' and STMTDT='31-Mar-20
 18'

Recommendation was provided to make use of bind variables and Application team agreed to perform this change for such queries.
But distinct queries count was more than 1500 and practically it was impossible to convert all by using BIND variables within Application code.

We eventually decided to set cursor_sharing=FORCE to convert all application queries to make use of system generated BIND variables.
After this change system was running fine for almost 10 days as compared with DB crash for every 2-3 days consecutively.

We assumed issue got resolved but again database crashed after 10 days.
Now hard parse count per second was also within acceptable range.

  
 BEGIN_TIME           METRIC_NAME                            AVERAGE
 -------------------- ----------------------------------- -----------
 15-apr-2018 07:30:19 Hard Parse Count Per Sec            5.45156795
 15-apr-2018 08:30:20 Hard Parse Count Per Sec            8.56563527
 15-apr-2018 09:30:20 Hard Parse Count Per Sec            11.0442935
 15-apr-2018 10:30:20 Hard Parse Count Per Sec            12.9795929
 15-apr-2018 11:29:20 Hard Parse Count Per Sec            14.4070235
 15-apr-2018 12:29:20 Hard Parse Count Per Sec            10.8581472
 15-apr-2018 13:30:20 Hard Parse Count Per Sec            9.08589363
 15-apr-2018 14:30:19 Hard Parse Count Per Sec            8.00785081
 15-apr-2018 15:30:19 Hard Parse Count Per Sec            7.63119506
 15-apr-2018 16:29:20 Hard Parse Count Per Sec            8.80645671
 15-apr-2018 17:29:20 Hard Parse Count Per Sec            8.35716632

We have started analyzing ORA-04031 trace files to verify shared pool usage statistics.
Following details have been identified.

Shared Pool Usage statistics from trace file
  
 TOP 20 MAXIMUM MEMORY -- USES ACROSS SGA HEAP 1 - 4
 -------------------------------------------------
 "krsdicle                  "         12 GB 35%
 "free memory               "         10 GB 31%
 "SQLA                      "       6629 MB 19%
 "KGLH0                     "       2658 MB  8%
 "KGLHD                     "        473 MB  1%
 "PX msg pool               "        269 MB  1%
 "dbktb: trace buffer       "        184 MB  1%
 "KGLDA                     "        141 MB  0%
 "KGLS                      "        111 MB  0%
 
 Memory Utilization of Subpool 1 (SGA HEAP)

 	  Allocation Name              Size         Max Size   Chunks
------------------------------------------------------------------------
 "free memory                    "     559510696   1418773464       511
 "db_block_hash_buckets          "       4194304      4194304       128
 "parameter value memory         "         37376        37376      2317
 "db_files                       "        808144       808144      1002
 "krsdicle                       "    6296844896    629684489 619189684
 "State object subpools          "          4608         4608        23
 "ksdhng: blkrs cache            "         18176        18176         1

Out of all shared pool components, there was one which had utilized entire shared pool memory.
“krsdicle” 12 GB 35%

Another important observation was “alter system” command consistently being observed in database alert log.

Excerpts from Alert log
  
 ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=MEMORY SID='*'; 
 Wed Apr 25 12:34:22 2018
 ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=MEMORY SID='*'; 
 Wed Apr 25 12:35:22 2018
 ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=MEMORY SID='*'; 
 Wed Apr 25 12:36:22 2018
 ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=MEMORY SID='*'; 

In an ideal scenario, database must not run such command continuously.
This made us to feel some abnormal behaviour at DB end.

So, what led to this ????

The ‘KRSDICLE’ memory chunk is Data Guard specific and related to redo transport.
This issue was investigated in Bug 23722678 : ORA-04031 ON DATAGUARD ENVIRONMENT WITH HIGH ALLOCATION IN ‘KRSDICLE’ , closed as not a bug.

The cause for this behavior is that log_archive_dest_n are configured incorrectly.
In this particular case, log_archive_dest_2 on the primary database had the same service defined as log_archive_dest_2 from the standby instance.

This is a configuration issue.

Error from Standby DB alert log
  
 ORA-16040: Destination archive log file is locked.
 Wed Apr 25 14:37:39 2018
 TT00: Archive log rejected (T-1.S-56423) at host 'SERVICE_DR' 

Solution Implemented

On standby database, log_archive_dest_2 was deferred which points to same service used by production database.
ALTER SYSTEM SET log_archive_dest_state_2=’DEFER’;

Reference Document
ORA-04031 On Dataguard Environment With High Allocation In ‘KRSDICLE’ (Doc ID 2233468.1)

3 thoughts on “ORA-04031: unable to allocate 352 bytes of shared memory (“shared pool”,”unknown object”,”sga heap(1,0)”,”krsdicle”)”

Leave a Reply

Your email address will not be published. Required fields are marked *