CARDS Application EOD stuck with “library cache pin”

Problem Statement

Application team informed us about the regular EOD jobs not progressing as expected.

Diagnosis

Application team was executing EOD operations. Basis their observation, the job progress had stagnated. We were roped in to troubleshoot this issue and figure out what engendered this issue.

When we initially looked at database, EOD sessions were waiting for “library cache pin”.
Below excerpts from session waits describe the same.

  
    SID EVENT                                           P1  P1RAW            STATE                SQL_ID        BLOCKING_SESSION 
 ------ ------------------------- ------------------------  ---------------- -------------------- ------------- ----------------
   3847 Disk file operations I/O                         8  0000000000000008 WAITED SHORT TIME    c6quqbusxxxxx
        -------------------------
                                1
   5311 db file sequential read                       2373  0000000000000945 WAITING              5nv8stdvxxxxx
   5958                                               3451  0000000000000D7B WAITING              1dckrmkgxxxxx
   8812                                               1757  00000000000006DD WAITED SHORT TIME    a4a4r7cpxxxxx
   2660                                               3771  0000000000000EBB WAITING              5m3t39ppxxxxx
   6761                                               3215  0000000000000C8F WAITED SHORT TIME    2t5nfwhqxxxxx
        -------------------------
                                5
    987 library cache pin              8070450556365212160  700000059D808A00 WAITING              2by0bj2fxxxxx
   3245                               12682136575630453440  B0000005CF712AC0 WAITING              g0bggfqrxxxxx
   4270                               12682136575630453440  B0000005CF712AC0 WAITING              85gnympxxxxxx             7238
        -------------------------
                                3

As evident from above output, we could see that blocking session was “7238”
Application team confirmed that it was their queueing query and can be killed.

Sample Query
  
 SQL_ID        SQL_TEXT                                 PLAN_HASH_VALUE EXECUTIONS AVG_TIME_SEC    AVG_LIO    AVG_ROW 
 ------------- ---------------------------------------- --------------- ---------- ------------ ---------- ----------
 85gnympxxxxxx Begin :fResult := XXXXXX.XXXXXXXXXXXXXXX               0          1   23205.3102  727922023          2
               XX.XXXXXXXXXXXXXXXXXX( 1234,1,'q_xxxxxxx
               xxxxx'); End;

Post killing the session, the expected relief did not come our way and we were baffled to see another HOLDER. This seemed to be a vicious cycle of killing such a session and getting trapped with another HOLDER session.

Later, Application team confirmed that it was an application job which shall keep respawning upon killing.
At this stage we were not aware as to why this query was blocking and holding EOD process over “library cache pin”.

Witnessing a hang like situation, We decided to take HANGANALZE dump to diagnose it further.

oradebug setmypid
oradebug unlimit
oradebug hanganalyze 4
Hang Analysis in /oradumps/dbname/diag/diag/rdbms/dbname/INSTNAME/trace/INSTNAME_ora_29001.trc

Below Chains most likely to have caused the hang:

 
 [a] Chain 1 Signature: 'wait for unread message on broadcast channel'<='library cache pin' 
     Chain 1 Signature Hash: 0xcd5dcbea
 [b] Chain 2 Signature: 'control file sequential read'
     Chain 2 Signature Hash: 0x258411b1
 [c] Chain 3 Signature: 
     Chain 3 Signature Hash: 0x673a0128

We further looked upon more details over Chain 1.

 
 -------------------------------------------------------------------------------
 Chain 1:
 -------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (dbname.instname)
                   os id: 23105
              process id: 264, oracle@hostname
              session id: 5188
        session serial #: 46497
    }
    is waiting for 'library cache pin' with wait info:
    {
                      p1: 'handle address'=0xb0000005cf712ac0
                      p2: 'pin address'=0x40000004541b9180
                      p3: '100*mode+namespace'=0x46eb00020003
            time in wait: 10 min 40 sec
           timeout after: 4 min 19 sec
                 wait id: 21
                blocking: 0 sessions
             current sql: BEGIN dbms_lock.sleep(60); END;
             short stack: ksedsts()+528<-ksdxcb()+960<-sspuser()+216<-__sighndlr()+12<-call_user_handler()+1028<-sigacthandler()+224
             <-_portfs()+8<-port_get()+28<-sskgpwwait()+1076<-skgpwwait()+180<-ksliwat()+1984<-kslwaitctx()+148<-kjusuc()+9160
             <-ksipgetctxia()+9396<-ksixpgetia()+180<-kqlmPin()+2996<-kglpnal()+5016<-kglpin()+1272<-kqlvld()+1308<-kglgob()+2812
             <-kgiinb()+1172<-pfri7_inst_body_common()+248<-pfri3_inst_body()+28<-pfrrun()+2520<-plsql_run()+560<-peicnt()+256
             <-kkxexe()+744<-opiexe()+22648<-kpoal8()+2384<-opiodr()+1100<-ttcpip() 
            wait history:
              * time between current wait and wait #1: 0.000110 sec
              1.       event: 'library cache revalidation'
                 time waited: 1.000049 sec
                     wait id: 20               p1: ' '=0x0
                                               p2: ' '=0x0
                                               p3: ' '=0x0
              * time between wait #1 and #2: 0.000056 sec
              2.       event: 'library cache pin'
                 time waited: 15 min 0 sec
                     wait id: 19               p1: 'handle address'=0xb0000005cf712ac0
                                               p2: 'pin address'=0x30000004541b9300
                                               p3: '100*mode+namespace'=0x46eb00020002
              * time between wait #2 and #3: 0.000062 sec
              3.       event: 'library cache revalidation'
                 time waited: 1.000032 sec
                     wait id: 18               p1: ' '=0x0
                                               p2: ' '=0x0
                                               p3: ' '=0x0
    }
    and is blocked by
 => Oracle session identified by:
    {
                instance: 1 (dbname.instname)
                   os id: 21431
              process id: 1478, oracle@hostname (J007)
              session id: 7238
        session serial #: 40343
    }
    which is waiting for 'wait for unread message on broadcast channel' with wait info:
    {
                      p1: 'channel context'=0xb12fb4d08
                      p2: 'channel handle'=0xb335bbce0
            time in wait: 0.484074 sec
      heur. time in wait: 6.701969 sec
           timeout after: 9.515926 sec
                 wait id: 5792240
                blocking: 1 session
             current sql: Begin :fResult := XXXXXX.XXX_XXXXXXXXXXXXX.XXXXXXXXXXXXXXXXXX( 1234,1,'q_xxxxxxxxxxxx'); End;
             short stack: ksedsts()+528<-ksdxcb()+960<-sspuser()+216<-__sighndlr()+12<-call_user_handler()+1028<-sigacthandler()+224
             <-_portfs()+8<-port_get()+28<-sskgpwwait()+1076<-skgpwwait()+180<-ksliwat()+1984<-kslwaitctx()+148<-ksrbwait()+464
             <-kwqidlisten()+416<-kwqilintl()+340<-kwqilisten()+388<-spefcmpa()+768<-spefmccallstd()+180<-peftrusted()+128<-psdexsp()+244 
             <-rpiswu2()+688<-kxe_push_env_internal_pp_()+300<-kkx_push_env_for_ICD_for_new_session()+64<-psdextp()+316
             <-pefccal()+476<-pefcal()+180<-pevm_FCAL()+144<-pfrinstr_FCAL()+156<-
            wait history:
              * time between current wait and wait #1: 0.000032 sec
              1.       event: 'wait for unread message on broadcast channel'
                 time waited: 0.000078 sec
                     wait id: 5792239          p1: 'channel context'=0xb12fb4d08
                                               p2: 'channel handle'=0xb335bbce0
              * time between wait #1 and #2: 0.000075 sec
              2.       event: 'wait for unread message on broadcast channel'
                 time waited: 0.117271 sec
                     wait id: 5792238          p1: 'channel context'=0xb12fb4d08
                                               p2: 'channel handle'=0xb335bbce0
              * time between wait #2 and #3: 0.000026 sec
              3.       event: 'wait for unread message on broadcast channel'
                 time waited: 0.000755 sec
                     wait id: 5792237          p1: 'channel context'=0xb12fb4d08
                                               p2: 'channel handle'=0xb335bbce0
    }
 
 Chain 1 Signature: 'wait for unread message on broadcast channel'<='library cache pin'
 Chain 1 Signature Hash: 0xcd5dcbea

We could see some interesting facts as highlighted above.

Waiting session(5188) was executing “BEGIN dbms_lock.sleep(60); END;” waiting for “library cache pin” and “library cache revalidation”
Blocking session(7238) was application job which is waiting for “wait for unread message on broadcast channel”

This information gave us an inkling of what might be happening in the background.

We looked at some other facts by executing few queries to understand this locking pattern.

 
 Sessions waiting for a Library Cache Pin

 select sid Waiter, 
 substr(rawtohex(p1),1,30) Handle, 
 substr(rawtohex(p2),1,30) Pin_addr 
 from v$session_wait 
 where wait_time=0 
 and event like 'library cache pin%';

     WAITER HANDLE                         PIN_ADDR
 ---------- ------------------------------ ----------------------- 
       4270 CA0D4516253A391F2E2329         CA031F3B2C03412F14161D
       5188 CA0D4516253A391F2E2329         CA053E1157044713236435
       8270 CA0D4516253A391F2E2329         CA0E54333B085D273A4561

 
 Objects involved to Library Cache Pin we are waiting for

 select to_char(SESSION_ID,'999') sid , 
 substr(LOCK_TYPE,1,30) Type, 
 substr(lock_id1,1,23) Object_Name, 
 substr(mode_held,1,4) HELD, substr(mode_requested,1,4) REQ, 
 lock_id2 Lock_addr 
 from dba_lock_internal
 where mode_requested<>'None' 
 and mode_requested<>mode_held 
 and session_id in ( select sid 
		     from v$session_wait 
		     where wait_time=0 
		     and event like 'library cache pin%'); 

 SID  TYPE                           OBJECT_NAME         HELD REQ  LOCK_ADDR
 ---- ------------------------------ ------------------- ---- ---- ----------------- 
 4270 Body Definition Pin            SYS.DBMS_LOCK       None Shar B0000005CF712AC0
 5188 Body Definition Pin            SYS.DBMS_LOCK       None Shar B0000005CF712AC0
 8270 Body Definition Pin            SYS.DBMS_LOCK       None Excl B0000005CF712AC0

 
 Who are the holders?

 select sid Holder ,KGLPNUSE Sessaddr , KGLPNMOD Held, KGLPNREQ Req 
 from x$kglpn , v$session 
 where KGLPNHDL in (select p1raw 
                    from v$session_wait 
                    where wait_time=0 
                    and event like 'library cache pin%') 
 and KGLPNMOD <> 0 
 and v$session.saddr=x$kglpn.kglpnuse;
 
     HOLDER SESSADDR               HELD        REQ
 ---------- ---------------- ---------- ---------- 
       2166 0000000B42286EE8          2          0
       3028 0000000B623B7CB0          2          0
       3837 0000000B724AC408          2          0
       4594 0000000B125E74C8          2          0
       5298 0000000B126D30A8          2          0
       7238 0000000B62942F60          2          0

 Holder Wait event

 select sid,substr(event,1,30),wait_time 
 from v$session_wait 
 where sid in (select sid 
		from x$kglpn , v$session 
		where KGLPNHDL in (select p1raw 
					from v$session_wait 
					where wait_time=0 
					and event like 'library cache pin%') 
		and KGLPNMOD <> 0 
		and v$session.saddr=x$kglpn.kglpnuse ); 

We could again see some similar facts being highlighted above.

So, what led to this????

All waiting sessions were executing “SYS.DBMS_LOCK” where TYPE was “Body Definition Pin”

We co-related the analysis from the HANGANALYZE and the output of the above queries. Putting this two together, we were sure that our previous inkling was correct.
Something seemed to have gone wrong with the package “DBMS_LOCK” definition which can be inferred by looking at the wait events “library cache revalidation” and TYPE “Body Definition Pin”

We verified the object status from DBA_OBJECTS

 
 select owner, object_name, object_type, status , last_ddl_time 
 from dba_objects  
 where object_name='DBMS_LOCK';

 OWNER           OBJECT_NAME     OBJECT_TYPE             STATUS  LAST_DDL_TIME
 --------------- --------------- ----------------------- ------- -------------------- 
 SYS             DBMS_LOCK       PACKAGE                 VALID   13-jul-2022 18:18:14
 SYS             DBMS_LOCK       PACKAGE BODY            INVALID 22-jun-2021 01:28:07
 PUBLIC          DBMS_LOCK       SYNONYM                 VALID   28-sep-2018 06:58:49

As highlighted, object was compiled on same evening leaving “DBMS_LOCK” package body INVALID.

This check assured us that since package body was INVALID, all the sessions were waiting on “library cache pin”.

Solution Implemented

We tried compiling the package manually.

alter package SYS.DBMS_LOCK compile body;

Our marathon did not end here. Yet another dilemma was waiting for us!

Even our session from which we initiated the COMPILE was waiting for “library cache lock” event.
We kept our command running and identified blocking sessions.

 
 select sid,substr(event,1,30),wait_time 
 from v$session_wait 
 where sid in (select sid 
		from x$kglpn , v$session 
		where KGLPNHDL in (select p1raw 
					from v$session_wait 
					where wait_time=0 
					and event like 'library cache pin%') 
		and KGLPNMOD <> 0 
		and v$session.saddr=x$kglpn.kglpnuse );

     HOLDER SESION                 HELD        REQ
 ---------- ---------------- ---------- ----------
       2166 0000000B42286EE8          2          0
       3028 0000000B623B7CB0          2          0
       3837 0000000B724AC408          2          0
       4594 0000000B125E74C8          2          0
       5298 0000000B126D30A8          2          0
       7238 0000000B62942F60          2          0

It was a similar job holding our operation as they were calling DBMS_LOCK package inside their package body. Application team insisted us to restart the database but we preferred to perform killing rather than entire DB restart.
We killed all blocking sessions (6 holders) and finally our COMPILE command completed successfully.

 
 alter session set nls_date_format='dd-mon-yyyy hh24:mi:ss'
 select owner, object_name, object_type, status , last_ddl_time 
 from dba_objects  
 where object_name='DBMS_LOCK';

 OWNER           OBJECT_NAME                         OBJECT_TYPE             STATUS  LAST_DDL_TIME
 --------------- ----------------------------------- ----------------------- ------- -------------------- 
 SYS             DBMS_LOCK                           PACKAGE                 VALID   13-jul-2022 18:18:14
 SYS             DBMS_LOCK                           PACKAGE BODY            VALID   13-jul-2022 23:05:38
 PUBLIC          DBMS_LOCK                           SYNONYM                 VALID   28-sep-2018 06:58:49

Application team initiated EOD again and it progressed as expected and came to a successful completion.

Well!!!  Well!!!  Well!!!
Wait ???????
It’s not the end yet!!!
What exactly triggered all this?

We started investigating why DBMS_LOCK package body was invalidated.
We looked at database alert log for similar time frame of LAST_DDL_TIME. Below messages were seen.

 
 2022-07-13T18:20:55.173026+05:30
 Errors in file /oradumps/dbname/diag/diag/rdbms/dbname/INSTNAME/trace/INSTNAME_j071_3462.trc: 
 ORA-00604: error occurred at recursive SQL level 2
 ORA-04061: existing state of  has been invalidated
 ORA-04061: existing state of package body "SYS.DBMS_LOCK" has been invalidated
 ORA-04065: not executed, altered or dropped package body "SYS.DBMS_LOCK"
 ORA-06508: PL/SQL: could not find program unit being called: "SYS.DBMS_LOCK"
 ORA-06512: at "SYS.DBMS_PRVTAQIP", line 3025
 ORA-06512: at line 1
 ORA-04061: existing state of  has been invalidated
 ORA-04061: existing state of package body "SYS.DBMS_LOCK" has been invalidated
 ORA-04065: not executed, altered or dropped package body "SYS.DBMS_LOCK"
 ORA-06508: PL/SQL: could not find program unit being called: "SYS.DBMS_LOCK"
 ORA-06512: at "SYS.DBMS_PRVTAQIP", line 3025
 ORA-06512: at line 1

We looked at all objects compiled during that day.

 
 select owner, object_name, object_type, status , last_ddl_time 
 from dba_objects  
 where trunc(last_ddl_time)=trunc(sysdate)
 order by last_ddl_time;

 OWNER                OBJECT_NAME                         OBJECT_TYPE             STATUS  LAST_DDL_TIME
 -------------------- ----------------------------------- ----------------------- ------- -------------------- 
 SYS                  DBA_REGISTRY_SQLPATCH               VIEW                    VALID   13-jul-2022 18:16:52
 SYS                  DBMS_SQLPATCH                       PACKAGE                 VALID   13-jul-2022 18:16:52
 SYS                  DBMS_QOPATCH                        PACKAGE                 VALID   13-jul-2022 18:16:52
 SYS                  REGISTRY$HISTORY                    TABLE                   VALID   13-jul-2022 18:16:52
 SYS                  DBMS_REGISTRY                       PACKAGE                 VALID   13-jul-2022 18:16:52
 .
 .
 .
 .
 SYS                  DBMS_RCVMAN                         PACKAGE                 VALID   13-jul-2022 18:18:11
 SYS                  DBMS_SERVER_ALERT                   PACKAGE                 VALID   13-jul-2022 18:18:11
 SYS                  DBMS_LOCK                           PACKAGE                 VALID   13-jul-2022 18:18:14

Why “DBA_REGISTRY_SQLPATCH” also got compiled?

Taking a cue from the above VIEW NAME, our line of investigation led us to know that “datapatch -verbose” had been executed on this setup which got failed because of similar locking issue and left the DBMS_LOCK package body uncompiled and in INVALID state.

12 thoughts on “CARDS Application EOD stuck with “library cache pin””

    1. Hi Dipak,

      Many of the essential queries are already included in the content.
      The following query can be utilized to monitor session wait events.

      col sid format 99999
      col event format a30
      col state format a25
      col p1 format 99999999999999
      set pages 1500
      set lines 150
      break on event
      compute count of event on event
      select sid,event,p1,p1raw, state, sql_id, final_blocking_session blocking_session
      from v$session
      where wait_class<>‘Idle’
      order by event;

      Please inform me if you require a particular content SQL.

Leave a Reply

Your email address will not be published. Required fields are marked *