Any use of the information presented here is entirely at the reader's own risk. Always backup your system before attempting any
procedure which could cause your VMS system to hang or crash. Though VMS (now known as OpenVMS) is very robust, some of the
techniques presented here involve unusual kernel mode operations which are extremely risky on a production system.
Resource Waits in the OpenVMS Operating System
DECUS Spring '93 Atlanta Symposia VS060
or
What to do when you R-WASTed by OpenVMS
David L. Cathey Montagar Software Concepts
P. O. Box 260772 Plano, TX 75026-0772 (972) 578-5036
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 1
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 2
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 3
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 4
31 16 15 0 +----------------------------+--+------------------------------+ | MBZ | 1| Owner Count | +----------------------------+--+------------------------------+ Write-in-progress or Write-pending status bit
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 5
; Note that this code must be in Kernel mode, ; in order to access the MUTEX data cell for R/W access. ; ; Grab the Intrusion Queue mutex, so we can ; scan it safely... moval g^CIA$GL_MUTEX,r0 jsb g^SCH$LOCKW ; Lock MUTEX movl g^CIA$GQ_INTRUDER,r3 ; Get first intrusion blk moval g^CIA$GQ_INTRUDER,r4 ; Get listhead address 1$: cmpl r3,r4 ; If r3 is listhead, bail beql 5$ ... ; Do lots of neat stuff movl CIA$L_FLINK(r3),r3 ; Get next intrusion blk brw 1$ 5$: moval g^CIA$GL_MUTEX,r0 ; Unlock MUTEX jsb g^SCH$UNLOCK
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 6
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 7
State Reason Code Value Meaning RWAST RSN$_ASTWAIT 1 Wait for AST event RWMBX RSN$_MAILBOX 2 Mailbox I/O RWNPG RSN$_NPDYNMEM 3 Nonpaged Dynamic Memory RWPAG RSN$_PGDYNMEM 5 Paged Dynamic Memory RWMPE RSN$_MPLEMPTY 11 Waiting for Modified List to empty RWMPB RSN$_MPWBUSY 12 Modified Page Writer Busy
(ReallyWantedMyProcessBack - Pat O.)RWSCS RSN$_SCS 13 System Communications Services RWCAP RSN$_CPUCAP 15 CPU Capability (Vectors, etc) RWCSV RSN$_CLUSRV 16 Cluster Server Process Busy
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 8
31 12 0 +-------------------------------------+--+---------------------+ | | 1| | +-------------------------------------+--+---------------------+
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 9
; ; Put self in RWAST if there unable to allocate ; the required non-paged pool... ; ; Assume R4 hold value of current PCB 1$: movl #GOOF$C_LENGTH,r1 jsb g^EXE$DEBIT_BYTCNT_ALO ; Allocate 1000 bytes blbs r0,5$ movl #RSN$_NPDYNMEM,r0 ; Can't do it, wait until jsb SCH$RWAIT ; the system frees some and brb 1$ ; then try it again... 5$: movl r1,GOOF$W_SIZE(r2) ; Play with our new buffer ...
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 10
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 11
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 12
SDA> SHOW PROCESS/INDEX=0CF Process index: 000F Name: DAVIDC_1 Extended PID: 000000CF - - - - - - ------------------------------------------------------------- Status : 02040001 res,phdres Status2: 00000001 quantum_resched PCB address 805659E0 JIB address 806D2F80 PHD address 808F9000 Swapfile disk address 00000000 Master internal PID 00020019 Subprocess count 1 Internal PID 0003000F Creator internal PID 00020019 Extended PID 000000CF Creator extended PID 00000099 State RWAST Termination mailbox 002F Current priority 6 AST's enabled KESU Base priority 4 AST's active NONE UIC [00002,000001] AST's remaining 148 Mutex count 0 Buffered I/O count/limit 0/40 <---+ Waiting EF cluster 0 Direct I/O count/limit 40/40 | Starting wait time 1B001B1B BUFIO byte count/limit 30800/30800 | Event flag wait mask 00000001<-+ # open files allowed left 147 | Local EF cluster 0 E0000000 | Timer entries allowed left 20 | Local EF cluster 1 00000000 | Active page table count 0 | Global cluster 2 pointer 00000000 | Process WS page count 161 | Global cluster 3 pointer 00000000 | Global WS page count 40 | | | | Zero remaining quota--+ | +- Event Flag Mask == 1 == RSN$_ASTWAIT if 8nnnnnnn, then it would indicate which MUTEX
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 13
SDA> SHOW PROCESS/INDEX=0CF/CHANNEL Process index: 000F Name: DAVIDC_1 Extended PID: 000000CF - - - - - - ------------------------------------------------------------- Process active channels ----------------------- Channel Window Status Device/file accessed - - - - - - ------- ------ ------ -------------------- 0010 00000000 DUA0: 0020 8071C470 DUA0:[DAVIDC.RWAST]RWAST_BIO.EXE;4 0030 00000000 Busy MBA50: <-+ 0040 00000000 TWA3: | 0050 00000000 TWA3: | | Mailbox I/O incomplete, probably needs flushing-+
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 14
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 15
$ MCR NCP SHOW KNOW LINKS $! kill the link that seems to be connected to the RWAST'd process $ MCR NCP DISCONNECT LINK
$ MCR NCP SHOW KNOW LINKKnown Link Volatile Summary as of 6-APR-1993 20:17:47
Link Node PID Process Remote link Remote user 8193 1.42 (AVATAR) 21600033 REMACP 8445 DAVIDC
$ MCR NCP DISCONNECT LINK 8193
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 16
or$ COPY MBA1284: NLA0:
It's probably a better practice to copy from before copying to...$ COPY LOGIN.COM MBA1284:
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 17
devnam: .ascid /MUA0:/ chan: .word 0 .entry packack,0 $ASSIGN_S chan=chan,- devnam=devnam $QIOW_S chan=chan,- func=#IO$_PACKACK ret .end packack
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 18
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 19
A printer got stuck at the same time the SYMBIONT had a lock on a RIGHTSLIST entry... and was stopped. The SYMBIONT was RWASTed, had a blocking lock on the RIGHTSLIST, that ended up locking up everyone on the system (600+ angry users) in LOGINOUT, DIR/OWNER, etc.The solution? Close the door to the 15-year-old-washing-machine-sized LP27 printer :-(
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 20
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 21
.title DISABLE_RW ;++ ; DISABLE_RW -- Disable Resource Wait of another process ; ; Author: David L. Cathey ; Montagar Software Concepts ; P. O. Box 260772 ; Plano, TX 75026-0772 ; [email protected] ; .link "SYS$SYSTEM:SYS.STB"/SE .library /SYS$LIBRARY:LIB/ $PCBDEF ; Process Control Block definitions asc_pid: .ascid "xxxxxxxx" ; Save space for PID bin_pid: .long 0 prompt: .ascid "Process ID: " ; Prompt string .entry Main,0 pushaw asc_pid pushaq prompt pushaq asc_pid calls #3,g^LIB$GET_FOREIGN ; Get PID from user blbc r0,999$ pushal bin_pid
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 22
pushaq asc_pid ; Convert ascii hex to binary calls #2,g^OTS$CVT_TZ_L blbc r0,999$ $CMKRNL_S routin=do_it ; Play with the process... 999$: ret .entry Do_It,^M<> movl bin_pid,r0 jsb g^EXE$EPID_TO_PCB ; Get PCB from EPID tstl r0 ; Did we??? beql 99$ ; Nope, bail out bisl2 #PCB$M_SSRWAIT,PCB$L_STS(r0) ; Set SSRWAIT disable bicl2 #PCB$M_DELPEN,PCB$L_STS(r0) ; Clear delete pending $DELPRC_S pidadr=bin_pid ; And delete again. ret ; Bye... 99$: movl #SS$_NONEXPR,r0 ; Non-existent process! ret .end Main
David L. Cathey, Montagar Software Concepts, P.O.Box 260772, Plano, TX 75026-0772 Spring'93 DECUS Symposia Slide No. 23
caveat: I recently noticed this on a Actian support page (
https://communities.actian.com/s/article/Procedure-To-Determine-Why-A-Process-Is-In-A-RWAST-State ) so all credit goes to
them.
p.s. I preserved the contents of their RWAST page for posterity
The SHOW SYSTEM command display indicates that a process is in an RWAST
state and the process seems locked. How can you determine why the process is in this state?
The RWAST is a general purpose "Resource Wait" state. It indicates that the wait is expected to be satisfied by the delivery
and/or enqueueing of an AST to the process.
There are 4 common reasons for a process to go into an RWAST state:
1) It is waiting for an I/O to complete on a channel.
2) It has exhausted an AUTHORIZE or SYSGEN quota.
3) It is waiting for a file system or lock request to complete.
4) It is waiting for a subprocess to terminate.
Processes in the RWAST state can NOT be deleted (e.g., with STOP/ID) until the condition they are waiting for is met. If you can
not identify what the process is waiting for, you will have to reboot the system in order to eliminate the process.
If the process in an RWAST state is running a user-written program, then it is possible to rewrite the program to arrange to
receive an error status for certain system calls, rather than have VMS put the process into the resource wait state. Usually, the
error status indicates either a quota problem or insufficient pooled memory. This is accomplished by using the SYS$SETRWM system
service call.
Procedure
To find out why a process is in an RWAST state, use the System Dump Analyzer (SDA):
1. Invoke SDA
$ ANALYZE/SYSTEM
VAX/VMS System analyzer
! To find the RWAST process and its INDEX
SDA> SHOW SUMMARY
Current process summary
Extended Indx Process name Username State Pri PCB PHD Wkset PID
20200080 0000 NULL COM 0 800024A8 80002328 0
20200081 0001 SWAPPER HIB 16 80002748 800025C8 0
20201005 0005 KILEY KILEY LEF 4 80363C50 82CEEE00 211
20200086 0006 ERRFMT SYSTEM HIB 7 8030CA80 80A2FA00 88
20200087 0007 CACHE_SERVER SYSTEM HIB 16 80317F70 80C3AE00 62
2020104F 004F SMITH SMITH RWAST 6 8036CE90 82DF4800 200
2. Set Your Default to the RWAST Process Using its INDEX Value
SDA> SET PROCESS/INDEX=4F !Selects the process in RWAST state, in this case, the SMITH process
Note
If you have tried to delete the process, SDA may not permit you to set your process to the RWAST process. In this case, you would
receive the following error:
%SDA-E-NOTINPHYS, xxxxxx: not in physical memory
If you receive this error, you may have to format the PCB and/or JIB to figure out the problem. The address for the PHD and PCB
can be found from the SHOW SUMMARY display. The address for the JIB will be an offset PCB$L_JIB in the formatted PCB. Keep this in
mind if SDA will not allow you "normal" access to the data structures that follow. If you can get no access to the process data
structures, for example the process header is outswapped, you may have to reboot the system and wait for the problem to occur
again. If it happens again, you may be able to catch the data structures in memory and analyze the Resource Wait state more
thoroughly.
3. Find the Process Program Counter (PC)
SDA> EXAMINE @PC
and see if it evaluates to one of the following symbols:
EXE$DASSGN+6D, in which case you go to step 4 (below)
EXE$MULTIQUOTA+032, in which case you go to step 5 (below)
EXE$DCLEXH+0A5, in which case you go to step 6 (below)
EXE$DCLEXH+141, in which case you go to step 7 (below)
Other RWAST states are possible but very rare. If the PC does not evaluate to one of the available symbols, you will have to
reboot the system to eliminate the hung process. Take a crash dump of the system so you can later determine why the process was in
RWAST.
Occasionally, the RWAST process will clear itself if the process waits long enough and the AST somehow gets satisfied.
4. PC is at EXE$DASSGN+6D
If the PC is at EXE$DASSGN+6D, the process is waiting for an I/O request to complete. Many times, the device on which it is
waiting for I/O to complete will be shown as "Busy" in the SHOW PROCESS/CHANNEL display:
SDA> SHOW PROCESS/CHANNEL !Look for a status of "Busy"
!and this will determine the
!device that is waiting for the I/O
Process index: 004F Name: SMITH Extended PID: 22E0124D
Process active channels
Channel Window Status Device/file accessed
0010 00000000 LFILNG$DUA14:
00C0 00000000 Busy LPA0: <-- This device is
00D0 00000000 MBA1: blocking the process
In the above example, you have only 1 BUSY channel, so this must be the channel causing the process to hang in RWAST. If you have
multiple BUSY channels, you can identify which one is causing the RWAST state with the following commands:
SDA> READ SYS$SYSTEM:SYSDEF.STB !read in system symbols
SDA> EXAMINE @R6+CCB$W_IOC !number of outstanding requests
8041BBBA: 00180001 "...." !in lower word (1)
SDA> DEFINE UCB=@(@R6+CCB$L_UCB)!define UCB address
SDA> EXAMINE UCB+UCB$W_UNIT !low order word is unit number
UCB+054: 00000000 "...." !here it is unit # 0
SDA> EXAMINE @(UCB+UCB$L_DDB)+DDB$T_NAME;8 !device name
8041BBBA: 2041504C "LPA." !device is LPA0:
Note
If the device is a printer connected to a terminal port and the symbiont is waiting for an XON to be delivered, occasionally
turning the printer OFF and back ON again will cause an XON to be sent back to the VAX. This allows the I/O to complete,
permitting the print symbiont to continue.
If the device is a printer connected to a printer port (LPA0, LCA0...), the VAX thinks the printer is offline.
This may indicate a hardware problem with the printer or controller, if it is really online. Again, turning the printer OFF and
back ON again may help.
5. Value Returned is EXE$MULTIQUOTA+032
If the value returned is EXE$MULTIQUOTA+032, the RWAST state indicates the process has run out of a quota. A SHOW SYSTEM display
will often show the process continuing to accumulate CPU time.
A quick check to help determine which quota the process has exhausted is the following:
SDA> SHOW PROCESS !for the SMITH process
Process index: 004F Name: SMITH Extended PID: 2020104F
Process status: 02040001 RES,PHDRES
PCB address 8036CE90 JIB address 8064F3C0
PHD address 82DF4800 Swapfile disk address 01002821
Master internal PID 0020004F Subprocess count 0
Internal PID 0020004F Creator internal PID 00000000
Extended PID 2020104F Creator extended PID 00000000
State RWAST Termination mailbox 0000
Base priority 4 AST's active NONE
UIC [00022,000016] AST's remaining 16
Mutex count 0 Buffered I/O count/limit 0/18*
Waiting EF cluster 0 Direct I/O count/limit 18/18*
Starting wait time 1B001B1B BUFIO byte count/limit 30478/31936*Event flag wait mask 00000001 # open files allowed left 75
Local EF cluster 0 E4000000 Timer entries allowed left 10
Local EF cluster 1 00000000 Active page table count 0
Global cluster 2 pointer 00000000 Process WS page count 140
Global cluster 3 pointer 00000000 Global WS page count 60
Look in the lower right hand portion of the display, denoted by asterisk (*), to see if any quotas are down to zero. In the
example above, you can see that Buffered I/O count/limit is zero. The number before the slash (/) is the amount of this quota
left. The number after the slash (/) is the total amount allowed. These fields relate to the following Authorization (UAF) records
and SYSGEN PQL parameters if the process is a detached process:
RUN/Detached
Authorization SYSGEN PQLs Limits
AST's remaining - ASTLM PQL_DASTLM /AST_LIMIT
Buffered I/O count/limit - BIOLM PQL_DBIOLM /IO_BUFFERED
Direct I/O count/limit - DIOLM PQL_DDIOLM /IO_DIRECT
BUFIO byte count/limit - BYTLM PQL_DBYTLM /BUFFER_LIMIT
# open files allowed left - FILLM PQL_DFILLM /FILE_LIMIT
Timer Entries allowed left - TQELM PQL_DTQELM /QUEUE_LIMIT
Once it has been determined which quota needs to be increased for processes or subprocesses, increase that value within the
User Authorization File, the SYSGEN PQL parameter (if it is a detached process), or on the SYS$CREPRC system service (if a process
is creating the detached process). The new value is then used when a new process is created or when you log out and log back in
again. At this point, reboot the system to eliminate the RWAST process waiting for quota. Hopefully, you have increased the
parameter to a value high enough that you will not see the new process go into RWAST state again.
If the RWAST process is waiting for a quota and the quota does not appear to be any of these, you can format and display the Job
Information Block (JIB), Process Control Block (PCB), and the Process Header (PHD) to locate the quota problem.
R2 contains the address of the insufficient quota. To determine the insufficient quota, do the following:
SDA> READ SYS$SYSTEM:SYSDEF ! read system definitions
SDA> EXAMINE R2
R2: 8036CDCA "JNG." ! obtain value contained in R2
Next, locate the addresses of the PCB (Process Control Block) and the JIB (Job Information Block) from the top of the SHOW PROCESS
display. The value found in R2 will be pointing somewhere in one of these two data structures. Identify which data structure would
contain the value in R2 and format that data structure. In this case, R2 would be in the PCB so the PCB needs to be formatted:
SDA> FORMAT 8036CE90 !Formatting PCB of process
8036CE90 PCB$L_SQFL 80002180
8036CE94 PCB$L_SQBL 80002180
........ ........ ........
8036CEC6 PCB$W_PPGCNT 008C
8036CEC8 PCB$W_ASTCNT 0010
"8036CECA" PCB$W_BIOCNT 000 <-- This address matches
8036CECC PCB$W_BIOLM 0012 R2. The value for
PCB$W_BIOCNT is zero,
indicating that the
quota is depleted.
BIOLM needs to be
increased beyond 18,
8036CECE PCB$W_DIOCNT 0012 or the application
...... ....... ...... program modified so
8036CF0C 00000000 that not so many
8036CF10 PCB$L_JOB 8064F3C0 outstanding buffered
I/O requests are made at once.
If the address is not found in the PCB, format the JIB.
The JIB address can be found from either the SHOW PROCESS display or the PCB$L_JIB value above:
SDA> FORMAT 8064F3C0 ! Formatting JIB of process
6. Value Comes Back with EXE$DCLEXH+0A5
If the value comes back with EXE$DCLEXH+0A5, the process may be waiting either for the file system to complete a request, or for a
lock request. EXE$DCLEXH+0A5 is returned if you have tried to delete the process, whose former state was probably LEF.
A quick check to see if the RWAST process is waiting for an XQP file request to complete is to format the PCB and look for an
non-zero value in PCB$B_DPC. If the process is being forced to wait under these circumstances, the SDA "SHOW PROCESS" command
displays the PROCESS status as "DELPEN". v
SDA> SHOW PROCESS
Process index: 00DF Name: Mike Mc. | Extended PID: 000007DF
Process status: 02040023 RES,DELPEN,RESPEN,PHDRES
PCB address 80339230 JIB address 804FE7E0
PHD address 82DF4800 Swapfile disk address 010065A1
Master internal PID 007000DF Subprocess count 0
Internal PID 007000DF Creator internal PID 00000000
Extended PID 000007DF Creator extended PID 00000000
State RWAST Termination mailbox 0000
Base priority 4 AST's active NONE
UIC [00022,000016] AST's remaining 16
Mutex count 0 Buffered I/O count/limit 0/18*
Waiting EF cluster 0 Direct I/O count/limit 18/18*
Starting wait time 1B001B1B BUFIO byte count/limit 30478/31936*Event flag wait mask 00000001 # open files allowed left 75
Locak EF cluster 0 E4000000 Timer entries allowed left 10
Local EF cluster 1 00000000 Active page table count 0
Global cluster 2 pointer 00000000 Process WS page count 140
Global cluster 3 pointer 00000000 Global WS page count 60
SDA> FORMAT 80339230 ! format the PCB
80339230 PCB$L_SQFL 80002180
80339234 PCB$L_SQBL 8032D5C0
.......
8033925A PCB$B_DPC 01 <----NON-ZERO, waiting for XQP (file
....... ......... .... system) activity to complete
If the value is zero (00), the RWAST is not waiting for the XQP and you can check for outstanding lock requests using the command
SDA> SHOW PROCESS/LOCK. You will often find a lock in either "Waiting for" or "Converting to" state.
Two other articles in this database describe how to trace lock requests on both clustered and nonclustered systems.
The process holding the lock this process is waiting for is often in a RWxxx state itself, and solving that process' problem would
clear up this process' RWAST state.
If you are at this address, EXE$DCLEXH+0A5, and the process is not getting CPU time or waiting for XQP or lock operations to
complete, reboot the system to eliminate the process. Take a crash dump for later examination if the problem occurs often.
If the value in PCB$B_DPC is non-zero, then the process is waiting for an XQP file system request. You may use the following
information to analyze further. Note that it is possible that this command itself could cause the SDA process to go into an RWAST
state.
SDA> SHOW PROCESS/CHANNELS
Process index: 00DF Name: Mike Mc. Extended PID: 000007DF
Process active channels
Channel Window Status Device/file accessed
0010 00000000 DUA2:
0040 00000000 VTA52:
0050 00000000 VTA52:
0070 00000000 Busy DUA2: <---- Device waiting for XQP
The most common reason for an RWAST process to be waiting for XQP is that the PAGEDYN SYSGEN parameter has been exhausted, or
there is very little left. If this is the case, PAGEDYN will have to be increased and the system rebooted to prevent further
problems.
PAGEDYN should normally be at least 25-30% free, and having it 40% free on a busy system may actually help increase your system
performance.
To determine how much PAGEDYN has been used, issue the following command:
SDA> SHOW POOL/SUMMARY/PAGE
Page dynamic storage pool
Summary of paged pool contents
108 UNKNOWN = 357984 (20%)
2 LOG = 83728 (4%)
......
1 CI = 96 (0%)
1 CLU = 2384 (0%)
Total space used = 1719808 out of 1988608 total bytes, 268800 bytes left.
Total space utilization = 86% <--- indicates that only 14% is left
To determine if the RWAST process is waiting for PAGEDYN, you can do the following:
Create the following MACRO program that will set up a symbol definition in order to format data structures to get necessary
symbols defined for the next step. Note that case matters here:
$ CREATE F11BDEF.MAR
$F11BCDEF GLOBAL
$F11BDEF GLOBAL
.END
$ MACRO /OBJ=F11BDEF.STB F11BDEF.MAR+SYS$LIBRARY:LIB/LIB
SDA> READ F11BDEF !read in data structures created by MACRO above
SDA> SHOW DEVICE DUA2 !DUA2 is the device shown busy from the
!SDA command SHOW PROCESS/CHANNELS
Look for the AQB address in the following information:
I/O data structures
DUA2 RA80 UCB address: 80484B90
Device status: 00021810 online,valid,unload,lcl_valid
Characteristics: 1C4D4108 dir,rct,fod,shr,avl,mnt,elg,idv,odv,rnd
00000221 clu,mscp,nnm
Owner UIC [000001,000001] Operation count 274887 ORB address 80484C96
PID 00000000 Error count 0 DDB address 80BA3D20
Alloc. lock ID 000400AF Reference count 1 DDT address 80550C78
Alloc. class 2 Online count 1 VCB address 8048AD70
........ .................
Press RETURN for more.
I/O data structures
---Volume Control Block (VCB) 8048AD70 ---
Volume: TUBORGPAGE Lock name: TUBORGPAGE
Status: A0 extfid,system
Status2: 05 writethru,mountver
***Here it is
Mount count 1 Rel. volume 0 AQB address 80BA4AA0
Transactions 2 Max. files 29651 RVT address 80484B90
Free blocks 34020 Rsvd. files 9 FCB queue 808A9D10
Window size 7 Cluster size 3 Cache blk. 80768460Vol. lock ID 000200B1 Def. extend sz. 5
Block.lock ID 001A00A7 Record size 0
SDA> FORMAT 80BA4AA0
80BA4AA0 AQB$L_ACPQFL 80BA4AA0
........ ............
80BA4AB6 AQB$B_CLASS 00
80BA4AB7 00
80BA4AB8 AQB$L_BUFCACHE 80274380 <--- !Look for this value
........ ...........
SDA> FORMAT 80274380 ! formatting the AQB$L_BUFCACHE
80274380 F11BC$L_BUFBASE 80297400
.......
802743CB F11BC$Q_POOL_WAITQ 802743C8 <--- !Look for this value.
802743CC 802743c8 !The forward link and the
!backward link point to the
802743D0 802743D0 !forward link address which
802743D4 802743D0 !indicates the queue is empty.
802743D8 802743D8 !In this example, it is not
802743DC 802743D8 !waiting for PAGEDYN. If it
!was waiting for PAGEDYN,
802743E0 802743E0 !different address values
802743E4 802743E0 !for the forward and backward
!link would be displayed and
802743E8 F11BC$L_POOLAVAIL 0000002F !they would not have the same
802743EC 0000044A !value, indicating that the
. !queue is not empty, waiting
!for PAGEDYN.
SDA> EXIT
7. Value Comes Back with EXE$DCLEXH+141
If the value comes back with EXE$DCLEXH+141, the process is waiting for a subprocess to terminate before it can terminate. Many
times the subprocess is also in an RWxxx state.
To check for this occurrence, the PCB can be formatted as described above and the PCB$W_PRCNT field checked. This field contains
the number of subprocesses this process is waiting for.
The following DCL command procedure can be used to check all subprocesses on the system to find which process has the RWAST
parent process. The subprocess Username and Process Name are displayed along with the parent process PID and image name.
$ context = ""
$!
$loop:
$ pid = F$PID( context )
$ if pid .eqs. "" then $ goto done
$ owner = F$GETJPI( pid , "owner" )
$ if owner .eqs. "" then $ goto loop
$ username = F$GETJPI( pid , "username" )
$ prcname = F$GETJPI( pid , "prcnam" )
$ imagname = F$GETJPI( pid , "imagname" )
$ imagname = F$PARSE( imagname ,,, "name" )
$ text = F$FA0( "!8AS !8AS !8AS !15AS !AS" -
, pid , owner , username , prcnam , imagname )
$ write sys$output text
$ goto loop
$!
$done:
If no subprocess is found for the parent process, the parent process will wait forever and the system will have to be rebooted
to
eliminate this process. It could be that privileged code is altering the subprocess PCB$L_OWNER field so process termination
does not know about the parent process.
To find the subprocess in a crash dump, you need to locate the OWNER field whose PID matches that of the parent process in
RWAST.
This can be done by displaying every owner field in every PCB available on the system.
SDA> READ SYS$SYSTEM:SYSDEF
SDA> SHOW SUMMARY ! to get all the PCB addresses
then for each PCB address you can enter:
SDA> EXAMINE
When you find the process whose owner/parent is the PID of the process in RWAST, you can start analyzing why the subprocess is
not terminating.