[comp.sys.apollo] What makes an unstoppable process that way...

agq@itd1.dsto.oz (Ashley Quick) (10/10/90)

Sorry this is appearing so late... our news feed has been down for 2
weeks.

I have seen the discussion and  the sample program which cause a
process to become unstoppable. The answer is not short, but here is
why:

When you use the prf_$ calls, they initiate an RPC exchange with the
print manager and the print server. (If the print server is an SR9
server, the action is different and no RPC interaction takes place).
This is documented in the book "Printing in the Aegis Environment".

When the RPC activities take place, it appears that the system creates
a number of tasks. (TASK_$ calls were newly released in SR10,
although I suspect they had been around a bit longer....) The TASK
mechanism is a simple means of MULTI-THREADING. It effectively creates
a number of execution threads inside a user process. There are calls to
start and stop tasks, set the priority, etc. This is all documented in
the DOMAIN/OS call reference books.

One of the not so nice things about tasks, is that they break some
system calls: namely any call that is not re-entrant might (note might
not WILL!!) fall over given the right circumstances. There is a table
in the TASK section of the call reference which describes the classes
of system calls which are not re-entrant. All UNIX calls fit this
category.

Also, some system calls have a changed behaviour when they operate in
a tasking environment. As an example of this: under SR10 the print
system (inc print servers) uses NCS/RPC and the print servers have
about 4 tasks running. Try taking all the paper out of your printer,
then print a large job. Then, sigp the print server (After it has
started and got held up due to no paper) You will notice that the
print server has become unstoppable. Also note though, that when you
put paper in the printer, so that IO can take place, the print server
will die nicely, exactly as expected.

The reason for the funnies that you have been seeing is exactly the
same as that described above: When tasking is enabled, and you try to
do some IO, all tasking is STOPPED temporarily (they probaly call
PFM_$INHIBIT which blocks ALL async faults, including the stop and
quit faults!). The entire process waits for the IO to complete before
anything more can happen. This means that you cannot stop the process
until there is some IO activity.

There is also probably a very good reason to do this: When tasking is
enabled, faults are handled differently because there is no way of
telling which task was executing when the fault ocurred. Also, it
could be nasty to have a (say) two tasks trying to do input or output
at the same time - things would get into an awful mess!

What I am leading to here is that Apollo have not told us that calling
prf_$queue_name starts up tasking. This was news to me, but I should
not be surprised! It explains a weird bug I have seen here, also.

Suggested work around: (which I have not tried)
  It may be possible to call PFM_$ENABLE after your call to
PRF_$QUEUE_NAME, as I seem to remember that these work by
incrementing/decrememting a counter.... This would thus fool the OS
and allow faults during IO. Remember though that tasking is implemeted
by delivering regular faults, so tasking would stay enabled... But
that should be waiting on RPC events which would no longer ocurr, so
no matter...

  It may also be possible to shut down the other tasks. Your program
is called  the "distinguished task" - as it started the others
(without your knowledge, though). As all tasks are identified by
handles, which you do not know, this is probably a long shot.


I hope I have answer the WHY of your problem. I cannot actually
suggest how to fix it... some playing with PFM_$ENABLE, etc may help.
I suggest you report this as an APR, as it seems to me that tasking
should be shut down once the call to PRF_$QUEUE_NAME has completed. As
this is the culprit call, I suggest that it should return to the
caller in a known state.

Ashleigh Quick
---------------------------
Ashleigh Quick                             | ACSnet: AGQ@dstos3.dsto.oz
Defence Science and Technology Organisation| Internet: AGQ@dstos3.dsto.oz.au
PO Box 1600                                | Phone: (Intl) (+61 8) 259 6975
Salisbury 5108      AUSTRALIA              |        (Local) (08) 259 6975