[comp.os.minix] sleep

MBECK@ai.ai.mit.edu (Mark E. Becker) (06/04/88)

Hello -

     A long time ago someone posted a patch that fixed a problem
dealing with background processes doing sleep().. I recall something
along the lines of "/etc/update stopped running and was no longer
sync'ing the disk every 30 seconds".  This apparently occurred when a
second background program used sleep() and timed out.. clearing some
bit associated with /etc/update's status.

     I didn't save the patch.  Does anyone have a file copy?  Or can
point me in the "right" direction?

Muchly appreciated -
Mark Becker

csrobe@cs.wm.edu (Chip Roberson) (12/15/88)

There is a bug in the 1.2 Version of PC Minix (that I have).

You can determine if you have the bug by running the following code:

if (fork() == 0)
    while (1)
      { putchar('c');  fflush(stdout);  sleep(1); }
else
    while (1)
      { putchar('p');  fflush(stdout);  sleep(2); }

if the ONLY output you see is something like:
	cpc
then you have the same bug.  NB:  This bug will effectively kill
/etc/update by breaking its sleep/wake-up loop!

The problem is in mm/signal.c/check_sig at lines 6824-6842, where
mm searches "through the proc table for processes to signal".

for(...) {
  ...
  if (proc_id > 0 && proc_id != rmp->mp_pid) send_sig = FALSE;
  ...
  /* SIGALARM is a little special.  When a process exits, a clock signal
   * can arrive just as the timer is being turned off.  Also, turn off
   * ALARM_ON bit when timer goes off to keep it accurate.
   */
  if (sig_nr == SIGALRM) {
	if ( (rmp->mp_flags & ALARM_ON) == 0) continue;
	rmp->mp_flags &= ~ALARM_ON;
  }
  if (send_sig == FALSE || rmp->mp_ignore & mask) continue;
  ...
}

The problem occurs when there are two alarms pending at the same time
and an alarm goes off for the process deeper/later in the process
table.  The first check, determines that the signal is not for this
process (proc_id != rmp->mp_pid) and sets a flag (send_sig = FALSE).
Later on, it checks to see if this is an ALARM signal.  If it is,
then it makes sure that the process is still there and that it is
still waiting for a signal.  [BUT, at this point, we know that this
process is not the correct process.]  Well, if mm sees that this process
has it's ALARM bit on (and /etc/update always will, and it will
always be lower than any other user process!) it will turn off
the ALARM_ON bit thinking that this process is about to be awoken.
Well the last line, above, realizes that the signal really isn't for
this process and goes to the next iteration of the for-loop.

ERGO:  an ALARM signal for process N will clear all outstanding
ALARMs for any process i,  INIT_PROC_NR < i < N.

The fix is to move

	if (send_sig == FALSE || rmp->mp_ignore & mask) continue;

before

	if (sig_nr == SIGALRM) {

I would have sent diffs but that was just too much trouble to get them
to a networked machine.

cheers,
-c
-------------------------------------------------------------------------
Chip Roberson                ARPANET:  csrobe@cs.wm.edu
Dept of Comp. Sci.                     csrobe@icase.edu
College of William and Mary  BITNET:   #csrobe@wmmvs.bitnet
Williamsburg, VA 23185       UUCP:     ...!uunet!pyrdc!gmu90x!wmcs!csrobe
-------------------------------------------------------------------------