[mod.unix] Unix Technical Digest V1 #12

Ron Heiby (The Moderator) <unix-request@cbosgd.UUCP> (03/12/85)

Unix Technical Digest       Tue, 12 Feb 85       Volume  1 : Issue  12

Today's Topics:
         When does ( alarm(1) == alarm(INFINITY) ) ? (2 msgs)
----------------------------------------------------------------------

Date: 6 Mar 85 07:19:59 GMT
From: jeff@heurikon.UUCP (Jeffrey Mattox)
Subject: When does ( alarm(1) == alarm(INFINITY) ) ?

Answer:  When you least expect it and, thanks to Murphy, when
you least want it.  In fact, there is a small, but non-zero,
probability that:  alarm(ANYTHING) == alarm(INFINITY).

We came across this little kernel bug the other day when trying
to figure out why certain programs were hanging on a simple
"sleep(1);" statement.  Although this happens on System V, I've
been told this problem is common to most non-BSD UNIXes.

   sleep(arg)	/* A simplified view of the sleep subroutine */
   {
	...
	alarm(arg);	/* sets  p_clktim=arg  in proc table */
	...		/* a critical time */
	pause();	/* waits for SIGALRM: --pp->p_clktim == 0 */
   }

   If an alarm(1) is executed *just* prior to a one second time
   tick, and if the time tick occurs before the pause(), then the
   pp->p_clktim value hits zero in clock() before the pause() is
   done, and the alarm signal will be missed by the pause.  This
   results in an INFINITE sleep.  If the process is suspended for
   more than one second prior to the pause, then alarms longer
   than one second could hang, too.

We'd welcome suggestions on how to fix this problem; the simpler,
the better (although the simpler, the-less-likely-it-will-fix-it).

See for yourself.  Here's an adaptive program which encourages a
sleep(1) hang.  You may need to manually adjust the initial loop
count estimation algorithm, or enter an initial value as an argument.
The ease with which this hangs is scary.
------------------------------cut here-------------------------------
						/* slphang.c */
#include <signal.h>
#include <stdio.h>

/* This program is an adaptive sleep(1) tester.  When called
 * without an argument, the logic first runs one loop over a one
 * second period to estimate the initial loop count value.  If there
 * is an argument, it is taken to be a decimal value which is used
 * as the initial delay loop counter.  On our system, a value of
 * 55000 is a good starting point.
 *
 * Sleep(1)'s are done and the loop count is increased or decreased
 * depending on whether or not the delay loop took more or less than
 * one full second to execute.  The target is to loop until *just*
 * prior to a one second clock tick, hoping to set the alarm value
 * and have the clock tick occur before the sleep() does a pause().
 *
 * While running, a SIGINT will print out the current loop delay value.
 * A "+" will print if the delay loop is being increased; a "-" if the
 * delay is being decreased.  The program can be run in the background.
 * Ideally, the system should be "quiet" when executing this so the
 * adaption will work.  Restart the program if it doesn't adapt (hang)
 * within one minute.
 *
 * Expect a bunch of "+" marks to start; if it starts with "-", it may
 * not converge.  If you get alternating "+" and "-" you're close, or
 * your system is too busy for this to work.  When you don't get
 * either (+ or -) for more than two seconds, it's hung.
 *
 * The ideal output should look something like this:
 * "++++++---+--+-+--++-+-"
 *
 * A SIGINT will cause the delay value to be printed and also will
 * "unhang" the sleep, so the program will continue printing "+"
 * and "-" until another hang.
 *
 * Use SIGQUIT to exit.
 */

int	delay;
short	alrmflag,dummy;

main(argc,argv)
int argc;
char *argv[];
{
	int	onintr();
	register int i;
	int	delta;
	long	time1,time2;
	short	lastflag = 0;

	if ( argc > 1 )
		delay = atoi(argv[1]);

	signal(SIGINT,onintr);
	signal(SIGALRM,onintr);
	sleep(1);	/* synchronize to the clock, hope we don't hang here! */
	if ( delay == 0 ) {	/* argument? */
		alarm(1);	/* set alrmflag in one second */
		for (i=0; i<999999 && !alrmflag ; i++);
		delay = i - i/3 - i/7;	/* make a guess */
	}
	delta = delay/12;		/* initial adaptive +- increment */
	time(&time2);			/* init time2 */
	while (1) {
		time1 = time2;		/* last loop time */
		for (i=0; i<delay && !dummy ; i++);/* delay almost one sec */
		sleep(1);		/* sleep and hope for tick real soon */
		time(&time2);		/* time after sleep */
		if ( time2-time1 > 1 ) {	/* using too big a delay */
			if ( lastflag == 1 ) {	/* want two in a row */
				delta = (delta/2) | 1;	/* decrease delta */
				delay -= delta;	/* adapt */
				lastflag = 0;
			} else {
				lastflag = 1;
			}
			putc('-',stderr);
		} else {			/* using too small a delay */
			delay += delta;		/* adapt */
			putc('+',stderr);
			lastflag = 0;
		}
	}
}
onintr(sig)
{
	if ( sig == SIGALRM ) {
		alrmflag = 1;
		return;
	}
	signal(sig,onintr);
	printf("\ndly=%d\n",delay);
}
----
/"""\	Jeffrey Mattox, Heurikon Corp, Madison, WI
|O.O|	{harpo, hao, philabs}!seismo!uwvax!heurikon!jeff  (news & mail)
\_=_/				     ihnp4!heurikon!jeff  (mail - best)

------------------------------

Date: 7 Mar 85 20:34:43 GMT
From: radford@calgary.UUCP (Radford Neal)
Subject: When does ( alarm(1) == alarm(INFINITY) ) ?

I wrote the following fudge routine to handle a similar problem 
when using 4.1 BSD. The 4.2 signal stuff allows a cleaner (though
less efficient) solution. It only works on the VAX as written, though
adaptation to other machines would probably be possible.


/* PAUSE_FUDGE - Fudge routines to fix up pause race problem */

/* This module allows someone to wait until some condition has been
   made true by an interrupt routine without wasting cp time in a 
   polling loop. 

   The Unix 'pause' system call will suspend a process until an
   interrupt (i.e. signal) is received. This is not directly 
   usable in a wait loop however, since between a check for a 
   condition and a call of pause an interrupt may occur which
   would have made the condition true. So the following wait
   loop may hang up:

           while (!condition) pause();

   The following wait loop is to be used instead:

           for (;;)
           { jk_set_up_pause();
             if (condition) break;
             jk_maybe_do_pause();
           }

   The way this works is that jk_set_up_pause creates a routine
   which will perform a pause system call. jk_maybe_do_pause will
   execute this routine. The interrupt routine should be written
   to call the routine jk_disable_pause, which changes the routine
   created by jk_set_up_pause to do nothing instead of a pause.
   This is done by a change of a single Vax machine instruction to
   nop's. 

*/

static char pause_routine[5];		/* Pause system call or nop's */

/* Set up a routine to do a "pause" system call. */

jk_set_up_pause()
{ register char *p;
  p = &pause_routine[2];
  *p++ = 0274; *p++ = 035;	/* chmk $pause */
  *p++ = 04;			/* ret */
}

/* Execute routine to do pause system call, unless it has been nop'ed out. */

jk_maybe_do_pause()
{ (*(void (*)())pause_routine)();
}

/* Disable pause call by replacing system call with nop's */

jk_disable_pause()
{ register char *p;
  p = &pause_routine[2];
  *p++ = 01; *p++ = 01;		/* nop; nop */
}

------------------------------

End of Unix Technical Digest
******************************
-- 
Ronald W. Heiby / ihnp4!{wnuxa!heiby|wnuxb!netnews}
AT&T Information Systems, Inc.
Lisle, IL  (CU-D21)