[net.unix-wizards] sleep

stuart (08/14/82)

Sleep(3) uses the alarm signal and claims to restore the state of the
the signal.  
I use a sleep inside my routine which handles the alarm signal.
But, can anyone explain why sleep(2) might not fully restore the alarm
to the way things were before?  Both the remaining alarm time and
the 'current' handler routine are available to sleep(2) if it wants
to look at them.  

v.wales@Ucla-Security@sri-unix (08/31/82)

From: v.wales at Ucla-Security (Rich Wales)
Date: 25 August 1982 1013-PDT (Wednesday)
If you're using sleep(3) in conjunction with sigset(3), you might be
running into trouble because sleep(3) uses the old "signal" system call
instead of the new Berkeley signal-handling stuff.  I don't think
"signal" and "sigset" were ever designed to coexist peacefully in the
same program.

To fix this problem, put a version of "sleep" in the "libjobs" library
in which the "signal" calls have been replaced by "sigset" calls.

Once you do this, any programs that use "-ljobs" will use the new
"sleep" (with sigset's instead of signal's), while programs that don't
use "-ljobs" will be unaffected.

-- Rich

chris@umcp-cs.UUCP (07/05/83)

I  have a bone to pick with the C library "sleep" routine.  First, some
background:

The  way  sleep  works  is  something like this:  set a signal trap for
SIGALRM, set the alarm, and pause forever.  When the SIGALRM  hits  the
trap function uses longjmp to break out of the for(;;).

Now here's what can go wrong:

While  pause()d, suppose a SIGHUP arrives.  The SIGHUP handler wants to
write something to a file, then clean  up  and  exit.    So  it  begins
processing  (with  SIGHUP  carefully  turned off).  Meanwhile the alarm
clock is still ticking away.  Suddenly the alarm goes  off,  and  (here
comes  the  bug)  the  longjmp() RETURNS FROM THE sleep() CALL.  No one
even notices that a hangup was being processed!

Seems  to  me  the way sleep() ought to work is:  set a signal trap for
SIGALRM, clear an alarmed flag, set the alarm, and  while  the  alarmed
flag is clear, pause.  Using longjmp() was an outright mistake.

Anyone  see  any  problem  with  this?  I've been using it in Emacs for
quite a while now with no trouble.  (I needed to  prevent  the  longjmp
for the echo-keystrokes mod.)

					- Chris
-- 
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs
ARPA:	chris.umcp-cs@UDel-Relay

kwmc@hou5d.UUCP (07/06/83)

Regarding a bug in sleep. There is yet another bug in that on a heavily
loaded system it is possible for the alarm to go off before the pause is ever
entered.  Thus sleep(1) on a heavily loaded system COULD sleep forever.  This
happened here, and we reinstated sleep as a system call in itself to fix
the problem. Sleep used to be a system call in the dim and distant past. I feel
that it still should be.
			Ken Cochran       hou5d!kwmc

tom@rlgvax.UUCP (07/08/83)

The sleep() system call was changed to alarm(), pause() between V6 and PWB
UNIX.   At that time the whole sleep structure was redone.  The new setup
had the bug/limitation that there is a critical period between setting the
alarm and actually getting to sleep.  It also has the unfortunate point
of using and relying on signals, which generally suck.

However, before you jump to put in the old sleep() system call, you should be
aware of its problem, too.  The old way that sleep() was done had a very
inefficient implementation.  If several processes were sleeping, and the time
came for one to wake up, they ALL woke up.  They would ALL swap in (if they
were swapped out), check to see if their time elapsed, and if not, would
go back to sleep again.  In a system I designed once, I had the machine
thrashing like hell because several processes were supposedly sleeping!!!
Installing pause() speeded the machine up greatly.

So, if you are going to put sleep() back, make sure you do it well...

- Tom Beres
CCI, Reston (formerly RLG)
{sesimo, allegra, mcnc, brl-bmd}!rlgvax!tom

ka@spanky.UUCP (07/10/83)

The problem with implementing sleep by setting a flag when the alarm
signal is received and testing for this flag in a loop around the call
to pause() is that the alarm signal may come in between the time the
flag is checked and pause is called.  This may be an improbable occurance,
but I have seen it happen with a version of sleep that works exactly as
Chris proposes.
				Kenneth Almquist

dale@cbosg.UUCP (07/12/83)

Another possibility we have used for years, is to make pause()
return  immediately with an error condition if there is no alarm
outstanding. This has the disadvantage of not being able to do a pause
for another signal unless an alarm is also set, but is not an undue
restriction and eliminates all race conditions.

guy@rlgvax.UUCP (Guy Harris) (07/17/83)

Having pause() only work if there is an alarm outstanding works if 1) you
have some other way to implement a block/wakeup mechanism (using pause(),
signal(), and kill() for this is a ghastly kludge but if it's all you've got...)
as USG UNIX 5.0 does, I believe 4.2BSD does (I don't have my 4.2BSD System
Manual handy), and I suspect CB-UNIX does, or 2) every time a process does
a block explicitly requested as such by user-mode code it wants a timeout.

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rglvax!guy

guy@rlgvax.UUCP (Guy Harris) (07/17/83)

It turns out the speedup was due to other causes; pause() sleeps on &u for
all processes where it is called.  Therefore, anybody who gets woken up during
a pause() wakes up everybody else who is pause()ing, so the old problem is
still there.  To remove this problem, make pause() sleep on u.u_procp+1 or
some other unique number (u.u_procp is out because wait() sleeps there).

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

guy@rlgvax.UUCP (Guy Harris) (07/18/83)

In a previous note I mentioned that if several processes are sleeping on
a pause(), and one of them gets an alarm(), all of them get woken up.  It
turns out that the problem is that in vanilla V7 psignal() uses wakeup() to
wake the recipient up.  This means that if more than one process is in a
pause() and sleeping on &u (which is the same address for all processes on
most systems), if one process gets signalled out of the sleep all the other
processes will wake up.  However, on 4.1BSD and System III, it looks like
this isn't a problem because on those systems psignal() doesn't use wakeup()
to make the target process runnable, but moves the process to the run queue
and changes its state directly.  We encountered the problem on a V7-based
kernel.

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

edhall%rand-unix@sri-unix.UUCP (07/19/83)

I don't think &u is EVER woken up.  A signal wakes up a process
via a direct call to setrun(), and not by calling wakeup() on its
current p_wchan.  Take a look in sig.c at the signal routines, and
in slp.c at setrun(), and you'll see what I'm talking about.

		-Ed

rtc@cca.UUCP (Richard Carling) (08/01/83)

  In 4.1c there should be an fsleep() call available.  This uses
sixtieths of a second as a parameter instead of seconds (so pass it a
60 instead of a 1).  It does not use signals but the kernals sleep-wakeup
mechanism.  If your having problems or want something with a little finer
granularity you might want to try it.  I never understood why it wasn't
part of unix from the start.

jmc@root44.UUCP (John Collins) (08/02/83)

I must admit I hate the idea of a built-in 60th second sleep or anything else
as I am fed up with going through all sorts of programs changing 60 to 50 for
use in Europe where the power is 50HZ.

Then of course someone is bound to come out with a machine which offers some
other basic clock frequency.  Unless there is a system call to go with it which
tells me that I'd rather not have it.

		John Collins
		Root Computers Ltd
			....!vax135!ukc!root44!jmc

akmal@nosc@syte.UUCP (08/05/83)

You'll find out under a heavy load why the clock handler doesn't 
make those checks every 60th of a second !!

------ Akmal

guy@rlgvax.UUCP (Guy Harris) (08/07/83)

An alternate way to implement short-duration alarms or sleeps would be to use
the "timeout" facility in the kernel.  This way, only the first entry in the
timeout queue would be checked every clock tick, not the entire process table.

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

rehmi@umcp-cs.UUCP (08/08/83)

I hacked over worm once to use the ipcmessagewait() thingy for 60th
sleeps, and it is much more fun now... All you need say is
ipcmessagewait(ALLPORTS, x), where ALLPORTS is -1 and x in milliseconds.

By the way, if anybody wants the new worm, let me know...

There was also '#ifdef LUCAS' in the ~4.1c kernel for a syscall called nap()
which'll do 60ths. I thought it so bizarre that I fell out of my chair
laughing.

-- 
	By the fork, spoon, and exec of The Basfour.

Arpa:   rehmi.umcp-cs@udel-relay
Uucp:...{allegra,seismo}!umcp-cs!rehmi

rtc@cca.UUCP (Richard Carling) (08/09/83)

The fsleep() system call does use the "timeout" mechanism in the kernal. It
works fine on VERY loaded systems. It is available in 4.1BSD. I wrote it
several years ago to do real-time graphics and Berkeley has since picked
it up.