[comp.unix.i386] wait

carroll@m.cs.uiuc.edu (08/20/90)

As has been noted by others, Epoch is a little unstable when running
subprocesses under ISC 2.0.2. While part of this was errors on my part, I've
recently tracked at least some of the crashes down to the fact that wait(loc)
sometimes returns negative values in *loc, e.g. 0xFFFFxxyy  where xx and yy
are the values you'd normally expect in the bottom 16 bits of the return value.
TFM specifies the contents of the bottom 16 bits of *loc, but says absolutely
nothing about the top 16 bits. The GNU-Emacs process internals assume that
the value is positive, in particular that the type and mark bits are 0
(generally the top 8 bits). A simple fix is to use 
	w &= 0xFFFF;
immediately after the wait. Is this reasonable? Does Emacs make unwarranted
assumptions about wait(), or is this an ISC bug?

Alan M. Carroll                Barbara/Marilyn in '92 :
carroll@cs.uiuc.edu            + This time, why not choose the better halves?
Epoch Development Team         
CS Grad / U of Ill @ Urbana    ...{ucbvax,pur-ee,convex}!cs.uiuc.edu!carroll

dougm@ico.isc.com (Doug McCallum) (08/20/90)

In article <70400015@m.cs.uiuc.edu> carroll@m.cs.uiuc.edu writes:
...
>subprocesses under ISC 2.0.2. While part of this was errors on my part, I've
>recently tracked at least some of the crashes down to the fact that wait(loc)
>sometimes returns negative values in *loc, e.g. 0xFFFFxxyy  where xx and yy
>are the values you'd normally expect in the bottom 16 bits of the return value.
>TFM specifies the contents of the bottom 16 bits of *loc, but says absolutely
>nothing about the top 16 bits. The GNU-Emacs process internals assume that
>the value is positive, in particular that the type and mark bits are 0
>(generally the top 8 bits). A simple fix is to use 
>	w &= 0xFFFF;
>immediately after the wait. Is this reasonable? Does Emacs make unwarranted
>assumptions about wait(), or is this an ISC bug?

The wait call only specifies the bottom 16 bits.  In the V.3 implementation,
only those 16 bits are ever set so the "w &= 0xFFFF;" is reasonable.  I
checked the source and it looks like a bug in the AT&T code.  The exit()
call has its argument shifted left 8 bits and is stored in a "short" in
the proc structure.  When wait is called, that short is put into an "int"
for return to the caller.  If the exit call had been made with a negative
number, or the value when shifted left set the sign bit, the sign will get
extended.  This looks like a pretty generic AT&T behavior.

Doug McCallum
Interactive Systems Corp.
dougm@ico.isc.com

rvdp@cs.vu.nl (Ronald van der Pol) (08/20/90)

carroll@m.cs.uiuc.edu writes:


>sometimes returns negative values in *loc, e.g. 0xFFFFxxyy  where xx and yy
>are the values you'd normally expect in the bottom 16 bits of the return value.
>TFM specifies the contents of the bottom 16 bits of *loc, but says absolutely
>nothing about the top 16 bits. The GNU-Emacs process internals assume that
wait(loc)
int *loc;

loc = 0xHHHH.LLLL
HHHH: low byte of child exit argument (0 on normal exit)
LLLL: termination status of child process (0 on normal exit)

--
		Ronald van der Pol  <rvdp@cs.vu.nl>

carroll@m.cs.uiuc.edu (08/21/90)

/* Written 10:13 am  Aug 20, 1990 by rvdp@cs.vu.nl in m.cs.uiuc.edu:comp.unix.i386 */
carroll@m.cs.uiuc.edu writes:
>sometimes returns negative values in *loc, e.g. 0xFFFFxxyy  where xx and yy
>are the values you'd normally expect in the bottom 16 bits of the return value.
wait(loc)
int *loc;

loc = 0xHHHH.LLLL
HHHH: low byte of child exit argument (0 on normal exit)
LLLL: termination status of child process (0 on normal exit)
/* End of text from m.cs.uiuc.edu:comp.unix.i386 */
I think that you are confused. You say "byte" but you use 4 hex digits, which
is _two_ bytes. TFM does not say "byte", it specifically says "8 bits", which
is 2 hex digits, so it would be 0xQQQQEETT with Q undefined, T termination
status, and E exit code.

rvdp@cs.vu.nl (Ronald van der Pol) (08/21/90)

carroll@m.cs.uiuc.edu writes:

|is 2 hex digits, so it would be 0xQQQQEETT with Q undefined, T termination
|status, and E exit code.
	This is correct. When I wrote 'HHHH' I indeed meant 'HH'.
--
		Ronald van der Pol  <rvdp@cs.vu.nl>

walter@mecky.UUCP (Walter Mecky) (08/25/90)

In article <70400015@m.cs.uiuc.edu> carroll@m.cs.uiuc.edu writes:
< 
< I've
< recently tracked at least some of the crashes down to the fact that wait(loc)
< sometimes returns negative values in *loc, e.g. 0xFFFFxxyy  where xx and yy
< are the values you'd normally expect in the bottom 16 bits of the return value.
< TFM specifies the contents of the bottom 16 bits of *loc, but says absolutely
< nothing about the top 16 bits. 

As others noted, only the bottom 16 bits are defined and it's possible,
that the high order bit is propagated. I had to learn hard, that wait(2)
is indeed giving an int some years ago. I thought "He gives me 16
bits, so a short is enough", defined the "loc" variable as short 
and had a hard time to find the mysterious changing of another variable ...

BTW: X/OPEN and Posix defined macros in <sys/wait> to access the parts
     of "loc". SCO-UNIX has them too.
-- 
Walter Mecky	[ walter@mecky.uucp	or  ...uunet!unido!mecky!walter ]