kevin@kosman.UUCP (Kevin O'Gorman) (06/06/91)
I snarfed dmake 3.7 by ftp, and I'm working to install it on my 3b1. The bootstrap make seemed to go fine. The resulting object seems normal. I did minimal edits: just to startup.h. I did 'make sysvr3'. I installed the result. Now I'm trying to use dmake to make itself again: the acid test. It goes okay for several modules, and then at a certain point, it reports "Error -- lost a child" and dies. If I repeat the command, one additional source is compiled, and dmake dies in the same way on the second. This is no way to make a program. The natural questions arise: has anyone seen this before? Has anyone had success on a 3b1? Does anyone have any hints? Does anyone have any idea what the message means? (I've tried grepping the source, but don't see that message anywhere). -- Kevin O'Gorman ( kevin@kosman.UUCP, kevin%kosman.uucp@nrc.com ) voice: 805-984-8042 Vital Computer Systems, 5115 Beachcomber, Oxnard, CA 93035 Non-Disclaimer: my boss is me, and he stands behind everything I say.
lbr@holos0.uucp (Len Reed) (06/06/91)
In article <1362@kosman.UUCP> kevin@kosman.UUCP (Kevin O'Gorman) writes:
=It goes okay for several modules, and then at a certain point, it
=reports "Error -- lost a child" and dies. If I repeat the command,
=one additional source is compiled, and dmake dies in the same way
=on the second.
I've seen this on SCO Xenix 386.
--
Len Reed
Holos Software, Inc.
Voice: (404) 496-1358
UUCP: ...!gatech!holos0!lbr
dvadura@watdragon.waterloo.edu (Dennis Vadura) (06/11/91)
In article <1362@kosman.UUCP> kevin@kosman.UUCP (Kevin O'Gorman) writes: >It goes okay for several modules, and then at a certain point, it >reports "Error -- lost a child" and dies. If I repeat the command, >one additional source is compiled, and dmake dies in the same way >on the second. I tried to respond to you by mail but it bounced for some reason. The error message is printed from the sys_errlist table. I currently don't have a fix for this situation, and can't reproduce it on anything that I have tried. -dennis -- ------------------------------------------------------------------------------- "It may not be the truth, but in Baghdad, |Dennis Vadura it is the News. --Unknown Gulf Correspondent |dvadura@dragon.uwaterloo.ca ===============================================================================
haug@almira.uucp (Brian R. Haug) (06/13/91)
In article <1362@kosman.UUCP> kevin@kosman.UUCP (Kevin O'Gorman) writes: >I snarfed dmake 3.7 by ftp, and I'm working to install it on my 3b1. >The bootstrap make seemed to go fine. The resulting object seems >normal. I did minimal edits: just to startup.h. I did 'make sysvr3'. > >I installed the result. > >Now I'm trying to use dmake to make itself again: the acid test. > >It goes okay for several modules, and then at a certain point, it >reports "Error -- lost a child" and dies. If I repeat the command, > >The natural questions arise: has anyone seen this before? Has anyone >had success on a 3b1? Does anyone have any hints? Does anyone have >any idea what the message means? (I've tried grepping the source, >but don't see that message anywhere). I saw this behavior as well. After some examination and code modifications I found out that the wait call was failing with the error that there were no children. Additional debug code showed the process number of forked children and the value returned by wait. When the error occurred, there had been no waits for the child in question. I used adb to set a breakpoint at wait, and found that it was being called from getcwd(3c). It seems that many systems implement this subroutine as a call to popen(3C) followed by some sort of read and then a pclose(3c). Pclose has to do a wait, which may get the child that dmake will later be searching for. As best I can tell, this can not be easily fixed in any System V release (until V.4 when we get waitpid) unless you re-write the getcwd function, or the dmake function which calls getcwd. Best of luck.
pcg@aber.ac.uk (Piercarlo Grandi) (06/15/91)
On 13 Jun 91 01:00:35 GMT, haug@almira.uucp (Brian R. Haug) said: haug> [ ... dmake calls getcwd(3) while it has children outstanding; haug> since in many systems getcwd(3) just forks pwd(1), this makes haug> for problems ... ] haug> As best I can tell, this can not be easily fixed in any System V haug> release (until V.4 when we get waitpid) unless you re-write the haug> getcwd function, or the dmake function which calls getcwd. There is fairly clever freeware implementation of getcwd(3) going around, one version of which has been done by Doug Gwyn. This does not call pwd(1), and solves the problem. -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@aber.ac.uk
dvadura@watdragon.waterloo.edu (Dennis Vadura) (06/20/91)
In article <1991Jun13.010035.16040x@almira.uucp> haug@ColumbiaSC.NCR.COM (Brian R. Haug) writes: >As best I can tell, this can not be easily fixed in any System V release (until >V.4 when we get waitpid) unless you re-write the getcwd function, or the dmake >function which calls getcwd. Best of luck. Many thanks to Brian for finding this bug. It's really hard to for me to get to a machine that exhibits the above behaviour. Does anyone have a getcwd for Sys V that doesn't rely on forking and invoking pwd. I'd like to include the fix in the next patch (which I have been promissing for a while and keep delaying due to this problem). -dennis -- ------------------------------------------------------------------------------- Sometimes fate needs a good kick in the |Dennis Vadura butt to get it going. |dvadura@dragon.uwaterloo.ca ===============================================================================
les@chinet.chi.il.us (Leslie Mikesell) (06/21/91)
In article <1991Jun20.133732.1559@watdragon.waterloo.edu> dvadura@watdragon.waterloo.edu (Dennis Vadura) writes: >Does anyone have a getcwd for Sys V that doesn't rely on forking and invoking >pwd. I'd like to include the fix in the next patch (which I have been >promissing for a while and keep delaying due to this problem). The machines that have this problem generally do not have symlinks, so there should never be any surprises from getcwd(). Can't you just pick up your starting directory before doing any work and track your own chdir()s relative to that? Les Mikesell les@chinet.chi.il.us