system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (10/30/90)
In article <9010292020.AA26213@richter.mit.edu> krowitz@RICHTER.MIT.EDU (David Krowitz) writes: >Well, I'm trying to building the X11 R4 sources on our DN3000's, 3500's, and >2500's without much luck. > [ ... notes about hung nested makes deleted ... ] >After a while, the machine will eventually crash. >Has anyone seen this sort of behaviour? Do you know how to get around it? I have seen this sort of behaviour whenever a script/program is run that creates lots (i.e. hundreds or thousands) of subshells very quickly - the problem is worse when the shells get more deeply nested. This problem was insurmountable at SR10.0, surmountable at SR10.1 by running such tasks immediately after rebooting and rebooting immediately afterwards (otherwise the system would just hang shortly anyways), and much better under SR10.2 but not perfect (I still use the same workaround as for SR10.1, though it is pain in the ass to have to take down a DN10020 and lose all the active jobs just because the Apollo's can't fork a shell properly). We have this problem whenever I run the node protection scripts which run several thousand subshells and nest about 3 deep, or when I compile the NCAR graphics software which is 4-5 deep nested makes. -- Mike Peterson, System Administrator, U/Toronto Department of Chemistry E-mail: system@alchemy.chem.utoronto.ca Tel: (416) 978-7094 Fax: (416) 978-8775
oliveria@srvr1 (ROQUE DONIZETE DE OLIVEIRA) (10/30/90)
From article <1990Oct29.223226.9532@alchemy.chem.utoronto.ca>, by system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)): > In article <9010292020.AA26213@richter.mit.edu> krowitz@RICHTER.MIT.EDU (David Krowitz) writes: >>Well, I'm trying to building the X11 R4 sources on our DN3000's, 3500's, and >>2500's without much luck. >> [ ... notes about hung nested makes deleted ... ] >>After a while, the machine will eventually crash. >>Has anyone seen this sort of behaviour? Do you know how to get around it? > > I have seen this sort of behaviour whenever a script/program is run that > creates lots (i.e. hundreds or thousands) of subshells very quickly - > the problem is worse when the shells get more deeply nested. > This problem was insurmountable at SR10.0, surmountable at SR10.1 by > running such tasks immediately after rebooting and rebooting immediately > afterwards (otherwise the system would just hang shortly anyways), > and much better under SR10.2 but not perfect (I still use the same workaround > as for SR10.1, though it is pain in the ass to have to take down a > DN10020 and lose all the active jobs just because the Apollo's can't fork a > shell properly). > > We have this problem whenever I run the node protection scripts > which run several thousand subshells and nest about 3 deep, or when I > compile the NCAR graphics software which is 4-5 deep nested makes. > -- > Mike Peterson, System Administrator, U/Toronto Department of Chemistry > E-mail: system@alchemy.chem.utoronto.ca > Tel: (416) 978-7094 Fax: (416) 978-8775 We had the same problem (make crashing due to too many (4 or 5) deep nested makes) when installing NCAR graphics. The solution was to modify some rules, by adding a "wait" statement. Example: all:: @for dir in $(SUBDIRS) ; do\ (cd $$dir; echo "Making $$dir";\ $(MAKE) $(MFLAGS) ; wait );\ done Roque Oliveira oliveria@caen.engin.umich.edu