[comp.unix.questions] sh script hangs after first external command

sewilco@datapg.MN.ORG (Scot E Wilcoxon) (03/05/89)

A program is doing a system(3) call to execute a shell script.
The script hangs after doing the first command which is not a shell
builtin.  Any ideas what can do that?

This problem is taking place in a System V.2 Honeywell-Bull XPS-100
(X-20, I think), with a single 68020 processor.

A `ps -ef` shows two processes, "sh -c /usr/sewilco/script" and
"/usr/sewilco/script".  This second process uses cpu time quickly,
about one second cpu per elapsed second.

The script sets some environment variables, then does
	date >>/tmp/sew1
	date >>/tmp/sew1

The first date reaches /tmp/sew1, but not the second.  If "pwd" or
"echo" (both shell builtins) is substituted their output both reach
/tmp/sew1, and the script hangs further along at the next external
command.

The parent program is fairly active, so I checked number of open files
and file locking, among other things:

	I'm not running out of file descriptors, as a close(1)
	loop reported only 9 files closed before executing
	system(3)..and the script still hung.

	There are no mandatory file locks in use.

	A shared memory segment was detached before executing
	system(3), so that can't be a problem.

Two nephew processes and other terminals continue running, so
the system has not crashed.

When the parent is killed, the two system(3) processes continue to
exist until `kill -9` used.  If the problem was process limit, the
hung process should have continued when the parent and nephews
exited.  There are fewer than 10 related processes simultaneously
active, and process limit seems to be much higher than that.

Any ideas what can cause a shell script to spin, eating cpu?
-- 
Scot E. Wilcoxon  sewilco@DataPg.MN.ORG    {amdahl|hpda}!bungia!datapg!sewilco
Data Progress 	 UNIX masts & rigging  +1 612-825-2607    uunet!datapg!sewilco
	I'm just reversing entropy while waiting for the Big Crunch.

rogol@marob.MASA.COM (Fred Buck) (03/07/89)

In article <3642@datapg.MN.ORG> sewilco@DataPg.MN.ORG (Scot E Wilcoxon) writes:
>A program is doing a system(3) call to execute a shell script.
>The script hangs after doing the first command which is not a shell
>builtin.  Any ideas what can do that?
>
>This problem is taking place in a System V.2 Honeywell-Bull XPS-100

A shell that doesn't get the expected results from SIGCLD will hang
after the first external command, since it can't tell when the child
exits.  Typically this is caused by ignoring SIGCLD before forking the
child.  Try a signal(SIGCLD,SIG_DFL) before your system() call.  If this
fixes the problem, then make provisions to save the previous vector
for SIGCLD and to restore it after system() returns.

------------------------------------------------------------------
Fred Buck       { uunet, rutgers }!hombre!marob!rogol
                { uunet, rutgers }!hombre!magpie!lemur!rogol
                { uunet, rutgers }!rogol@marob.masa.com
------------------------------------------------------------------