[comp.unix.questions] Bourne-Shell and fork

martin@mwtech.UUCP (Martin Weitzel) (02/21/90)

In article <720013@hpcljws.HP.COM> jws@hpcljws.HP.COM (John Stafford) writes:
>And beware, if you redirect the input or output of the loop as a whole, you
>won't be able to get the variables out at all (as the loop will be executed
>by a child shell).

It should be noted, that this is the regular behaviour of the Bourne
Shell too (not only "ksh"). As a general strategy the Bourne Shell
seems to avoid forking as long as "it get's not too complicated
without a fork". This behaviour is found on some "older" constructs
(like {}-command grouping and loops) but not on some "newer" ones
(like I/O-redirections on internally executed commands and shell
functions).

This can be confusing sometimes. Consider the following situations:

( cd somewhere; morestuff ) # sh forks, working directory doesn't change
morecommands                # for process executing morecommands

( cd somewhere; morestuff ) > whatever # same as above, working directory
morecommands                # doesn't change .....

{ cd somewhere; morestuff;} # sh doesn't fork, working directory *changes*
morecommands                # for process executing morecommands

{ cd somewhere; morestuff;} > whatever # sh now forks!! working directory
morecommands                # doesn't change for process executing morecommands

So be careful! Adding redirection to some shell constructs can change
the semantics of these constructs, because the shell does a fork for
something it would do without a fork if you ommit the I/O-redirection.
There are some other pitfalls with forking vs. non-forking of a new
process by the shell. Consider the following:

v="initial value"
v="new value" cmd # v is set to "new value" only in the environment of cmd
nextcommand $v	  # and $v expands to "initial value" here

This is well-known behaviour. But what, if cmd is executed internally?
I think you guessed it -- now $v expands to "new value" when executing
nextcommand.  The following fragment of a script makes use of this:

x=external
x=internal pwd >/dev/null
echo "pwd is an $x command in this version of the shell"

Finally let's come to shell functions (which are, of course,
executed internally). What do you think will happen in case of:

f() { somestuff $v; }
v="initial"
v="new" f

You may choose among these possibilities for the value of $v
when/after executing f: (1) initial/initial, (2) initial/new,
(3) new/initial and (4) new/new. What is your guess?
If "f" were a true internal command the answer would be (2).
If "f" were an external script the answer would be (3).
IMHO also (4) would make some sense, because functions are
executed internally and "v" may be set to the new value prior
to the execution of "f".

The implementors of shell functions have choosen (1) - don't ask
me why. Of course, problems only occur in rare situations, but I
would have appreciated a more 'predictable' approach. Interrestingly
enough, the shell doesn't fork if it executes a function - even
if you redirect I/O for the function. So you can put a loop in a
shell function and redirect I/O for the function call, if you need
to get variables out of the loop body.

BTW: I would not be surprised, if the behavior of {}-command grouping
and loops with redirection will change in the future ... so don't
depend on it.
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83

maart@cs.vu.nl (Maarten Litmaath) (02/23/90)

In article <642@mwtech.UUCP>,
	martin@mwtech.UUCP (Martin Weitzel) writes:
)...
){ cd somewhere; morestuff;} > whatever # sh now forks!! working directory
)morecommands                # doesn't change for process executing morecommands

Too bad... :-(
Now we need something gross like the following to redirect the output to
another file temporarily:

	exec 3>&1 > file	# remember the original stdout in file
				# descriptor 3, connect stdout to `file'
	some_stuff
	exec 1>&3 3>&-		# reconnect stdout to the original file,
				# close fd 3

)...
)v="initial value"
)v="new value" cmd # v is set to "new value" only in the environment of cmd
)nextcommand $v	  # and $v expands to "initial value" here
)
)This is well-known behaviour. But what, if cmd is executed internally?
)I think you guessed it -- now $v expands to "new value" when executing
)nextcommand.  The following fragment of a script makes use of this:
)
)x=external
)x=internal pwd >/dev/null
)echo "pwd is an $x command in this version of the shell"

This behavior is RIDICULOUS!  A DISGUSTING BUG!
When a `normal' command is made into a built-in, redirection mustn't change
its behavior.  Likewise it shouldn't make any difference if a command is a
function or an executable.

)...
)BTW: I would not be surprised, if the behavior of {}-command grouping
)and loops with redirection will change in the future ... so don't
)depend on it.

If the behavior of {}-grouping and loops with redirection were to change,
it would break some scripts.  Considering the gain in elegance I wouldn't
object though.
--
  "Ever since the discovery of domain addresses in the French cave paintings
  [...]"  (Richard Sexton)      |  maart@cs.vu.nl,  uunet!mcsun!botter!maart

chet@cwns1.CWRU.EDU (Chet Ramey) (02/24/90)

In article <5649@star.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes:
>In article <642@mwtech.UUCP>,
>	martin@mwtech.UUCP (Martin Weitzel) writes:
>)...
>){ cd somewhere; morestuff;} > whatever # sh now forks!! working directory
>)morecommands                # doesn't change for process executing morecommands
>
>Too bad... :-(

Bash does, however, handle this correctly:

cwns1$ { cd /usr/local/bin ; pwd } > foo
cwns1$ echo $PWD
/usr/local/bin				<----- CWD changes for process
cwns1$ pwd
/usr/local/bin
cwns1$ cat foo
foo: No such file or directory
cwns1$ cd -
cwns1$ cat foo
/usr/local/bin
cwns1$

ksh does this right, too.

I would think that with the ability of modern sh's (s5r2 and later) to
redirect builtin commands that this would be easy to do.  4.3 BSD /bin/sh
is probably a lost cause (maybe CSRG will replace it with bash someday).

Perhaps it's because the ability to redirect builtins was added later,
while the ability to redirect group commands has been around since V7 --
the group command execution stuff was not updated.  All in the name of
backwards compatibility. 

>)...
>)v="initial value"
>)v="new value" cmd # v is set to "new value" only in the environment of cmd
>)nextcommand $v	  # and $v expands to "initial value" here
>)
	[But not if `cmd' is a shell builtin]

>This behavior is RIDICULOUS!  A DISGUSTING BUG!

Bash gets this right, too (as does ksh):

cwns1$ v="initial value"
cwns1$ v="new value" builtin type builtin    <----- force execution of builtin
builtin is a shell builtin
cwns1$ echo $v
initial value
cwns1$

I believe this is a reasonably well-known sh bug that is slated to change.

>If the behavior of {}-grouping and loops with redirection were to change,
>it would break some scripts.  Considering the gain in elegance I wouldn't
>object though.

I believe that at least the behavior of {}-grouping *will* change, in
response to POSIX.2.  Bash is supposed to conform to that spec, so we've
tried to implement bash so that it is an example of what will be `standard
shell behavior'. 

Chet Ramey
-- 
Chet Ramey				"Can't you pay a grad student to 
Network Services Group			 read the manual for you?"
Case Western Reserve University			-- Bill Wisner,
chet@ins.CWRU.Edu				   	to Peter Honeyman