[gnu.emacs] Why programs use the shell to start up a program

worley@EDDIE.MIT.EDU (Dale Worley) (03/23/89)

Generally the Un*x convention is that if a program wants to start an
inferior process, it has the shell do it, rather than doing a fork()
and exec() itself.  See, for example, the system() call.  You might
well wonder why this is done, since it costs time, and probably sets
up an additional process (is that really true?).  The reason is that
often the "program name" to run is obtained from the user or an
environment variable, and using the shell to process it gives
additional flexibility.  For instance:

The program may have a text string which is a command to run, complete
with arguments.  Then it can leave the problem of parsing the string
(and expanding wild cards, ~s, etc.) to the shell.

The program name may not be an absolute path, and thus the problem of
searching the path can be left to the shell.

The program name may be an alias (which, remember, the user thinks of
as a "command").  Figuring out aliases can be left to the shell.

The program name may be a real program name with some switches.  For
example, consider setting Emacs's lpr-command to "lpr -Pprinter2".
This will cause all of the Emacs print-* commands to print on
"printer2", but only if lpr-command's value is handed over to a shell
to crack apart into arguments.  A direct exec() of "lpr -Pprinter2"
will fail, since there is no file by that name.  (This trick is really
common, and is often used by users when setting shell variables, csh
aliases, and Emacs program-name variables.  That's why Un*x options go
after the program name but before the file names, rather than, say, at
the end of the command.)

Of course, much of this could be fixed by using the hack that Make
uses:  First check the command line for troublesome things (characters
that are special to the shell).  If there are none, the program is
invoked directly; if not, the problem is passed to the shell.

Dale Worley, Compass, Inc.                      worley@compass.com
The War on Drugs -- Prohibition for the '80s.

tmb@wheaties.ai.mit.edu (Thomas M. Breuel) (03/24/89)

In article <8903231435.AA11955@galaxy.compass.com> think!compass!worley@EDDIE.MIT.EDU (Dale Worley) writes:

   Generally the Un*x convention is that if a program wants to start an
   inferior process, it has the shell do it, rather than doing a fork()
   and exec() itself.  See, for example, the system() call.  You might
   well wonder why this is done, since it costs time, and probably sets
   up an additional process (is that really true?).  The reason is that
   often the "program name" to run is obtained from the user or an
   environment variable, and using the shell to process it gives
   additional flexibility.

People often complain that making a subshell is slow. This is mostly
a function of the kind of shell you are using. Using the Bourne shell
(at least for subshells), system() calls and subshells start up almost
instantaneously. If you must use the C-shell, you should at least check
in your .cshrc whether the shell is run interactively and skip most of
the .cshrc file if it is not.

pinkas@hobbit.intel.com (Israel Pinkas ~) (03/25/89)

In article <8903231435.AA11955@galaxy.compass.com> think!compass!worley@EDDIE.MIT.EDU (Dale Worley) writes:

> Generally the Un*x convention is that if a program wants to start an
> inferior process, it has the shell do it, rather than doing a fork()
> and exec() itself.  See, for example, the system() call.  You might
> well wonder why this is done, since it costs time, and probably sets
> up an additional process (is that really true?).

This is an invalid assertion.  There are a number of differences between
system() and fork()/exec().  The main ones are that system() invokes a
shell to process the command given.  The call does not return until the
invoked program exits.  fork()/exec() start up the specified process in
parallel.

If IPC is needed, fork() allows pipes to be used.  The pipes can also be
passed through an exec().  (This is tricky!)

>						  The reason is that
> often the "program name" to run is obtained from the user or an
> environment variable, and using the shell to process it gives
> additional flexibility.

If the program name comes from the user, the program can be invoked
directly.  If it comes from the environment, getenv() is simple enough.

>			For instance:
>
> The program may have a text string which is a command to run, complete
> with arguments.  Then it can leave the problem of parsing the string
> (and expanding wild cards, ~s, etc.) to the shell.

With all the different variations of exec() available, this should not be a
problem.  The only issue I see here is parsing the words of a line into
arguments.  This can almost be done with scanf().

> The program name may not be an absolute path, and thus the problem of
> searching the path can be left to the shell.

exec?p() does this.

> The program name may be an alias (which, remember, the user thinks of
> as a "command").  Figuring out aliases can be left to the shell.

Since you mention aliases, I assume that you are refering to csh.  Since
system() invokes csh with the -c flag, prompt is not set.  Most users do
not define aliases when prompt is not set.  (I also son't set most shell
variables.)  With this in mind, system can't do any better than exec().

> The program name may be a real program name with some switches.  For
> example, consider setting Emacs's lpr-command to "lpr -Pprinter2".

Actually, Emacs' lpr commands use a variable called lpr-switches, which is
a list of switches to pass to lpr before the file names.  A better example
is the mh-e package's mh-lpr-command format, which is a string that is
passed to format along with the file name.  The return should be a command
to execute.  (However, I think that both lpr commands use system() to
print.)

As I mentioned before, parsing the arguments out of a string is not
difficult.  The hard part may be determining when a space is part of the
argument list instead of a separator.  

On BSD 4.3 systems there are routines for parsing.  See string(3) for
strspn, strtok.  At the very worst, index/strchr can be used.  The parsing
routine is not difficult, and can be reused.

-Israel Pinkas
--
--------------------------------------
Disclaimer: The above are my personal opinions, and in no way represent
the opinions of Intel Corporation.  In no way should the above be taken
to be a statement of Intel.

UUCP:	{amdcad,decwrl,hplabs,oliveb,pur-ee,qantel}!intelca!mipos3!cadev4!pinkas
ARPA:	pinkas%cadev4.intel.com@relay.cs.net
CSNET:	pinkas@cadev4.intel.com