guido@mcvax.UUCP (Guido van Rossum) (02/09/84)
Here are some rules for writing shell scripts in such a way that
they are more readable, robust and still not too slow.
1. If at all possible, use the Bourne shell (/bin/sh), not the C shell
(/bin/csh), even if your login shell is the latter. The Bourne shell
has a better way of treating multi-line control structures (if, for,
case etc.) and better substitution rules. Bourne shell scripts
are also easier to port to other sites than C shell scripts are.
(Exception: then C shell is superiour if lots of arithmetic must be done.)
2. Use blank lines, comments and indentation like you would in C or
Pascal programs.
3. Whenever parameter substitution (e.g., $1) or variable substitution
(such as $HOME) are used, decide whether there should be double quotes
("") around it. If it is expected that there may be embedded blanks
in the actual value (theoretically this can even occur in filenames) and
it is passed as a single argument to another program, quote it!
Also note the very useful difference between "$*" and "$@", which both
expand to the concatenation of all arguments. When passed to another
program, "$*" is always one argument; "$@" is as many arguments as
there were originally, if there was at least one. $* (without quotes)
omits empty or blank arguments (created by passing, e.g., "" as argument)
and splits arguments in thwo when they contain blanks.
(Note that "" passed as a file name means the current directory, which
is almost certainly not what was meant!)
E.g., to print all arguments, by default the standard input:
case $# in
0) print;;
*) print "$@";;
esac
4. Use "case" for string comparisons rather than "if". That is, to see
e.g. whether $1 equals "-a", use:
case "$1" in
-a) then-part;;
*) else-part;;
esac
rather than
if [ "$1" == "-a" ]; then
<then-part>
else
<else-part>
fi
The reason is mainly that "[" "]" executes as a separate process,
while the case is executed by the shell. (I know that some shell
derivates avoid the extra process in this case; but the vanilla
V7 Bourne shell is the subject of this article.)
5. Avoid the commands "true" and "false". They are implemented through
separate processes. The next paragraph shows an alternative for
"while true":
6. Be careful to design a parameter convention which mimics that of other
well-known Un*x programs; e.g. command [-flag] ... [file] ... .
If files can be given as arguments, the script should read its
standard in put instead. Example of how to program this:
while : # ":" is a built-in do-nothing command
do
case "$1" in
-a) <process-a-flag>; shift;;
-b) <process-b-flag>; shift;;
-*) echo "Usage: $0 [-a] [-b] [file] ..." 1>&2; exit 1;;
*) break;; # breaks out of loop
esac
done
And then proceed with the strategy pointed out in paragraph 3.
I could go on indefinitely with this, but try to stop here.
Any comments? Contrary visions? Other hints? (Flames?, I would add...)
Guido van Rossum
Centrum voor Wiskunde en Informatica, Amsterdam
...!{decvax,philabs}!mcvax!guidoguy@rlgvax.UUCP (Guy Harris) (02/11/84)
A couple of minor points:
1) /bin/[ should be linked to /bin/test (on non-USG systems) in order to make
if [ "$1" = "foo ]
then
...
else
...
fi
work; I have seen systems in which /bin/test (which is documented in the V7
manual) works but /bin/[ (which isn't documented, but works if the link is
made) doesn't.
2) The "#" comment convention is only in the 4.xBSD and USG shells; the
standard V7 shell only implements ":" comments - NOTE that it's not
a real comment, but a command which throws its arguments away and returns
an "exit status" of 0 (which is why "while :" works). You can't say
things like
: This isn't valid
because the shell gets upset at the unbalanced single quote.
Guy Harris
{seismo,ihnp4,allegra}!rlgvax!guymce@teldata.UUCP (Brian McElhinney) (02/12/84)
*Sigh* I agree that sh is much more portable, but reading sh scripts is painful... "case" and "esac"??? UNIX supports C, a standard UNIX shell should at least resemble C. I never have understood why the Bourne shell looks like Algol. (Not that I think a change is possible, just that this is one more reason UNIX is not easily accepted)
stan@teltone.UUCP () (02/12/84)
> From: mce@teldata.UUCP (Brian McElhinney) > > *Sigh* I agree that sh is much more portable, but reading sh scripts > is painful... "case" and "esac"??? UNIX supports C, a standard UNIX > shell should at least resemble C. I never have understood why the Bourne > shell looks like Algol. (Not that I think a change is possible, just that > this is one more reason UNIX is not easily accepted) Instead of case ... in ... esac you can do case X { X) . . } You can also replace the for .. [ in ... ] do ... done with for .. [ in ... ] { ... }. This doesn't work for the while loop though (darn!). This works on 4.1bsd and Venix Bourne shells, but I don't know how portable it is to other versions.
wolfe@mprvaxa.UUCP (Peter Wolfe) (02/15/84)
I agree with most of your comments except that they are very specific to
the Bourne shell. You brush of C-shell as being very 'unstructured' -
I beg to differ but I find that C-shell syntax is more obvious to a C
programmer than shell.
In practice I use the Bourne shell for small scripts that don't do a lot
of complicated logic (ie. to avoid another process) because it is faster
than C-shell in execution (yes even if C-shell has -f in command line).
I have written some C-shell scripts which are 5-7 pages long and found
C-shell helpful in doing what I want in terms of filename manipulation,
logic expressions etc.
I feel that yet another shell would be appropriate for the UNIX(tm)
environment. This shell would allow me to do most of the things I
can do in 'C' (eg. subroutines, local variables, file i/o easily) and
also be able to be 'compiled' to execute as fast as possible. It doesn't
need (in my opinion at least) all the user interface stuff of the history
mechanism and event specification of C-shell.
(I guess I am dreaming - but why not)
--
Peter Wolfe
Microtel Pacific Research
..decvax!microsoft!ubc-vision!mprvaxa!wolfeguy@rlgvax.UUCP (Guy Harris) (02/17/84)
A reply to several articles: 1) the #! /bin/csh -f construct is not only not portable because of the kernel change to run shell files being a Berkeleyism, but because the C shell isn't on all UNIX systems. The Bourne shell *is* on all UNIX systems worth talking about in this day and age. 2) The C shell resembles C about as much as the Bourne shell resembles Algol 68, so claims that the C shell is better than the Bourne shell because it looks more like the UNIX implementation language are bogus. 3) > Instead of case ... in ... esac you can do case X { X) . . } > You can also replace the for .. [ in ... ] do ... done > with for .. [ in ... ] { ... }. > This doesn't work for the while loop though (darn!). > This works on 4.1bsd and Venix Bourne shells, but I don't know how > portable it is to other versions. It works on the System III shell, and probably will work on any V7 or post-V7 Bourne shell. 4) > The Bourne shell resembles Algol apparently because S. R. Bourne likes it. > The source code in C is written in the same style, with #define's for > IF, ELSE, and so on. I find it difficult to read. E.g: Yes, Bourne likes Algol 68. He wrote a compiler for Algol 68C, which I believe was for the Cambridge University CAP machine. I think he may have written a PDP-11 UNIX Algol 68 compiler, and an associated debugger called - surprise, surprise - Algol DeBugger, or "adb" for short... he definitely wrote "adb", as one can tell by the same heavy use of the "let's make C look like Algol 68" #defines. PDP-11 "adb" does have a "$a" command to print an "Algol 68 stack trace". By the way, those #defines make it *very* difficult to find unmatched IF...FI pairs and the like, as the error messages are confusing due to wierdities in line numbers and the like. Then again, you have to admire someone who uses the same technique for growing a processes' data space as UNIX uses for growing a processes' stack space... Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy
gwyn%brl-vld@sri-unix.UUCP (02/22/84)
From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> The Bourne shell is not only much faster than the Cshell, it is more suitable for programming scripts for several reasons, the main one of which is the ability to redirect I/O of control loop commands. There is a form of "subroutine" in the UNIX System V Release 2 Bourne shell. The only thing that the Bourne shell still does not have that the Cshell does that is worth having is a "history" mechanism. Korn put a hack into the shell to support a generalization of the idea of command history editing; he wrote the last several commands to a file and then invoked the editor (your choice via the EDITOR environment variable) to allow you to edit the history before it was re-executed. Some people would include "job control" in the Cshell advantages but this doesn't matter if you have a nice terminal like a Teletype 5620.
gwyn%brl-vld@sri-unix.UUCP (02/26/84)
From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> Here are two important considerations for people writing Bourne shell scripts that may have to be run by other people who perhaps are on a Berkeley UNIX using the Cshell as a command interpreter: (1) The first line of every Bourne shell script should be: #!/bin/sh (2) Before invoking ANY system commands, set the expected command search path. This is usually: PATH=/bin:/usr/bin but on BRL UNIXes, UNIX System V compatible shell scripts must use the following since /bin and /usr/bin may have incompatible commands such as "echo" and "pr": PATH=/usr/5bin:/bin:/usr/bin
matt%ucla-locus@sri-unix.UUCP (02/26/84)
From: Matthew J. Weinstein <matt@ucla-locus> Two notes: I believe that Bourne shell IS the default on a BSD system. The #! may not be recognized on non-Berkeley systems. Better that it be added locally if it's wanted... If PATH varies among Unices, it might be better to define all of the programs you are going to use at the top of the shell script, as one does in make scripts; e.g.: SORT=/bin/sort set SORT = /bin/sort SED=/bin/sed set SED = /bin/sed The owner of the target system can easily tell what has to be changed; this also makes the script writer think about it too... Maybe there should be an ``Elements of Shell Programming Style'' ... - Matt
gwyn%brl-vld@sri-unix.UUCP (02/26/84)
From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> You missed the point. #!/bin/sh will be treated as a comment by the Bourne shell, so its presence cannot possibly hurt. Further, if this is NOT the first line in a Bourne shell script that is invoked by a Cshell user, the Cshell will attempt to interpret the commands, which usually results in a portion of the contents of the script actually being executed before it bombs. By having the funny first line in your Bourne shell scripts, even if a Cshell user runs one of them it will be executed correctly since the Berkeley kernel will exec /bin/sh to handle it. It is better to set the $PATH than to define explicit path names for standard UNIX system commands. For example, is "sort" in /bin in your system? It's in /usr/bin in others, except I want the one in /usr/5bin which has all the bugs fixed. When the /usr/5bin/sort is moved into /usr/bin I do not want to have to track down all the shell scripts and change "SORT=/usr/5bin/sort" to "SORT=/usr/bin/sort". By setting PATH=/usr/5bin:/bin:/usr/bin I have GUARANTEED getting the standard UNIX System V "sort" command regardless of where it actually lives, which is in different places on our different UNIX systems. An "Elements of Shell Programming Style" may be a good idea (in fact there are good guides already available in Bourne's and Kernighan & Pike's books), but you're not the one to write it..
matt%ucla-locus@sri-unix.UUCP (02/26/84)
From: Matthew J. Weinstein <matt@ucla-locus>
Firstly:
The Csh manual says that the standard shell will be invoked UNLESS the
file begins with a # character (for command files). From the Csh man
page on command files:
``... The shell opens this file, and saves its name for
possible resubstitution by `$0'. Since many systems use
either the standard version 6 or version 7 shells whose
shell scripts are not compatible with this shell, the shell
will execute such a `standard' shell if the first character
of a script is not a `#', i.e. if the script does not start
with a comment...''
As for #!, EXEC has been hacked on 4.x to look for #! as a special magic
number; I'm not sure that Bell Unix has that (although it may); anyway,
the man page for exec says:
``To aid execution of command files of various programs, if
the first two characters of the executable file are '#!'
then exec attempts to read a pathname from the executable
file and use that program as the command files command
interpreter. For example, the following command file
sequence would be used to begin a csh script:
#! /bin/csh
# This shell script computes the checksum on /dev/foobar
#
...
... The space (or tab) following the '#!' is mandatory, and
the pathname must be explicit (no paths are searched)...''
(By the way, you left out the space after #! in your last message, which is
mandatory).
Finally, there is no mention of # as a comment character in my sh man
page... If it's in yours, it's probably mentioned as a Csh compatability
hack.
Secondly:
The contention on names of programs stems from a difference in outlook
on name binding.
A few types of name binding are available to the shell programmer:
Static: A qualified pathname (one that contains a slash).
This is sort of a ``you said it, you got it'' kind of execution.
If that program doesn't work or isn't there, your command fails.
There are two flavors of this:
Absolute: This is a name that begins with a slash, and names
a particular object in the file hierarchy. "/bin/sort"
is an example of this. Note that a name of this sort is
note context-sensitive.
Useful if you want to make sure that you get a
PARTICULAR executable.
Relative: A partially qualified pathname. The name is RELATIVE
TO your current working directory, and IS context
sensitive. "./foo" is an example of one of these names.
Useful if the shell script changes working directories,
or if you are executing in a controlled environment.
Note that the execution path mechanism is not used in this case.
Dynamic: An unqualified pathname. This is the the kind of command
name most shell files utter. It has no slashes, and the command
is found by searching the execution path until an executable of
the same name is found.
The problem is, of course, that the particular program found may
not be the same kind of program as was originally intended.
The user may have his own `bzork' program in his bin directory.
When you execute what you think is /bin/bzork, what you really get
is the user's program instead.
A possible solution is to alter the ``search path''
to guarantee that the name-program binding is performed using
a specific ordered set of domains (directories). However, this
may lead to unpleasant side-effects (example: a script which invokes
an interactive program. The user forks a shell from that
interactive program. He is, however, unable to reference his
bin directory in the way he assumed he would be able to...).
Clearly, when a shell script may invoke an interactive program,
changing the path (or for that matter the working directory)
without warning is not a good idea.
In any case, this points out the fact that the semantics of commands may
vary in certain circumstances, and that the script writer should
consider his choices carefully.
My suggestions about defining names should thus be considered in the
light of the functionality required.
I think that work on the semantics of command language bindings (in Unix)
is overdue. We use a lot of ad-hoc mechanisms, and attempt to make
up for this with ``programming style''.
I am not sure who is doing compilable shell work, but they must have come
across this sort of thing before. Does anyone have any feedback on
this?
- Mattchris@umcp-cs.UUCP (03/02/84)
One more note. Apparently, someone is saying that "gee I want to
run the user's ``xyzzy'' program but the /bin ``test'' program",
and for that reason can't put
#! /bin/sh
PATH=/usr/bin:/bin
at the beginning of ``sh'' scripts. Well don't despair, there is a
solution. Try
#! /bin/sh
lpath=/usr/bin:/bin # or whatever your shell script needs
xyzzy # use the user's xyzzy program
PATH=$lpath if test ... # don't use the user's test program
I can't say if it works on System III or System V, but it works under
4.1. (I just tested it.)
Aren't "temporary environment variables" wonderful?
--
In-Real-Life: Chris Torek, Univ of MD Comp Sci
UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet: chris@umcp-cs ARPA: chris.umcp-cs@CSNet-Relayjohn@genrad.UUCP (John Nelson) (03/02/84)
> #!/bin/sh > will be treated as a comment by the Bourne shell, so its presence > cannot possibly hurt. Excuse me, but while it may be true that the Bourne shell will treat that line as a comment, any csh running on a system that does not have the "#!" magic number built into the kernal exec will attempt to run that script as a csh script. Csh first attempts to exec a command. If that fails, it examines the first character of the file. If it is a "#", then it runs it as a csh script, otherwise, it runs it as an sh script. That line will do precisely the wrong thing on such a system (like my 68000 unisoft system III).
gwyn%brl-vld@sri-unix.UUCP (03/03/84)
From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> The white space after #! is not mandatory, it is optional. If a shell script is going to run a command such as an interactive subshell where it is important that the user's original PATH be in effect, then the script should run that command with PATH temporarily restored. This is standard shell programming practice. All you info-unix subscribers who are now totally confused, go back to my original posting on the subject and ignore the ensuing debate. The advice I gave is correct and is based on the approach used at BRL to cope with a variety of different UNIX command environments.
smk@axiom.UUCP (Steven M. Kramer) (03/03/84)
Matt's idea of putting SORT= ... at the beginning of a shell script is
excellent. May I further that to makefiles. If I want to use a test
utility, and that includes ANY program used in the makefile, I have
to rewrite tha makefile (and usually take a couple of times to undo
the quirks. If I forget other types of files related to shell and make
files, I mean for these enchancements to go into them also. An idea
whose time is come.
--
--steve kramer
{allegra,genrad,ihnp4,utzoo,philabs,uw-beaver}!linus!axiom!smk (UUCP)
linus!axiom!smk@mitre-bedford (MIL)henry@utzoo.UUCP (Henry Spencer) (03/06/84)
Doug Gwyn observes, in part:
Before invoking ANY system commands, set the expected command
search path. This is usually:
PATH=/bin:/usr/bin
Not quite right. The proper incantation, one which we take some pains
to always use hereabouts, is:
PATH=/bin:/usr/bin ; export PATH
Without that magic "export", the user's original PATH is what gets
exported to commands executed from the shell file, which means that
it can reappear without warning.
--
Henry Spencer @ U of Toronto Zoology
{allegra,ihnp4,linus,decvax}!utzoo!henrygwyn%brl-vld@sri-unix.UUCP (03/07/84)
From: Doug Gwyn (VLD/VMB) <gwyn@brl-vld> As I pointed out in a very lengthy summary of how csh executes shell scripts, merely not having a # as the first character of the script is NOT enough to guarantee that it will be run under /bin/sh.
lcc.bob%ucla-locus@sri-unix.UUCP (03/09/84)
From: Bob English <lcc.bob@ucla-locus> I'll probably get yelled at for this, but if csh doesn't recognize #!, then the port from BSD (where it would never see such a thing) to whatever flavor of Unix you run was incomplete. --bob--