[net.unix] history question-- Bourne

yoda@ittatc.ATC.ITT.UUCP (Todd C. Williams) (08/06/86)

OK, we're from a mixed (SysV, 4bsd, V6, V7) background here, and a
question has come up which NO ONE is clear on:



	How do you put a comment in a shell script?




This is not in the standard documentation!  S.R.Bourne doesn't mention
it in his 1978 paper, except for the following line from an example script:
	: 'colon is the comment command'

Some say that the colon introduces a comment, but that is not entirely true,
since the line still gets evaluated, at least somewhat.  So what is the
purpose of the colon?  It's not a comment, but it is????

Others say that it is the sharp/pound/number sign, "#" <--that thing, but
if you're on a bsd system, you can't START your Bourne shell script with
it, or the C shell will run your script.  Or should you start with
	#!/bin/sh
???  And does the csh fork here, or exec?

What is the historical development of the : and the # ?  I am SURE that
the colon was the character to use in V7, but....

There are several methods to use here.  Which is the one of choice? and why?
Are there advantages/disadvantages with any of these methods {especially
the #!/bin/sh method above}?

-- 
+------------------------------------------------------------------------------+
|  Todd C. Williams			|  "Summer blonds		       |
|  ITT Defense Communications		|   revealing tan lines,	       |
|  Nutley, NJ				|   I'll make more moves than	       |
|  {decvax, et al.}!ittatc!dcdvaxb!tcw	|   ALLIED VAN LINES!"		       |
|  +1 201 284 3305			|     --from: "I wanna be a lifeguard" |
|  I love to receive e-mail!		|			by BLOTTO      |
+------------------------------------------------------------------------------+

chris@umcp-cs.UUCP (Chris Torek) (08/06/86)

In article <1751@ittatc.ATC.ITT.UUCP> yoda@ittatc.ATC.ITT.UUCP
(Todd C. Williams) writes:
>OK, we're from a mixed (SysV, 4bsd, V6, V7) background here, and a
>question has come up which NO ONE is clear on:
>
>	How do you put a comment in a shell script?

The only reliable answer---though hardly satisfactory---is that
you cannot.

>... S.R.Bourne doesn't mention it in his 1978 paper, except for
>the following line from an example script:
>	: 'colon is the comment command'
>Some say that the colon introduces a comment, but that is not entirely true,
>since the line still gets evaluated, at least somewhat.

The colon command is a shell built-in that performs no action and
returns a zero exit code.  Since it is a command, all its arguments
are parsed (though immediately discarded), and the remainder of
the line must be syntactically valid to `sh'.  The easiest way to
ensure that is to enclose it in quotes, as in the example above.

>Others say that it is the sharp/pound/number sign, "#" <--that thing, but
>if you're on a bsd system, you can't START your Bourne shell script with
>it, or the C shell will run your script.  Or should you start with
>	#!/bin/sh
>???  And does the csh fork here, or exec?

Now the picture becomes murkier.

>What is the historical development of the : and the # ?  I am SURE that
>the colon was the character to use in V7, but....

Indeed, a bit of historical perspective is essential.  I am not
certain I will get this right, for I am relatively new to the, er,
`shell games'.  I will leave out 6th Edition entirely---it had a
different shell---and pick up from V7.

In V7 there was The Shell, and The Shell had no Comment Character.
The Null Command `:' was used and all was well.

Along came Berkeley with a PDP-11.  The '11 was slow, and running
all those Null Commands did not help matters a bit.  So the Hackers
came up with a new Shell, the `C Shell', that was supposed to be
faster, and had a Comment Character as well.  And now there was a
Problem.  For when one ran a Shell Script, which Shell was to be
used?  But there was an easy Solution, for the new Shell had a
Comment Character, and it was `#', not `:'.  So one began a C Shell
Script with a C Shell Comment, and all was well.

But as the scripts grew it became clear that `sh' should have a
true Comment Character.  The character `#' was free, and seemed a
logical choice; and it was done.  And now both Shells had Comments,
though `sh' Scripts could not begin with one.  It became a Practise
to begin such Scripts with `:'.

And then one fateful day, it was observed that Only Shells Run
Scripts.  The Wizards were consulted, and one prescribed a cure:
if the Kernel saw a Comment that began with `#!', it could then
interpret the line itself, and thus Run Scripts.  The Shells would
skip the line, for the line was just a Comment.  And so it was
done.

And 2BSD met 32V, and they begat 3BSD, and 3BSD begat 4BSD.  And
so it is that 4BSD Shells have Comments, and the 4BSD Kernel
interprets `#!'.

And now I must leave the storytelling aside, for though there are
those who know of the True History of System V, I am not among
them.  But I believe that by SysIII at the latest, `sh' supported
`#' comments.

In general, when writing a shell script, if you want maximum
portability, and need neither true comments nor direct execution
by the kernel, use the colon (null) command; any post-V6 system
will run such a script with `sh', and everything will work.  If
you are not so concerned with portability, or prefer comments
wherever possible, include them as you wish.  If you write comments
only at the beginnings of lines, less fortunate sites can always
use

	ed - script << end
	g/^#/d
	w
	q
	end
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

henry@utzoo.UUCP (Henry Spencer) (08/07/86)

> OK, we're from a mixed (SysV, 4bsd, V6, V7) background here, and a
> question has come up which NO ONE is clear on:
> 	How do you put a comment in a shell script?

If you've got a mixture of new, old, and ancient shells, you're out
of luck:  there is no single entirely satisfactory way.

> Some say that the colon introduces a comment, but that is not entirely true,
> since the line still gets evaluated, at least somewhat.  So what is the
> purpose of the colon?  It's not a comment, but it is????

The shell's command parser and preparation phases don't know that colon is
anything special, so they treat it as an ordinary command.  The execution
phase knows that colon is a special command which does nothing.  So you
can put anything you want in colon's arguments, and it will be ignored,
*if* it has no special significance to the earlier phases.  This is
not entirely satisfactory because it means that things like ">" cannot
appear unless they are backslashed or quoted, and things like "'" get
very tricky.

> Others say that it is the sharp/pound/number sign, "#" <--that thing, but
> if you're on a bsd system, you can't START your Bourne shell script with
> it, or the C shell will run your script.

In modern shells, # at the beginning of a "word" means that the rest of
the line is a comment.  This is the new standard convention.  Unfortunately,
some old shells don't cope properly.  In particular, old broken C shells
think a # at the beginning means "C shell script".  The fix to this is
to get a modern C shell.

> Or should you start with
>	#!/bin/sh
> ???  And does the csh fork here, or exec?

In recent Unix versions, "#!" at the very start of a file is handled
specially by the kernel, and the C shell never gets involved if the
command specified is "/bin/sh".  "#!/bin/sh" tells the kernel itself to
fire up a copy of /bin/sh and run the rest of the file into it.  It's a
general way to specify the interpreter for a command file.

> What is the historical development of the : and the # ?  I am SURE that
> the colon was the character to use in V7, but....

The # arrived in the Bourne shell shortly after V7; I'm not sure of its
history in the C shell.  : was the old way to do things, with # introduced
partly because : was so unsatisfactory.  As an added bonus, by the way,
# is a good deal more efficient, since the shell isn't wasting a lot of
preparation on the do-nothing : command.

> There are several methods to use here.  Which is the one of choice? and why?

If you must cope with old systems, either ones that don't recognize # or
ones with C shells that misinterpret it, you must use : for comments.
Otherwise, you should use # because it's less hassle and more efficient.
You should probably use #!/bin/sh at the start of shell files, since it's
harmless in systems whose kernels don't know about "#!" and useful in
those that do.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

mwm@eris.berkeley.edu (08/07/86)

To add some more background to Chris's comments.

The v7 shell did indeed swallow ':' (with parsed argumernts) as a
comment, and did not take '#' as a comment (neither did 2.8BSD
systems). The v6 shell used ':' as a label for branches, so you could
write things like:

	: loop
	who
	sleep 3600
	goto loop

for iterated displays, etc. I wouldn't be suprised if this use of ':'
had something to do with ':''s use in the v7 shell.

I've heard a rumor that berkeley didn't introduce '#' as a shell
comment character, but picked it up from the Unix 2.0 (PWD 2.0) or
Unix 3.0 shells. On the other hand, it was definitely used on v6 & v7
systems running csh by csh to decide if your script was going to be
run by csh or sh.

Similarly, hearsay has it that the explanation for the '#!command' was
that it was an extension of what the pascal interpreter was doing at
the time.

Naturally, corrections by people with hard facts are welcome.

	<mike

dgk@ulysses.UUCP (David Korn) (08/13/86)

> > OK, we're from a mixed (SysV, 4bsd, V6, V7) background here, and a
> > question has come up which NO ONE is clear on:
> > 	How do you put a comment in a shell script?
> 
> If you've got a mixture of new, old, and ancient shells, you're out
> of luck:  there is no single entirely satisfactory way.
> 
> > Some say that the colon introduces a comment, but that is not entirely true,
> > since the line still gets evaluated, at least somewhat.  So what is the
> > purpose of the colon?  It's not a comment, but it is????
> 
> The shell's command parser and preparation phases don't know that colon is
> anything special, so they treat it as an ordinary command.  The execution
> phase knows that colon is a special command which does nothing.  So you
> can put anything you want in colon's arguments, and it will be ignored,
> *if* it has no special significance to the earlier phases.  This is
> not entirely satisfactory because it means that things like ">" cannot
> appear unless they are backslashed or quoted, and things like "'" get
> very tricky.
> 
> > Others say that it is the sharp/pound/number sign, "#" <--that thing, but
> > if you're on a bsd system, you can't START your Bourne shell script with
> > it, or the C shell will run your script.
> 
> In modern shells, # at the beginning of a "word" means that the rest of
> the line is a comment.  This is the new standard convention.  Unfortunately,
> some old shells don't cope properly.  In particular, old broken C shells
> think a # at the beginning means "C shell script".  The fix to this is
> to get a modern C shell.
> 
> > Or should you start with
> >	#!/bin/sh
> > ???  And does the csh fork here, or exec?
> 
> In recent Unix versions, "#!" at the very start of a file is handled
> specially by the kernel, and the C shell never gets involved if the
> command specified is "/bin/sh".  "#!/bin/sh" tells the kernel itself to
> fire up a copy of /bin/sh and run the rest of the file into it.  It's a
> general way to specify the interpreter for a command file.
> 
> > What is the historical development of the : and the # ?  I am SURE that
> > the colon was the character to use in V7, but....
> 
> The # arrived in the Bourne shell shortly after V7; I'm not sure of its
> history in the C shell.  : was the old way to do things, with # introduced
> partly because : was so unsatisfactory.  As an added bonus, by the way,
> # is a good deal more efficient, since the shell isn't wasting a lot of
> preparation on the do-nothing : command.
> 
> > There are several methods to use here.  Which is the one of choice? and why?
> 
> If you must cope with old systems, either ones that don't recognize # or
> ones with C shells that misinterpret it, you must use : for comments.
> Otherwise, you should use # because it's less hassle and more efficient.
> You should probably use #!/bin/sh at the start of shell files, since it's
> harmless in systems whose kernels don't know about "#!" and useful in
> those that do.
> -- 
> 				Henry Spencer @ U of Toronto Zoology
> 				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

If you must write shell programs to work on old versions of the shell and
new ones you might use
: # comments go here
Since this will always work and on newer versions of the shell that allow
# for comment the arguments to the : command will not be expanded.
It is best to put the comment in single quotes to avoid any argument processing
in case the  shell doesn't recognize the # as a comment character.

The #! /bin/sh on the first line is interpretted by the kernel in BSD 4.2.
This has two unfortunate consequences.  First of all it execs a new copy
of the shell to run the shell procedure even if you are already running
the Bourne shell.  This is a lot slower than just letting the exec
fail and letting the shell read the procedure directly.

Secondly, this allows for a very insecure implementation of setuid shell
procedures.  Given any setuid shell procedure written using this mechanism
anyone can get the power to do anything that the owner of this procedure
can do.  (The mechanism for setuid and execute only shell procedures that
I use for ksh doesn't rely on kernel changes and doesn't have these
obvious security problems).

David Korn
ulysses!dgk