[net.bugs.v7] "sed" question - undocumented features like comments

guy@sun.uucp (Guy Harris) (11/17/85)

> 	The System V version of "sed" allows the first line of a
> script to begin with a '#'; the line is treated as a comment.  The
> 4.2 BSD "sed" allows as many such lines as you care to sprinkle
> throughout your script.

Gee, I tried running a script with multiple lines of comments sprinkled
throughout it on the 4.2 and S5 "sed"s and they both worked exactly the
same...  First *line*?  I don't believe anybody'd be so sloppy as to require
all comments to fit into *one* line.

If you check the code, you see that the "fcomp" routine in "sed0.c" has two
places where it checks for a '#'; one at the beginning of "fcomp" (the
routine that parses the command list), which checks for "#n" at the front of
the script (this causes "sed" to act as if you'd specified the "-n" flag),
and one in the body of the loop that reads the script that checks whether
*any* line in the script has a "#", possibly preceded by white space, in it.
This code is essentially the same in both (probably all) versions of "sed",
so their behavior will be identical.

(Oh boy, another undocumented feature.  Will the appropriate people please
pick this up and document them?  I've already mailed it to Berkeley and to
our local documentation people.)

	Guy Harris

guy@sun.uucp (Guy Harris) (11/20/85)

> Gee, I tried running a script with multiple lines of comments sprinkled
> throughout it on the 4.2 and S5 "sed"s and they both worked exactly the
> same...  First *line*?  I don't believe anybody'd be so sloppy as to require
> all comments to fit into *one* line.

I checked up on it and -- good grief, they *were* that sloppy.  It turns out
that the S5 "sed" we're running here has the Berkeley comment code stuck in.
The change is trivial; change line 171 of "sed0.c" (or thereabouts; it's
line 171 in the S5R2V1 VAX distribution) from

	if(*cp == '\0')  continue;

to

	if(*cp == '\0' || *cp == '#')    continue;

Your line numbers may differ on V7, S3, other S5's, etc..

Note that 1) if you continue a command on multiple lines using \, "#" at the
beginning of those continuation lines will NOT be treated as a comment
indicator, and 2) "#" in the middle of a command (i.e., not at the beginning
of a line or separated from the beginning of the line only by whitespace)
will also not be treated as a comment indicator.  This is a feature; it
means that old scripts won't break by virtue of "#" characters which used to
be legitimate parts of commands suddenly becoming comments....

Also note that, even in "sed"s without this code, you shouldn't have the
first line of a script be a comment which immediately follows the "#" and
begins with a lower-case "n" unless you want the script always to be run as
if "sed" had been invoked with the "-n" flag.  If the first two characters
of a script are "#n", it turns on the "-n" flag, as mentioned before...

	Guy Harris