[comp.unix.wizards] #!

guy@auspex.auspex.com (Guy Harris) (06/27/91)

>Actually, if I were about to change the semantics of a prominent UNIX
>call, I would probably have given it a new name,

The whole *point* of the change was to make it *transparent* to existing
programs!

If you decide they're different calls and give them different names, all
the programs out there that use the old calls won't be able to run
scripts.  I don't *want* to have to change every single program that
calls one of the exec-family routines just so that it can automatically
run shell/awk/perl/whatever scripts.

>You could also get rid of the ugly hard-coded limits that
>are in kern_exec.c;

E.g., the 1MB hard-coded limit on number of characters passed as
arguments to a program? :-)  (Yes, there *are* systems with limits that
large.  I hope I don't have to spend very much time ever again on
systems with tiny limits such as 20480 characters....)

john@sco.COM (John R. MacMillan) (06/28/91)

This is largely a religious debate (and not even a particularly
important one at that) which I've been through many times, so I won't
say anything more after this unless someone says something new.

Guy Harris <guy@auspex.auspex.com> writes:
|>Actually, if I were about to change the semantics of a prominent UNIX
|>call, I would probably have given it a new name,
|
|The whole *point* of the change was to make it *transparent* to existing
|programs!

You left out the part where I said that *now* I would indeed make it
transparent.  At the time it was done, I don't think the existing
software base was so large that changing those cases that wanted to be
able to run shell scripts would have been unreasonable, and it would
have given programmers a choice of whether they wanted to exec objects
or run programs (see Statement of Religion, below).  This is unrelated
to whether or not it belongs in the kernel.

|>You could also get rid of the ugly hard-coded limits that
|>are in kern_exec.c;
|
|E.g., the 1MB hard-coded limit on number of characters passed as
|arguments to a program? :-)

I didn't say you could fix *all* the ugly hard-coded limits, :-) just
the 29 (32 - 3 for #! and \n) bytes for shell + 1 arg (and the just one
arg, if you wanted to).  This is also largely unrelated to whether or
not it belongs in the kernel, although the more complex the
implementation, the less likely you are to want it in the kernel.

Elsewhere, in <19407@rpp386.cactus.org>, John F Haugh II
<jfh@rpp386.cactus.org> writes:

|I'm no fan of bloat either, and I rail against it at every oppurtunity.

Well, obviously not *every* opportunity. :-)

|The "#!" "hack" is not "bloat".  As Guy (that was Guy Harris, right?)
|pointed out, the change is really very minimal.

I don't think it's big, I just think it's in the wrong place, whether
it's one line or one thousand.

[ Flameproof suit on ]

I wasn't going to say anything on this, but since everyone keeps
quoting the word ``hack'':

I called it a ``hack'' because I felt that it was a feature motivated
by wanting new functionality with minimal code change (ie. work), not
thought out very clearly (were the security problems with setuid
scripts considered?), and as such the code is certainly not something
of which I would be proud, and is not a complete solution.  I'm
obviously guessing at the motivation and depth of design, but all I
have to judge is the end result (security problems, incomplete
solution, and ugly code).

Why do I not consider it a complete solution?  It only works with
interpreters that ignore the magic line themselves (most do, so it's
convenient), it requires an explicit path be provided in the script
for the interpreter (makes it easy for a 60 line kernel
implementation), and it does not allow flexibility in how the
interpreter is invoked (again makes for easy implementation).
Wouldn't it be nice to have been able to say something like:

/* #! myrexx -v -f !0 -lmylib !*
 */

where /* and */ are the REXX comment delimiters, #! still signifies an
``exec'' like string, and the !  substitutions are csh-like?  I
haven't thought this out completely myself, but it seems possible and
not that difficult.  This is getting _way_ off topic.
 
As to why I think the code is ugly, consider the 32 char buffer which
limits you to 29 characters for shell plus one arg, unless the byte
following the ex_shell[] array happens to be '\0', in which case you
get 30 characters, but maybe only part of your shell name or argument
with no error.  Besides, it uses a goto. :-) I know there's lots of
ugly code in everybody's kernel but ``everybody's doing it'' is rarely
if ever a good reason for anything.

There are valuable hacks, useful hacks, ugly hacks, even brilliant
hacks, but they're all hacks.  I'm not against ``good'' hacks, but I do
like to recognize them for what they are.

[ Flameproof suit off ]

|It gets made in exactly
|one place (the kernel) instead of many others (every command that might
|include a library module which executes another command) and brings with
|it certain (dubious) advantages (like set-UID scripts ...)

I also think it should be made in exactly one place (the library).
Since, as both of you have noted, it is quite small, it does not
``bloat'' every command that might want to execute another command in
pre-shared library systems, and with shared libraries, takes exactly
as much space as it does in the kernel.

Stdio adds much more bulk, including code to format floating point
numbers, into every program that uses printf(3).  I don't think anyone
would suggest that it belongs in the kernel to avoid ``bloating''
applications.

|> [ getting around setuid scripts with an auxiliary program ]
|
|This points directly to why it should be handled in the kernel.  We know
|exactly how to execute shell scripts, it isn't that hard, and we can
|do it right in the kernel with 60 lousey little lines of code.  [ Plus a
|few to close the set-UID holes if you really insist on set-UID scripts ]

Above you noted that this was a ``dubious'' advantage.  Also, to my
knowledge, the holes exists, and there's nothing a sysadmin can do
about it.  That is, not only is it dubious, it's unavoidable.  With an
auxiliary program to run setuid scripts, the situation is under
control of the sysadmin.  Statement of Religion: New features,
particularly those of dubious merit and/or with security concerns,
should be optional.

Actually, I'm not convinced a few lines would close the security
holes.  They would, I presume, fix any problems in the kernel (I've
never really looked at what these might be), but some of the problems
are related to the fact that scripts are difficult to control.  Not
only must you have faith in the interpreter, but in every program it
invokes (this is less of an issue in largely self-contained
interpreters such as awk and perl than in ones such as sh and csh).
This, however, is getting off topic (again :-) ).

I'll give you some more ammunition, though:  with the kernel
implementation you can exec a script that you don't have read
permission on.  This is, presumably a fairly rare case, since in order
for the *interpreter* to open and read the script, it would need
appropriate permissions.  If you really wanted to handle this case in
the library, you could presumably use the same auxiliary program that
handles setuid scripts.

To reiterate, I know no *clear* reason why #! should be in the
kernel.  It needn't be there for transparency (put it in the library),
or size considerations (it's not big), or even the far-from-clear for
need for setuid scripts.  I'd be happy to hear other new reasons, or
carry on ranting by email if anyone is interested.

gwyn@smoke.brl.mil (Doug Gwyn) (06/28/91)

In article <1991Jun27.170723.10630@sco.COM> john@sco.COM (John R. MacMillan) writes:
>To reiterate, I know no *clear* reason why #! should be in the kernel.

The article was a good summary of arguments against the #! implementation.
While useful, the hack indeed doesn't provide a clean solution to a
general problem; practically by definition, then, it is not properly part
of the system kernel.

guy@auspex.auspex.com (Guy Harris) (06/30/91)

>You left out the part where I said that *now* I would indeed make it
>transparent.

I left it out because it was irrelevant - I would have wanted it done
transparently at *any* time, no matter how few programs would allegedly
have to have been changed.

>At the time it was done, I don't think the existing
>software base was so large that changing those cases that wanted to be
>able to run shell scripts would have been unreasonable, and it would
>have given programmers a choice of whether they wanted to exec objects
>or run programs (see Statement of Religion, below).

I follow a different faith, I guess.  I don't *want* to give programmers
that choice; they might make the wrong choice and, for me, the wrong
choice is always "only run executables".  I don't want some programmer
trying to second-guess me - or even *forgetting* to update their program
to use the new call.  Consider how many bits of SVR3.x came out without
being converted to use the directory library, and fell flat on their
faces when used over NFS; those probably weren't deliberate choices,
somebody just forgot to update the program.

(I also don't like the convention implemented by some shells, wherein
they treat anything that they got an ENOEXEC error from as if it were a
shell script.  Fine back in the days when a file system was unlikely to
have executable images on it other than images for the machine to which
the disk containing the file system was directly attached; not so nice
if you can access VAX and SPARC and i386 and 68K and MIPS and...
binaries via some distributed file system.

Some scheme by which interpreted scripts could be explicitly designated
would have been better.)