guy@auspex.auspex.com (Guy Harris) (06/27/91)
>Actually, if I were about to change the semantics of a prominent UNIX >call, I would probably have given it a new name, The whole *point* of the change was to make it *transparent* to existing programs! If you decide they're different calls and give them different names, all the programs out there that use the old calls won't be able to run scripts. I don't *want* to have to change every single program that calls one of the exec-family routines just so that it can automatically run shell/awk/perl/whatever scripts. >You could also get rid of the ugly hard-coded limits that >are in kern_exec.c; E.g., the 1MB hard-coded limit on number of characters passed as arguments to a program? :-) (Yes, there *are* systems with limits that large. I hope I don't have to spend very much time ever again on systems with tiny limits such as 20480 characters....)
john@sco.COM (John R. MacMillan) (06/28/91)
This is largely a religious debate (and not even a particularly important one at that) which I've been through many times, so I won't say anything more after this unless someone says something new. Guy Harris <guy@auspex.auspex.com> writes: |>Actually, if I were about to change the semantics of a prominent UNIX |>call, I would probably have given it a new name, | |The whole *point* of the change was to make it *transparent* to existing |programs! You left out the part where I said that *now* I would indeed make it transparent. At the time it was done, I don't think the existing software base was so large that changing those cases that wanted to be able to run shell scripts would have been unreasonable, and it would have given programmers a choice of whether they wanted to exec objects or run programs (see Statement of Religion, below). This is unrelated to whether or not it belongs in the kernel. |>You could also get rid of the ugly hard-coded limits that |>are in kern_exec.c; | |E.g., the 1MB hard-coded limit on number of characters passed as |arguments to a program? :-) I didn't say you could fix *all* the ugly hard-coded limits, :-) just the 29 (32 - 3 for #! and \n) bytes for shell + 1 arg (and the just one arg, if you wanted to). This is also largely unrelated to whether or not it belongs in the kernel, although the more complex the implementation, the less likely you are to want it in the kernel. Elsewhere, in <19407@rpp386.cactus.org>, John F Haugh II <jfh@rpp386.cactus.org> writes: |I'm no fan of bloat either, and I rail against it at every oppurtunity. Well, obviously not *every* opportunity. :-) |The "#!" "hack" is not "bloat". As Guy (that was Guy Harris, right?) |pointed out, the change is really very minimal. I don't think it's big, I just think it's in the wrong place, whether it's one line or one thousand. [ Flameproof suit on ] I wasn't going to say anything on this, but since everyone keeps quoting the word ``hack'': I called it a ``hack'' because I felt that it was a feature motivated by wanting new functionality with minimal code change (ie. work), not thought out very clearly (were the security problems with setuid scripts considered?), and as such the code is certainly not something of which I would be proud, and is not a complete solution. I'm obviously guessing at the motivation and depth of design, but all I have to judge is the end result (security problems, incomplete solution, and ugly code). Why do I not consider it a complete solution? It only works with interpreters that ignore the magic line themselves (most do, so it's convenient), it requires an explicit path be provided in the script for the interpreter (makes it easy for a 60 line kernel implementation), and it does not allow flexibility in how the interpreter is invoked (again makes for easy implementation). Wouldn't it be nice to have been able to say something like: /* #! myrexx -v -f !0 -lmylib !* */ where /* and */ are the REXX comment delimiters, #! still signifies an ``exec'' like string, and the ! substitutions are csh-like? I haven't thought this out completely myself, but it seems possible and not that difficult. This is getting _way_ off topic. As to why I think the code is ugly, consider the 32 char buffer which limits you to 29 characters for shell plus one arg, unless the byte following the ex_shell[] array happens to be '\0', in which case you get 30 characters, but maybe only part of your shell name or argument with no error. Besides, it uses a goto. :-) I know there's lots of ugly code in everybody's kernel but ``everybody's doing it'' is rarely if ever a good reason for anything. There are valuable hacks, useful hacks, ugly hacks, even brilliant hacks, but they're all hacks. I'm not against ``good'' hacks, but I do like to recognize them for what they are. [ Flameproof suit off ] |It gets made in exactly |one place (the kernel) instead of many others (every command that might |include a library module which executes another command) and brings with |it certain (dubious) advantages (like set-UID scripts ...) I also think it should be made in exactly one place (the library). Since, as both of you have noted, it is quite small, it does not ``bloat'' every command that might want to execute another command in pre-shared library systems, and with shared libraries, takes exactly as much space as it does in the kernel. Stdio adds much more bulk, including code to format floating point numbers, into every program that uses printf(3). I don't think anyone would suggest that it belongs in the kernel to avoid ``bloating'' applications. |> [ getting around setuid scripts with an auxiliary program ] | |This points directly to why it should be handled in the kernel. We know |exactly how to execute shell scripts, it isn't that hard, and we can |do it right in the kernel with 60 lousey little lines of code. [ Plus a |few to close the set-UID holes if you really insist on set-UID scripts ] Above you noted that this was a ``dubious'' advantage. Also, to my knowledge, the holes exists, and there's nothing a sysadmin can do about it. That is, not only is it dubious, it's unavoidable. With an auxiliary program to run setuid scripts, the situation is under control of the sysadmin. Statement of Religion: New features, particularly those of dubious merit and/or with security concerns, should be optional. Actually, I'm not convinced a few lines would close the security holes. They would, I presume, fix any problems in the kernel (I've never really looked at what these might be), but some of the problems are related to the fact that scripts are difficult to control. Not only must you have faith in the interpreter, but in every program it invokes (this is less of an issue in largely self-contained interpreters such as awk and perl than in ones such as sh and csh). This, however, is getting off topic (again :-) ). I'll give you some more ammunition, though: with the kernel implementation you can exec a script that you don't have read permission on. This is, presumably a fairly rare case, since in order for the *interpreter* to open and read the script, it would need appropriate permissions. If you really wanted to handle this case in the library, you could presumably use the same auxiliary program that handles setuid scripts. To reiterate, I know no *clear* reason why #! should be in the kernel. It needn't be there for transparency (put it in the library), or size considerations (it's not big), or even the far-from-clear for need for setuid scripts. I'd be happy to hear other new reasons, or carry on ranting by email if anyone is interested.
gwyn@smoke.brl.mil (Doug Gwyn) (06/28/91)
In article <1991Jun27.170723.10630@sco.COM> john@sco.COM (John R. MacMillan) writes: >To reiterate, I know no *clear* reason why #! should be in the kernel. The article was a good summary of arguments against the #! implementation. While useful, the hack indeed doesn't provide a clean solution to a general problem; practically by definition, then, it is not properly part of the system kernel.
guy@auspex.auspex.com (Guy Harris) (06/30/91)
>You left out the part where I said that *now* I would indeed make it >transparent. I left it out because it was irrelevant - I would have wanted it done transparently at *any* time, no matter how few programs would allegedly have to have been changed. >At the time it was done, I don't think the existing >software base was so large that changing those cases that wanted to be >able to run shell scripts would have been unreasonable, and it would >have given programmers a choice of whether they wanted to exec objects >or run programs (see Statement of Religion, below). I follow a different faith, I guess. I don't *want* to give programmers that choice; they might make the wrong choice and, for me, the wrong choice is always "only run executables". I don't want some programmer trying to second-guess me - or even *forgetting* to update their program to use the new call. Consider how many bits of SVR3.x came out without being converted to use the directory library, and fell flat on their faces when used over NFS; those probably weren't deliberate choices, somebody just forgot to update the program. (I also don't like the convention implemented by some shells, wherein they treat anything that they got an ENOEXEC error from as if it were a shell script. Fine back in the days when a file system was unlikely to have executable images on it other than images for the machine to which the disk containing the file system was directly attached; not so nice if you can access VAX and SPARC and i386 and 68K and MIPS and... binaries via some distributed file system. Some scheme by which interpreted scripts could be explicitly designated would have been better.)