ron@oscvax.UUCP (Ron Janzen) (04/07/86)
I am posting this for a friend who doesn't have net access. PLEASE reply to his e-mail address which is given at the end of the posting. mcvax can be reached thru seismo. ------------------------------------------------------------------------- I'm working with System V Release 2.1 from Motorola, for the 68010. I've come across some very poorly documented, or strangely implemented things in the kernel and shell, and would like to find out whether these are problems with our (Motorola) release, or whether they are problems inherent in System V. I'd also like to know what the easiest and most elegant solutions to these problems are. 1) I know that some (berkeley?) versions of exec(2) understand shell scripts. The implementation i've seen is to put "#! interpreter-path" at the start of the script. As far as i can see, this is very kludgy, but good for several reasons. It means that any "program", either a binary or a shell script, can be exec(2)'d. It means that shell scripts can be setuid (is this possible anyways? I couldn't see a way). It allows a consistent way to handle scripts for several shells. Is this mechanism only a berkeley thing? I know it's not very elegant, but is it hard to implement for some non-obvious reason? (given the source to exec(2), i mean). I can see that it means the file needs read perms so that exec can read it, but the interpreter is going to need them anyways. 2) Does our exec(2) have a bug here? A file cannot be exec'd (by our exec(2)) by root unless execute permissions are set for one or more of user, group, other. There are several things which make me think that this is a bug: - it is inconsistent with the way read and write perms are given; root can read and write a file regardless of the read and write permission bits. - intro(2) says "... execute permissions on a file are granted to a process if one or more of the following are true: The process's effective user ID is superuser. ..." - access(2) on a file with no execute perms, by a process with real uid of root, returns 0, indicating that root can execute the file. 3) Why is there no eaccess(2) which uses effective uid? It can be done with stat(2), but so can access(2), can it not? The fact that access(2) uses real uid makes it not so convenient, and might cause programs that use it to be incorrect (like maybe the shell? - see below) 4) Is it true that the sh construct "$@" (in a shell script) is supposed to be identically equal to the command-line arguments? It seems that it should, because otherwise there is no way to get at these because $@ = $* gets reparsed, and "$*" is one word. In our sh, "$@" is exactly equivalent to the command line args if there are some, otherwise it is equal to "" instead of nothing. Seems like a bug to me. If so, how wide-spread is it? 5) Shell functions. Are they such a kludge on all System V sh's? On ours the definitions seem to be stored with shell parameters, so if you do a set you get a huge mess of junk. They are not part of the environment though (thank god), but it would be nice if there was some mechanism for getting them into every sh started up. They do get into forked sh's though, which means they are "exported" to shell scripts, and commands in (). The former is a disaster! It's easy enough to write a shell script with PATH=/bin:/usr/bin in it, but there is no simple way to ensure that commands used in the script are not going to be shell functions. Also, executing a shell function gives values to $1, $2, ... in the executing shell. This is really messy; these should only be changed while executing the function. That is, $1, $2, ... should be auto variables to a shell function, not external. 6) Is the PATH variable so screwed up on all System V's? It used to be that the component were *separated* with :, and a null component was ".", so an initial :, or ::, or a final : meant ".". Now it seems that the final : isn't "." anymore. Bug or feature? PATH cannot be unset either, so it can't be removed from the environment once it's there. By the way, with previous (say version 7) shells, how did you unexport a parameter, i.e. remove it from the environment of invoked commands? 7) It seems our shell checks file permissions before trying to exec the file. It will not execute a command for which the effective uid (gid) has permissions, but for which the the real uid (gid) doesn't. Looks like it uses access(2) to me, which is an error, 'cause access only looks at real uid (gid), and effectives are the ones that count to exec(2). 8) Command Hashing. This could be good, but our shell screws up in some really stupid ways: - if you try to execute a directory (that has execute, i.e. search, perms), the shell reports that it can't be executed, then hashes the name anyways. - executables used in shell functions aren't hashed. - if a name is hashed (by executing it) then removed, the shell correctly finds another executable with the same name if there is one, but doesn't replace the old hash table entry with the newly found path. 9) The new "type" shell builtin makes mistakes, and is horribly designed. It does the things that Kernighan and Pike's which command did, but is so verbose that it is unusable for anything non-interactive. It makes mistakes too, saying that directories are executable, among other things. 10) Shell parameters used on a command line (i.e. "TERM=xx vi file") don't work for shell builtins and shell functions, on our shell. The semantics should be that the value of the parameter is set and exported to the command, but when the command finishes, the variable should be back to it's old state. For shell builtins, the parameter isn't restored to its old state. For shell functions, the new value of the parameter isn't used at all. 11) Many of the shell scripts in /bin and /usr/bin are very poorly written. - they don't initialize shell variables (parameters) so any imported from the environment don't have a null value. - they don't set PATH=/bin:/usr/bin so they are very unrobust as to which version of a command they execute. Where did they get the people who wrote these? As you can see, i'm not especially delighted with the system i'm using. Sorry for the verbosity, but i figured that it was important to state the problems precisely. Please mail responses to me, as we are not on the net yet. Thanks, Alan Rooks - Bruel & Kjaer Copenhagen - ...!mcvax!bk!alan or ...!diku!bk!alan
ka@hropus.UUCP (Kenneth Almquist) (04/14/86)
> I'm working with System V Release 2.1 from Motorola, for the 68010. > 1) I know that some (berkeley?) versions of exec(2) understand shell scripts. > The implementation i've seen is to put "#! interpreter-path" at the > start of the script. As far as i can see, this is very kludgy, but good > for several reasons. It means that any "program", either a binary or a shell > script, can be exec(2)'d. It means that shell scripts can be setuid (is > this possible anyways? I couldn't see a way). It allows a consistent > way to handle scripts for several shells. > Is this mechanism only a berkeley thing? I know it's not very elegant, > but is it hard to implement for some non-obvious reason? (given the > source to exec(2), i mean). I can see that it means the file needs read > perms so that exec can read it, but the interpreter is going to need > them anyways. Yes, it's a Berkeley feature. One weakness is that if you are running /bin/sh and invoke a shell procedure which begins with "#!/bin/sh", the kernel will exec /bin/sh even though /bin/sh is already running, so there is a cost to this approach. (An alternative approach is to have the shell handle "#!". I prefer this approach both because it keeps code out of the kernel and because it allows normal path searches for the interpreter, as well as solving the aformentioned efficiency problem.) You can get a setuid shell script be writing a trivial setuid C program that runs exec's the shell on a shell script. But be warned in any case that setuid shell procedures generally have security holes. > 2) Does our exec(2) have a bug here? A file cannot be exec'd (by our exec(2)) > by root unless execute permissions are set for one or more of user, group, > other. There are several things which make me think that this is a bug: > - it is inconsistent with the way read and write perms are given; > root can read and write a file regardless of the read and write > permission bits. > - intro(2) says "... execute permissions on a file are granted to a process > if one or more of the following are true: The process's effective user > ID is superuser. ..." > - access(2) on a file with no execute perms, by a process with real uid > of root, returns 0, indicating that root can execute the file. It used to be that exec was was supposed to require that at least one of the execute bits on a file be set for exec to succeed. I think that the intention was to catch errors by superusers. (Forgetting to type the command name and just typing the name of the file I want to cat or what- ever is an error I make occasionally.) I agree that the System V documentation says that the superuser should be allowed to exec a file which has no execute bits set, and I would guess that AT&T's policy is that the documentation is correct. Just out of curiosity, though, why do you care? > 3) Why is there no eaccess(2) which uses effective uid? It can be done > with stat(2), but so can access(2), can it not? The fact that access(2) > uses real uid makes it not so convenient, and might cause programs that > use it to be incorrect (like maybe the shell? - see below) access cannot be easily done with stat. For example, stat will fail if the real uid has permission to search the directories leading to the file but the effective uid does not. You are right about access causing programs that use it to be incorrect. Programs should not use access unless they explicitly need to test the real uid rather than the effective uid, but they do anyway because access is faster and simpler to use than stat. As a result, you cannot generally assume that programs will behave properly if invoked with differing real and effective IDs. This is a good example of a feature that practically begs people to write their programs wrong. I believe that providing an access system call without a corresponding eaccess system call was a mistake. > 4) Is it true that the sh construct "$@" (in a shell script) is supposed > to be identically equal to the command-line arguments? It seems that > it should, because otherwise there is no way to get at these because > $@ = $* gets reparsed, and "$*" is one word. > In our sh, "$@" is exactly equivalent to the command line args if there > are some, otherwise it is equal to "" instead of nothing. > Seems like a bug to me. If so, how wide-spread is it? It's on our SVR2 on a VAX here. The Korn shell gets this right, of course. > 5) Shell functions. Are they such a kludge on all System V sh's? On ours > the definitions seem to be stored with shell parameters, so if you do a set > you get a huge mess of junk. They are not part of the environment though > (thank god), but it would be nice if there was some mechanism for getting > them into every sh started up. They do get into forked sh's though, which > means they are "exported" to shell scripts, and commands in (). The former > is a disaster! It's easy enough to write a shell script with > PATH=/bin:/usr/bin > in it, but there is no simple way to ensure that commands used in the > script are not going to be shell functions. > Also, executing a shell function gives values to $1, $2, ... in the > executing shell. This is really messy; these should only be changed > while executing the function. That is, $1, $2, ... should be auto > variables to a shell function, not external. Shell functions are not store with shell parameters; the set command just prints them out (which makes the output of the set command messy). You are right about functions being passed to shell procedures and $1, $2, etc. not being restored when a function completes. (Again, ksh doesn't have these problems.) > 6) Is the PATH variable so screwed up on all System V's? It used to > be that the component were *separated* with :, and a null component was ".", > so an initial :, or ::, or a final : meant ".". Now it seems that the final > : isn't "." anymore. Bug or feature? Bug. Simple workaround--end PATH with two colons rather than one. > PATH cannot be unset either, so it can't be removed from the environment > once it's there. > By the way, with previous (say version 7) shells, how did you unexport > a parameter, i.e. remove it from the environment of invoked commands? Allowing you to unset PATH would be difficult. In previous shells you could not unset variables at all, although env(1) allowed you to remove *all* environment variables except for a specific list from the environment. > 7) It seems our shell checks file permissions before trying to exec the > file. It will not execute a command for which the effective uid (gid) > has permissions, but for which the the real uid (gid) doesn't. Looks > like it uses access(2) to me, which is an error, 'cause access only > looks at real uid (gid), and effectives are the ones that count to exec(2). As I mentioned above, lots of people decided that real and effective IDs were always the same and started using access rather than stat; the shell is only one of many culprits here. > 8) Command Hashing. This could be good, but our shell screws up in some > really stupid ways: > - if you try to execute a directory (that has execute, i.e. search, perms), > the shell reports that it can't be executed, then hashes the name > anyways. Hashing is done before executing the command, and it doesn't check whether the file is a directory. It doesn't discover that the directory cannot be executed until after it has forked, at which point it's too late to fix the hash table. > - executables used in shell functions aren't hashed. They are, though not when the function is defined. > - if a name is hashed (by executing it) then removed, the shell correctly > finds another executable with the same name if there is one, but > doesn't replace the old hash table entry with the newly found path. This is again due to the fact that the shell doesn't discover the problem until after it has forked, when it is too late to fix the hash table. The correct program is still executed, and the added cost of imperfect hashing would not seem to be a real concern in view of the number of time this must occur. > 9) The new "type" shell builtin makes mistakes, and is horribly designed. > It does the things that Kernighan and Pike's which command did, but is > so verbose that it is unusable for anything non-interactive. It makes > mistakes too, saying that directories are executable, among other things. I don't think it was intended for non-interactive use. (The bit about directories being listed as executable is fixed in ksh, of course.) > 10) Shell parameters used on a command line (i.e. "TERM=xx vi file") don't > work for shell builtins and shell functions, on our shell. The semantics > should be that the value of the parameter is set and exported to the > command, but when the command finishes, the variable should be back to it's > old state. For shell builtins, the parameter isn't restored to its old > state. For shell functions, the new value of the parameter isn't used at > all. The behavior in the case of shell builtins was certainly intended, but it was never documented, and in fact ksh treats builtins and functions just like you (and the documentation) suggest. > 11) Many of the shell scripts in /bin and /usr/bin are very poorly written. > - they don't initialize shell variables (parameters) so any imported from > the environment don't have a null value. > - they don't set PATH=/bin:/usr/bin so they are very unrobust as to > which version of a command they execute. > Where did they get the people who wrote these? No comment :-) (I should say, though, that I have never had a problem with any of these shell procedures although I'm sure I could break them if I tried.) Kenneth Almquist ihnp4!houxm!hropus!ka (official name) ihnp4!opus!ka (shorter path)
kre@munnari.OZ (Robert Elz) (04/16/86)
In article <412@hropus.UUCP>, ka@hropus.UUCP (Kenneth Almquist) writes: > An alternative approach is to have the shell handle "#!". If you do this you lose what is one of the main advantages of #!, that is that exec() works on them from any parent, not only shells. You could build handling of #! into execvp() I suppose, and then require ever program to use execvp() rather than execve() (and close relations) but don't you think that this is one place where the kernel really is the right solution. The claim of effeciency loss is largely a red herring. Most of the costs have already been born (a bad pun?) by the shell by the time it would look for the #! anyway - its done the fork, its done the path search for the exec, etc. The only thing you save is the work of the exec itself, and for a shell exec'ing itself that's not as much as you might expect. The text is going to be shared with the parent, the data space is pretty small, there just isn't all that much to do. (For csh of course this isn't true, but csh was never able to avoid the exec in any circumstances...) NB: I'm not claiming that there's no loss, just that the magnitude of the loss isn't likely to be enough to really trouble anyone. If you're writing a set of command procedures, and you know how they are going to be used (from each other) then you can always just omit the #! and have the shell operate the old way, and gain all your effeciency back. And continues... > Allowing you to unset PATH would be difficult. I'm not sure why. If you can exec a shell with no PATH set, the shell has to do something sensible (which may be defined as using a default built in PATH, or simply having no path at all and only exec'ing commands when given path names that find them). I can't see why unsetting PATH should be any different. Robert Elz seismo!munnari!kre kre%munnari.oz@seismo.css.gov
david@sun.uucp (David DiGiacomo) (04/16/86)
In article <412@hropus.UUCP> ka@hropus.UUCP (Kenneth Almquist) writes: >> 4) Is it true that the sh construct "$@" (in a shell script) is supposed >> to be identically equal to the command-line arguments? It seems that >> it should, because otherwise there is no way to get at these because >> $@ = $* gets reparsed, and "$*" is one word. >> In our sh, "$@" is exactly equivalent to the command line args if there >> are some, otherwise it is equal to "" instead of nothing. >> Seems like a bug to me. If so, how wide-spread is it? > >It's on our SVR2 on a VAX here. The Korn shell gets this right, of course. This also afflicts SunOS 3.0. I find it incredibly annoying, but a simple workaround is to use ${1+"$@"} instead of plain "$@". [disclaimer] -- David DiGiacomo {decvax, ihnp4, ucbvax}!sun!david david@sun.arpa Sun Microsystems, Mt. View, CA (415) 960-7495
chris@umcp-cs.UUCP (Chris Torek) (04/17/86)
In article <412@hropus.UUCP> ka@hropus.UUCP (Kenneth Almquist) writes: >[`#! /bin/sh' is] a Berkeley feature. One weakness is that if >you are running /bin/sh and invoke a shell procedure which begins >with "#!/bin/sh", the kernel will exec /bin/sh even though /bin/sh >is already running, so there is a cost to this approach. The /bin/sh that is already running would have to fork to run the script anyway, so this `cost' really amounts to the extra code in the kernel required to perform this indirection. >... I believe that providing an access system call without a >corresponding eaccess system call was a mistake. It seems to me that, rather than providing one new system call, we should instead generalise `access': have it take a uid. Only root should be able to determine access for values other than u.u_ruid and u.u_euid, of course. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu
mike@whuxl.UUCP (BALDWIN) (04/19/86)
> In article <412@hropus.UUCP> ka@hropus.UUCP (Kenneth Almquist) writes: > >[`#! /bin/sh' is] a Berkeley feature. One weakness is that if > >you are running /bin/sh and invoke a shell procedure which begins > >with "#!/bin/sh", the kernel will exec /bin/sh even though /bin/sh > >is already running, so there is a cost to this approach. > > The /bin/sh that is already running would have to fork to run > the script anyway, so this `cost' really amounts to the extra > code in the kernel required to perform this indirection. > > In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415) There is a real extra cost, the exec of /bin/sh. Of course you always have to fork, but you don't have to exec. True, the text space is shared and the data space may not be too big, but the kernel has to clean up mem mgmt and then redo it, plus copying the environment. -- Michael Baldwin (not the opinions of) AT&T Bell Laboratories {at&t}!whuxl!mike