[net.unix-wizards] Problems with System V Release 2.1 on 68010

ron@oscvax.UUCP (Ron Janzen) (04/07/86)

I am posting this for a friend who doesn't have net access. PLEASE reply
to his e-mail address which is given at the end of the posting. mcvax can
be reached thru seismo.

-------------------------------------------------------------------------
I'm working with System V Release 2.1 from Motorola, for the 68010.
I've come across some very poorly documented, or strangely implemented
things in the kernel and shell, and would like to find out whether these
are problems with our (Motorola) release, or whether they are problems
inherent in System V. I'd also like to know what the easiest and most
elegant solutions to these problems are.

1) I know that some (berkeley?) versions of exec(2) understand shell scripts.
   The implementation i've seen is to put "#! interpreter-path" at the
   start of the script. As far as i can see, this is very kludgy, but good
   for several reasons. It means that any "program", either a binary or a shell
   script, can be exec(2)'d. It means that shell scripts can be setuid (is
   this possible anyways? I couldn't see a way). It allows a consistent
   way to handle scripts for several shells.
   Is this mechanism only a berkeley thing? I know it's not very elegant,
   but is it hard to implement for some non-obvious reason? (given the
   source to exec(2), i mean). I can see that it means the file needs read
   perms so that exec can read it, but the interpreter is going to need
   them anyways.

2) Does our exec(2) have a bug here? A file cannot be exec'd (by our exec(2))
   by root unless execute permissions are set for one or more of user, group,
   other. There are several things which make me think that this is a bug:
   - it is inconsistent with the way read and write perms are given;
     root can read and write a file regardless of the read and write permission
     bits.
   - intro(2) says "... execute permissions on a file are granted to a process
     if one or more of the following are true: The process's effective user
     ID is superuser. ..."
   - access(2) on a file with no execute perms, by a process with real uid
     of root, returns 0, indicating that root can execute the file.

3) Why is there no eaccess(2) which uses effective uid? It can be done
   with stat(2), but so can access(2), can it not? The fact that access(2)
   uses real uid makes it not so convenient, and might cause programs that
   use it to be incorrect (like maybe the shell? - see below)

4) Is it true that the sh construct "$@" (in a shell script) is supposed
   to be identically equal to the command-line arguments? It seems that
   it should, because otherwise there is no way to get at these because
   $@ = $* gets reparsed, and "$*" is one word.
   In our sh, "$@" is exactly equivalent to the command line args if there
   are some, otherwise it is equal to "" instead of nothing.
   Seems like a bug to me. If so, how wide-spread is it?

5) Shell functions. Are they such a kludge on all System V sh's? On ours
   the definitions seem to be stored with shell parameters, so if you do a set
   you get a huge mess of junk. They are not part of the environment though
   (thank god), but it would be nice if there was some mechanism for getting
   them into every sh started up. They do get into forked sh's though, which
   means they are "exported" to shell scripts, and commands in (). The former
   is a disaster! It's easy enough to write a shell script with
   PATH=/bin:/usr/bin
   in it, but there is no simple way to ensure that commands used in the
   script are not going to be shell functions.
   Also, executing a shell function gives values to $1, $2, ... in the
   executing shell. This is really messy; these should only be changed
   while executing the function. That is, $1, $2, ... should be auto
   variables to a shell function, not external.

6) Is the PATH variable so screwed up on all System V's? It used to
   be that the component were *separated* with :, and a null component was ".",
   so an initial :, or ::, or a final : meant ".". Now it seems that the final
   : isn't "." anymore. Bug or feature?
   PATH cannot be unset either, so it can't be removed from the environment
   once it's there.
   By the way, with previous (say version 7) shells, how did you unexport
   a parameter, i.e. remove it from the environment of invoked commands?

7) It seems our shell checks file permissions before trying to exec the
   file. It will not execute a command for which the effective uid (gid)
   has permissions, but for which the the real uid (gid) doesn't. Looks
   like it uses access(2) to me, which is an error, 'cause access only
   looks at real uid (gid), and effectives are the ones that count to exec(2).

8) Command Hashing. This could be good, but our shell screws up in some
   really stupid ways:
   - if you try to execute a directory (that has execute, i.e. search, perms),
     the shell reports that it can't be executed, then hashes the name anyways.
   - executables used in shell functions aren't hashed.
   - if a name is hashed (by executing it) then removed, the shell correctly
     finds another executable with the same name if there is one, but
     doesn't replace the old hash table entry with the newly found path.

9) The new "type" shell builtin makes mistakes, and is horribly designed.
   It does the things that Kernighan and Pike's which command did, but is
   so verbose that it is unusable for anything non-interactive. It makes
   mistakes too, saying that directories are executable, among other things.

10) Shell parameters used on a command line (i.e. "TERM=xx vi file") don't
    work for shell builtins and shell functions, on our shell. The semantics
    should be that the value of the parameter is set and exported to the
    command, but when the command finishes, the variable should be back to it's
    old state. For shell builtins, the parameter isn't restored to its old
    state. For shell functions, the new value of the parameter isn't used at
    all.

11) Many of the shell scripts in /bin and /usr/bin are very poorly written.
    - they don't initialize shell variables (parameters) so any imported from
      the environment don't have a null value.
    - they don't set PATH=/bin:/usr/bin so they are very unrobust as to
      which version of a command they execute.
    Where did they get the people who wrote these?


As you can see, i'm not especially delighted with the system i'm using.
Sorry for the verbosity, but i figured that it was important to state the
problems precisely.
Please mail responses to me, as we are not on the net yet.
Thanks,
Alan Rooks - Bruel & Kjaer Copenhagen - ...!mcvax!bk!alan or ...!diku!bk!alan

ka@hropus.UUCP (Kenneth Almquist) (04/14/86)

> I'm working with System V Release 2.1 from Motorola, for the 68010.

> 1) I know that some (berkeley?) versions of exec(2) understand shell scripts.
>    The implementation i've seen is to put "#! interpreter-path" at the
>    start of the script. As far as i can see, this is very kludgy, but good
>    for several reasons. It means that any "program", either a binary or a shell
>    script, can be exec(2)'d. It means that shell scripts can be setuid (is
>    this possible anyways? I couldn't see a way). It allows a consistent
>    way to handle scripts for several shells.
>    Is this mechanism only a berkeley thing? I know it's not very elegant,
>    but is it hard to implement for some non-obvious reason? (given the
>    source to exec(2), i mean). I can see that it means the file needs read
>    perms so that exec can read it, but the interpreter is going to need
>    them anyways.

Yes, it's a Berkeley feature.  One weakness is that if you are running
/bin/sh and invoke a shell procedure which begins with "#!/bin/sh", the
kernel will exec /bin/sh even though /bin/sh is already running, so there
is a cost to this approach.  (An alternative approach is to have the shell
handle "#!".  I prefer this approach both because it keeps code out of the
kernel and because it allows normal path searches for the interpreter, as
well as solving the aformentioned efficiency problem.)

You can get a setuid shell script be writing a trivial setuid C program
that runs exec's the shell on a shell script.  But be warned in any case
that setuid shell procedures generally have security holes.

> 2) Does our exec(2) have a bug here? A file cannot be exec'd (by our exec(2))
>    by root unless execute permissions are set for one or more of user, group,
>    other. There are several things which make me think that this is a bug:
>    - it is inconsistent with the way read and write perms are given;
>      root can read and write a file regardless of the read and write
>      permission bits.
>    - intro(2) says "... execute permissions on a file are granted to a process
>      if one or more of the following are true: The process's effective user
>      ID is superuser. ..."
>    - access(2) on a file with no execute perms, by a process with real uid
>      of root, returns 0, indicating that root can execute the file.

It used to be that exec was was supposed to require that at least one of
the execute bits on a file be set for exec to succeed.  I think that the
intention was to catch errors by superusers.  (Forgetting to type the
command name and just typing the name of the file I want to cat or what-
ever is an error I make occasionally.)

I agree that the System V documentation says that the superuser should be
allowed to exec a file which has no execute bits set, and I would guess
that AT&T's policy is that the documentation is correct.  Just out of
curiosity, though, why do you care?

> 3) Why is there no eaccess(2) which uses effective uid? It can be done
>    with stat(2), but so can access(2), can it not? The fact that access(2)
>    uses real uid makes it not so convenient, and might cause programs that
>    use it to be incorrect (like maybe the shell? - see below)

access cannot be easily done with stat.  For example, stat will fail if
the real uid has permission to search the directories leading to the file
but the effective uid does not.

You are right about access causing programs that use it to be incorrect.
Programs should not use access unless they explicitly need to test the
real uid rather than the effective uid, but they do anyway because access
is faster and simpler to use than stat.  As a result, you cannot generally
assume that programs will behave properly if invoked with differing real
and effective IDs.

This is a good example of a feature that practically begs people to write
their programs wrong.  I believe that providing an access system call
without a corresponding eaccess system call was a mistake.

> 4) Is it true that the sh construct "$@" (in a shell script) is supposed
>    to be identically equal to the command-line arguments? It seems that
>    it should, because otherwise there is no way to get at these because
>    $@ = $* gets reparsed, and "$*" is one word.
>    In our sh, "$@" is exactly equivalent to the command line args if there
>    are some, otherwise it is equal to "" instead of nothing.
>    Seems like a bug to me. If so, how wide-spread is it?

It's on our SVR2 on a VAX here.  The Korn shell gets this right, of course.

> 5) Shell functions. Are they such a kludge on all System V sh's? On ours
>    the definitions seem to be stored with shell parameters, so if you do a set
>    you get a huge mess of junk. They are not part of the environment though
>    (thank god), but it would be nice if there was some mechanism for getting
>    them into every sh started up. They do get into forked sh's though, which
>    means they are "exported" to shell scripts, and commands in (). The former
>    is a disaster! It's easy enough to write a shell script with
>    PATH=/bin:/usr/bin
>    in it, but there is no simple way to ensure that commands used in the
>    script are not going to be shell functions.
>    Also, executing a shell function gives values to $1, $2, ... in the
>    executing shell. This is really messy; these should only be changed
>    while executing the function. That is, $1, $2, ... should be auto
>    variables to a shell function, not external.

Shell functions are not store with shell parameters; the set command just
prints them out (which makes the output of the set command messy).  You
are right about functions being passed to shell procedures and $1, $2, etc.
not being restored when a function completes.  (Again, ksh doesn't have
these problems.)

> 6) Is the PATH variable so screwed up on all System V's? It used to
>    be that the component were *separated* with :, and a null component was ".",
>    so an initial :, or ::, or a final : meant ".". Now it seems that the final
>    : isn't "." anymore. Bug or feature?

Bug.  Simple workaround--end PATH with two colons rather than one.

>    PATH cannot be unset either, so it can't be removed from the environment
>    once it's there.
>    By the way, with previous (say version 7) shells, how did you unexport
>    a parameter, i.e. remove it from the environment of invoked commands?

Allowing you to unset PATH would be difficult.  In previous shells you could
not unset variables at all, although env(1) allowed you to remove *all*
environment variables except for a specific list from the environment.

> 7) It seems our shell checks file permissions before trying to exec the
>    file. It will not execute a command for which the effective uid (gid)
>    has permissions, but for which the the real uid (gid) doesn't. Looks
>    like it uses access(2) to me, which is an error, 'cause access only
>    looks at real uid (gid), and effectives are the ones that count to exec(2).

As I mentioned above, lots of people decided that real and effective IDs
were always the same and started using access rather than stat; the shell
is only one of many culprits here.

> 8) Command Hashing. This could be good, but our shell screws up in some
>    really stupid ways:
>    - if you try to execute a directory (that has execute, i.e. search, perms),
>      the shell reports that it can't be executed, then hashes the name
>      anyways.

Hashing is done before executing the command, and it doesn't check whether
the file is a directory.  It doesn't discover that the directory cannot
be executed until after it has forked, at which point it's too late to
fix the hash table.

>    - executables used in shell functions aren't hashed.

They are, though not when the function is defined.

>    - if a name is hashed (by executing it) then removed, the shell correctly
>      finds another executable with the same name if there is one, but
>      doesn't replace the old hash table entry with the newly found path.

This is again due to the fact that the shell doesn't discover the problem
until after it has forked, when it is too late to fix the hash table.  The
correct program is still executed, and the added cost of imperfect hashing
would not seem to be a real concern in view of the number of time this
must occur.

> 9) The new "type" shell builtin makes mistakes, and is horribly designed.
>    It does the things that Kernighan and Pike's which command did, but is
>    so verbose that it is unusable for anything non-interactive. It makes
>    mistakes too, saying that directories are executable, among other things.

I don't think it was intended for non-interactive use.  (The bit about
directories being listed as executable is fixed in ksh, of course.)

> 10) Shell parameters used on a command line (i.e. "TERM=xx vi file") don't
>     work for shell builtins and shell functions, on our shell. The semantics
>     should be that the value of the parameter is set and exported to the
>     command, but when the command finishes, the variable should be back to it's
>     old state. For shell builtins, the parameter isn't restored to its old
>     state. For shell functions, the new value of the parameter isn't used at
>     all.

The behavior in the case of shell builtins was certainly intended, but it
was never documented, and in fact ksh treats builtins and functions just
like you (and the documentation) suggest.

> 11) Many of the shell scripts in /bin and /usr/bin are very poorly written.
>     - they don't initialize shell variables (parameters) so any imported from
>       the environment don't have a null value.
>     - they don't set PATH=/bin:/usr/bin so they are very unrobust as to
>       which version of a command they execute.
>     Where did they get the people who wrote these?

No comment :-)  (I should say, though, that I have never had a problem with
any of these shell procedures although I'm sure I could break them if I
tried.)
				Kenneth Almquist
				ihnp4!houxm!hropus!ka	(official name)
				ihnp4!opus!ka		(shorter path)

kre@munnari.OZ (Robert Elz) (04/16/86)

In article <412@hropus.UUCP>, ka@hropus.UUCP (Kenneth Almquist) writes:
> An alternative approach is to have the shell handle "#!".

If you do this you lose what is one of the main advantages of #!,
that is that exec() works on them from any parent, not only shells.
You could build handling of #! into execvp() I suppose, and then
require ever program to use execvp() rather than execve() (and close
relations) but don't you think that this is one place where the
kernel really is the right solution.

The claim of effeciency loss is largely a red herring.  Most of the
costs have already been born (a bad pun?) by the shell by the time
it would look for the #! anyway - its done the fork, its done the
path search for the exec, etc.  The only thing you save is the
work of the exec itself, and for a shell exec'ing itself that's
not as much as you might expect.  The text is going to be shared
with the parent, the data space is pretty small, there just isn't all
that much to do.  (For csh of course this isn't true, but csh was
never able to avoid the exec in any circumstances...)  NB: I'm not
claiming that there's no loss, just that the magnitude of the loss
isn't likely to be enough to really trouble anyone.  If you're writing
a set of command procedures, and you know how they are going to be
used (from each other) then you can always just omit the #! and
have the shell operate the old way, and gain all your effeciency back.

And continues...
> Allowing you to unset PATH would be difficult.

I'm not sure why.  If you can exec a shell with no PATH set,
the shell has to do something sensible (which may be defined
as using a default built in PATH, or simply having no path
at all and only exec'ing commands when given path names that
find them).  I can't see why unsetting PATH should be any different.

Robert Elz		seismo!munnari!kre   kre%munnari.oz@seismo.css.gov

david@sun.uucp (David DiGiacomo) (04/16/86)

In article <412@hropus.UUCP> ka@hropus.UUCP (Kenneth Almquist) writes:
>> 4) Is it true that the sh construct "$@" (in a shell script) is supposed
>>    to be identically equal to the command-line arguments? It seems that
>>    it should, because otherwise there is no way to get at these because
>>    $@ = $* gets reparsed, and "$*" is one word.
>>    In our sh, "$@" is exactly equivalent to the command line args if there
>>    are some, otherwise it is equal to "" instead of nothing.
>>    Seems like a bug to me. If so, how wide-spread is it?
>
>It's on our SVR2 on a VAX here.  The Korn shell gets this right, of course.

This also afflicts SunOS 3.0.  I find it incredibly annoying, but a simple
workaround is to use ${1+"$@"} instead of plain "$@".

[disclaimer]
-- 
David DiGiacomo  {decvax, ihnp4, ucbvax}!sun!david  david@sun.arpa
Sun Microsystems, Mt. View, CA  (415) 960-7495

chris@umcp-cs.UUCP (Chris Torek) (04/17/86)

In article <412@hropus.UUCP> ka@hropus.UUCP (Kenneth Almquist) writes:
>[`#! /bin/sh' is] a Berkeley feature.  One weakness is that if
>you are running /bin/sh and invoke a shell procedure which begins
>with "#!/bin/sh", the kernel will exec /bin/sh even though /bin/sh
>is already running, so there is a cost to this approach.

The /bin/sh that is already running would have to fork to run
the script anyway, so this `cost' really amounts to the extra
code in the kernel required to perform this indirection.

>... I believe that providing an access system call without a
>corresponding eaccess system call was a mistake.

It seems to me that, rather than providing one new system call, we
should instead generalise `access':  have it take a uid.  Only root
should be able to determine access for values other than u.u_ruid
and u.u_euid, of course.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

mike@whuxl.UUCP (BALDWIN) (04/19/86)

> In article <412@hropus.UUCP> ka@hropus.UUCP (Kenneth Almquist) writes:
> >[`#! /bin/sh' is] a Berkeley feature.  One weakness is that if
> >you are running /bin/sh and invoke a shell procedure which begins
> >with "#!/bin/sh", the kernel will exec /bin/sh even though /bin/sh
> >is already running, so there is a cost to this approach.
> 
> The /bin/sh that is already running would have to fork to run
> the script anyway, so this `cost' really amounts to the extra
> code in the kernel required to perform this indirection.
> 
> In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)

There is a real extra cost, the exec of /bin/sh.  Of course you always
have to fork, but you don't have to exec.  True, the text space is
shared and the data space may not be too big, but the kernel has to
clean up mem mgmt and then redo it, plus copying the environment.
-- 
						Michael Baldwin
			(not the opinions of)	AT&T Bell Laboratories
						{at&t}!whuxl!mike