[comp.std.unix] Shell standardization

C.R.Ritson@newcastle.ac.uk ("C.R. Ritson") (01/17/91)

Submitted-by: C.R.Ritson@newcastle.ac.uk ("C.R. Ritson")

Please  could  you  post  this  to  comp.std.unix, or else reply to it
yourself if that is more appropriate.

                               -------

I  am  unsure  of the state of standardisation of the unix shells, but
hope that the standards committees would consider removing a piece  of
functionality that is becoming both irrelevant and dangerous.

I  am  thinking of the feature where, after the exec system call fails
to be able to execute a file, the shell assumes that  if  it  has  the
execute  bit  set  and is a file, then it must be a shell script.  Now
that the exec  system  call  on  most  systems  understands  the  "#!"
notation  as  a  valid  magic  number and starts the named interpretor
itself, this is no  longer  needed,  provided  this  behaviour  itself
becomes standard.

The  danger  of direct interpretation by the shell is that the file is
quite  likely  to  be  an  executable  object  file  for  some   other
architecture  seen  from the wrong side of an NFS mount.  When this is
the case the shell produces large numbers of "not found" messages  and
often  ends  up  resetting  numerous operating modes.  Our newer users
find this most confusing.

Chris Ritson
--
PHONE: +44 91 222 8175              Computing Laboratory,
FAX  : +44 91 222 8232              University of Newcastle upon Tyne,
TELEX: uk+53654-UNINEW_G            UK, NE1 7RU

Volume-Number: Volume 22, Number 74

trt@mcnc.org (Tom Truscott) (01/18/91)

Submitted-by: trt@mcnc.org (Tom Truscott)

> Submitted-by: C.R.Ritson@newcastle.ac.uk ("C.R. Ritson")
>
> ... 
> The  danger  of direct interpretation by the shell is that the file is
> quite  likely  to  be  an  executable  object  file  for  some   other
> architecture  seen  from the wrong side of an NFS mount.  When this is
> the case the shell produces large numbers of "not found" messages  and
> often  ends  up  resetting  numerous operating modes.  Our newer users
> find this most confusing.

If the kernel simply returned EACCES (for example) rather than ENOEXEC
when the file is non-ascii, the shells would not attempt interpretation.
(Just check that the first 4 characters have value > 1 and < 128.)
Dropping direct interpretation does make good sense.
But there is the problem of old kernels (e.g. System V.3.2!!) lacking #!,
and I think a surprising number of scripts will stop working
(such as /bin/true on some systems).  Serves them right I suppose.

For Freedomnet, we took this a bit further.  We identify
the binary's type and then execute it on the fastest available system.
This saves a lot of wear and tear in our user environment
with computers from a dozen different vendors.
Wherever I log in my favorite commands still work.
(Hmm, would other people have a dozen different personal "bin" directories?)
Of course this isn't entirely wonderful: when we unplugged the last
Gould machine some of my commands had to be recompiled.
But it goes a long way and surely it is in the right direction.

It is ironic that this NFS comment comes from the home of
the Newcastle Connection, which is the intellectual parent
of Freedomnet.  We kept the faith, and the technical
results have been most gratifying.
You too can get religion: contact Thomas Warren at 1-919-541-6110
or wtw@rti.rti.org.

	Tom Truscott

Volume-Number: Volume 22, Number 76

jack@cwi.nl (Jack Jansen) (01/21/91)

Submitted-by: jack@cwi.nl (Jack Jansen)

>  = trt@mcnc.org (Tom Truscott) writes:
>> = C.R.Ritson@newcastle.ac.uk ("C.R. Ritson")
>> The  danger  of direct interpretation by the shell is that the file is
>> quite  likely  to  be  an  executable  object  file  for  some   other
>> architecture  seen  from the wrong side of an NFS mount.  When this is
>> the case the shell produces large numbers of "not found" messages  and
>> often  ends  up  resetting  numerous operating modes.  Our newer users
>> find this most confusing.

>If the kernel simply returned EACCES (for example) rather than ENOEXEC
>when the file is non-ascii, the shells would not attempt interpretation.
>(Just check that the first 4 characters have value > 1 and < 128.)
>Dropping direct interpretation does make good sense.
>But there is the problem of old kernels (e.g. System V.3.2!!) lacking #!,
>and I think a surprising number of scripts will stop working
>(such as /bin/true on some systems).  Serves them right I suppose.

I think you've missed the point here. The question is not wether the
kernel recognizes #!, but wether the shell recognizes it.

Currently, when the kernel returns ENOEXEC the shell just blindly assumes
that we have a shell script here and starts executing it. I don't see
any problem in the shell reading the first line and checking it for
#!/bin/sh.

The only thing that this would break is that old shell scripts without
a #! first line wouldn't execute anymore, but this is trivial to fix
by adding the #! line.
-- 
--
Een volk dat voor tirannen zwicht	| Oral:     Jack Jansen
zal meer dan lijf en goed verliezen	| Internet: jack@cwi.nl
dan dooft het licht			| Uucp:     hp4nl!cwi.nl!jack


Volume-Number: Volume 22, Number 78

guy@auspex.uucp (Guy Harris) (01/23/91)

Submitted-by: guy@auspex.uucp (Guy Harris)

>Currently, when the kernel returns ENOEXEC the shell just blindly assumes
>that we have a shell script here and starts executing it. I don't see
>any problem in the shell reading the first line and checking it for
>#!/bin/sh.

Or checking for "#!" in general, and doing what is done in the "exec*()"
implementations of many systems (usually in the kernel, but not
necessarily so) - i.e., have the Bourne shell capable of executing C
shell scripts (for those of you who write them :-)) that begin with "#!
/bin/csh", and the C shell capable of executing Bourne shell scripts
that begin with "#! /bin/sh", etc..

I don't strongly care where it's done (although I *do* prefer having
"execl()" AND "execv()" capable of running scripts, even if it's done by
having them be wrappers around kernel traps with the wrappers checking
for the "#!" line if they get ENOEXEC), but it *would* be nice if the
system didn't inappropriately try to run files that happened to have
execute permissions as scripts if, in fact, they aren't scripts. 

I don't know if anything more should be said by any standard than simply
"we do not guarantee that any shell will execute a script that doesn't
begin with '#!'", so that you can remove the "if it gets ENOEXEC, treat
it as a script" stuff and still comply with the appropriate POSIX
standard(s).


Volume-Number: Volume 22, Number 79

gwyn@smoke.brl.mil (Doug Gwyn) (01/24/91)

Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn)

In article <17155@cs.utexas.edu> jack@cwi.nl (Jack Jansen) writes:
>I don't see any problem in the shell reading the first line and
>checking it for #!/bin/sh.

Shells that have done this (e.g. some csh implementations on UNIX System V)
have definitely caused problems.  For example, Most of my shell scripts
start with #!/usr/5bin/sh to ensure System V semantics on BSD systems;
however, on a genuine UNIX System V system I was expecting the script to
be interpreted by /bin/sh (even if executed by csh).  We discovered that
some overly-"helpful" implementations of csh on System V, however, went
ahead and tried to find /usr/5bin/sh, which of course made the execution
fail.

Volume-Number: Volume 22, Number 80

rcd@ico.isc.com (Dick Dunn) (01/27/91)

Submitted-by: rcd@ico.isc.com (Dick Dunn)

jack@cwi.nl (Jack Jansen) writes:

> ... I don't see any problem in the shell reading the first line and
> checking it for #!/bin/sh.

> The only thing that this would break is that old shell scripts without
> a #! first line wouldn't execute anymore, but this is trivial to fix
> by adding the #! line.

While I don't really intend to wish evil upon Jansen, I do wish that people
who make statements like this would have to keep a hand in maintaining
large systems.

It is, indeed, trivial to change any given shell script by the addition of
a #! line at the beginning.  But the task of finding (all instances of) all
shell scripts and correcting them all is a most incredible nightmare.  Even
this ignores finding the programs which generate shell scripts (a much
smaller class than the former, but much harder to solve).

The alternate solution which has been suggested--having the shell sanity-
check before it executes a file--is far more practical to carry out in the
real world, and will find virtually all non-shell-script problems.
-- 
Dick Dunn     rcd@ico.isc.com -or- ico!rcd       Boulder, CO   (303)449-2870
   ...Mr. Natural says, "Use the right tool for the job."

Volume-Number: Volume 22, Number 83

karish@mindcraft.com (Chuck Karish) (01/31/91)

Submitted-by: karish@mindcraft.com (Chuck Karish)

In article <17400@cs.utexas.edu> gwyn@smoke.brl.mil (Doug Gwyn) wrote:
>In article <17155@cs.utexas.edu> jack@cwi.nl (Jack Jansen) writes:
>>I don't see any problem in the shell reading the first line and
>>checking it for #!/bin/sh.
>
>Shells that have done this (e.g. some csh implementations on UNIX System V)
>have definitely caused problems.  For example, Most of my shell scripts
>start with #!/usr/5bin/sh to ensure System V semantics on BSD systems;
>however, on a genuine UNIX System V system I was expecting the script to
>be interpreted by /bin/sh (even if executed by csh).

It would seem that programmer expectations were more at fault here than
were the SysV shells.  The scripts he described were designed to
exploit a difference in the behavior of shells on different systems.  I
don't understand why it worked, though.  On some older SysV-based
systems I've used, scripts that start with the '#' character are
interpreted as csh scripts no matter what follows the '#'.

Some systems that honor "#! /bin/whatever" do not default to csh if the
file starts with just "#".

The need for a hard-coded path makes scripts that depend on the "#!"
mechanism non-portable. "/usr/5bin/sh" is the right path for a shell
with full SysV functionality on a Sun.  On a DEC system, the
incantation is "/bin/sh5"; on other systems, the correct name might be
"/usr/usg/sh" or "/bin/bsh".  Perhaps what we need is a standard set of
non-filename identifiers for shells.

Creative use of links can help alleviate this problem on existing
systems.

I would be more optimistic that the adoption of the 1003.2 standard
would solve this problem if I hadn't already seen the many different
ways that vendors have isolated 1003.1 behavior in compatibility
modes.  Operating system standards won't live up to their full
potential until they're accepted as the default interfaces.

To change the subject a little, what features of modern sh
implementations cause scripts written for the BSD sh to fail?  I
presume there's a reason that BSD vendors don't install a SysV sh in
/bin.  Followups on this point should probably go to comp.unix.shell.
-- 

	Chuck Karish		karish@mindcraft.com
	Mindcraft, Inc.		(415) 323-9000

Volume-Number: Volume 22, Number 90

gwyn@smoke.brl.mil (Doug Gwyn) (02/01/91)

Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn)

In article <17504@cs.utexas.edu> karish@mindcraft.com (Chuck Karish) writes:
>On some older SysV-based systems I've used, scripts that start with the '#'
>character are interpreted as csh scripts no matter what follows the '#'.

That was a bug in porting a pre-#! era csh feature.  On UNIX System V,
more often than not Bourne shell scripts begin with '#'.  Therefore it
was unwise to leave that vestigial feature in csh when porting it to
such an environment.

The basic answer is that there is no truly portable solution, other
than writing Bourne shell scripts starting with non-# that invoke
whatever other command is actually wanted, using some sort of path
search to find it.

Of course Plan 9's user-mounted appendable directories makes this easier..

Volume-Number: Volume 22, Number 95