[net.unix-wizards] 4.1c sh "not found" errors

bukys@rochester.UUCP (07/05/83)

The command
	echo "see /dev/null; see /dev/null" | sh
usually says
	sh: see: not found

In other words, the first "see" is not found, but the second one is.
The diffs from 4.1 to 4.1c "sh" show only very minor changes, each of
which I have twiddled without very encouraging results.

Anybody out there fixed this problem yet?

Liudvikas Bukys
rochester!bukys (uucp, via {seismo,allegra,brl-bmd,ritcv,rocksvax})
bukys@rochester (arpa)

bukys@rochester.UUCP (07/05/83)

Relay-Version:version B 2.10 5/3/83; site wjh12.UUCP
Posting-Version:version B 2.10 5/3/83; site rochester.UUCP
Path:wjh12!genrad!linus!philabs!seismo!rochester!bukys
Message-ID:<2145@rochester.UUCP>
Date:Tue, 5-Jul-83 11:15:06 EDT
Organization:University of Rochester

The command
	echo "see /dev/null; see /dev/null" | sh
usually says
	sh: see: not found

In other words, the first "see" is not found, but the second one is.
The diffs from 4.1 to 4.1c "sh" show only very minor changes, each of
which I have twiddled without very encouraging results.

Anybody out there fixed this problem yet?

Liudvikas Bukys
rochester!bukys (uucp, via {seismo,allegra,brl-bmd,ritcv,rocksvax})
bukys@rochester (arpa)

jim@mcvax.UUCP (07/08/83)

Yes, we get them too, and an easy fix is to put
	#!/bin/sh
at the beginning of shell scripts, then they always work. But the
problem lies elsewhere, and is in fact, a problem where a new process
after an `exec' seems to get a garbage environment. Why, I don't know.
It has caused all sorts of problems here with uucp and the line-printer
daemon to mention two.  We have had to change some programs to use
`execve' with a null environment pointer, exec's would RETURN with no
error set!

If you have a shell script, without the `#!/bin/sh' at the front,
then executing it exhibits random results, it looks like, depending
on whether one of the relevant inodes is in core or not.
I added a line in kern_process.c/execve() to ensure the stuff on the
stack was terminated properly (this was in all previous V7-derived
systems) as follows:

	(void) suword((caddr_t)ap, 0);
#ifdef MCVAX
	(void) suword((caddr_t)ucp, 0);
#endif MCVAX
	setregs();
bad:

and now such exec's which exhibited random failure before fail with
EBADF, rather than going on with a crippled process. 

The `rmdir' system call also exhibits strange behaviour by not
giving an error when it should, a simple program with rmdir("..")
gives no error the first time (if it does, do a couple of `sync's
and `cd's) then works O.K.

Sam Leffler told me that the code in the `do' loop in
sys_generic.c/rwip() does not handle u.u_error properly, but I have
now changed that to no effect. He also said that this problem does
not happen on 4.2.

So I am interested to know what gives here too. Still, I reckon we
can live with it for a month or so longer.

Jim McKie	....decvax!mcvax!jim
		...philabs!mcvax!jim

shannon@sun.UUCP (Bill Shannon) (07/08/83)

I'm sure I responded to this before but perhaps my response was lost.
The bug is in ufs_bio.c in (I think) biodone().  My previous response
gave the exact fix but I'm too lazy to dig it out a second time.  I
think the problem is that biodone does

	u.u_error = geterror();

where it should do

	if (u.u_error == 0)
		u.u_error = geterror();


					Bill Shannon
					Sun Microsystems Inc.

guy@rlgvax.UUCP (07/09/83)

This one was posted before, with a fix: here's a copy of the original article.

>>From: dean@cornell.UUCP
>>Subject: Re: 4.1c Cshell bug? (actually an exec bug)
>>Message-ID: <4585@cornell.UUCP>
>>Organization: Cornell Computer Science

There is a bug in the exec routine in the 4.1c kernel.  One
manifestation of this bug is when the shell tries to execute a command
file that does not have "#! /bin/csh"  (or "#! /bin/sh") as its first
line.  The problem is that an incorrect error code is sometimes
returned, resulting in messages like "Bad file number".

There are two ways (at least) to fix this problem.  One is to make sure
that there is always an explicit shell specification at the beginning
of shell command files.  In this case the exec will always succeed, and
the shell program won't have to look at the returned error code.  The
other fix is to install the following code into /sys/sys/kern_process.c
and build and install a new system.  The code replaces that at the end
of the "execve" function from the label "bad:" to the end of the function.
-----------------------------------------------
bad:
        /* CORNELL DEBUG: claim there's a path */
        /* thru iput that clears u.u_error.    */

        { int sav_u_error;
            sav_u_error = u.u_error;
            u.u_error = 0;
            if (bp)
                brelse(bp);
            if (bno)
                rmfree(argmap, (long)ctod(clrnd((int) btoc(NCARGS))), bno);
            iput(ip);
            if( u.u_error == 0 )
                u.u_error = sav_u_error; 
        }
}
-----------------------------------------------

Berkeley is aware of the problem, and it will presumably be fixed in
later releases.
                                    Dean Krafft
                                    Cornell Computer Science Dept.
                                    (607) 256-4052
                                    uucp: decvax!cornell!dean
                                    ARPA: dean@cornell