[comp.unix.sysv386] Fixes for a job controlled bash1.07 on ISC2.2

lubkin@cs.rochester.edu (Saul Lubkin) (05/24/91)

I've finally worked out the last problems in using a POSIX compiled, job
control bash1.07, with ISC Unix 2.2.  Enclosed are all the context diffs
needed, to change a virgin copy, extracted from bash1.07.tar.Z on
prep.ai.mit.edu, to one that will compile, to a 100% useable, safe, very
useful shell, under ISC 2.2.

Note:  I've used "gcc -traditional -posix" for my C compiler.  gcc can also
be downloaded from prep.ai.mit.edu.  Or, you can use "cc -Xp" instead, and
make the appropriate modifications in cpp-Makefile, by hand.

Let me note what the problems had been:

ISC2.2 is a dual operating system.  When a final is compiled with "cc -Xp",
(or equivalently "gcc -posix")
a POSIX.1 compliant environment is created for the binary, which affects the
way it executed.  Compiling without the "-Xp" flag creates ordinary SVR3
compatible binaries.

In a POSIX ("-Xp") binary, you can execute "__setostype(0);" to change
semantics to SVR3, and "__setostype(1);" to change back to POSIX.  THIS
IS NOT USUSLLY A GOOD IDEA; some system calls may become confused if you
do this.

If you compile a POSIX shell, like bash with "gcc -posix" or "cc -Xp", then
ANY COMMAND EXECED OR FORK-EXECED from the shell WILL OBEY POSIX SEMANTICS,
not SVR3.  This, in an ideal world, should not be a problem.  Unfortunately,
there are some bugs (that can have serious consequences) in running some
ISC commands with POSIX semantics.

Most notorious, is the "namei" bug.  Some binaries -- including, notably 
vpix -- have not been compiled for running in ISC POSIX mode.  As a result,
they dereference a null pointer in "namei", causing a system panic.

Uwe Doering posted a binary patch to /etc/conf/pack.d/os.o, that fixes that
problem, once you build a new kernel.

However, there are other problems, even after you apply Uwe's patch.  Namely,
the ISC "mv" command behaves strangely when you try to rename a directory.
"mv dir1 dir2", executed under a POSIX shell, reports that it "Cannot unlink
.".  You then find that dir1 and dir2 both exist, and with the same inode.
You've created two lnked directories!  (Once, perhaps when I was root, even
more mischief was done -- the parent directory of "dir1" was changed from
a directory to an unreferenced file, and all files in that parent directory
-- my home directory! -- were orphaned.  I found them in /usr/lost+found.)

I've compiled a replacement for "mv", using the GNU file utilities 1.4, also
at prep.ai.mit.edu.  When compiled in POSIX mode ("cc -Xp" or "gcc -posix")
it works fine, even under a POSIX bash.  (Interestingly, when compiled in
non-POSIX fashion -- "cc" or "gcc" without the option -- it DOESN'T work
properly.  IT reports "not owner", when attempting to rename a directory.
And it refuses to overwrite a plain file, (e.g., "mv file1 file2", when
"file[12]" exist and are ordinary files), whatever option you use.  No such
shenanigans when compiling GNU "mv" in POSIX mode.)

>From the above, it's clear that you must be a bit brave (maybe even
reckless) to use a POSIX compiled shell, even if you THINK that you've found
all the bugs in ISC2.2 POSIX.

Here's the safe solution:

The problems all arise because commands execed, or forked and execed, from
a running POSIX-compiled process, run with ISC POSIX.1 semantics.  (As if
the first line in "main" were "--setostype(1);").  The solution is, JUST
BEFORE each such exec, add the line "--setostype(0);".  Then, the execed
process runs with ordinary SVR3 semantics.  Even vpix will run safely under
such a bash, even if you haven't applied Uwe's binary patch.

To be on the safe side (in case the exec call fails, and control returns),
the line JUST AFTER the exec call, should be "--setostype(1);", to return
to the normal POSIX.1 semantics of bash.

I've tried this.  It works like a charm.  The created bash has job control,
and all the other nice bash features -- in line editing, filename completion,
varible completion, command completion, etc.  And, best of all all commands
executed from bash (either on the command line or by using the bash builtin
command "exec"), run as ordinary SVR3 processes.  E.g., ISC "mv" behaves
normally.

I believe that the above method is probably the way ISC compiled it's own
job controll supporting csh.  The proof is that, from ISC's csh, a command
like:

"echo foo > asdfghjklqwertyuiop"

will get the POSIX error, about too long a filename.  Same thing if you do
this from bash, compiled as I've suggested.  But "/bin/sh", and also the
old Microport Korn shell, produce no such error message -- they simply
truncate the filename to 14 charaters.

Of course, automatic truncation of filename length to 14 chars occurs under
bash, compiled as suggested, and also ISC's csh, when running, or exec'ing,
non-builtin commands at the shell prompt or from a shell script.  It's only
a shell builtin, like echo, that runs, under either of these two shells,
with POSIX.1 semantics, with "_POSIX_NO_TRUNC" turned on (as it is, by
default, for ISC2.2 in POSIX.1.  BTW, I don't know how to turn off
"POSIX_NO_TRUNC" on ISC2.2, or even if it can be).

It turns out that there are only two places in the sources for bash where
an exec is called.  This is always "execve".  The first place is in
execute.command.c, for commands run from the shell prompt.  The second
place is in builtins.c, for the shell builtin exec command -- for commands
that are execed from the shell prompt.  (One would like those to run in
SVR3 mode, as well.)

Following this note, are the full set of context diffs to turn bash1.07 into
source ready for gcc under ISC2.2.  These diffs also include a new version
of "trap.c", sent to me by Chet Ramey (one of the two principal maintainers
of bash), for a bug that otherwise doesn't allow using bash for a login
shell.  There are also some minor fixes, some for ISC header bugs, one for
a trivial type mismatch in the bash sources.

First, I'm including a repost of Uwe Doering's binary os.o patch, since I've
received a few requests for it.

			Sincerely yours,
			Saul Lubkin

Uwe's patch:
=======================================================================
Article 6891 of comp.unix.sysv386:
From: gemini@geminix.in-berlin.de (Uwe Doering)
Newsgroups: comp.unix.sysv386
Subject: Re: NAMEI panic - trap "E", address and info follows (+ patch)
Message-ID: <KYXPX2E@geminix.in-berlin.de>
Date: 13 Apr 91 00:55:41 GMT
References: <1991Apr10.040146.645@ddsw1.MCS.COM>
Organization: Private UNIX Site

karl@ddsw1.MCS.COM (Karl Denninger) writes:

>Is anyone else having problems with a "namei" panic in ISC 2.2 (with NFS,
>the NFS/lockd patches, and POSIX patches applied)?
>
>I have been getting these nearly daily.  Trap type "E", address is d007962f.
>That's right near the end of "namei"; here's the relavent line from a "nm"
>on the kernel:
>
>namei               |0xd007919c|extern|       *struct( )|0x0608|     |.text
>
>Needless to say, I am most displeased with the crashes!
>
>Near as I can determine, the hardware is fine.  
>
>All pointers or ideas appreciated...

I found this bug a few days ago and was about to send a bug report
to ISC. The problem is "simply" a NULL pointer reference in the
namei() function. The machine I found this on runs ISC 2.21 with
the security fix installed. I fixed this bug with a binary patch. It
is for the module /etc/conf/pack.d/kernel/os.o. I disassembled the
original and then the fixed version of os.o and ran a context diff
over the output. Depending on what version of the kernel config kit
you have the addresses might be off some bytes. You can apply this
patch with every binary file editor.

***************
*** 35349,35364 ****
                      [%al,%al]
  	cf71:  74 1e                  je     0x1e <cf91>
                      [0xcf91]
! 	cf73:  0f b7 07               movzwl (%edi),%eax
                      [%edi,%eax]
! 	cf76:  3d 11 00 00 00         cmpl   $0x11,%eax
                      [$0x11,%eax]
! 	cf7b:  74 14                  je     0x14 <cf91>
                      [0xcf91]
! 	cf7d:  c7 45 e8 00 00 00 00   movl   $0x0,0xe8(%ebp)
!                     [$0x0,-24+%ebp]
! 	cf84:  eb 19                  jmp    0x19 <cf9f>
!                     [0xcf9f]
  	cf86:  90                     nop    
                      []
  	cf87:  90                     nop    
--- 35349,35372 ----
                      [%al,%al]
  	cf71:  74 1e                  je     0x1e <cf91>
                      [0xcf91]
! 	cf73:  85 ff                  testl  %edi,%edi
!                     [%edi,%edi]
! 	cf75:  74 1a                  je     0x1a <cf91>
!                     [0xcf91]
! 	cf77:  0f b7 07               movzwl (%edi),%eax
                      [%edi,%eax]
! 	cf7a:  3d 11 00 00 00         cmpl   $0x11,%eax
                      [$0x11,%eax]
! 	cf7f:  74 10                  je     0x10 <cf91>
                      [0xcf91]
! 	cf81:  eb 15                  jmp    0x15 <cf98>
!                     [0xcf98]
! 	cf83:  90                     nop    
!                     []
! 	cf84:  90                     nop    
!                     []
! 	cf85:  90                     nop    
!                     []
  	cf86:  90                     nop    
                      []
  	cf87:  90                     nop    

I'm not absolutely sure whether the action that is now taken in case of
a NULL pointer is the right one, but I haven't noticed any problems,
and most important, there are no more kernel panics! At least not from
that spot. :-) The action that is taken if the pointer in _not_ NULL
hasn't changed (this is not very obvious from the patch, but look
in the disassembler listing of your own kernel for more details).
I use this modified kernel for over a week now and it works for
me. Of course, as always, I can't give you any guaranty that this
patch does something useful on your machine. :-)

Hope this helps you.

     Uwe

PS: ISC, if you see this posting, could you drop me a note on whether
you have put this on your to-do list? This would save me the time
needed to file an official bug report.
-- 
Uwe Doering  |  INET : gemini@geminix.in-berlin.de
Berlin       |----------------------------------------------------------------
Germany      |  UUCP : ...!unido!fub!geminix.in-berlin.de!gemini
=======================================================================


context diffs for bash1.07 under ISC2.2:


*** builtins.c	Fri May  3 14:06:16 1991
--- ../bash/builtins.c	Wed May 22 12:02:29 1991
***************
*** 1406,1412 ****
--- 1406,1414 ----
  
        signal (SIGINT, SIG_DFL);
        signal (SIGQUIT, SIG_DFL);
+ 	__setostype(0); /* SVR3 semantics Saul */
        execve (command, args, export_env);
+ 	__setostype(1); /* POSIX semantics Saul */
  
        adjust_shell_level (1);
  
*** config.h	Thu May  2 13:11:23 1991
--- ../bash/config.h	Wed May 22 12:45:56 1991
***************
*** 108,112 ****
--- 109,114 ----
  /* Define BREAK_COMPLAINS if you want the incompatible, but useful
     error messages about `break' and `continue' out of context. */
  #define BREAK_COMPLAINS
+ #include <sys/limits.h> /* For PATH_MAX */
  
  #endif	/* _CONFIG_ */
*** cpp-Makefile	Fri May  3 14:17:18 1991
--- ../bash/cpp-Makefile	Wed May  8 22:11:06 1991
***************
*** 40,46 ****
  /* **************************************************************** */
  
  /* Define HAVE_GCC if you have the GNU C compiler. */
! /* #define HAVE_GCC */
  
  /* Define HAVE_FIXED_INCLUDES if you are using GCC with the fixed
     header files. */
--- 40,46 ----
  /* **************************************************************** */
  
  /* Define HAVE_GCC if you have the GNU C compiler. */
! #define HAVE_GCC
  
  /* Define HAVE_FIXED_INCLUDES if you are using GCC with the fixed
     header files. */
***************
*** 102,110 ****
  /* This is guaranteed to work, even if you have the fixed includes!
     (Unless, of course, you have the fixed include files installed in
     /usr/include.  Then it will break. ) */
! CC = gcc -traditional -I/usr/include
  #else
! CC = gcc
  #endif /* !HAVE_FIXED_INCLUDES */
  #else
  CC = CPP_CC
--- 103,113 ----
  /* This is guaranteed to work, even if you have the fixed includes!
     (Unless, of course, you have the fixed include files installed in
     /usr/include.  Then it will break. ) */
! /* CC = gcc -traditional -I/usr/include */
! CC = gcc -traditional -posix -I/usr/include
  #else
! /* CC = gcc /* Saul */
! CC = gcc -posix
  #endif /* !HAVE_FIXED_INCLUDES */
  #else
  CC = CPP_CC
***************
*** 177,183 ****
  
  SYSTEM_FLAGS = $(LINEBUF) $(VPRINTF) $(UNISTD) $(GROUPS) $(RESOURCE) \
         $(SIGHANDLER) $(SYSDEP) $(WAITH) $(GETWD) -D$(MACHINE) -D$(OS)
! DEBUG_FLAGS = $(PROFILE_FLAGS) -g
  LDFLAGS	= $(DEBUG_FLAGS) $(NOSHARE) $(SYSDEP)
  CFLAGS	= $(DEBUG_FLAGS) $(SYSTEM_FLAGS)
  CPPFLAGS= -I$(LIBSRC)
--- 180,187 ----
  
  SYSTEM_FLAGS = $(LINEBUF) $(VPRINTF) $(UNISTD) $(GROUPS) $(RESOURCE) \
         $(SIGHANDLER) $(SYSDEP) $(WAITH) $(GETWD) -D$(MACHINE) -D$(OS)
! /* DEBUG_FLAGS = $(PROFILE_FLAGS) -g /* Saul */
! DEBUG_FLAGS = $(PROFILE_FLAGS) -O
  LDFLAGS	= $(DEBUG_FLAGS) $(NOSHARE) $(SYSDEP)
  CFLAGS	= $(DEBUG_FLAGS) $(SYSTEM_FLAGS)
  CPPFLAGS= -I$(LIBSRC)
*** execute_cmd.c	Sat Apr 27 13:45:51 1991
--- ../bash/execute_cmd.c	Wed May 22 12:12:33 1991
***************
*** 67,73 ****
  extern char *strerror ();
  
  #if defined (USG)
! extern int last_made_pid;
  #endif
  
  extern WORD_LIST *expand_words (), *expand_word ();
--- 67,74 ----
  extern char *strerror ();
  
  #if defined (USG)
! /* extern int last_made_pid; /* original line */
! extern pid_t last_made_pid; /* incompatible types */
  #endif
  
  extern WORD_LIST *expand_words (), *expand_word ();
***************
*** 1366,1372 ****
--- 1367,1375 ----
  	    if (do_redirections (simple_command->redirects, 1, 0, 0) == 0)
  	      {
  		signal (SIGCHLD, SIG_DFL);
+ 	__setostype(0); /* ISC2.2: SVR3 semantics for executed command */
  		execve (command, args, export_env);
+ 	__setostype(1); /* ISC2.2: POSIX semantics if execve fails */
  
  		/* If we get to this point, then start checking out the file.
  		   Maybe it is something we can hack ourselves. */
***************
*** 1472,1478 ****
--- 1475,1483 ----
  			  struct stat finfo;
  			  extern int errno;
   
+ 	__setostype(0); /* ISC2.2: SVR3 semantics for executed command */
  			  execve (shell_name, args, export_env);
+ 	__setostype(1); /* ISC2.2: POSIX semantics if execve fails */
   
  			  /* Oh, no!  We couldn't even exec this! */
  			  if ((stat (shell_name, &finfo) == 0) &&
*** machines.h	Fri May  3 14:05:09 1991
--- ../bash/machines.h	Wed May 22 12:34:44 1991
***************
*** 305,317 ****
  #define M_MACHINE "i386"
  #define M_OS USG
  #undef HAVE_GETWD
! #define SYSDEP_CFLAGS -DUSGr3
  #define USE_GNU_MALLOC
  #define HAVE_VPRINTF
  #define VOID_SIGHANDLER
  #if !defined (HAVE_GCC)
  #  define HAVE_ALLOCA
! #  define REQUIRED_LIBRARIES -lPW
  #endif /* !HAVE_GCC */
  #endif /* i386 */
  
--- 305,321 ----
  #define M_MACHINE "i386"
  #define M_OS USG
  #undef HAVE_GETWD
! /* #define SYSDEP_CFLAGS -DUSGr3 /* original line */
! #define SYSDEP_CFLAGS -D_POSIX_SOURCE -DMAXPATHLEN=PATH_MAX -DUSGr3 /* ISC 2.2 */
  #define USE_GNU_MALLOC
  #define HAVE_VPRINTF
  #define VOID_SIGHANDLER
+ /* #define HAVE_SIGLIST /* ISC2.2 no we don't! */
+ #define HAVE_MULTIPLE_GROUPS /* ISC 2.2 */
  #if !defined (HAVE_GCC)
  #  define HAVE_ALLOCA
! /* #  define REQUIRED_LIBRARIES -lPW /* original line */
! #  define REQUIRED_LIBRARIES -linet -lPW /* for bcopy, etc. */
  #endif /* !HAVE_GCC */
  #endif /* i386 */
  
***************
*** 784,795 ****
  /*			    */
  /*    Cadmus (tested once)  */
  /*			    */
! /* ************************ */
  /* Port by bfox.  I apologize to the rest of the world for Cadmus. */
! #if defined (cadmus)
  #define M_MACHINE "cadmus"
  #define M_OS BrainDeath		/* By Far, the worst yet. */
! #define SYSDEP_CFLAGS -DUSG
  #undef HAVE_GETWD
  #define USE_GNU_MALLOC
  #define HAVE_VPRINTF
--- 788,799 ----
  /*			    */
  /*    Cadmus (tested once)  */
  /*			    */
! /* ************************ */ /* 386 machines think they're a Cadmus! */
  /* Port by bfox.  I apologize to the rest of the world for Cadmus. */
! /* #if defined (cadmus)
  #define M_MACHINE "cadmus"
  #define M_OS BrainDeath		/* By Far, the worst yet. */
! /* #define SYSDEP_CFLAGS -DUSG
  #undef HAVE_GETWD
  #define USE_GNU_MALLOC
  #define HAVE_VPRINTF
*** posixstat.h	Sat Mar  9 17:06:18 1991
--- ../bash/posixstat.h	Wed May  8 22:18:22 1991
***************
*** 25,30 ****
--- 25,32 ----
  #define _POSIXSTAT_H
  
  #include <sys/stat.h>
+ #define     S_IFDIR  0040000  /* directory /* need this for ISC header bug */
+ #define  S_IFMT   0170000     /* type of file /* need this for ISC header bug */
  
  /* This text is taken directly from the Cadmus I was trying to
     compile on:
*** readline/readline.c	Thu Apr 11 07:24:09 1991
--- ../bash/readline/readline.c	Wed May  8 22:25:18 1991
***************
*** 35,40 ****
--- 35,41 ----
  #include <fcntl.h>
  #include <sys/file.h>
  #include <signal.h>
+ #include <posixstat.h> /* for ISC2.2 header bugs */
  
  #ifdef __GNUC__
  #define alloca __builtin_alloca
*** trap.c	Sun Jan 20 12:33:07 1991
--- ../bash/trap.c	Thu May  9 16:43:26 1991
***************
*** 238,244 ****
      }
  
  
!   for (i = 0; i < NSIG; i++)
      {
        trap_list[i] = (char *)DEFAULT_SIG;
        original_signals[i] = (SigHandler *)signal (i, SIG_DFL);
--- 238,244 ----
      }
  
  
!   for (i = 1; i < NSIG; i++)
      {
        trap_list[i] = (char *)DEFAULT_SIG;
        original_signals[i] = (SigHandler *)signal (i, SIG_DFL);
***************
*** 277,285 ****
  	 (stricmp (string, &(signal_names[sig])[3]) == 0))
         return (sig);
  
-   if ((stricmp (string, "SIGNULL") == 0) || (stricmp (string, "NULL") == 0))
-     return (0);
- 
    return (NO_SIG);
  }
  
--- 277,282 ----
***************
*** 387,393 ****
       int sig;
       char *value;
  {
!   if (((int)trap_list[sig]) > 0)
       free (trap_list[sig]);
    trap_list[sig] = value;
  }
--- 384,390 ----
       int sig;
       char *value;
  {
!   if ((((int)trap_list[sig]) > 0) && (trap_list[sig] != (char *)IGNORE_SIG))
       free (trap_list[sig]);
    trap_list[sig] = value;
  }
***************
*** 420,426 ****
  }
  
  /* Handle the calling of "trap 0".  The only sticky situation is when
!    the command to be executed includes an "exit". */
  void
  run_exit_trap ()
  {
--- 417,424 ----
  }
  
  /* Handle the calling of "trap 0".  The only sticky situation is when
!    the command to be executed includes an "exit".  This is why we have
!    to provide our own place for top_level to jump to. */
  void
  run_exit_trap ()
  {
***************
*** 427,439 ****
    if (((int)trap_list[0]) > 0)
      {
        char *trap_command = savestring (trap_list[0]);
  
        change_signal (0, (char *)NULL);
!       parse_and_execute (trap_command, "trap");
      }
  }
        
! /* Reset all trapped signals to their original values. */
  void
  restore_original_signals ()
  {
--- 425,444 ----
    if (((int)trap_list[0]) > 0)
      {
        char *trap_command = savestring (trap_list[0]);
+       int code;
  
        change_signal (0, (char *)NULL);
!       code = setjmp (top_level);
!       if (code == 0)
! 	parse_and_execute (trap_command, "trap");
!       else
!         return;
      }
  }
        
! /* Reset all trapped signals to their original values.  Signals set to be
!    ignored with trap '' signal should be ignored, so we make sure they
!    are. */
  void
  restore_original_signals ()
  {
***************
*** 440,448 ****
    register int i;
  
    for (i = 0; i < NSIG; i++)
!     if ((trap_list[i] != (char *)DEFAULT_SIG) &&
!     	(trap_list[i] != (char *)IGNORE_SIG))
!       restore_default_signal (i);
  }
  
  /* Run a trap set on SIGINT.  This is called from throw_to_top_level (), and
--- 445,459 ----
    register int i;
  
    for (i = 0; i < NSIG; i++)
!     {
!       if (trap_list[i] != (char *)DEFAULT_SIG)
!         {
! 	  if (trap_list[i] == (char *)IGNORE_SIG)
! 	    signal (i, SIG_IGN);
! 	  else
! 	    restore_default_signal (i);
!         }
!     }
  }
  
  /* Run a trap set on SIGINT.  This is called from throw_to_top_level (), and
***************
*** 463,465 ****
--- 474,477 ----
        last_command_exit_value = old_exit_value;
      }
  }
+