lubkin@cs.rochester.edu (Saul Lubkin) (05/24/91)
I've finally worked out the last problems in using a POSIX compiled, job control bash1.07, with ISC Unix 2.2. Enclosed are all the context diffs needed, to change a virgin copy, extracted from bash1.07.tar.Z on prep.ai.mit.edu, to one that will compile, to a 100% useable, safe, very useful shell, under ISC 2.2. Note: I've used "gcc -traditional -posix" for my C compiler. gcc can also be downloaded from prep.ai.mit.edu. Or, you can use "cc -Xp" instead, and make the appropriate modifications in cpp-Makefile, by hand. Let me note what the problems had been: ISC2.2 is a dual operating system. When a final is compiled with "cc -Xp", (or equivalently "gcc -posix") a POSIX.1 compliant environment is created for the binary, which affects the way it executed. Compiling without the "-Xp" flag creates ordinary SVR3 compatible binaries. In a POSIX ("-Xp") binary, you can execute "__setostype(0);" to change semantics to SVR3, and "__setostype(1);" to change back to POSIX. THIS IS NOT USUSLLY A GOOD IDEA; some system calls may become confused if you do this. If you compile a POSIX shell, like bash with "gcc -posix" or "cc -Xp", then ANY COMMAND EXECED OR FORK-EXECED from the shell WILL OBEY POSIX SEMANTICS, not SVR3. This, in an ideal world, should not be a problem. Unfortunately, there are some bugs (that can have serious consequences) in running some ISC commands with POSIX semantics. Most notorious, is the "namei" bug. Some binaries -- including, notably vpix -- have not been compiled for running in ISC POSIX mode. As a result, they dereference a null pointer in "namei", causing a system panic. Uwe Doering posted a binary patch to /etc/conf/pack.d/os.o, that fixes that problem, once you build a new kernel. However, there are other problems, even after you apply Uwe's patch. Namely, the ISC "mv" command behaves strangely when you try to rename a directory. "mv dir1 dir2", executed under a POSIX shell, reports that it "Cannot unlink .". You then find that dir1 and dir2 both exist, and with the same inode. You've created two lnked directories! (Once, perhaps when I was root, even more mischief was done -- the parent directory of "dir1" was changed from a directory to an unreferenced file, and all files in that parent directory -- my home directory! -- were orphaned. I found them in /usr/lost+found.) I've compiled a replacement for "mv", using the GNU file utilities 1.4, also at prep.ai.mit.edu. When compiled in POSIX mode ("cc -Xp" or "gcc -posix") it works fine, even under a POSIX bash. (Interestingly, when compiled in non-POSIX fashion -- "cc" or "gcc" without the option -- it DOESN'T work properly. IT reports "not owner", when attempting to rename a directory. And it refuses to overwrite a plain file, (e.g., "mv file1 file2", when "file[12]" exist and are ordinary files), whatever option you use. No such shenanigans when compiling GNU "mv" in POSIX mode.) >From the above, it's clear that you must be a bit brave (maybe even reckless) to use a POSIX compiled shell, even if you THINK that you've found all the bugs in ISC2.2 POSIX. Here's the safe solution: The problems all arise because commands execed, or forked and execed, from a running POSIX-compiled process, run with ISC POSIX.1 semantics. (As if the first line in "main" were "--setostype(1);"). The solution is, JUST BEFORE each such exec, add the line "--setostype(0);". Then, the execed process runs with ordinary SVR3 semantics. Even vpix will run safely under such a bash, even if you haven't applied Uwe's binary patch. To be on the safe side (in case the exec call fails, and control returns), the line JUST AFTER the exec call, should be "--setostype(1);", to return to the normal POSIX.1 semantics of bash. I've tried this. It works like a charm. The created bash has job control, and all the other nice bash features -- in line editing, filename completion, varible completion, command completion, etc. And, best of all all commands executed from bash (either on the command line or by using the bash builtin command "exec"), run as ordinary SVR3 processes. E.g., ISC "mv" behaves normally. I believe that the above method is probably the way ISC compiled it's own job controll supporting csh. The proof is that, from ISC's csh, a command like: "echo foo > asdfghjklqwertyuiop" will get the POSIX error, about too long a filename. Same thing if you do this from bash, compiled as I've suggested. But "/bin/sh", and also the old Microport Korn shell, produce no such error message -- they simply truncate the filename to 14 charaters. Of course, automatic truncation of filename length to 14 chars occurs under bash, compiled as suggested, and also ISC's csh, when running, or exec'ing, non-builtin commands at the shell prompt or from a shell script. It's only a shell builtin, like echo, that runs, under either of these two shells, with POSIX.1 semantics, with "_POSIX_NO_TRUNC" turned on (as it is, by default, for ISC2.2 in POSIX.1. BTW, I don't know how to turn off "POSIX_NO_TRUNC" on ISC2.2, or even if it can be). It turns out that there are only two places in the sources for bash where an exec is called. This is always "execve". The first place is in execute.command.c, for commands run from the shell prompt. The second place is in builtins.c, for the shell builtin exec command -- for commands that are execed from the shell prompt. (One would like those to run in SVR3 mode, as well.) Following this note, are the full set of context diffs to turn bash1.07 into source ready for gcc under ISC2.2. These diffs also include a new version of "trap.c", sent to me by Chet Ramey (one of the two principal maintainers of bash), for a bug that otherwise doesn't allow using bash for a login shell. There are also some minor fixes, some for ISC header bugs, one for a trivial type mismatch in the bash sources. First, I'm including a repost of Uwe Doering's binary os.o patch, since I've received a few requests for it. Sincerely yours, Saul Lubkin Uwe's patch: ======================================================================= Article 6891 of comp.unix.sysv386: From: gemini@geminix.in-berlin.de (Uwe Doering) Newsgroups: comp.unix.sysv386 Subject: Re: NAMEI panic - trap "E", address and info follows (+ patch) Message-ID: <KYXPX2E@geminix.in-berlin.de> Date: 13 Apr 91 00:55:41 GMT References: <1991Apr10.040146.645@ddsw1.MCS.COM> Organization: Private UNIX Site karl@ddsw1.MCS.COM (Karl Denninger) writes: >Is anyone else having problems with a "namei" panic in ISC 2.2 (with NFS, >the NFS/lockd patches, and POSIX patches applied)? > >I have been getting these nearly daily. Trap type "E", address is d007962f. >That's right near the end of "namei"; here's the relavent line from a "nm" >on the kernel: > >namei |0xd007919c|extern| *struct( )|0x0608| |.text > >Needless to say, I am most displeased with the crashes! > >Near as I can determine, the hardware is fine. > >All pointers or ideas appreciated... I found this bug a few days ago and was about to send a bug report to ISC. The problem is "simply" a NULL pointer reference in the namei() function. The machine I found this on runs ISC 2.21 with the security fix installed. I fixed this bug with a binary patch. It is for the module /etc/conf/pack.d/kernel/os.o. I disassembled the original and then the fixed version of os.o and ran a context diff over the output. Depending on what version of the kernel config kit you have the addresses might be off some bytes. You can apply this patch with every binary file editor. *************** *** 35349,35364 **** [%al,%al] cf71: 74 1e je 0x1e <cf91> [0xcf91] ! cf73: 0f b7 07 movzwl (%edi),%eax [%edi,%eax] ! cf76: 3d 11 00 00 00 cmpl $0x11,%eax [$0x11,%eax] ! cf7b: 74 14 je 0x14 <cf91> [0xcf91] ! cf7d: c7 45 e8 00 00 00 00 movl $0x0,0xe8(%ebp) ! [$0x0,-24+%ebp] ! cf84: eb 19 jmp 0x19 <cf9f> ! [0xcf9f] cf86: 90 nop [] cf87: 90 nop --- 35349,35372 ---- [%al,%al] cf71: 74 1e je 0x1e <cf91> [0xcf91] ! cf73: 85 ff testl %edi,%edi ! [%edi,%edi] ! cf75: 74 1a je 0x1a <cf91> ! [0xcf91] ! cf77: 0f b7 07 movzwl (%edi),%eax [%edi,%eax] ! cf7a: 3d 11 00 00 00 cmpl $0x11,%eax [$0x11,%eax] ! cf7f: 74 10 je 0x10 <cf91> [0xcf91] ! cf81: eb 15 jmp 0x15 <cf98> ! [0xcf98] ! cf83: 90 nop ! [] ! cf84: 90 nop ! [] ! cf85: 90 nop ! [] cf86: 90 nop [] cf87: 90 nop I'm not absolutely sure whether the action that is now taken in case of a NULL pointer is the right one, but I haven't noticed any problems, and most important, there are no more kernel panics! At least not from that spot. :-) The action that is taken if the pointer in _not_ NULL hasn't changed (this is not very obvious from the patch, but look in the disassembler listing of your own kernel for more details). I use this modified kernel for over a week now and it works for me. Of course, as always, I can't give you any guaranty that this patch does something useful on your machine. :-) Hope this helps you. Uwe PS: ISC, if you see this posting, could you drop me a note on whether you have put this on your to-do list? This would save me the time needed to file an official bug report. -- Uwe Doering | INET : gemini@geminix.in-berlin.de Berlin |---------------------------------------------------------------- Germany | UUCP : ...!unido!fub!geminix.in-berlin.de!gemini ======================================================================= context diffs for bash1.07 under ISC2.2: *** builtins.c Fri May 3 14:06:16 1991 --- ../bash/builtins.c Wed May 22 12:02:29 1991 *************** *** 1406,1412 **** --- 1406,1414 ---- signal (SIGINT, SIG_DFL); signal (SIGQUIT, SIG_DFL); + __setostype(0); /* SVR3 semantics Saul */ execve (command, args, export_env); + __setostype(1); /* POSIX semantics Saul */ adjust_shell_level (1); *** config.h Thu May 2 13:11:23 1991 --- ../bash/config.h Wed May 22 12:45:56 1991 *************** *** 108,112 **** --- 109,114 ---- /* Define BREAK_COMPLAINS if you want the incompatible, but useful error messages about `break' and `continue' out of context. */ #define BREAK_COMPLAINS + #include <sys/limits.h> /* For PATH_MAX */ #endif /* _CONFIG_ */ *** cpp-Makefile Fri May 3 14:17:18 1991 --- ../bash/cpp-Makefile Wed May 8 22:11:06 1991 *************** *** 40,46 **** /* **************************************************************** */ /* Define HAVE_GCC if you have the GNU C compiler. */ ! /* #define HAVE_GCC */ /* Define HAVE_FIXED_INCLUDES if you are using GCC with the fixed header files. */ --- 40,46 ---- /* **************************************************************** */ /* Define HAVE_GCC if you have the GNU C compiler. */ ! #define HAVE_GCC /* Define HAVE_FIXED_INCLUDES if you are using GCC with the fixed header files. */ *************** *** 102,110 **** /* This is guaranteed to work, even if you have the fixed includes! (Unless, of course, you have the fixed include files installed in /usr/include. Then it will break. ) */ ! CC = gcc -traditional -I/usr/include #else ! CC = gcc #endif /* !HAVE_FIXED_INCLUDES */ #else CC = CPP_CC --- 103,113 ---- /* This is guaranteed to work, even if you have the fixed includes! (Unless, of course, you have the fixed include files installed in /usr/include. Then it will break. ) */ ! /* CC = gcc -traditional -I/usr/include */ ! CC = gcc -traditional -posix -I/usr/include #else ! /* CC = gcc /* Saul */ ! CC = gcc -posix #endif /* !HAVE_FIXED_INCLUDES */ #else CC = CPP_CC *************** *** 177,183 **** SYSTEM_FLAGS = $(LINEBUF) $(VPRINTF) $(UNISTD) $(GROUPS) $(RESOURCE) \ $(SIGHANDLER) $(SYSDEP) $(WAITH) $(GETWD) -D$(MACHINE) -D$(OS) ! DEBUG_FLAGS = $(PROFILE_FLAGS) -g LDFLAGS = $(DEBUG_FLAGS) $(NOSHARE) $(SYSDEP) CFLAGS = $(DEBUG_FLAGS) $(SYSTEM_FLAGS) CPPFLAGS= -I$(LIBSRC) --- 180,187 ---- SYSTEM_FLAGS = $(LINEBUF) $(VPRINTF) $(UNISTD) $(GROUPS) $(RESOURCE) \ $(SIGHANDLER) $(SYSDEP) $(WAITH) $(GETWD) -D$(MACHINE) -D$(OS) ! /* DEBUG_FLAGS = $(PROFILE_FLAGS) -g /* Saul */ ! DEBUG_FLAGS = $(PROFILE_FLAGS) -O LDFLAGS = $(DEBUG_FLAGS) $(NOSHARE) $(SYSDEP) CFLAGS = $(DEBUG_FLAGS) $(SYSTEM_FLAGS) CPPFLAGS= -I$(LIBSRC) *** execute_cmd.c Sat Apr 27 13:45:51 1991 --- ../bash/execute_cmd.c Wed May 22 12:12:33 1991 *************** *** 67,73 **** extern char *strerror (); #if defined (USG) ! extern int last_made_pid; #endif extern WORD_LIST *expand_words (), *expand_word (); --- 67,74 ---- extern char *strerror (); #if defined (USG) ! /* extern int last_made_pid; /* original line */ ! extern pid_t last_made_pid; /* incompatible types */ #endif extern WORD_LIST *expand_words (), *expand_word (); *************** *** 1366,1372 **** --- 1367,1375 ---- if (do_redirections (simple_command->redirects, 1, 0, 0) == 0) { signal (SIGCHLD, SIG_DFL); + __setostype(0); /* ISC2.2: SVR3 semantics for executed command */ execve (command, args, export_env); + __setostype(1); /* ISC2.2: POSIX semantics if execve fails */ /* If we get to this point, then start checking out the file. Maybe it is something we can hack ourselves. */ *************** *** 1472,1478 **** --- 1475,1483 ---- struct stat finfo; extern int errno; + __setostype(0); /* ISC2.2: SVR3 semantics for executed command */ execve (shell_name, args, export_env); + __setostype(1); /* ISC2.2: POSIX semantics if execve fails */ /* Oh, no! We couldn't even exec this! */ if ((stat (shell_name, &finfo) == 0) && *** machines.h Fri May 3 14:05:09 1991 --- ../bash/machines.h Wed May 22 12:34:44 1991 *************** *** 305,317 **** #define M_MACHINE "i386" #define M_OS USG #undef HAVE_GETWD ! #define SYSDEP_CFLAGS -DUSGr3 #define USE_GNU_MALLOC #define HAVE_VPRINTF #define VOID_SIGHANDLER #if !defined (HAVE_GCC) # define HAVE_ALLOCA ! # define REQUIRED_LIBRARIES -lPW #endif /* !HAVE_GCC */ #endif /* i386 */ --- 305,321 ---- #define M_MACHINE "i386" #define M_OS USG #undef HAVE_GETWD ! /* #define SYSDEP_CFLAGS -DUSGr3 /* original line */ ! #define SYSDEP_CFLAGS -D_POSIX_SOURCE -DMAXPATHLEN=PATH_MAX -DUSGr3 /* ISC 2.2 */ #define USE_GNU_MALLOC #define HAVE_VPRINTF #define VOID_SIGHANDLER + /* #define HAVE_SIGLIST /* ISC2.2 no we don't! */ + #define HAVE_MULTIPLE_GROUPS /* ISC 2.2 */ #if !defined (HAVE_GCC) # define HAVE_ALLOCA ! /* # define REQUIRED_LIBRARIES -lPW /* original line */ ! # define REQUIRED_LIBRARIES -linet -lPW /* for bcopy, etc. */ #endif /* !HAVE_GCC */ #endif /* i386 */ *************** *** 784,795 **** /* */ /* Cadmus (tested once) */ /* */ ! /* ************************ */ /* Port by bfox. I apologize to the rest of the world for Cadmus. */ ! #if defined (cadmus) #define M_MACHINE "cadmus" #define M_OS BrainDeath /* By Far, the worst yet. */ ! #define SYSDEP_CFLAGS -DUSG #undef HAVE_GETWD #define USE_GNU_MALLOC #define HAVE_VPRINTF --- 788,799 ---- /* */ /* Cadmus (tested once) */ /* */ ! /* ************************ */ /* 386 machines think they're a Cadmus! */ /* Port by bfox. I apologize to the rest of the world for Cadmus. */ ! /* #if defined (cadmus) #define M_MACHINE "cadmus" #define M_OS BrainDeath /* By Far, the worst yet. */ ! /* #define SYSDEP_CFLAGS -DUSG #undef HAVE_GETWD #define USE_GNU_MALLOC #define HAVE_VPRINTF *** posixstat.h Sat Mar 9 17:06:18 1991 --- ../bash/posixstat.h Wed May 8 22:18:22 1991 *************** *** 25,30 **** --- 25,32 ---- #define _POSIXSTAT_H #include <sys/stat.h> + #define S_IFDIR 0040000 /* directory /* need this for ISC header bug */ + #define S_IFMT 0170000 /* type of file /* need this for ISC header bug */ /* This text is taken directly from the Cadmus I was trying to compile on: *** readline/readline.c Thu Apr 11 07:24:09 1991 --- ../bash/readline/readline.c Wed May 8 22:25:18 1991 *************** *** 35,40 **** --- 35,41 ---- #include <fcntl.h> #include <sys/file.h> #include <signal.h> + #include <posixstat.h> /* for ISC2.2 header bugs */ #ifdef __GNUC__ #define alloca __builtin_alloca *** trap.c Sun Jan 20 12:33:07 1991 --- ../bash/trap.c Thu May 9 16:43:26 1991 *************** *** 238,244 **** } ! for (i = 0; i < NSIG; i++) { trap_list[i] = (char *)DEFAULT_SIG; original_signals[i] = (SigHandler *)signal (i, SIG_DFL); --- 238,244 ---- } ! for (i = 1; i < NSIG; i++) { trap_list[i] = (char *)DEFAULT_SIG; original_signals[i] = (SigHandler *)signal (i, SIG_DFL); *************** *** 277,285 **** (stricmp (string, &(signal_names[sig])[3]) == 0)) return (sig); - if ((stricmp (string, "SIGNULL") == 0) || (stricmp (string, "NULL") == 0)) - return (0); - return (NO_SIG); } --- 277,282 ---- *************** *** 387,393 **** int sig; char *value; { ! if (((int)trap_list[sig]) > 0) free (trap_list[sig]); trap_list[sig] = value; } --- 384,390 ---- int sig; char *value; { ! if ((((int)trap_list[sig]) > 0) && (trap_list[sig] != (char *)IGNORE_SIG)) free (trap_list[sig]); trap_list[sig] = value; } *************** *** 420,426 **** } /* Handle the calling of "trap 0". The only sticky situation is when ! the command to be executed includes an "exit". */ void run_exit_trap () { --- 417,424 ---- } /* Handle the calling of "trap 0". The only sticky situation is when ! the command to be executed includes an "exit". This is why we have ! to provide our own place for top_level to jump to. */ void run_exit_trap () { *************** *** 427,439 **** if (((int)trap_list[0]) > 0) { char *trap_command = savestring (trap_list[0]); change_signal (0, (char *)NULL); ! parse_and_execute (trap_command, "trap"); } } ! /* Reset all trapped signals to their original values. */ void restore_original_signals () { --- 425,444 ---- if (((int)trap_list[0]) > 0) { char *trap_command = savestring (trap_list[0]); + int code; change_signal (0, (char *)NULL); ! code = setjmp (top_level); ! if (code == 0) ! parse_and_execute (trap_command, "trap"); ! else ! return; } } ! /* Reset all trapped signals to their original values. Signals set to be ! ignored with trap '' signal should be ignored, so we make sure they ! are. */ void restore_original_signals () { *************** *** 440,448 **** register int i; for (i = 0; i < NSIG; i++) ! if ((trap_list[i] != (char *)DEFAULT_SIG) && ! (trap_list[i] != (char *)IGNORE_SIG)) ! restore_default_signal (i); } /* Run a trap set on SIGINT. This is called from throw_to_top_level (), and --- 445,459 ---- register int i; for (i = 0; i < NSIG; i++) ! { ! if (trap_list[i] != (char *)DEFAULT_SIG) ! { ! if (trap_list[i] == (char *)IGNORE_SIG) ! signal (i, SIG_IGN); ! else ! restore_default_signal (i); ! } ! } } /* Run a trap set on SIGINT. This is called from throw_to_top_level (), and *************** *** 463,465 **** --- 474,477 ---- last_command_exit_value = old_exit_value; } } +