fail@fozzy.UUCP (Dennis Fail) (10/19/89)
Help, We recently obtained the ksh-88 source from the toolchest and installed on our newtwork of Sun3's and sun4's running SunOS 4.0.3. The problem we have been having is that suddenly a window will die because it's parent process ksh has dumped core. The best that I can determine is that when when someone is working at their Sun workstation with ksh as their login shell and then rlogin's or physically gets up and goes to another workstation, logs in, and does some work and then goes back to his workstation and try to use his windows that he was working in, they will dump core. This doesn't happen all the time, but with enough frequency to be a problem. I've isolated the problem to be in the history function of ksh as the following dbx dialog will tell, but I am at a loss of what to do about. Is it a ksh bug, is it something wrong in the configuration. All the users HOME directorys are NFS mounted on every machine and they all use .sh_history as the history file and I am suspecting that this sharing of the file has something to do with it, but thats just an un-educated guess. As a side note, we had had problems with the previous version of ksh corrupting the history file after an rlogin, but it would never dump core. When we tried to recall a line after a rlogin the shell would just beep at you, doing a history command would display a list of numbers something like this: 200 252 201 ... To fix this we would do an 'exec ksh' and everything would be fine. Here is the dbx dialog dbx /bin/ksh core Reading symbolic information... Read 14645 symbols (dbx) where hist_eof(), line 430 in "isda/fail/sun-src/ksh-i/src/sh/history.c" exfile(), line 377 in "isda/fail/sun-src/ksh-i/src/sh/main.c" main(c = 1, v = 0xefffc40, 0xefffc48), line 280 in "isda/fail/sun-src/ksh-i/src/sh/main.c" (dbx) list 430 register off_t count = fp->fixcnt; 431 int skip = 0; 432 io_seek(fp->fixfd,count,SEEK_SET); 433 #ifdef INT16 434 while((c=io_getc(fp->fixfd))!=EOF) 435 { 436 #else 437 /* could use io_getc() but this is faster */ 438 while(1) 439 { (dbx) print fp `history`hist_eof`fp = (nil) (dbx) I guess the nil pointer is causing the cre dump, but I dont know why it is nil. Any clues anybody. Thanks Dennis Fail Rockwell Int. {uunet | attctc}!fozzy!fail
seth@ctr.columbia.edu (Seth Robertson) (10/19/89)
In article <834@fozzy.UUCP> fail@fozzy.UUCP (Dennis Fail) writes: >goes to another >workstation, logs in, and does some work and then goes back to his >workstation and try to use his windows that he was working in, they will >dump core. This doesn't happen all the time, but with enough frequency >to be a problem. > > I've isolated the problem to be in the history function of ksh >As a side note, we had had problems with the previous version of ksh >corrupting the history file after an rlogin, but it would never dump core. The way I solved the corrupted history file problem was to change the history file for each pty. I assume that this would solve your problem also. From my .kshrc: (The file that gets read in every time a ksh starts. If you do not have this feature, you could stick it in .profile) ------------------------------------------------------------ ## ## Source the people's startup file. . /public/etc/kshsetup ## ## Set it up so that it prints the contents of ~/.face when I log out if test "$0" = "su" -o "$0" = "-su" then # WatchOut gets set if there is already another history file existing # with the same name (i.e. don't delete it) if test "$WatchOut" then # Keep the history file and don't print a closing screen # (Because if you su, you don't want your history file to disappear :-) trap 'trap 0; exec ~/.kshout save no; kill -9 0; exit; exit' 0 else # Delete the history file and don't print a closing screen trap 'trap 0; exec ~/.kshout kill no; kill -9 0; exit; exit' 0 fi else if test "$0" = "-ksh" then # Delete the histry file and print a closing screen trap 'trap 0; exec ~/.kshout kill yes; kill -9 0; exit; exit' 0 else # Delete the history file but don't print a closing screen trap 'trap 0; exec ~/.kshout kill no; kill -9 0; exit; exit' 0 fi fi ------------------------------------------------------------ From my .kshout ------------------------------------------------------------ : ${tty:=`tty`} : ${pty:=`basename $tty`} : ${host:=`hostname`} # If argv[1] is `kill' then we are supposed to get rid of the # ksh history file if test "$1" = "kill" then if test "$pty" = "console" then rm -f .ksh.$host.* ~/core else : ${HISTFILE:="$HOME/.ksh.$host.$pty.$USER"} rm -f "$HISTFILE" ~/core fi fi # If argv[2] is `yes' then we are supposed to print a closing screen if test "$2" = "yes" then clear cat ~/.face fi ------------------------------------------------------------ From /public/etc/kshsetup: ------------------------------------------------------------ : ${tty:=`tty`} : ${pty:=`basename $tty`} : ${host:=`hostname`} if test -f "$HOME/.ksh.$host.$pty.$USER" then WatchOut="true" fi HISTFILE="$HOME/.ksh.$host.$pty.$USER" . /public/etc/kshenv # Set up some special features ------------------------------------------------------------ That should solve the problem by having a seperate history file for each invocation of ksh. I'll include the public/etc/kshenv for those who are interested: ------------------------------------------------------------ : ${tty:=`tty`} : ${pty:=`basename $tty`} if test "$pty" = "console" then # 3.4 machines don't run sunview if test -d /var then : else alias sunview=suntools fi else alias sunview="echo 'This is not allowed unless you are on the console.'" alias suntools="echo 'This is not allowed unless you are on the console.'" fi alias logout="exit" # Plus some more CTR specific stuff ------------------------------------------------------------ Hope this helps, -Seth Robertson seth@ctr.columbia.edu -- -Seth Robertson seth@ctr.columbia.edu
amos@taux01.UUCP (Amos Shapir) (10/21/89)
It seems all your sessions use the same history file; one of them adds to it, making the other's current pointer into it invalid. It is easy to fix: just define HISTFILE=.hist$HOST$TTY in your .profile (make sure $TTY and $HOST are defined first, of course). I do not use it since it's sometime useful to keep history around from other sessions; just pressing RETURN every now and then makes sure a session re-reads the history file, so it does not stay behind too much. -- Amos Shapir amos@taux01.nsc.com or amos@nsc.nsc.com National Semiconductor (Israel) P.O.B. 3007, Herzlia 46104, Israel Tel. +972 52 522261 TWX: 33691, fax: +972-52-558322 34 48 E / 32 10 N (My other cpu is a NS32532)