[comp.unix.shell] TCSH exit crashing Sparc

pauln@TIRS.oz.au (Paul Ninnis) (05/24/91)

We're having the occasional problem using tcsh on a sparc station 1 under
SunOS 4.1
It appears to create a data fault causing a panic condition leading to
rebooting when a terminal user logs out (only occasionally)
(possibly by the use of exit rather than ^D)
This info was recorded in the system messages file along with its PID no

This is the version and the compiler set options obtained by a 'set'
  version tcsh 5.20.02 (Cornell) 12/07/90 options 8b,nls,el,dl,al,dir

At times , no other users or processes are running when this has occurred
Is this a know bug?
Is there a later version of tcsh that fixes it?
+-------------------------------------------+--------------------------------+
|  ___  __   _  _   _     _    _            | PAUL NINNIS : Software Engineer|
| (  ,)( ,\ ( )( ) / )   ( )  / )           | International Railroad Systems |
|  ) _) )  \ \()/ / (_    ) \/ / @TIRS.oz.au| 209 Wakefield St,Adelaide SA   |
| (_)  (_)\_)(__)(____)  (_/\_/             | PHONE: +61 8 232 1740          |
| The Hidden Flaw NEVER Remains Hidden      | FAX: +61 8 232 2274            |
+-------------------------------------------+--------------------------------+

subbarao@phoenix.Princeton.EDU (Kartik Subbarao) (05/24/91)

In article <1991May24.025110.8586@TIRS.oz.au> pauln@TIRS.oz.au (Paul Ninnis) writes:
>
>We're having the occasional problem using tcsh on a sparc station 1 under
>SunOS 4.1
>It appears to create a data fault causing a panic condition leading to
>rebooting when a terminal user logs out (only occasionally)
>(possibly by the use of exit rather than ^D)
>This info was recorded in the system messages file along with its PID no
>
>This is the version and the compiler set options obtained by a 'set'
>  version tcsh 5.20.02 (Cornell) 12/07/90 options 8b,nls,el,dl,al,dir

Yeah. We had the *exact* same problem. No, getting a new version of tcsh is
not going to fix it (and wouldn't be the way to do it anyway, since many
programs can tweak the same bug. You should get a patch from Sun. That is
the best way to deal with things. Of course, knowing Sun, you might have to
wait a bit............


			-Kartik


--
internet% ypwhich

subbarao@phoenix.Princeton.EDU -| Internet
kartik@silvertone.Princeton.EDU (NeXT mail)  
SUBBARAO@PUCC.BITNET			          - Bitnet

fischer@iesd.auc.dk (Lars P. Fischer) (05/25/91)

>>>>> On 24 May 91 02:51:10 GMT, pauln@TIRS.oz.au (Paul Ninnis) said:

Paul> We're having the occasional problem using tcsh on a sparc station 1 under
Paul> SunOS 4.1
Paul> It appears to create a data fault causing a panic condition leading to
Paul> rebooting when a terminal user logs out (only occasionally)
Paul> (possibly by the use of exit rather than ^D)

Never seen that problem, and we have, umm, 224 users running tcsh
under SunOS 4.1 on SS 1 (all variants) here.

/Lars
--
Lars Fischer,  fischer@iesd.auc.dk   | It takes an uncommon mind to think of
CS Dept., Univ. of Aalborg, DENMARK. | these things.  -- Calvin

thorinn@diku.dk (Lars Henrik Mathiesen) (05/28/91)

pauln@TIRS.oz.au (Paul Ninnis) writes:
>We're having the occasional problem using tcsh on a sparc station 1 under
>SunOS 4.1
>It appears to create a data fault causing a panic condition leading to
>rebooting when a terminal user logs out (only occasionally)
>(possibly by the use of exit rather than ^D)

We had a similar-sounding problem some time ago. (I'm almost certain
it was under 4.1, but I'm not sure.) One user, and only he, could
crash the systems when logging in. At some point in his login
sequence, tcsh would perform a strange sequence of fork()s and
exit()s. (The dump showed that this was where the crash came.) But
that wasn't enough to crash the system when we tried, so we looked at
his login some more.

It turned out that he set the coredump resource limit. This somehow
made the kernel botch the fork/exit stuff. (There may be a patch by
now.) I think any limit (zero or not) would provoke the crash.

You might want to check if the crashes you see have anything to do
with the coredump size limit.

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.      thorinn@diku.dk

christos@theory.TC.Cornell.EDU (Christos S. Zoulas) (05/28/91)

In article <1991May27.221004.2841@odin.diku.dk> thorinn@diku.dk (Lars Henrik Mathiesen) writes:
>pauln@TIRS.oz.au (Paul Ninnis) writes:
>>We're having the occasional problem using tcsh on a sparc station 1 under
>>SunOS 4.1
>>It appears to create a data fault causing a panic condition leading to
>>rebooting when a terminal user logs out (only occasionally)
>>(possibly by the use of exit rather than ^D)
>
[stuff deleted]

>It turned out that he set the coredump resource limit. This somehow
>made the kernel botch the fork/exit stuff. (There may be a patch by
>now.) I think any limit (zero or not) would provoke the crash.
>
>You might want to check if the crashes you see have anything to do
>with the coredump size limit.

I've had the same problem long time ago, not with tcsh but with 
another program that did lots of fork()/exec() calls. If I remember
well, this crash has something to do with spilling the register
windows. Anything that fork()/exec()'s frequently enough can cause it.

The place with the most sun fixes (I know about) is princeton.edu (thanks
a lot guys!!!) The one you want is 1029939 in /pub/sun-fixes/sunos4.1. 
We have a copy of the fix on tesla.ee.cornell.edu in
/pub/sun-patches/flush_windows.tar.Z


christos
-- 
Christos Zoulas         | 389 Theory Center, Electrical Engineering,
christos@ee.cornell.edu | Cornell University, Ithaca NY 14853.
christos@crnlee.bitnet  | Phone: (607) 255 0302, Fax: (607) 255 9072