[unix-pc.general] Unix-pc lockup problems - A follow-up

jeff@cjsa.UUCP (C. Jeffery Small) (12/30/87)

Thanks to everyone who sent me reports of their problems with the Unix-pc
locking up.  This appears to be a widespread problem and is being looked
into by AT&T software engineers.  The engineers have asked me a question
which I have never checked during a lockup and I thought that some of you
who are experiencing the problem could check this out the next time the
problem occurs.

During lockup, keyboard input is typically ignored.  The question is: Do
the Caps-Lock and Num-Lock lights work (ie come on and off) when the machine
has crashed?  Mail me the results if you [unfortunately] get the opportunity
to verify this.

I'll keep these newsgroups posted of any future results.
--
Jeffery Small          (203) 776-2000     UUCP:   uunet!---\
C. Jeffery Small and Associates	                  ihnp4!--- hsi!cjsa!jeff
123 York Street, New Haven, CT  06511          hao!noao!---/

allbery@axcess.UUCP (Brandon S. Allbery) (01/04/88)

Just in case anyone's interested:  I've managed to duplicate that lockup.
Notable is that it happened not long after I started rearranging things
around the computer... most notably, that d*mned printer cable.  I still
suspect spurious interrupts; but the printer may not be the only device
capable of sending them.  (Serial ports?  Maybe even bad termination on the
expansion ports?)

I was able to get the window manager to change the current window, but text
output didn't work, and the machine never got to a point in any window where
it was ready for input.  No, I didn't think to check the Caps Lock or Num Lock.

Possibly related?  I've been having a few other oddities:

Using "windy" too many times, or loading/unloading fonts (even the ones that
come with the machine) will cause an "su" in a subwindow to echo the password,
and immediately hang.  It *can* be interrupted without any problems.  I've
noticed that this tends to make the pre-crash sequence happen much sooner...
and this time, the actual crash as well.  (It appears to be based on the
"parent" window; log out and log back in (which closes and re-opens your login
window) and "windy" again works fine... for a while.)

I saw another unusual thing as well:  a program which up until just before the
crash worked perfectly suddenly started spitting out "calloc returned NULL in
_makenew" (yes, it uses curses/terminfo) errors when run.  The pre-crash
sequence began immediately afterward, when I fired up Emacs to look at the
program source....

Conclusion:  I strongly suspect a problem where the windaemon is somehow
interacting with the page daemon.  Spurious interrupts could be causing the
latter to go into some strange state, font mounting and/or whatever "windy"
does to create new windows could be confusing the former, and the two
apparently decide to get into a fight with each other.  (Maybe windaemon is
causing a massive number of page faults?)  The page daemon's involvement would
also explain the "out of memory" aspect.
-- 
 ___  ________________,	Brandon S. Allbery	       cbosgd \
'   \/  __   __,  __,	aXcess Company		       mandrill|
 __  | /__> <__  <__	6615 Center St. #A1-105		       !ncoast!
/  ` | \__. .__> .__>	Mentor, OH 44060-4101	       necntc  | axcess!allbery
\___/\________________.	Moderator, comp.sources.misc   hoptoad/

lenny@icus.UUCP (Lenny Tropiano) (01/11/88)

Here is what happened this evening.  My machine was unattented all day
as I was out of town.  I came home at 2:00am and pressed a key to wake up
the screen saver.  Lo and behold I noticed it was talking to one of
my UUCP connections indicated by the status line (phone daemon I wrote).
Now I know not everyone is running this, and this problem existed *LONG*
before I even started work with phdaemon.  The clock indicated it was
just 9:56pm (even though it was 2:00am).  The keyboard did not respond
in echoing characters and the CAPS/NUM LOCK keys did work (they lit up).

Oh well, had to search for that RESET button at 2am, that was a chore! :-)

The people who are on the war-path at AT&T looking for this problem should
definately go on the idea of the phone manager/window manager problem after
uucico dies.  NOTE:  I am running HDB UUCP, so it isn't just inherent in
the generic UUCP.

							-Lenny
-- 
============================ US MAIL:   Lenny Tropiano, ICUS Computer Group
 IIIII   CCC   U   U   SSSS             PO Box 1
   I    C   C  U   U  S                 Islip Terrace, New York  11752
   I    C      U   U   SSS   PHONE:     (516) 968-8576 [H] (516) 582-5525 [W] 
   I    C   C  U   U      S  AT&T MAIL: ...attmail!icus!lenny  TELEX: 154232428
 IIIII   CCC    UUU   SSSS   UUCP:
============================    ...{uunet!godfre, harvard!talcott}!\
                   ...{ihnp4, boulder, mtune, bc-cis, ptsfa, sbcs}! >icus!lenny 
"Usenet the final frontier"        ...{cmcl2!phri, hoptoad}!dasys1!/