[net.unix-wizards] 750 not rebooting after panic

@csnet-relay.arpa,@ucsc.CSNET:conrad@ucsc.CS (conrad) (04/16/85)

We recently started having this problem (i.e., the boot sequence parks
right before the "Automatic reboot ..." message).  The problem only
surfaced after we started running a kernal that uses our eagle as a primary
swap device.  I have heard that the boot procedure can get hung at this point
if the primary swap device is not ready.  I've generated a namelist of the
kernal and plan to halt the VAX and find out where it's stalled the next time
this problem manifests itself.

I've been seeing some postings lately describing a problem in which they
have to reset there SI 9900 controller.  I've had this problem, but not
when the system was up and running, but after a power failure when the
system is re-booting.  I bring it up here only because I have a vague
suspicion that it is related to the above problem.

BTW, we are running 4.2 on a VAX 11/750, rooted on a fuji-160 with SI 9400
controller, and an eagle with SI 9900 controller as the primary swap
device.


Al Conrad
Computer and Information Sciences
University of California at Santa Cruz
CSnet: conrad@ucsc
uucp: ucbvax|ucscc|v:conrad
408-429-2370

roy@phri.UUCP (Roy Smith) (04/20/85)

> We recently started having this problem (i.e., the boot sequence parks
> right before the "Automatic reboot ..." message).

	Well, since I got the answer to the same question when I posted
it a while ago, I suppose I should share it.

	If your console is at a baud rate that requires it to do
XON/XOFF (i.e. faster than 1200 for a LA-120 or LA-100) the following
happens:  During the auto-conf printouts the printer sends an XOFF.
Since the console tty driver isn't dealing with interrupts yet (does it
ever?), the XOFF never gets looked at.  When the printer catches up, it
sends an XON.  Since the receiver buffer register was never emptied of
the XOFF, the XON gets discarded and the overrun error flag is set.
When the system goes to print out the "Automatic reboot" message, you
are running a 'real' console tty driver.  Of course, what's waiting in
the rcvr buffer?  The leftover XOFF.  Hence the system waits for the XON
that it missed.  I hope I got all the details right.

	At any rate, the fix is to reduce the console baud rate.  When
we dropped ours from 2400 to 1200 (LA-120), everything started to work
fine.

	BTW, I think the way most devices do XON/XOFF is wrong, or at
least less than optimal.  We have an H/P-7470A plotter (one of the
best designed things I've ever seen for the money) which takes a
slightly different approach to things.  When its buffer starts to get
full, it send an XOFF, but doesn't assume that it got received and acted
upon.  If, after some interval, stuff is still comming in, it sends
another XOFF.  The same thing works one the other side.  When its buffer
is empty enough, it sends an XON.  If it doesn't get any more data after
some interval, it assumes the XON got lost and sends another one.

	Since XON and XOFF are not supposed to nest (i.e. a single XON
should restart output after an arbitrary number of XOFFs) this protocol
is far more stable than what most other terminals I've seen use.
-- 
allegra!phri!roy (Roy Smith)
System Administrator, Public Health Research Institute