[comp.sys.sun] submission: Explanation of BAD TRAP crashes

sitongia@hao.ucar.edu (Leonard Sitongia) (04/12/90)

System Context: Sun-4/280, SunOS4.0.3, ALM-II 

Once in a while the question arises, what causes the panics with BAD TRAP
and some kind of data or text fault.  While this doesn't qualify for
frequently asked questions, perhaps you would consider posting this.

My site was getting these kinds of panics at a rate of serveral per week
after installing 4.0.3.  It may have been in 4.0.1 also.  After many false
leads, Sun finally escalated the problem to their corporate level and a
specialist in California got involved.  I dont know the details, but soon
afterward we finally got a patch to fix this problem.  Here is an extract
from the patch tape README.

If you have this type of problem, you should contact your Sun software
support to request the bugfix.

-Leonard E. Sitongia    System Manager		 (303) 497-1509
USPS Mail: High Altitude Observatory P.O. Box 3000 Boulder CO  80307
Internet:               sitongia@hao.ucar.edu
SPAN:			NSFGW::"hao.ucar.edu!sitongia"	[NSFGW=9580]

@(#)README 1.1 [limes] 89/09/08 SMI

This archive contains all the changes to the serial drivers and streams
code since SunOS 4.0.3 FCS 2, and a quickie install script as an example
of how to rebuild a kernel with the new drivers.

These files fix the following BugTraq bugs:

[other bugs deleted by sitongia]

1025622	Panic bus error in streams close code

	The panic was being caused by a naive fix to #1019499, which
	introduced a race condition in the streams open/close code
	that could cause a stream to be torn down even though someone
	else was in the middle of opening it; the resulting corruption
	of data would cause the system to panic at some later time,
	normally after carrier was detected, getty opened the line,
	called vhangup, and closed the line. Specificly, the panic
	would occur most often during the "close" above, since the
	queue's q_qinfo pointer pointed at something unexpected. The
	fix is to back out the original fix for #1019499, and modify
	the streams code to properly handle the case of background
	processes holding a stream open that has been hung up.