[comp.unix.wizards] 4.3BSD crash dump help needed

dhesi@bsu-cs.UUCP (Rahul Dhesi) (01/28/89)

I'm having trouble analyzing a 4.3BSD crash dump.  Using adb on the
dump thus:

     adb -k vmunix.1 vmcore.1 << \EOF
     $c
     EOF

give this output:

     sbr f868 slr 3070
     p0br e00 p0lr 160 p1br 1600 p1lr ffde
     panic: Segmentation fault
     _boot()	from 8002694b
     _boot(0,0) from	_panic+3a
     _panic(8003ef90) from _trap+ac
     _trap()	from _Xtransflt+1d
     _Xtransflt(800691d8,1) from 8000b5e7
     _newproc(1) from _fork1+b3
     _fork1(1) from _vfork+b
     _vfork() from 80027265
     _syscall() from	_Xsyscall+c
     _Xsyscall(23c80,7fffe340,17948,0,7fffdd34,7fffdd30) from 3a03
     _Syssize(17934,23c80,7fffe340,17948) from 38ac
     _Syssize(17934,27a40) from 46f3
     _Syssize(17934,62) from	1026
     ?(4,7fffeb3c,7fffeb50) from 3d
     ?()

I have not found the Xtransflt symbol anywhere in the source or in any
of the C libraries.  What am I overlooking?  The crash dumps always
give a similar stack trace.  If only I could figure out what Xtransflt
is I might have a clue.

Any suggestions?
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi
                    ARPA:  bsu-cs!dhesi@iuvax.cs.indiana.edu

mike@turing.cs.unm.edu (Michael I. Bushnell) (01/28/89)

In article <5483@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>I'm having trouble analyzing a 4.3BSD crash dump.  Using adb on the
>dump thus:
>
>     adb -k vmunix.1 vmcore.1 << \EOF
>     $c
>     EOF
>
>give this output:
>
>     sbr f868 slr 3070
>     p0br e00 p0lr 160 p1br 1600 p1lr ffde
>     panic: Segmentation fault
>     _boot()	from 8002694b
>     _boot(0,0) from	_panic+3a
>     _panic(8003ef90) from _trap+ac
>     _trap()	from _Xtransflt+1d
>     _Xtransflt(800691d8,1) from 8000b5e7
>     _newproc(1) from _fork1+b3
>     _fork1(1) from _vfork+b
>     _vfork() from 80027265
>     _syscall() from	_Xsyscall+c
>     _Xsyscall(23c80,7fffe340,17948,0,7fffdd34,7fffdd30) from 3a03
>     _Syssize(17934,23c80,7fffe340,17948) from 38ac
>     _Syssize(17934,27a40) from 46f3
>     _Syssize(17934,62) from	1026
>     ?(4,7fffeb3c,7fffeb50) from 3d
>     ?()
>
>I have not found the Xtransflt symbol anywhere in the source or in any
>of the C libraries.  What am I overlooking?  The crash dumps always
>give a similar stack trace.  If only I could figure out what Xtransflt
>is I might have a clue.


Xtransflt is in locore.s.  Look for the line "SCBVEC(transflt)".  The
macro SCBVEC defines a vectore to be called from the system control
block, with your name prepended by an X.  So that interrupt vector is
called Xtransflt.  

This was an address translation fault, which was interpreted by the
trap routine as a segmentation fault.  Since the fault happened in kernel
mode, the kernel paniced.

The actual segmentation fault occurred in newproc (the context above the
call to Xtransflt.)  The console messages (unfortunately not logged)
contained that actual address within newproc where the fault occurred, and
what the bad address was.  The adb kernel scripts probably contain something
that will format the structure pushed on the stack at fault time and can
get you that information.

  Michael I. Bushnell       \     This above all; to thine own self be true
         GIG!                \    And it must follow, as the night the day,
mike@turing.cs..unm.edu      /\   Thou canst not be false to any man.
  Hmmmm..............       /  \  Farewell:  my blessing season this in thee!

chris@mimsy.UUCP (Chris Torek) (01/28/89)

In article <5483@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>     _Xtransflt(800691d8,1) from 8000b5e7
>     _newproc(1) from _fork1+b3
>
>I have not found the Xtransflt symbol anywhere in the source or in any
>of the C libraries.  What am I overlooking?  The crash dumps always
>give a similar stack trace.  If only I could figure out what Xtransflt
>is I might have a clue.

Xtransflt is in locore.s.  The problem here is that adb is looking at
CALLG/CALLS stack frames, and a translation fault trap pushes a trap
frame instead, confusing it.  That Xtransflt() is almost certainly
procdup().  It may be related to this bug in resume():

RCS file: RCS/locore.s,v
retrieving revision 1.5
retrieving revision 1.6
diff -c2 -r1.5 -r1.6
*** /tmp/,RCSt1000249	Sat Jan 28 02:13:57 1989
--- /tmp/,RCSt2000249	Sat Jan 28 02:14:04 1989
***************
*** 1645,1650 ****
  1:
  	movl	r1,sp
! 	movl	(r0),(sp)			# address to return to
! 	movl	$PSL_PRVMOD,4(sp)		# ``cheating'' (jfr)
  	rei
  
--- 1645,1650 ----
  1:
  	movl	r1,sp
! 	pushl	$PSL_PRVMOD			# return psl
! 	pushl	(r0)				# address to return to
  	rei
  
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

aglew@urbana.mcd.mot.com (01/30/89)

>I have not found the Xtransflt symbol anywhere in the source or in any
>of the C libraries.  What am I overlooking?  The crash dumps always
>give a similar stack trace.  If only I could figure out what Xtransflt
>is I might have a clue.
>
>Any suggestions?
>-- 
>Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi
>                    ARPA:  bsu-cs!dhesi@iuvax.cs.indiana.edu

Chris Torek has already shown where Xtransflt comes from - it is
a symbol in locore.s that is generated by a macro - and diagnosed
the problem.

I thought that I might just mention a convenient trick for finding
this sort of generated symbol that does not exist in the source:
just do an "nm -o *.[oa]|grep Xtransflt" in the directory that contains all of
your objects for the kernel build. That'll tell you what file
the symbol is in - then you just have to look at the macros.

BTW, it might be nice to find a way to map these synthetic symbols
into a tags file - although at the moment I'd be happy to have tags
for my assembler files... [everything is simple, just requiring
time and effort]


Andy "Krazy" Glew   aglew@urbana.mcd.mot.com   uunet!uiucdcs!mcdurb!aglew
   Motorola Microcomputer Division, Champaign-Urbana Design Center
	   1101 E. University, Urbana, Illinois 61801, USA.
   
My opinions are my own, and are not the opinions of my employer, or
any other organisation. I indicate my company only so that the reader
may account for any possible bias I may have towards our products.