[net.unix-wizards] traceback on vmcore

jmcg (09/22/82)

An error in one of my drivers was causing occasional syscall panics
when the system went multi-user, so I found it necessary to dig into
this matter.  The articles from Spencer Thomas and Chris Kent were
helpful, but there were still some missing pieces.  I'm enclosing some
m4 macros and some explanatory text that others may find useful.  In
particular, the command line

	echo traceback | m4 adbhelp.m4 - | adb vmunix.XX vmcore.XX

yields a traceback (which is frequently all you need).

		Jim McGinness - UC San Diego  Chemistry
		ucbvax!sdcsvax!jmcg      (714) 452-4016
------------------------------------------------------------------
adbhelp.m4 (26 lines)
------------------------------------------------------------------
define(`upages',8)dnl	# from <sys/param.h>
define(`kstkpgs',2)dnl	# number of contig kernel stack pages
define(`kstkoff',`psiz*(upages-kstkpgs)')dnl	# offset in u struct of kstkpgs
define(`pagemask',1fffff)dnl	# pfn field of PTE
define(`ekstk',7fffffff)dnl	# base of kernel stack
define(`pteperpage',80)dnl	# PTEs per page
define(`psiz',200)dnl		# page size
define(`u_p0br',50)dnl		# from <sys/user.h>
define(`u_szpt',60)dnl		# from <sys/user.h>
define(`saved_fp',`*(rpb+1ec)')dnl	# see `doadump' in locore.s
define(`PCB',`*(rpb+1f8)')dnl		# ditto
define(`uvirt',`((PCB)+80000000)')dnl	# alternate virtual address for u struct
define(`p0br',`(*(uvirt+u_p0br))')dnl	# P0 region base
define(`szpt',`(*(uvirt+u_szpt))')dnl	# size of process page table
define(`usrpte',`(Usrptmap+((($1)-usrpt)%pteperpage))')dnl # needs explanation
define(`pageoff',`psiz*(pagemask&(*($1)))')dnl	# takes a pte as argument
define(`usrpteoff',`pageoff(usrpte($1))')dnl
define(`sho_ppte',`usrpte(p0br),szpt/X')dnl
define(`remap',`/*m $1 ekstk $2')dnl
define(`map_u',`remap(u,PCB)')dnl
define(`map_ppte',`remap(u+(kstkoff),usrpteoff(p0br+(psiz*(szpt-1))))')dnl
define(`map_kstk',dnl
`remap(u+(kstkoff),pageoff((u+((kstkoff)+psiz)-(4*kstkpgs))))')dnl
define(`traceback',dnl
map_ppte
map_kstk
saved_fp`$c')dnl
------------------------------------------------------------------
These macros undoubtedly have some rough edges, but it's much better
than trying to understand the adb expressions which accomplish the
desired action (apparently Rob Gurwitz gets the credit for m4|adb).
(You'll need something like Dan Franklin's suggestions to make m4
do unbuffered IO I you want a useful interactive tool, though.)

Some explanations are in order.  The goal of the whole mess is to map
the kernel stack so that adb's stack trace command can be applied.  For
things that are not kept in contiguous physical pages, you can map in
at most 1k (this is 4.1BSD) at one time.  For the kernel stack, the
last 1k is usually sufficient.

Some space is reserved after the Restart Parameter Block (rpb) for a
stack to be used by `doadump'.  The PCBB and registers for the active
process are saved there.  Since the Process Context Block (PCB) is the
first part of the _u structure, PCBB (PCB Base) can be taken as the
physical address of _u.  I used p0br and szpt to locate the last page
of the process's page table; I suppose p1br could have been used as
well.

I had to look at `analyze.c' to figure out how to interpret p0br and
actually find the process page table.  What it boils down to is that
you have to subtract `usrpt' from it, then convert that offset into an
index into `Usrptmap'.

There's a subtlety involved in mapping the page table in at the address
of the kernel stack: adb does not completely evaluate the three
expressions before changing the map in the $m command.

Once the process page table was located, it was straightforward to
fetch the next-to-last Page Table Entry (pte) for the physical page
number of the last 1K of kernel stack.

Aside: once I had the traceback, it was embarrassingly simple to see
what had caused the syscall panic--an uninitialized tty structure
allowed ttstart to call NULL as a function.  Surprisingly, NULL is a
perfectly good function: it's the user process.  It's interesting (but
probably sinful) to contemplate useful applications of the idea of
calling a user-space function from kernel context.