[comp.sys.pyramid] how to force a dump?

era@NIWOT.UCAR.EDU (04/28/89)

We have an occasional situation where our 90x running 4.4 freezes
up.  The system activity light indicates something going on, but
there is no response to attempts to get into the machine thru the
network or console.

Question: is there some way, from one of the COS panels, to coerce
OSx into dropping a core file?

Thanks in advance for your assistance...

Ed Arnold
era@ncar.ucar.edu

steve@polyslo.CalPoly.EDU (Steve DeJarnett) (04/28/89)

In article <8904272154.AA00226@era.ucar.edu.UCAR.EDU> era@NIWOT.UCAR.EDU writes:
>Question: is there some way, from one of the COS panels, to coerce
>OSx into dropping a core file?

	Build a kernel with the Kernel debugger installed.  To do this, edit
your local Kernel configuration file (not conf.c but UCAR or whatever you call
it) and add the line

	debugger

Then, when this happens, from the console, type:

	CTRL-@

This will dump you into the kernel debugger, where you can do things like 
check which process is running, look at it's call stack, and just about 
anything else you'd probably ever want to do.  One of the notable things
you can also do is panic the kernel (from the <dbg> prompt, type pa (for
panic, obviously).

	Hope that helps.

-------------------------------------------------------------------------------
| Steve DeJarnett            | Smart Mailers -> steve@polyslo.CalPoly.EDU     |
| Computer Systems Lab       | Dumb Mailers  -> ..!ucbvax!voder!polyslo!steve |
| Cal Poly State Univ.       |------------------------------------------------|
| San Luis Obispo, CA  93407 | BITNET = Because Idiots Type NETwork           |
-------------------------------------------------------------------------------

cml@brachiosaur.cis.ohio-state.edu (Christopher Lott) (04/28/89)

In article <8904272154.AA00226@era.ucar.edu.UCAR.EDU> era@NIWOT.UCAR.EDU writes:
>Question: is there some way, from one of the COS panels, to coerce
>OSx into dropping a core file?

Take this for what it's worth.  These are some very, _very_ old instructions
I once got from RTOC which were supposed to cause a pyramid machine (hung or 
otherwise) to panic.  Takes some diddling, and be warned that it *never*
worked for me.  If this is hopelessly obsolete and wrong, would someone
at Pyramid (hi, carl g?) please correct me?



                          How to Force a Panic


	     1.  If the system is hung, or you have a reason to cause a
	     crash go to COS frame B and halt it by pushing the Z-key.

	     2.  The system will  stop.   Make  a  note  of  the  Program
	     Counter  in  the  system  status  line  at the bottom of the
	     screen.  It will look like:
		                             FFxxxxxx

	     3.  Alter  memory  word location following that address.  If
	     pc = FF150808 then change location FF15080C.  (ie, add 0x4.)
	     Store  a  31000001 there.  This will be the next instruction
	     executed when the machine is restarted.   Type 'M' to modify 
	     memory - you will see a display something like this:
	     FF150800: 00000000 00000000 00000000 00000000
	      address  ^800     ^804     ^808     ^80C
	     The pyramid has long words (a 4 word boundary).

	     4.  Next, alter GR0.  Store the hex number 0000 0001 there.
	     This is General Register 0 -  displayed  in  frame B.  Use
	     command 'A' to modify registers.

	     5.  Restart the CPU  with  the  Z-key in frame B.
	     The  two new instructions force the computer to attempt exe-
	     cution of a word  instruction  on  a  byte  boundary.   This
	     forces  a  trap,  and if savecore enabled (in /etc/rc), core
	     will dump.  All the contents of memory will  be  written  to
	     the swap device.

	     6.  Hit <esc> 0, and watch to see that this happens.  If the
	     panic  was  caused  by a disk error, the core-write may fail
	     also.

	     7.  Reboot.  If savecore is enabled,  the  contents  of  the
	     swap device  will  be  copied  once again into the directory
	     specified; most usually, this is /usr/crash, but the  custo-
	     mers move it around.  There must be enough free space in the
	     file system to hold it.  It will be as large as  the  memory
	     copied,  and in the case of repeated failures there will al-
	     ready be other dumps stored there.

my note: attempted 870817 to no avail.

chris...
-=-
cml@cis.ohio-state.edu        Computer Science Dept, OSU          614-292-1826
 or:  ...!{att,pyramid,killer}!osu-cis!cml		<standard disclaimers>

csg@pyramid.pyramid.com (Carl S. Gutekunst) (04/28/89)

>Take this for what it's worth.  These are some very, _very_ old instructions
>I once got from RTOC which were supposed to cause a pyramid machine (hung or 
>otherwise) to panic.

I remember going through this nonsense. I can't vouch for the exact procedure
any more, but it did work. It was quite enough of a hassle that some kind soul
added the `pa' command to the kernel debugger in OSx 4.0, as Steve described.
True, you do need do need to build OSx with the kernel debugger in, and then
wait for the *next* time the problem happens. I always build my kernels that
way anyway; partly because I'm usually booting weird and fragile things in my
kernels, and partly because I'm nosy. :-) The 'flags' field from COS Frame 1
can be set to control automatic entry into the debugger; see your System Admin
Guide for details. Or call RTOC, if what you want isn't documented to your
satisfaction.

As far as the NCAR problem, by all means, if you're getting quiet lockups,
build with the kernel debugger and take a core sample next time it happens.

<csg>

generous@daitc.daitc.mil (Curtis Generous) (04/28/89)

In article <8904272154.AA00226@era.ucar.edu.UCAR.EDU> era@NIWOT.UCAR.EDU writes:
>We have an occasional situation where our 90x running 4.4 freezes
>up.   ....
>Question: is there some way, from one of the COS panels, to coerce
>OSx into dropping a core file?
>
>Ed Arnold
>era@ncar.ucar.edu

If you have the kernel debugger compiled in (do a 'strings /vmunix' to check
it out or look at the config file in /sys/kernel{_m}/HOSTNAME}, you can
do a ^@ (as in <CTRL>@) at the console.  This will throw you into the debugger.
Just type 'panic' and then 'exit'.

--curtis
-- 
Curtis C. Generous
DTIC Special Projects Office (DTIC-SPO)
ARPA: generous@daitc.mil
UUCP: {uunet,vrdxhq,lll-tis}!daitc!generous

karl@triceratops.cis.ohio-state.edu (Karl Kleinpaste) (04/28/89)

The suggestions for building a kernel w/debugger and then hitting ^@
to get control of it thereby, followed by a `pa' command, are
fine...if the system is listening to the console.  We had some
problems with an early release of 4.4 where heavy network traffic
caused the system to lock up so tight that it ignored the console -
typing anything was rewarded with CPU BUSY in SSL #3, which usually
says nothing more exciting than CPU[0].  It was necessary during these
events to put the sort of operation which Chris Lott described to use.
It worked, fortunately, and we did get usable core dumps out of it.

The bugs causing the lockup were fixed months ago, of course.  We're
going to be upgrading several of our other 4.0 machines to 4.4 Real
Soon Now.

--Karl