emv@ox.com (Ed Vielmetti) (03/07/91)
"poe" is a decstation 3100 running ultrix 4.1. it recently crashed
for no apparent reason. here's a stack trace to show what happened,
and the most useful of the uerf messages.
nothing obvious in here that clues me in on what's up, I'm writing it
off for now to a stray gamma ray.
--Ed
********************************* ENTRY 3. *********************************
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 200. PANIC
SEQUENCE NUMBER 936.
OPERATING SYSTEM ULTRIX 32
OCCURRED/LOGGED ON Wed Mar 6 15:20:03 1991 EST
OCCURRED ON SYSTEM poe.aa.ox.co
SYSTEM ID x82011601 HW REV: x1
FW REV: x16
CPU TYPE: R2000A/R3000
PROCESSOR TYPE KN01
PANIC MESSAGE unaligned access
********************************* ENTRY 4. *********************************
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 117. ERROR & STATUS REGS
SEQUENCE NUMBER 935.
OPERATING SYSTEM ULTRIX 32
OCCURRED/LOGGED ON Wed Mar 6 15:20:03 1991 EST
OCCURRED ON SYSTEM poe.aa.ox.co
SYSTEM ID x82011601 HW REV: x1
FW REV: x16
CPU TYPE: R2000A/R3000
PROCESSOR TYPE KN01
----- ERROR & STATUS REGS -----
CAUSE x10000010
EXCEPTION CODE ADR ERR (LOAD OR INST
_FETCH)
EPC x8008D794
STATUS x0000FF34 CURRENT INTERRUPT STATE DISABLED
CURRENT MODE KERNEL
PREVIOUS INTERUPT STATE ENABLED
PREVIOUS MODE KERNEL
OLD INTERUPT STATE ENABLED
OLD MODE USER
SW INTERRUPT 0 ENABLED
SW INTERRUPT 1 ENABLED
HW INTERRUPT 0 ENABLED
HW INTERRUPT 1 ENABLED
HW INTERRUPT 2 ENABLED
HW INTERRUPT 3 ENABLED
HW INTERRUPT 4 ENABLED
HW INTERRUPT 5 ENABLED
CACHE STATE NORMAL
VIRTUAL ADDRESS x0001000B
SP xFFFFDE3C
poe% dbx -k ./vmunix.0 ./vmcore.0
(dbx) where
> 0 boot(paniced = 0, arghowto = 0)
["../../machine/mips/machdep.c":568, 0x800f54b4]
1 subr_prf.panic(s = 0x8013cecc = "unaligned access")
["../../sys/subr_prf.c":1165, 0x8009dee0]
2 kn01trap_error(ep = 0x8023b6a8, code = 3241116672,
sr = 2149617064, cause = 268435472, signo = 0xffffdd74)
["../../machine/mips/kn01.c":514, 0x800f1ab0]
3 trap.trap(ep = 0xffffdd98, code = 2149616896, sr = 65332,
cause = 268442744)
["../../machine/mips/trap.c":447, 0x800fdf34]
4 VEC_trap()
["../../machine/mips/locore.s":742, 0x800f3b0c]
5 .block201
["../../sys/kern_exit.c":314, 0x8008d790]
6 kern_exit.exit(rv = 0)
["../../sys/kern_exit.c":314, 0x8008d790]
7 rexit()
["../../sys/kern_exit.c":221, 0x8008d3e0]
8 syscall(ep = 0xffffdf5c, code = 1, sr = 65340, cause = 32)
["../../machine/mips/trap.c":1074, 0x800ff350]
9 VEC_syscall()
["../../machine/mips/locore.s":797, 0x800f3b94]
10 entry.start()
[0x409904]
(dbx)
tridge@anu.oz.au (Andrew Tridgell) (03/07/91)
Just yesterday morning I also got a crash for no apparent reason.
According to uerf it was also the result of an unaligned access. This
is the first time in 2 years we've had a random crash, and I'd like to
find out the cause.
We have a DecStation 3100 running Ultrix 4.0. At the time the load
average was below 2 and there were 4 users logged on. The last processes
running were a login and a process spawned by cron that did an rsh on
another machine (a sun). Neither the rsh nor the login completed.
uerf shows -
********************************* ENTRY 188. *********************************
----- EVENT INFORMATION -----
EVENT CLASS OPERATIONAL EVENT
OS EVENT TYPE 250. ASCII MSG
SEQUENCE NUMBER 47.
OPERATING SYSTEM ULTRIX 32
OCCURRED/LOGGED ON Wed Mar 6 12:32:37 1991 EDT
OCCURRED ON SYSTEM aerodec
SYSTEM ID x82011601 HW REV: x1
FW REV: x16
CPU TYPE: R2000A/R3000
PROCESSOR TYPE KN01
MESSAGE
********************************* ENTRY 189. *********************************
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 117. ERROR & STATUS REGS
SEQUENCE NUMBER 46.
OPERATING SYSTEM ULTRIX 32
OCCURRED/LOGGED ON Wed Mar 6 12:32:37 1991 EDT
OCCURRED ON SYSTEM aerodec
SYSTEM ID x82011601 HW REV: x1
FW REV: x16
CPU TYPE: R2000A/R3000
PROCESSOR TYPE KN01
----- ERROR & STATUS REGS -----
CAUSE x30000010
EXCEPTION CODE ADR ERR (LOAD OR INST
_FETCH)
EPC x801151B0
STATUS x00000000 CURRENT INTERRUPT STATE DISABLED
CURRENT MODE KERNEL
PREVIOUS INTERUPT STATE DISABLED
PREVIOUS MODE KERNEL
OLD INTERUPT STATE DISABLED
OLD MODE KERNEL
CACHE STATE NORMAL
VIRTUAL ADDRESS x2F320951
SP xFFFFDED4
********************************* ENTRY 190. *********************************
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 200. PANIC
SEQUENCE NUMBER 48.
OPERATING SYSTEM ULTRIX 32
OCCURRED/LOGGED ON Wed Mar 6 12:32:37 1991 EDT
OCCURRED ON SYSTEM aerodec
SYSTEM ID x82011601 HW REV: x1
FW REV: x16
CPU TYPE: R2000A/R3000
PROCESSOR TYPE KN01
PANIC MESSAGE unaligned access
********************************* ENTRY 191. *********************************
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 117. ERROR & STATUS REGS
SEQUENCE NUMBER 49.
OPERATING SYSTEM ULTRIX 32
OCCURRED/LOGGED ON Wed Mar 6 12:32:37 1991 EDT
OCCURRED ON SYSTEM aerodec
SYSTEM ID x82011601 HW REV: x1
FW REV: x16
CPU TYPE: R2000A/R3000
PROCESSOR TYPE KN01
----- ERROR & STATUS REGS -----
CAUSE x30000C10
EXCEPTION CODE ADR ERR (LOAD OR INST
_FETCH)
HW INTERRUPT 0 PENDING
HW INTERRUPT 1 PENDING
EPC x80099558
STATUS x0000C004 CURRENT INTERRUPT STATE DISABLED
CURRENT MODE KERNEL
PREVIOUS INTERUPT STATE ENABLED
PREVIOUS MODE KERNEL
OLD INTERUPT STATE DISABLED
OLD MODE KERNEL
HW INTERRUPT 4 ENABLED
HW INTERRUPT 5 ENABLED
CACHE STATE NORMAL
VIRTUAL ADDRESS x003938D7
SP xFFFFDB24
Which isn't particularly useful for me, as I haven't had much experience
reading such things (it doesn't go down very often)
dbx gives :
aero[9:38am]>dbx -k ./vmunix.0 ./vmcore.0
dbx version 2.0
Type 'help' for help.
reading symbolic information ...
[using memory image in ./vmcore.0]
(dbx) where
> 0 swtch.swtch() ["../../machine/mips/swtch.c":275, 0x8011adb0]
1 sleep_unlock(chan = 0x80167f9d = "^A", pri = 0, l = (nil)) ["../../sys/kern
_synch.c":519, 0x800aa458]
2 kern_synch.sleep(chan = (nil), pri = -2146891488) ["../../sys/kern_synch.c"
:385, 0x800a9ffc]
3 .block419 ["../../vm/vm_sched.c":371, 0x800e2f50]
4 sched() ["../../vm/vm_sched.c":371, 0x800e2f50]
5 .block217 ["../../sys/init_main.c":755, 0x800909a4]
6 .block216 ["../../sys/init_main.c":755, 0x800909a4]
7 main() ["../../sys/init_main.c":755, 0x800909a4]
8 entry.start() ["../machine/entry.s":110, 0x80030098]
(dbx)
Any ideas at to what caused this?
Thanks!
---
Andrew Tridgell
tridge@aerodec.anu.edu.au