igb@vlsi1.vlsi.columbia.edu (Isidore G. Bendrihem) (04/07/90)
One of our SparcStation has been crashing once every other day. This is what /usr/adm/messages has to say about it Apr 2 17:36:04 eevlsi vmunix: BAD TRAP Apr 2 17:36:04 eevlsi vmunix: csh: Data fault Apr 2 17:36:04 eevlsi vmunix: kernel read fault at addr=0xff640000, pme=0x0 Apr 2 17:36:04 eevlsi vmunix: Sync Error Reg 80<INVALID> Apr 2 17:36:04 eevlsi vmunix: pid=7988, pc=0xf809124c, sp=0xffffecc0, psr=0xc2, context=0 Apr 2 17:36:04 eevlsi vmunix: g1-g7: 4000e3, 4000e3, f80e3fd0, d, 0, f80e4000, f80e3c00 Apr 2 17:36:04 eevlsi vmunix: Begin traceback... sp = ffffecc0 Apr 2 17:36:04 eevlsi vmunix: Called from f802e2bc, fp=ffffed20, args=ffffc000 ff640000 38a0 ffffec18 f8129e50 f8101b20 Apr 2 17:36:04 eevlsi vmunix: Called from f802e0d0, fp=ffffed98, args=f8125e64 2 23 0 ffffffff ff11e6b4 Apr 2 17:36:04 eevlsi vmunix: Called from f802dba8, fp=ffffedf8, args=1 50 14 f8124c78 f812143c f8125e64 Apr 2 17:36:04 eevlsi vmunix: Called from f802d9f8, fp=ffffee60, args=1 f80e3c00 32 ff0f001c f8124c78 f8125e64 Apr 2 17:36:04 eevlsi vmunix: Called from f80aa01c, fp=ffffeec0, args=ffffefe0 f80d2b80 f80d2b80 0 ffffefb4 ffffefe0 Apr 2 17:36:04 eevlsi vmunix: Called from f8005808, fp=ffffef58, args=8000000 42 f7fffcb4 42 42 210 Apr 2 17:36:04 eevlsi vmunix: Called from 1626c, fp=f7fffbe8, args=0 0 f7fffcb4 4124 1 f7800150 Apr 2 17:36:04 eevlsi vmunix: End traceback... Apr 2 17:36:04 eevlsi vmunix: panic: Data fault Apr 2 17:36:04 eevlsi vmunix: zs3: silo overflow Apr 2 17:36:04 eevlsi vmunix: syncing file systems... [11] [11] [8] [5] done Apr 2 17:36:04 eevlsi vmunix: 00257 low-memory static kernel pages Apr 2 17:36:04 eevlsi vmunix: 00343 additional static kernel pages Apr 2 17:36:04 eevlsi vmunix: 00027 dynamic kernel data pages Apr 2 17:36:04 eevlsi vmunix: 00028 additional user structure pages Apr 2 17:36:04 eevlsi vmunix: 00000 segmap kernel pages Apr 2 17:36:04 eevlsi vmunix: 00000 segvn kernel pages Apr 2 17:36:04 eevlsi vmunix: 00000 current user process pages Apr 2 17:36:04 eevlsi vmunix: 00024 user stack pages Apr 2 17:36:04 eevlsi vmunix: 00679 total pages (679 chunks) Apr 2 17:36:04 eevlsi vmunix: Apr 2 17:36:04 eevlsi vmunix: dumping to vp ff04bc1c, offset 22684 Apr 2 17:36:04 eevlsi vmunix: 679 total pages, dump succeeded Apr 2 17:36:04 eevlsi vmunix: rebooting... Apr 2 17:36:04 eevlsi vmunix: SunOS Release 4.0.3c (GENERIC) #1: Thu May 25 17:17:12 PDT 1989 Apr 2 17:36:04 eevlsi vmunix: Copyright (c) 1983-1989 Sun Microsystems, Inc. Apr 2 17:36:04 eevlsi vmunix: mem = 8192K (0x800000) Apr 2 17:36:04 eevlsi vmunix: avail mem = 6987776 Apr 2 17:36:04 eevlsi vmunix: Ethernet address = 8:0:20:7:4a:ab Apr 2 17:36:04 eevlsi vmunix: zs0 at obio 0xf1000000 pri 12 Apr 2 17:36:04 eevlsi vmunix: zs1 at obio 0xf0000000 pri 12 Apr 2 17:36:04 eevlsi vmunix: fd0 at obio 0xf7200000 pri 11 Apr 2 17:36:04 eevlsi vmunix: audio0 at obio 0xf7201000 pri 13 Apr 2 17:36:04 eevlsi vmunix: sbus0 at SBus slot 0 0x0 Apr 2 17:36:04 eevlsi vmunix: dma0 at SBus slot 0 0x400000 Apr 2 17:36:04 eevlsi vmunix: esp0 at SBus slot 0 0x800000 pri 3 Apr 2 17:36:04 eevlsi vmunix: sd0 at esp0 target 3 lun 0 Apr 2 17:36:04 eevlsi vmunix: sd0: <Quantum ProDrive 105S cyl 974 alt 2 hd 6 sec 35> Apr 2 17:36:04 eevlsi vmunix: sd1 at esp0 target 1 lun 0 Apr 2 17:36:04 eevlsi vmunix: sd1: <Quantum ProDrive 105S cyl 974 alt 2 hd 6 sec 35> Apr 2 17:36:04 eevlsi vmunix: le0 at SBus slot 0 0xc00000 pri 5 Apr 2 17:36:04 eevlsi vmunix: le1 at SBus slot 1 0xc00000 pri 5 Apr 2 17:36:04 eevlsi vmunix: cgsix0 at SBus slot 2 0x0 pri 7 Apr 2 17:36:04 eevlsi vmunix: root on sd0a fstype 4.2 Apr 2 17:36:04 eevlsi vmunix: swap on sd0b fstype spec size 14070K Apr 2 17:36:04 eevlsi vmunix: dump on sd0b fstype spec size 14056K Apr 3 14:31:09 eevlsi vmunix: BAD TRAP Apr 3 14:31:09 eevlsi vmunix: csh: Data fault Apr 3 14:31:09 eevlsi vmunix: kernel read fault at addr=0xff640000, pme=0x0 Apr 3 14:31:09 eevlsi vmunix: Sync Error Reg 80<INVALID> Apr 3 14:31:09 eevlsi vmunix: pid=1365, pc=0xf809124c, sp=0xffffecc0, psr=0xc3, context=0 Apr 3 14:31:09 eevlsi vmunix: g1-g7: 4000e4, 4000e4, f80e3fd0, d, 0, f80e4000, f80e3c00 Apr 3 14:31:09 eevlsi vmunix: Begin traceback... sp = ffffecc0 Apr 3 14:31:09 eevlsi vmunix: Called from f802e2bc, fp=ffffed20, args=ffffc000 ff640000 38a0 ffffec18 f8129fb0 f8101b20 Apr 3 14:31:09 eevlsi vmunix: Called from f802e0d0, fp=ffffed98, args=f81264c0 2 2e ffffff6c ffffffff ff0ee464 Apr 3 14:31:09 eevlsi vmunix: Called from f802dba8, fp=ffffedf8, args=1 50 14 f81256e0 f8122098 f81264c0 Apr 3 14:31:09 eevlsi vmunix: Called from f802d9f8 Apr 3 14:31:25 eevlsi login: REPEATED LOGIN FAILURES ON ttyb, y Apr 4 16:05:21 eevlsi vmunix: BAD TRAP Apr 4 16:05:21 eevlsi vmunix: csh: Data fault Apr 4 16:05:21 eevlsi vmunix: kernel read fault at addr=0xff640000, pme=0x0 Apr 4 16:05:21 eevlsi vmunix: Sync Error Reg 80<INVALID> Apr 4 16:05:21 eevlsi vmunix: pid=3078, pc=0xf809124c, sp=0xffffecc0, psr=0xc2, context=0 Apr 4 16:05:21 eevlsi vmunix: g1-g7: 4000e3, 4000e3, a000, c, 0, f80e4000, f80e3c00 Apr 4 16:05:21 eevlsi vmunix: Begin traceback... sp = ffffecc0 Apr 4 16:05:21 eevlsi vmunix: Called from f802e2bc, fp=ffffed20, args=ffffc000 ff640000 38a0 ffffecc0 f8129f50 f80e9e10 Apr 4 16:05:21 eevlsi vmunix: Called from f802e0d0, fp=ffffed98, args=f8126304 2 2b 0 ffffffff ff0f46bc Apr 4 16:05:21 eevlsi vmunix: Called from f802dba8, fp=ffffedf8, args=1 50 14 f8124c78 f8121e30 f8126304 Apr 4 16:05:21 eevlsi vmunix: Called from f802d9f8, fp=ffffee60, args=1 f80e3c00 32 ff112bcc f8124c78 f8126304 Apr 4 16:05:21 eevlsi vmunix: Called from f80aa01c, fp=ffffeec0, args=ffffefe0 f80d2b80 f80d2b80 0 ffffefb4 ffffefe0 Apr 4 16:05:21 eevlsi vmunix: Called from f8005808, fp=ffffef58, args=8000000 42 f7fffcb4 42 42 210 Apr 4 16:05:21 eevlsi vmunix: Called from 1626c, fp=f7fffbe8, args=0 0 f7fffcb4 4124 1 f7800150 Apr 4 16:05:21 eevlsi vmunix: End traceback... Apr 4 16:05:21 eevlsi vmunix: panic: Data fault Apr 4 16:05:21 eevlsi vmunix: syncing file systems... [4] [4] [2] done Apr 4 16:05:21 eevlsi vmunix: 00257 low-memory static kernel pages Apr 4 16:05:21 eevlsi vmunix: 00337 additional static kernel pages Apr 4 16:05:21 eevlsi vmunix: 00039 dynamic kernel data pages Apr 4 16:05:21 eevlsi vmunix: 00048 additional user structure pages Apr 4 16:05:21 eevlsi vmunix: 00000 segmap kernel pages Apr 4 16:05:21 eevlsi vmunix: 00000 segvn kernel pages Apr 4 16:05:21 eevlsi vmunix: 00000 current user process pages Apr 4 16:05:21 eevlsi vmunix: 00046 user stack pages Apr 4 16:05:21 eevlsi vmunix: 00727 total pages (727 chunks) Apr 4 16:05:21 eevlsi vmunix: Apr 4 16:05:21 eevlsi vmunix: dumping to vp ff04bc1c, offset 22300 Apr 4 16:05:21 eevlsi vmunix: 727 total pages, dump succeeded Apr 4 16:05:21 eevlsi vmunix: rebooting... Apr 4 16:05:21 eevlsi vmunix: SunOS Release 4.0.3c (GENERIC) #1: Thu May 25 17:17:12 PDT 1989 Apr 4 16:05:21 eevlsi vmunix: Copyright (c) 1983-1989 Sun Microsystems, Inc. Apr 4 16:05:21 eevlsi vmunix: mem = 8192K (0x800000) Apr 4 16:05:21 eevlsi vmunix: avail mem = 6987776 Apr 4 16:05:21 eevlsi vmunix: Ethernet address = 8:0:20:7:4a:ab Apr 4 16:05:21 eevlsi vmunix: zs0 at obio 0xf1000000 pri 12 Apr 4 16:05:21 eevlsi vmunix: zs1 at obio 0xf0000000 pri 12 Apr 4 16:05:21 eevlsi vmunix: fd0 at obio 0xf7200000 pri 11 Apr 4 16:05:21 eevlsi vmunix: audio0 at obio 0xf7201000 pri 13 Apr 4 16:05:21 eevlsi vmunix: sbus0 at SBus slot 0 0x0 Apr 4 16:05:21 eevlsi vmunix: dma0 at SBus slot 0 0x400000 Apr 4 16:05:21 eevlsi vmunix: esp0 at SBus slot 0 0x800000 pri 3 Apr 4 16:05:21 eevlsi vmunix: sd0 at esp0 target 3 lun 0 Apr 4 16:05:21 eevlsi vmunix: sd0: <Quantum ProDrive 105S cyl 974 alt 2 hd 6 sec 35> Apr 4 16:05:21 eevlsi vmunix: sd1 at esp0 target 1 lun 0 Apr 4 16:05:21 eevlsi vmunix: sd1: <Quantum ProDrive 105S cyl 974 alt 2 hd 6 sec 35> Apr 4 16:05:21 eevlsi vmunix: le0 at SBus slot 0 0xc00000 pri 5 Apr 4 16:05:21 eevlsi vmunix: le1 at SBus slot 1 0xc00000 pri 5 Apr 4 16:05:21 eevlsi vmunix: cgsix0 at SBus slot 2 0x0 pri 7 Apr 4 16:05:21 eevlsi vmunix: root on sd0a fstype 4.2 Apr 4 16:05:21 eevlsi vmunix: swap on sd0b fstype spec size 14070K Apr 4 16:05:21 eevlsi vmunix: dump on sd0b fstype spec size 14056K Any idea what's going on? Thanks Isidore Bendrihem VLSI Lab Columbia University igb@vlsi.columbia.edu igb@cunixa.cc.columbia.edu
igb@vlsi1.vlsi.columbia.edu (Isidore G. Bendrihem) (04/11/90)
Thanks to all the people who replied to my "bad silo" problem with one of our SparcStation. Many people seem to have this problem. Some of them have had their motherboards and mice replaced without luck. According to a SUN representative, the problem seems to be on 4.0.3c. This is the response I got from them: If you have a support contract, contact the USAC via email at hotline@sun.com or by calling 1-800-USA-4SUN. There is a known bug in SunOS 4.0.3c which can result in data fault panics. The problem deals with an interrupt being received from a serial port while the system is doing a fork(). If the timing is right, you can get a data fault panic. Since any shell does a lot of forking... The backtrace typically looks like: _panic(0xf80cb913,0x1e84800,0x1,0xf80cb7b8,0xfffff,0x0) + 7c _trap(0x9,0xffffec74,0xff640000,0x80,0x1,0xffffffff) + 180 fault(0x1,0xfffffff2,0xffffc000,0xffffed20,0xf0,0x1) + 64 level6(?) _bcopy(0xffffc000,0xff640000,0x38a0,0xffffe6a0,0xf810dff8,0x0) + c _procdup(0xf810b144,0x94,0x49,0x0,0xffffffff,0xff108c9c) + 134 _newproc(0x0,0x50,0x14,0xf810ab7c,0xf8108668,0xf810b144) + 494 _fork1(0x0,0xc,0x1,0xff13e8fc,0xf810ab7c,0xf810b144) + 1a0 _fork(0xffffefe0,0xf80be528,0xf80be528,0x0,0xffffefb4,0xffffefe0) + 4 _syscall(0x8000000) + 2d0 Note that you can only get the bug fix if you have a support contract. Considering how many people seem to be affected by this, I think SUN should make the fix freely available by posting it to the Rice U. archives. Isidore Bendrihem igb@vlsi.columbia.edu