[comp.sys.sgi] SGI crashes during shutdown! Why??

dwatts@ki.UUCP (Dan Watts) (12/20/90)

While attempting to do a reboot today, I ran across a nasty little problem.
Seems my system crashed during shutdown.  From the dbx output, it seems that
the problem has to do with NFS.  My system had previously had 3 NFS partions
mounted R/W. I did a 'umount -t nfs' the day before due to the remote system
being moved to another location.

The following is the output from 'hinv', 'versions' and the 'dbx' traceback.
Has anyone else seen this behavior?  This has happended to me before. Most
recently during a shutdown caused by a power outage.  The remote system isn't
on a battery backup, so I had done the "umount -t nfs" and waited for the
umount to timeout before I started the shutdown.

  # hinv
  1 12 MHZ IP6 Processor
  FPU: MIPS R2010A/R3010 VLSI Floating Point Chip Revision: 1.5
  CPU: MIPS R2000A/R3000 Processor Chip Revision: 1.6
  Data cache size: 8 Kbytes
  Instruction cache size: 16 Kbytes
  Main memory size: 8 Mbytes
  Integral Ethernet controller: Version 0
  Graphics board: GR1.2 
  Integral SCSI controller 0: Version WD33C93
  Tape drive: unit 4 on SCSI controller 0: QIC 24
  Tape drive: unit 2 on SCSI controller 0: QIC 150
  Disk drive: unit 3 on SCSI controller 0
  Disk drive: unit 1 on SCSI controller 0

  # versions
  I = Installed, R = Removed

    Name                  Date      Description
  I  dev                  90/08/15  4D1-3.3 Development System
  I  eoe1                 90/08/15  4D1-3.3 Execution Only Environment (part 1)
  I  eoe2                 90/08/15  4D1-3.3 Execution Only Environment (part 2)
  I  maint1               90/10/10  Maint1 Product 4D1-3.3.1
  I  nfs                  90/08/29  4D1-3.3 Network File System
  I  vis                  89/10/05  Personal Visualizer PD, 808-0160-001

# dbx -k unix.5 vmcore.5
dbx version 2.0 8/6/90 14:02
Copyright 1987 Silicon Graphics Inc.
Copyright 1987 MIPS Computer Systems Inc.
Type 'help' for help.

warning: unix.5 is not marked executable!
Reading symbolic information of `unix.5' . . .
SP=ffffdc54 fp=100096d8 pc=8001184c
Current process id is 15499
(dbx) where
> 0 syncreboot(0x0, 0xa7, 0x800cf7ce, 0x0, 0x40, 0x800cf63c)
    ["prf.c":578, 0x80011848]
  1 cmn_err(0x0, 0x800cf6ec, 0x800cf808, 0x8001f1b8, 0xffffdd8c, 0x40)
    ["prf.c":174, 0x8000fd28]
  2 panicregs(0xffffdd8c, 0x40, 0x8ff14, 0x30000008, 0x800cf808, 0x0)
    ["trap.c":127, 0x80014220]
  3 k_trap(0x8001fa2c, 0x800fd9bc, 0x8ff14, 0x30000008, 0x1b, 0x80148ef8)
    ["trap.c":248, 0x800145ec]
  4 .trap.trap(0xffffdd8c, 0x0, 0x8ff14, 0x30000008, 0xff, 0x80130350)
    ["trap.c":275, 0x80014638]
  5 pflushinvaldev(0x8015b060, 0x80051028, 0x80050f68, 0x8001e0f0, 0x800fb170,
    0x80106ad0) ["page.c":594, 0x8001f1b4]
  6 binval(0x800fb170, 0x800e7024, 0x800e7024, 0x8001e0b0, 0x1, 0x93)
    ["fs_bio.c":1141, 0x80051044]
  7 efs_umount(0x80106ae8, 0x19, 0x80038ccc, 0x0, 0x800fc7d4, 0x800159a0)
    ["mount.c":515, 0x800636ec]
  8 sumount(0x80106a08, 0x800335b4, 0x80033158, 0x80015ef8, 0xf801000, 0x8ff3c)
    ["sys3.c":561, 0x80038cf8]
   9 .trap.syscall(0xffffc0f0, 0x16, 0x8ff3c, 0x30000008, 0x0, 0x0)
    ["trap.c":1029, 0x8001599c]
  10 systrap(0xffffc0f0, 0x16, 0x8ff3c, 0x30000008, 0x0, 0x0)
    ["IP6.O/clocore.s":2121, 0x80002a48]
(dbx) 
-- 
################## Skinny Dippers Have Less Stress ##################
# CompuServe: >INTERNET:uunet.UU.NET!ki!dwatts    Dan Watts         #
# UUCP      : ...!{uunet | wgc386}!ki!dwatts      Ki Research, Inc. #
############### New Dimensions In Network Connectivity ##############

brendan@illyria.wpd.sgi.com (Brendan Eich) (12/22/90)

In article <902@ki.UUCP>, dwatts@ki.UUCP (Dan Watts) writes:
> While attempting to do a reboot today, I ran across a nasty little problem.
> Seems my system crashed during shutdown.  From the dbx output, it seems that
> the problem has to do with NFS.

But neither "nfs" nor "NFS" appears in the dbx output.  The "efs" in the 7th
stack frame below stands for Extent File System (SGI's local filesystem type).
Is there another reason why you think this might be an NFS bug?

> # dbx -k unix.5 vmcore.5
> dbx version 2.0 8/6/90 14:02
> . . .
> Current process id is 15499
> (dbx) where
> > 0 syncreboot(0x0, 0xa7, 0x800cf7ce, 0x0, 0x40, 0x800cf63c)
>   1 cmn_err(0x0, 0x800cf6ec, 0x800cf808, 0x8001f1b8, 0xffffdd8c, 0x40)
>   2 panicregs(0xffffdd8c, 0x40, 0x8ff14, 0x30000008, 0x800cf808, 0x0)
>   3 k_trap(0x8001fa2c, 0x800fd9bc, 0x8ff14, 0x30000008, 0x1b, 0x80148ef8)
>   4 .trap.trap(0xffffdd8c, 0x0, 0x8ff14, 0x30000008, 0xff, 0x80130350)
>   5 pflushinvaldev(0x8015b060, 0x80051028, 0x80050f68, 0x8001e0f0, 0x800fb170,
>   6 binval(0x800fb170, 0x800e7024, 0x800e7024, 0x8001e0b0, 0x1, 0x93)
>   7 efs_umount(0x80106ae8, 0x19, 0x80038ccc, 0x0, 0x800fc7d4, 0x800159a0)
>   8 sumount(0x80106a08, 0x800335b4, 0x80033158, 0x80015ef8, 0xf801000, 0x8ff3c)
>   9 .trap.syscall(0xffffc0f0, 0x16, 0x8ff3c, 0x30000008, 0x0, 0x0)

It looks more like an EFS or VM bug, although the prior NFS umounts you
mentioned could be obscurely related.  We'll have a look.  Thanks,

/be