[comp.sys.ibm.pc.rt] Boot Problem with RISC/6000

lzm@mace.cc.purdue.edu (Chris McCoy) (10/07/90)

I'm desperately trying to get help to figure out why my RISC/6000 fails
during power-up with an error code of '552'.  The '552' error is not
documented in the diagnostic manuals and my CE is stumped.  Of course,
this is the weekend so the *real* programmers down in Austin are not
available.

Has anyone else seen a system fail with a '552' error code?  I'm pushing
a deadline with this system so I need it up fast.

Configuration:
	32MB RISC/6000 model 320
	1 CD-ROM
	1 150 MB Tape Unit
	1 Internal 320 MB Hdisk
	1 8-port async adapter
	1 ETHERNET adpater
	1 Grayscale Graphics adapter

	AIX update (3001) not applied yet.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Chris McCoy                         : INTERNET:   mccoy@ecn.purdue.edu
Communication Systems Programmer    : UUCP:       ...!ecn-ee!mccoy
Ag. Computer Network, Purdue Univ.  : VOICE:      (317) 494-8339
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

lzm@mace.cc.purdue.edu (Chris McCoy) (10/08/90)

>I'm desperately trying to get help to figure out why my RISC/6000 fails
>during power-up with an error code of '552'.  The '552' error is not
>documented in the diagnostic manuals and my CE is stumped.  Of course,
>this is the weekend so the *real* programmers down in Austin are not
>available.

Something I forgot to mention:  I was able to boot the diagnostic
diskettes and run the full suite of diagnostic routines.  All routines
reported no errors.  I have not been able to boot the diagnostic 
routines from hdisk0.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Chris McCoy                         : INTERNET:   mccoy@ecn.purdue.edu
Communication Systems Programmer    : UUCP:       ...!ecn-ee!mccoy
Ag. Computer Network, Purdue Univ.  : VOICE:      (317) 494-8339
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

jfc@athena.mit.edu (John F Carr) (10/08/90)

In article <5711@mace.cc.purdue.edu> lzm@mace.cc.purdue.edu (Chris McCoy) writes:
>I'm desperately trying to get help to figure out why my RISC/6000 fails
>during power-up with an error code of '552'.  The '552' error is not
>documented in the diagnostic manuals and my CE is stumped. 

/etc/rc.boot4 contains these lines on our system:

showled 0x551
# bring up the root volume group
/etc/ipl_varyon -v
if [ $? != 0 ]
then
        while :
        do
                showled 0x552
        done
fi


It looks like either the volume table on disk or the root filesystem on the
boot image is corrupt.  I have no clue what you can do about this.


--
    John Carr (jfc@athena.mit.edu)

lzm@mace.cc.purdue.edu (Chris McCoy) (10/09/90)

In <1990Oct8.154314.10429@athena.mit.edu> jfc@athena.mit.edu (John F Carr) writes:

>In article <5711@mace.cc.purdue.edu> lzm@mace.cc.purdue.edu (Chris McCoy) writes:
>>I'm desperately trying to get help to figure out why my RISC/6000 fails
>>during power-up with an error code of '552'.  The '552' error is not
>>documented in the diagnostic manuals and my CE is stumped. 

>/etc/rc.boot4 contains these lines on our system:

>showled 0x551
># bring up the root volume group
>/etc/ipl_varyon -v
>if [ $? != 0 ]
>then
>        while :
>        do
>                showled 0x552
>        done
>fi

>It looks like either the volume table on disk or the root filesystem on the
>boot image is corrupt.  I have no clue what you can do about this.

Mucho Gracis to everyone that responded.  Most people had the correct
answer, or at least were very close.  I even got replies from some
*real* programmers in Austin (:-) ).  In the end, the support line
was where I got the response.  They're good -- use 'em.

The problem turned out to be that the log logical volume hd8 had been
corrupted, hence the failure of the script above.  I had to boot from
diskette and get a shell, run /etc/continue and /etc/aix/logform.
(If you have the same problem call the support line -- it's painless)

As John Carr showed (script portion above) the startup scripts actually
display LED values upon failure, so there is another place to search
for answers on a system failure.  Of course, it doesn't help much if you
only have the one system.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Chris McCoy                         : INTERNET:   mccoy@acn.purdue.edu
Systems Programmer                  : UUCP:       ...!ecn-ee!mccoy
Ag. Computer Network, Purdue Univ.  : VOICE:      (317) 494-8333
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

peter@dbaccess.com (Peter A. Castro) (10/11/90)

in article <5711@mace.cc.purdue.edu>, lzm@mace.cc.purdue.edu (Chris McCoy) says:
> 
+ I'm desperately trying to get help to figure out why my RISC/6000 fails
+ during power-up with an error code of '552'.  The '552' error is not
+ documented in the diagnostic manuals and my CE is stumped.  Of course,
+ this is the weekend so the *real* programmers down in Austin are not
+ available.
+ 
+ Has anyone else seen a system fail with a '552' error code?  I'm pushing
+ a deadline with this system so I need it up fast.

  Yes, this is an undocumented code.  It means the following:
  "The system has completed execution of tasks listed in the /etc/inittab
  file and is now waiting for something to do."
  We ran into this problem with our 320s.  It usually occurs when the system
  was powered down in a wrong way (eg: we has a power spike).  It trashed
  the machine while it was writing to the inittab file.  To correct it:
  boot up the BOSBOOT disks and get into the maint. shell.  type:
      /etc/continue hdisk0
  this will get the hdisk0 on line and mounted over /mnt.
  change directory to /mnt/etc and cat your inittab file.  It'll look
  either empty and almost empty.  Now, you need to get a copy of your
  inittab to reenter it from scratch.  Here is ours for your reference:

: @(#)inittab	1.22  com/cfg/etc,3.1,9021 4/6/90 17:18:07
init:2:initdefault:
brc::sysinit:/etc/brc >/dev/console 2>&1 # Phase 2 of system boot
rc:2:wait:/etc/rc > /dev/console 2>&1  # Multi-User checks
srcmstr:2:respawn:/etc/srcmstr		# System Resource Controller
rctcpip:2:wait:/etc/rc.tcpip > /dev/console 2>&1 # Start TCP/IP daemons
rcnfs:2:wait:/etc/rc.nfs > /dev/console 2>&1 # Start NFS daemons
cons:0123456789:respawn:/etc/getty /dev/console
piobe:2:once:/bin/rm -f /usr/lpd/pio/flags/*  # Clean up printer flags files
cron:2:respawn:/etc/cron
qdaemon:2:once:/bin/startsrc -sqdaemon
writesrv:2:once:/bin/startsrc -swritesrv
rcncs:2:wait:sh /etc/rc.ncs

  There are a number of ways to restore the file, eg: cat this into the
  file, type it in by hand, ed the file...
  Once you have restored this file, exit the shell and exit the BOSBOOT
  shell and reboot.  Your system should come up at this point.  You should
  check that other files are not damaged before continuing normal uage.
  Hope this helps.

+ 
+ Configuration:
+ 	32MB RISC/6000 model 320
+ 	1 CD-ROM
+ 	1 150 MB Tape Unit
+ 	1 Internal 320 MB Hdisk
+ 	1 8-port async adapter
+ 	1 ETHERNET adpater
+ 	1 Grayscale Graphics adapter
+ 
+ 	AIX update (3001) not applied yet.
+ --
+ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
+ Chris McCoy                         : INTERNET:   mccoy@ecn.purdue.edu
+ Communication Systems Programmer    : UUCP:       ...!ecn-ee!mccoy
+ Ag. Computer Network, Purdue Univ.  : VOICE:      (317) 494-8339
+ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-- 
Peter A. Castro                   INTERNET: peter@dbaccess.com        // //|
c/o DB/Access Inc.                UUCP: {uunet,mips}!troi!peter      // //||
2900 Gordon Avenue, Suite 101     FAX: (408) 735-0328            \\ // //-||-
Santa Clara, CA 95051-0718        TEL: (408) 735-7545             \// //  ||