[comp.sys.ibm.pc.rt] Hard Disk Errors

clp@beartrk.UUCP (Charlie Pilzer) (03/03/89)

In article <550@scifi.UUCP>, njs@scifi.UUCP (Nicholas J. Simicich) writes:
> In article <2854@stpstn.UUCP> aad@stepstone.com writes:
>>I've found that our AIX machines have the annoying tendency to get
>>corrupted, giving an "error reading iodn 16384" when trying to boot
> 
> iodn 16384 is your root filesystem.  An I/O error on the root
> filesystem would tend to keep your system from starting, certainly an
> annoying thing.

I had a 70 MB disk that began to lose blocks.  It happened to be my /usr
disk and every day I would get various I/O errors on it.  At first I
simply fsck'd the disk, telling the program to ignore the bad blocks.  It 
kept getting worse.  After a long session of RTFM, I learned that there are
two methods available of reformatting a hard disk drive.  Both are not
documented very well and quite frankly were scary to run the first time.
But I clenched my teeth and did it anyway.  Seems to have fixed the problem
as that I have had no errors at all since reformatting.

The procedures I followed were:

1) dump the filesystems (minidisks) on the affected hard disk.  In my case
this included the VRM so that I also needed to run cvid to create a new
set of installation disks.

2) run the minidisk command to record the current IODN of each minidisk

3) run the hardware diagnostics included in the problem determination guide

4) run the fixed disk utilities (only one available is to format the disk)

5) Close your eyes and reformat the disk

6) Install the VRM if necessary

7) Recreate the minidisks and do the newfs as needed.  This procedure is
   documented fairly well in the Installation and Customization guide.  It
   is covered under the part of the manual that talks about changing the
   page space size.  It has lots of steps but is pretty straightforward.

8) Restore the filesystems.

In addition to fixing my own machine (a 6151), I've done something similar
on a reconfiguration of a client's machine.  I would have preferred more
information on the diagnostic's format routine as opposed to the format
option available under the VMF.  When I ran the VMF format command, it said
that the disk had been previous formatted and did I really want to reformat
because I would lose all of my bad block information.  What I didn't know
(and still don't) is: a) Does the diagnostic format build a new bad block
table? and b) Does the VMF format blow away the bad block table as it
threatens to do? and c) Why are there two format routines anyway?

Charlie Pilzer
Bear Track Computer Company
uunet!beartrk!clp