[comp.sys.novell] Primary FAT trashed on NW386.3.1.A

Ullrich_Fischer@mindlink.UUCP (Ullrich Fischer) (03/20/91)

I've had two similar problems on a CDC WREN VI 650 Mb SCSI hard disk on a
NETWARE 386 3.1 A server.

The drive is one of 3 identical drives sharing a Novell DCB controlled SCSI bus
with two mirrored 40 Mb Conner SCSI drives.

The first incident was about a month ago when the server crashed.  It just hung
up.  When I powered it down and brought it up, it complained about the FAT on
VOL31:.  I tried running the VREPAIR NLM but according to a few postings here
over the last few weeks, I was not sufficiently persistent and when VREPAIR
seemed to be getting nowhere, I gave up, restored the volume from backup, and
bought a new volume in case we had to retire this one.  Before restoring from
tape, I re-ran compsurf overnight on VOL31: and found no defects.  At this
point, I installed the new volume as VOL33: which I an using as a temporary
scratch-pad volume with the understanding that anything put there may have
to be dumped on a moment's notice if VOL31: should need to be replaced.

For about a month, this server performed flawlessly (except for the directory
tree problem described in a separate posting).

Tonight, I had to take our 2 NETWARE 2.15 C SFT servers down to change the
network addresses and swap some hardware.  I also had to take down the 386 3.1
A
server to test the revised AUTOEXEC.NCF which specified the new network
addresses.   I DOWNed the 386 3.1 A server from the system console.  The server
went down in the usual way.  I then typed EXIT to get back to dos, then SERVER
to try to bring the server back up.   It went through the entire NETWARE 386
boot-up process in the normal way except when it got to mounting VOL31: it said
'mirror copies of FAT do not match.  VOL31 is NOT mounted'.  It mounted VOL32:
and VOL33: with no problems after coming up with the above error.  I loaded the
VREPAIR NLM and got about 15000 errors corrected.  In every case the primary
FAT had a zero sequence or link value where the mirror copy had some other
value.  Eventually, I let VREPAIR run without pausing at each error but made it
write a log file.  When it was done, I told it to write the corrected FAT to
the disk.   I tried mounting the volume from the system console as soon as
VREPAIR finished, but got a message to the effect that there was insufficient
RAM to mount the volume.  I then DOWNed the server, powered it off, and did a
cold boot.

When the server came up, it was able to mount VOL31: with no further problems.
I did several warm boots after that in the course of setting up the other
servers with no further problems.  In these cases, I just pressed the RESET
button on the front of the server (an EVEREX 20 Mhz 386 with 8Mb of RAM).

The drive had about 128Mb left out of its capacity of 650 Mb at the time of
these incidents.  There were no re-directed blocks in the hot-fix area and the
hot-fix figures looked normal before and after this and the previous incident.
There were about 3600 directory entries available on the drive according to
CHKVOL.

I am booting all our servers from floppies with an AUTOEXEC.BAT that runs
SERVER.EXE for the 386 3.1 A server and NET$OS for the 2.15 SFT 286 servers.

Is there something which can trash the primary FAT as part of the DOWN process?
Is it a mistake to use the EXIT and SERVER command to re-start a DOWNed
386 3.1 A server?   Should I suspect the soundness of this physical drive?

Has anyone had a similar experience?  Any comments will be greatly appreciated.
--

---    Ullrich Fischer  phone (604) 684 9371  Vancouver, BC, Canada --- 
                 Ullrich_Fischer@mindlink.uucp