[net.news.config] cbosgd comes back from crash

mark@cbosgd.UUCP (Mark Horton) (06/22/85)

cbosgd just came back up from the most horrible crash I've ever seen.

The original cause of the crash was a power failure (the second in a row
between 8 and 9 PM) from which cbosgd could not come back up.

cbosgd was down for 2 1/2 days while DEC worked on the RM05 and RM80
disks and the massbus.  Finally they concluded that several problems
existed, one of which was a bad massbus controller card in the RM05
which could cause random data to be scribbled on the any disk or tape
on the massbus.

We got the hardware working at noon on Friday.  During this time we
discovered that anything near an active area on the RM05 or RM80
was trashed, including lots of inode areas and a few superblocks.
Our user files only lost a couple dozen files.  /usr/spool/news and
/usr/lib/news had minor damage.  /tmp was destroyed beyond repair,
but since it was /tmp it didn't matter.  The root filesystem was also
destroyed beyond repair, losing most of /, /dev, and /etc, including
all 3 copies of vmunix in /.  We booted from a backup root partition
on another rm80, which was about 2 months old.  (Remind me to put an
entry in crontab to back it up nightly.)  /usr/spool was also destroyed
beyond reasonable repair, but having no alternative, I repaired it as
best as I could.  We lost all of /usr/spool/mail and most of
/usr/spool/uucp.  (This means people lost their incoming unread mail,
and any mail on the way through cbosgd has proably been lost if it
was in our spool queue at the time.  Apologies to the many people that
this will no doubt inconvenience.)

Of course, all this happened during the week that our two system
administrators were at a class learning more about system administration,
so not only were no backups done this week, but nobody who knew where
the most recent backups could be found could be found.

So anyway, I think we're back up, but much was repaired by hand, so
there are probably some things still broken.  Since I will be out of
town the week of the 24th, anything that doesn't get fixed within the
next day may remain broken for awhile.  Please tread lightly when
near cbosgd for a while, the eggshells may crunch under your feet.