[gnu.utils.bug] Extreme unhappiness caused by gtar on AT&T 3B1 v3.51

dwex%mtgzfs3@mtgzy.att.com (David E Wexelblat) (04/25/91)

Let me preface this by saying that I am pretty sure that the problem
is specific to the 3B1, and not a gtar bug.  But one never knows.

[gtar v1.09, compiled basically as System V, with 3B1 shared libraries]

I was testing out the multi-volume and verification features on
a medium sized directory prior to using it to back up my hard disk.
Like a bozo, I specified the block device for the floppy disk
instead of the raw device.  Gtar went on dumping stuff to the floppy
(or trying to :->) after the red LED went out on the floppy.
Either there was no end-of-file reported by the driver, or gtar
ignored it.

What's the problem you ask?  After breaking out of gtar, I tried
to cd to the directory where gtar was living and got 

	/u/dwex/gnu: not a directory

Uh oh, I think to myself.  Time to run fsck.  But wait!  Before
I can su I get:

	panic: dup inode

(or something like that).  This is getting better and better, I
say to myself.  Hit the reset button.  Guess what?  Init asks me
what run-level I want!  Yes, you guessed it.  /etc/inittab is
all gone.  No big deal, I think.  Just run fsck from the floppy,
and then make a new iniitab.

Drag out the floppy boot disk.  Boot from floppy, and break out onto 
the floppy filesystem disk.  Type "/mnt/etc/fsck" and I get:

	/mnt/etc/fsck: cannot execute

Not good.  Then I 'cat' /mnt/etc/fsck.  Nice file of nroff text.
Oops again.  At this point many people would be in deep trouble.
The normal floppy file system has no fsck on it.  But fortunately
I was smart a while ago, and made my own floppy file system with
fsck on it.  So I run fsck, and there are about 40 dup inodes.
Fortunately fsck tells me what's nuked.  Lots of good stuff
like /etc/inittab, /etc/getty, /etc/iv.  Well, once my
disk was patched, I created a new inittab, using /usr/lib/uucp/uugetty.
Reboot, pull stuff off of floppy, and live happily ever after.
(Theoretically, according to the documents, uugetty won't work
on the console.  But it worked long enough to get my system back.)

Now, I was running as myself, not as root, when I ran gtar. SO HOW
THE HELL DID MY HARD DISK GET NUKED?  I wrote a test program
to try reading to end-of-file on both the raw and block floppy
devices, and both correctly reported end-of-file and quit.  I 
wasn't about to try a test with writes (once was enough, thank
you).  Any pointers (besides "don't use gtar") would be useful.

Another horror story for you: 
A couple of years ago my hard disk grew a bad block smack dab in
the middle of /unix.  This is not fun.  Note that a copy of
/unix will not fit on the floppy file system, and I haven't
figured out a way to read a cpio archive while maintaining 
access to the floppy file system.  So here's what we did (a
good friend worked through this disaster with me -- this is when
I discovered that cpio just doesn't bother to write anything
when it can't read a disk block :-<):

	1) Go to another 3B1
	2) dd if=/unix of=/tmp/unixa count=200
	3) dd if=/unix of=/tmp/unixb count=200 skip=200
	4) repeat for the rest of /unix
	5) mount /dev/fp021 /mnt (floppy file system disk)
	6) cp /tmp/unixa /mnt
	7) dismount -f
	8) boot my dead 3b1 from floppy
	9) cp /unixa /mnt/unixa (copy from floppy file system to hard disk)
	10) repeat 5-9 for the other parts
	11) mv /mnt/unix /mnt/unix.fubar
	12) cat /mnt/unix? > /mnt/unix
	13) boot off hard disk
	14) make backup
	15) format hard disk
	16) restore foundation set
	17) discover cpio brain-damage
	18) dd each piece of corrupted cpio archive to /tmp
	19) use adb to patch each piece (basically, just fix the
	    length in the header)
	20) dd the files back out to floppy
	21) restore backup
	22) have several beers :->

All of this took about 10 hours.  This was about 2 weeks before
the first version of afio was posted to the net (at least the
first one I ever saw).  So now in / on my hard disk is /unix.bk.Z,
and on my floppy file system (in addition to fsck) are afio and
uncompress.  Fool me once, shame on you.  Fool me twice, shame on me.
(I'm not sure it this all deserves a :->, a :-<, or a !@#$%)

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
David Wexelblat             | dwex@mtgzz.att.com    | I asked her her name.
AT&T Bell Laboratories      | ...!att!mtgzz!dwex    |   She said her name was
200 Laurel Ave - 4B-421     |                       |      'Maybe'
Middletown, NJ  07748       | (201) 957-5871        | --Damn Yankees