[comp.unix.wizards] toasted root file system

mrd@sun.soe.clarkson.edu (Michael DeCorte) (10/07/89)

The discussion about 2 identical file names in the same directory
reminded me of a problem that I had (am having) with one of the sites
of mine.

I modemed in to the computer (SV.3, moto VME147) to update some
software.  In doing so I got a funny error message.  about the
password file.  I went hmmm.  cat /etc/passwd.  cat reports a stat
error.  (I haven't the faintest idea how I logged in) more hmmm's. ls
/etc- that works.  ls -l /etc 8-10 stat errors on lots of interesting
files.  a few more hmmm's.  df nope that reports all of of nice
errors.  ls -l /usr that works, other files systems seem ok.  root's
toasted.  I considered running fsck at this time but the computer 
must have perceived this and decided that it didn't want to be fsck'ed
so it crashed.  The system hasn't booted since.  My guess is that
a bunch of inodes got munged.

I assume someone flipped power at some time just because well.... its
just like a PC right?  

Now if the system had bothered to stay up for a little longer and I
was a proper wizard I could have run fsdb on it but to do that I would
have to unmount / wouldn't I?  otherwise wouldn't going init 6 sort of
ruin my work?  Same goes for all the other interesting ways that I can
think of to fix this.  (something tells me that running fsck would
have made the disks normal but without enough stuff to boot)

So what do you do if you mangle root?  

(what are we doing for this computer?  2.5 hour drive down, pop in a
new disk that has an os on it, and bring back the old disk) -sigh


--

Michael DeCorte // H215-546-0497 W386-8164 Fax386-8252 // mrd@clutx.bitnet
2300 Naudain St. "H", Phil, PA 19146 // mrd@sun.soe.clarkson.edu
---------------------------------------------------------------------------
Clarkson Archive Server // commands = help, index, send, path
archive-server@sun.soe.clarkson.edu
archive-server%sun.soe.clarkson.edu@omnigate.bitnet
dumb1!dumb2!dumb3!smart!sun.soe.clarkson.edu!archive-server
---------------------------------------------------------------------------

cpcahil@virtech.UUCP (Conor P. Cahill) (10/07/89)

In article <MRD.89Oct6222225@sun.clarkson.edu>, mrd@sun.soe.clarkson.edu (Michael DeCorte) writes:
> 
> So what do you do if you mangle root?  

The only fix for a mangled root is to boot off your backup root partition, or
if no backup root partition exists, a different device.  The other device
can include floppy, tape, etc.

Once you  have booted the other root, you can use fsdb and/or fsck to fix
most of the problems.  If you have real bad problems (like /unix got trashed or
lots of directories got trashed and you don't want to spend 3 days going
through lost+found and trying to figure which file goes where) you can read
in the last root backup that you have.

You have to have some alternative method of booting the system, otherwise
how did you get the os on there in the first place.
-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+

bph@buengc.BU.EDU (Blair P. Houghton) (10/08/89)

In article <1244@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
>In article <MRD.89Oct6222225@sun.clarkson.edu>, mrd@sun.soe.clarkson.edu (Michael DeCorte) writes:
>> 
>> So what do you do if you mangle root?  
>
>You have to have some alternative method of booting the system, otherwise
>how did you get the os on there in the first place.

Monitor-mode, which comes from a prom?

				--Blair
				  "Front-panel switches?"

john@frog.UUCP (John Woods) (10/11/89)

In article <4469@buengc.BU.EDU>, bph@buengc.BU.EDU (Blair P. Houghton) writes:
> In article <1244@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
> >In article <MRD.89Oct6222225@sun.clarkson.edu>, mrd@sun.soe.clarkson.edu (Michael DeCorte) writes:
> >> So what do you do if you mangle root?  
> >You have to have some alternative method of booting the system, otherwise
> >how did you get the os on there in the first place.
> Monitor-mode, which comes from a prom?
> 				--Blair
> 				  "Front-panel switches?"

Ah, front-panel switches.  Back when I ran a PDP-11, our panic-restore method
was quite simple:  a short assembly language program named "omg" (for Oh My
God), punched on several paper tapes and short enough to toggle in by hand
if need be, which would copy a tape image onto the root partition.  Every
couple of weeks, we'd create a root backup tape with a script named "gmo"
(for obvious reasons).

On these new-fangled systems without switches, and WORSE YET with complicated
controllers requiring gigabytes of code to engage in long philosophical
arguments with them just to get data shuffled back and forth, such a brute-force
solution might not be so elegant.  But it worked.
-- 
John Woods, Charles River Data Systems, Framingham MA 508-626-1101
...!decvax!frog!john, john@frog.UUCP, ...!mit-eddie!jfw, jfw@eddie.mit.edu