[comp.unix.questions] Recurring problem in root filesystem

850181f@aucs.uucp (S. Ferguson-Parker) (08/31/90)

We are having a recurring problem with our root file system.  Whenever we
perform an fsck, it always reports a "FREE INODE COUNT WRONG IN SUPERLBK".
If we fix the problem, it goes away only until 

art@pilikia.pegasus.com (Art Neilson) (09/03/90)

In article <1990Aug31.134539.749@aucs.uucp> 850181f@aucs.uucp (S. Ferguson-Parker) writes:
>We are having a recurring problem with our root file system.  Whenever we
>perform an fsck, it always reports a "FREE INODE COUNT WRONG IN SUPERLBK".
>If we fix the problem, it goes away only until 

Try running fsck on the *unmounted* root filesystem, or
use whatever option your fsck has for remounting root after
fsck completes.
-- 
Arthur W. Neilson III		| ARPA: art@pilikia.pegasus.com
Bank of Hawaii Tech Support	| UUCP: uunet!ucsd!nosc!pegasus!pilikia!art

guy@auspex.auspex.com (Guy Harris) (09/04/90)

>Try running fsck on the *unmounted* root filesystem,

Try unmounting the root filesystem first.  Good luck....

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/04/90)

In article <4010@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
> >Try running fsck on the *unmounted* root filesystem,
> Try unmounting the root filesystem first.  Good luck....

Wait a minute. Can't you chroot() to another filesystem, then remount
the original root below the new one? I haven't tested this but it seems
like it should work on an otherwise unused system.

---Dan

cpcahil@virtech.uucp (Conor P. Cahill) (09/04/90)

In article <15590:Sep402:41:0190@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <4010@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
>> >Try running fsck on the *unmounted* root filesystem,
>> Try unmounting the root filesystem first.  Good luck....
>
>Wait a minute. Can't you chroot() to another filesystem, then remount
>the original root below the new one? I haven't tested this but it seems
>like it should work on an otherwise unused system.

You can't remount root because you can't unmount root because your chrooted
environment is below the real root file system and therefore makes
the root file system busy.  chroot() does not make any file system 
changes it only places acts as a starting point for pathname resolution in
namei() or lookupname() (the kernel routines that resolve pathnames) when
the first character of the file name is '/'.

The problem is that fscking a mounted file system is that fsck may find 
things that are wrong when they aren't really wrong and may fix things
that get written over if the kernel syncs before you reboot.

Fscking root is yet another chicken and egg problem.  (although some
late system V.3s allow root to be fscked and remounted without rebooting
if there is little or no damage to the root fs).  This problem is usually
resolved by one of the following:

	a: boot off another device (different partition or different media)
	   fsck the real root and then reboot

	b: fsck root, reboot no sync, fsck root, if still problems reboot
	   no sync and try try again

One of the big reasons for continuing to have problems on your root partition
is that it cannot be unmounted due to some other problem in the system (like
you have another file system that is not being unmounted (but it too
would have fsck problems) or a process that gets stuck in never-never land
because of a bug in a device driver, etc)

-- 
Conor P. Cahill            (703)430-9247        Virtual Technologies, Inc.,
uunet!virtech!cpcahil                           46030 Manekin Plaza, Suite 160
                                                Sterling, VA 22170 

art@pilikia.pegasus.com (Art Neilson) (09/05/90)

In article <4010@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
>>Try running fsck on the *unmounted* root filesystem,
>
>Try unmounting the root filesystem first.  Good luck....

What do you mean, good luck??  I've run fsck on the unmounted hard disk
root filesystem on my system here, just boot with your disaster disk
and away you go ;^).  It's not impossible.  Besides, the point I was making
was that the message he was getting was *probably* due to his running
fsck on the mounted root file system.  The wrong inode count will always
occur because of the scratch file fsck creates.
-- 
Arthur W. Neilson III		| ARPA: art@pilikia.pegasus.com
Bank of Hawaii Tech Support	| UUCP: uunet!ucsd!nosc!pegasus!pilikia!art

guy@auspex.auspex.com (Guy Harris) (09/05/90)

>> >Try running fsck on the *unmounted* root filesystem,
>> Try unmounting the root filesystem first.  Good luck....
>
>Wait a minute. Can't you chroot() to another filesystem, then remount
>the original root below the new one?

Well, for one thing, that doesn't unmount the root file system, it
remounts it.

For another, 4.3BSD, System V Release 3, and probably most if not all
other UNIX systems don't let you mount the same local file system twice.

det@cimcor.mn.org (Derek Terveer) (09/05/90)

In article <15590:Sep402:41:0190@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> In article <4010@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
> > >Try running fsck on the *unmounted* root filesystem,
> > Try unmounting the root filesystem first.  Good luck....
> 
> Wait a minute. Can't you chroot() to another filesystem, then remount
> the original root below the new one? I haven't tested this but it seems
> like it should work on an otherwise unused system.

Wait a minute.  This shouldn't be necessary at all -- at least in
system V.  Bring the system down into single user and then run fsck on
the Block Device for root (not the char device).  For example:

fsck /dev/dsk/0s1

Since one can't unmount root while one is running from it, it makes
sense that one should be able to fsck it while you one is running on
it.  The alternative is to boot from some other media (another disk, a
tape, etc) and *then* run fsck on either the blocked or raw root
partition.

derek
-- 
temporarily:  			derek@cimcor.MN.ORG
as soon as i get my pc back:	det@hawkmoon.MN.ORG

guy@auspex.auspex.com (Guy Harris) (09/06/90)

>What do you mean, good luck??  I've run fsck on the unmounted hard disk
>root filesystem on my system here, just boot with your disaster disk
>and away you go ;^).  It's not impossible.

But it can be a nuisance, especially if you don't have a "disaster
disk".  You might have to e.g. read in your distribution tape into the
"mini-root" and run from there, if you have a system with the
"mini-root" notion.

>Besides, the point I was making was that the message he was getting was
>*probably* due to his running fsck on the mounted root file system.

But if he gets it all the time, it may not exactly be convenient to run
"fsck" from some other root file system every time....

>The wrong inode count will always occur because of the scratch file fsck
>creates.

Assuming he has an "fsck" that creates a scratch file.  I don't *think*
he's running 4.2BSD or later, as their "fsck" doesn't have the message
he describes.  4.1BSD's might, and it might not bother with a scratch
file.  S5R3's doesn't create one by default, it only does so when run
with the "-t" flag.

jeff@onion.pdx.com (Jeff Beadles) (09/06/90)

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

>Wait a minute. Can't you chroot() to another filesystem, then remount
>the original root below the new one? I haven't tested this but it seems
>like it should work on an otherwise unused system.

You can't mount the same physical partition on a local system twice.
(You can via NFS though, but fsck is useless there :-)

Doesn't the SYSV 3.2 fsck do something like remounting the root filesystem?  I
haven't looked at that area of the code, but I recall some strange looking
message when I had non-fatal problems on the root fs that was fixed...

	-Jeff
-- 
Jeff Beadles  jeff@onion.pdx.com  jeff@quark.wv.tek.com

art@pilikia.pegasus.com (Art Neilson) (09/08/90)

In article <4025@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
> [lengthy discussion on ways to fsck root deleted]
>Assuming he has an "fsck" that creates a scratch file.  I don't *think*
>he's running 4.2BSD or later, as their "fsck" doesn't have the message
>he describes.  4.1BSD's might, and it might not bother with a scratch
>file.  S5R3's doesn't create one by default, it only does so when run
>with the "-t" flag.

Points well taken.  I think we need to hear more from the original poster
like what OS he's running, and why is he running fsck on root in the first
place.  Thanks to trn, I followed the thread back to the original post.
It is from one S. Ferguson-Parker of Acadia University.  Hopefully he will
see this and reply.
-- 
Arthur W. Neilson III		| ARPA: art@pilikia.pegasus.com
Bank of Hawaii Tech Support	| UUCP: uunet!ucsd!nosc!pegasus!pilikia!art

bob@wyse.wyse.com (Bob McGowen x4312 dept208) (09/10/90)

In article <1990Sep6.035057.17079@onion.pdx.com> jeff@onion.pdx.com (Jeff Beadles) writes:
>brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>
>>Wait a minute. Can't you chroot() to another filesystem, then remount
----deleted----
>Doesn't the SYSV 3.2 fsck do something like remounting the root filesystem?  I
>haven't looked at that area of the code, but I recall some strange looking
>message when I had non-fatal problems on the root fs that was fixed...

Yes, do:

	fsck -b /dev/dsk/0s1

The filesystem will be repaired and re-mounted.

Bob McGowan  (standard disclaimer, these are my own ...)
Product Support, Wyse Technology, San Jose, CA
..!uunet!wyse!bob
bob@wyse.com

chris@mcc.oz (Chris Robertson) (09/11/90)

On some SysV/386's (e.g., Bell Tech 3.2), the system habitually
came up with minor problems in root, even after a clean shutdown.
I came to the conclusion that the system was simply lying when
it said root had been re-mounted after an fsck!  Also, the system
could occasionally get the free space in root so confused that no
amount of taking-it-down-to-single-user-doing-fsck-and-rebooting
worked;  fsck passed with flying colours in single-user mode, then
the no free space problem returned on going mutli-user (so definitely
the root re-mount was not working).  The only fix was to take it
to single user, fsck, then quickly power-fail.  *Then* things came
up clean!  Made me shudder each time I did it, though :-)
-- 
"Down in the dumps?  I TOLD you you'd     |    Chris Robertson
 need two sets..."                        |  chris@mcc.oz