[comp.unix.i386] File system problems with 386/ix

gb7@prism.gatech.EDU (Joe Bradley) (10/13/89)

I've started having problems creating tar archives of certain parts of
my file system. It crashes and dumps core while in the middle of creating
any archive. I'm running 386/ix v2.0.1.

This made me suspicious about the integrity of my file system, so I manually
invoked fsck. Well, it complained about a few things which I let it fix
(see below). However, when I invoke fsck again, it again complains about
the same things that it just supposedly fixed. Does anybody have any ideas?
Shouldn't these problems be fixed after running fsck?


  /dev/root
  File System: root Volume: disk0

  ** Phase 1 - Check Blocks and Sizes
  ** Phase 2 - Check Pathnames
  ** Phase 3 - Check Connectivity
  ** Phase 4 - Check Reference Counts
  FREE INODE COUNT WRONG IN SUPERBLK
  FIX? 
  ** Phase 5 - Check Free List 
  9003 BLK(S) MISSING
  BAD FREE LIST
  SALVAGE? 
  ** Phase 6 - Salvage Free List
  1117 files 22134 blocks 18026 free

  *** ROOT FILE SYSTEM WAS MODIFIED ***

-- 
G.J. (Joe) Bradley, Georgia Tech Research Institute, Atlanta, Georgia, 30332

UUCP:     ...!{allegra,amd,hplabs,ut-ngp}!gatech!prism!gb7
INTERNET: gb7@prism.gatech.edu

cpcahil@virtech.UUCP (Conor P. Cahill) (10/13/89)

In article <2462@hydra.gatech.EDU>, gb7@prism.gatech.EDU (Joe Bradley) writes:
> (see below). However, when I invoke fsck again, it again complains about
> the same things that it just supposedly fixed. Does anybody have any ideas?
> Shouldn't these problems be fixed after running fsck?

That depends upon how you run the fsck.  Fsck should be run on a non-mounted
file system.  Obviously, if you are running on the system the root 
partition will be mounted, but you can run fsck with specific flags which allow
the unmounting and remounting of the root partition.  These are used in the 
/etc/bcheckrc file:

	/etc/fsck -y -s -D -b ${rootfs}

where rootfs=/dev/root.

>   ** Phase 5 - Check Free List 
>   9003 BLK(S) MISSING
>   BAD FREE LIST
>   SALVAGE? 

This will always appear under interactive (if the file system is mounted)
since they purposly damage the free list on the drive while the file
system is unmounted.  This is part of thier high performace disk driver package.

-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+

brian@hutch.UUCP (Brian R. Eckert) (10/14/89)

In article <2462@hydra.gatech.EDU> gb7@prism.gatech.EDU (Joe Bradley) writes:
>I've started having problems creating tar archives of certain parts of
>my file system. It crashes and dumps core while in the middle of creating
>any archive. I'm running 386/ix v2.0.1.
>
>This made me suspicious about the integrity of my file system, so I manually
>invoked fsck. Well, it complained about a few things which I let it fix
>(see below). However, when I invoke fsck again, it again complains about
>the same things that it just supposedly fixed. Does anybody have any ideas?
>Shouldn't these problems be fixed after running fsck?
>
> [ fsck output deleted ]

I will first ask you a question:  are you running fsck at a multi-user
run level (2 or 3)?

Fsck repairs the filesystem which requires it to wade through the super-block
(the first block of the filesystem... it serves as an index to all the parts
of the filesystem:  free list, inode table, etc., hence its name).

The super-block is kept in core and updated in RAM, not on the disk.  When
UNIX is multi-user,  the super-block is periodically written to disk to
attempt to keep in sync.  Therefore,  fsck does its thing repairing the
filesystem and adjusting the super-block on the disk (while a bad super-block
still resides in memory);  shortly after,  UNIX writes the (BAD) in core
super-block to disk and the super-block is now useless again.

You should ALWAYS run an fsck in single-user mode (system maintenance mode);
in single-user mode,  no periodic update is done to the disk version of the
super-block.  Note that the root filesystem will automatically be remounted
if the super-block is modified by fsck.  You also should not run fsck on any
mounted filesystems (root being the exception) as you will invalidate the in
core super-block if the disk version gets modified.

Some versions of fsck will sync the disk before it goes to work (i.e. it tells
UNIX to write the in core super-block to disk), thus back-to-back fsck's report
the same problem (in many cases) if done while the system is multi-user.  In
any event,  you should manually do a sync prior to fsck.  Something like:

	# sync;sync
	# fsck /dev/.......

should be adequate.

As an aside:  not too many versions of UNIX ago,  if fsck modified the
root super-block,  you needed to press reset or power-off / power-on
the system.

dkelly@npiatl.UUCP (Dwight Kelly) (10/14/89)

In article <2462@hydra.gatech.EDU> gb7@prism.gatech.EDU (Joe Bradley) writes:
>
>This made me suspicious about the integrity of my file system, so I manually
>invoked fsck. Well, it complained about a few things which I let it fix
>(see below). However, when I invoke fsck again, it again complains about
>the same things that it just supposedly fixed. Does anybody have any ideas?
>Shouldn't these problems be fixed after running fsck?
>
>
>  FREE INODE COUNT WRONG IN SUPERBLK
>  ** Phase 5 - Check Free List 
>  9003 BLK(S) MISSING
>  BAD FREE LIST


Under 2.0.?, Interactive's fast file system marks almost all inodes as free and 
keeps a true map of allocated inodes in memory.  This way a panic or powerdown 
will force fschk to rebuild the ENTIRE inode list, giving the very large amount 
of incorrectly free blocks.  This is mentioned in the 2.0 upgrade kit on page 5
of the Release notes

I quote: 

 The fsck file system check program may sometimes display alarming messages when 
 checking a file system that was mounted using FFS at the time of a system failure
 Note that these messages are no cause for alarm. 
 When the FFS mounts a file system and builds its internal bitmap, it deliberately 
 marks the free list to be almost empty, to make sure that fsck will have to
 rebuild it from scratch in the event of a crash. fsck will rebuild
 a perfectly good free list, even though it may complain that thousands
 of blocks are missing, and the FFS can then use this free list to 
 reconstruct its internal bitmap when the file system is mounted again.

Dwight Kelly
Network Publications, Inc.
Director R&D

cpcahil@virtech.UUCP (Conor P. Cahill) (10/16/89)

In article <652@npiatl.UUCP>, dkelly@npiatl.UUCP (Dwight Kelly) writes:
> Under 2.0.?, Interactive's fast file system marks almost all inodes as free
> and keeps a true map of allocated inodes in memory.  This way a panic or
> powerdown will force fschk to rebuild the ENTIRE inode list, giving the
> very large amount of incorrectly free blocks. 


The FFS does not mark inodes as free.  What does happen is that the (block)
free list is purposely damaged so that it appears empty. The messages that
appeared in the original posters fsck output were probably due to his running
the fsck on a mounted file system (or else the file system was really damaged).


-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+