jjt@wm6r.UUCP (John Thornton) (11/25/89)
The file system used here for /usr/spool/news has become corrupted about every 6 months, an fsck gives: /dev/dsk/1s1 File System: Volume: ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Free List -3022 BLK(S) MISSING BAD FREE LIST SALVAGE? y ** Phase 6 - Salvage Free List 5423 files 33234 blocks 13358 free *** FILE SYSTEM WAS MODIFIED *** Unfortunately, fsck isn't able to fix the file system and I end up remaking the file system and restoring from backups. I am at a loss as to why fsck shows a large negative number of blocks missing. The file system is unmounted during this operation. Successive fscks give the same message. Anyone know of a work around for this? -- John Thornton P.O. Box 59 Berkeley, CA 94701 {ucbvax,uunet}!unisoft!wm6r!jjt unisoft!wm6r!jjt@ucbvax.berkeley.edu
" Maynard) (11/26/89)
In article <514@wm6r.UUCP> jjt@wm6r.UUCP (John Thornton) writes: >The file system used here for /usr/spool/news has become corrupted about >every 6 months, an fsck gives: > ** Phase 5 - Check Free List > -3022 BLK(S) MISSING > BAD FREE LIST > SALVAGE? y <== this is a fatal mistake. >Unfortunately, fsck isn't able to fix the file system and I end up >remaking the file system and restoring from backups. I am at a loss as to >why fsck shows a large negative number of blocks missing. The file system >is unmounted during this operation. Successive fscks give the same message. >Anyone know of a work around for this? Time to trot this one out again, I guess... This is a well-known bug. Your /dev/dsk/1s1 filesystem is large enough to require a work file (either via -t or by prompt). The free list info is built during phase 1, and is clobbered somewhere during phases 2-4. Phase 5 uses the now corrupted data to check the free list, and phase 6 uses it (still corrupted) to rebuild the free list. The workaround is to fsck any filesystem large enough to require a working file twice. The first time, reply yes to any prompt until you are told that the free list is corrupt; at that time, reply no to the "SALVAGE?" query. If you get an "excessive duplicate blocks" prompt, you must reply yes to the "CONTINUE?" question in order to get the "SALVAGE?" question; if you do not allow it to continue, the last changed buffer will not be rewritten. After fsck terminates when you've told it not to salvage the free list, then rerun fsck specifying -f; this will rebuild the free list properly. Note that this procedure is only necessary if a workfile is needed. fsck will work properly if all info can be contained in memory. Hope this helps... -- Jay Maynard, EMT-P, K5ZC, PP-ASEL | Never ascribe to malice that which can jay@splut.conmicro.com (eieio)| adequately be explained by stupidity. {attctc,bellcore}!texbell!splut!jay +---------------------------------------- "...when hasn't gibberish been legal C?" -- Tom Horsley, tom@ssd.harris.com
sjb@dalek.UUCP (Seth J. Bradley) (11/26/89)
In article <514@wm6r.UUCP> jjt@wm6r.UUCP (John Thornton) writes: >The file system used here for /usr/spool/news has become corrupted about >every 6 months, an fsck gives: > > /dev/dsk/1s1 > File System: Volume: > > ** Phase 1 - Check Blocks and Sizes > ** Phase 2 - Check Pathnames > ** Phase 3 - Check Connectivity > ** Phase 4 - Check Reference Counts > ** Phase 5 - Check Free List > -3022 BLK(S) MISSING > BAD FREE LIST > SALVAGE? y > > ** Phase 6 - Salvage Free List > 5423 files 33234 blocks 13358 free > *** FILE SYSTEM WAS MODIFIED *** > >Unfortunately, fsck isn't able to fix the file system and I end up >remaking the file system and restoring from backups. I am at a loss as to >why fsck shows a large negative number of blocks missing. The file system >is unmounted during this operation. Successive fscks give the same message. > >Anyone know of a work around for this? The text below is taken from a list of microport bugs posted over two years ago. To my knowledge, the fsck bug was never fixed, but there is a work around. 605 Priority: 1 Release: 2.2 Found in: fsck Verified fsck can give negative numbers and trash file system when you have a very large file system Work Around: use fsck -f to repair, and then check with fsck I got asked about the fsck bug I mentioned before. Here's the analysis that Steve Nuchia (uunet!nuchat!steve) gave me as to the problem: If fsck needs to use a work file, it gets confused. During phase 1, it builds a map of all the used sectors in memory, using the work file as a virtual memory extension. During phase 2, it clobbers that file. During phase 5, it uses the (now clobbered) file to check the free list, and, if it finds a discrepancy (which it is likely to do), then it uses the (still clobbered) file during phase 6 to rebuild the free list. I don't know what parameters fsck uses to determine that he needs a work file, but it's easy enough to determine if your file system is big enough to require one: fsck it with 'fsck -n /dev/dsk/0s2' (or whichever). If it asks you for a file name, then it's big enough. In that case, I recommend that you d o the following to avoid having the bug trash your file system: 1) Edit /etc/bcheckrc and /etc/mountall and remove the -y flags from the fsck commands. This will make fsck ask before it does anything to your file system. 2) If you get fsck invoked, answer 'no' to the question "BAD FREE LIST. SALVAGE?". This will cause your file system to still be bad, but does not use the clobbered map information to rebuild (==clobber) your free list. 3) Rerun fsck manually, as 'fsck -f /dev/dsk/0s2' (or whatever). This only invokes passes 1 and 5, and 6 if needed, avoiding the corruption from pass 2. Allow it to set the 'file system OK' flag. This procedure will rebuild the file system, and do so cleanly. -- Seth J. Bradley UUCP: uunet!{lll-winken|zorch}!dalek!sjb Internet: lll-winken.llnl.gov!dalek!sjb