jim@crom2.uucp (James P. H. Fuller) (04/14/91)
A bad spot has appeared in my /usr2 file system and I'm very unclear what to do about it. (I'm running ISC release 2.2.) If I try to mount the offending filesystem, # mount /dev/dsk/0s4 /usr2 the disk spins for a while and then mount tells me "free block read error on Primary AT Hard Disk unit 0 partition 4.... cannot mount /dev/dsk/0s4." If I try to check/repair the filesystem, # fsck /dev/dsk/0s4 then during Phase 5 - Check Free List fsck responds "CAN NOT READ: BLK 106626". If I tell it to continue and salvage the free list it seems to do so, i.e. fsck goes to completion, tells me the file system was modified, and gives me back the root # prompt. However, the error is still there and the file system is still not mountable. If I make a new filesystem for /usr2 and restore from my backups it will then be mountable and will run fine -- until the next time I reboot. Then the same old error reappears and the filesystem isn't mountable. The thing it *seems* I ought to do is to add information about this bad disk spot to the list of bad sectors in /etc/partitions and *then* make a new filesystem, but (HERE'S THE MAIN PROBLEM I'M ASKING ABOUT) the data in /etc/partitions is in the form of absolute sectors, but what fsck reports is a block number in a certain file system, and I don't know how to go from the one to the other. I tried using ISC's sysadm program to add the bad block info. This looks promising, since choice 1) under the HARD DISK MANAGEMENT submenu is 1) addbadblocks enter bad sector information but in fact the prompts you get if you choose addbadblocks don't have anything to do with blocks -- sysadm wants absolute sector number, and no backtalk! (Infuriatingly, if you hit ? to get help when it prompts for the absolute sector number, it responds "This is the absolute sector number that was printed on the console." although what was printed to the console was NOT absolute sector number but rather relative (to a certain f/s) block number.) In hopes that the bad area might be identified differently if I used sysadm to check the bad f/s I tried choice 3) on the same menu, checkhdfsys. It turns out this looks for file systems that are mounted, unmounts them and checks them. There's no way to tell it to check a f/s that *isn't* mounted. So it seems that the only info available to me is what I get by running fsck from the command line. I read and read in TFM and elsewhere, but not to much purpose. I did find a sentence in the man pages for fsdb that says "[fsdb] has conversions to translate block and inode numbers into their corresponding disk addresses" but the rest of the entry for fsdb is pretty cryptic (I mean it might as well be written in Hittite, as far as I'm concerned) and I can't figure out how to get it to do this, or even if it *can* do the conversion *and report it* to the console, instead of just doing it internally for its own purposes. Will some kind and knowledgeable soul please tell me how to calculate an absolute sector number from a block number, or how to persuade fsdb to do this? THANKS VERY MUCH James Fuller jim%crom2@nstar.rn.com
john@jwt.UUCP (John Temples) (04/14/91)
In article <1991Apr13.202621.272@crom2.uucp> jim@crom2.uucp (James P. H. Fuller) writes: >the disk spins for a while and then mount tells me "free block read error on >Primary AT Hard Disk unit 0 partition 4.... cannot mount /dev/dsk/0s4." >then during Phase 5 - Check Free List fsck responds "CAN NOT READ: BLK 106626". If you're getting a physical read error, I would expect the hard disk driver to be giving you an error message along with mount's or fsck's message. Whenever I've seen a bad sector pop up, the driver gives you the physical sector number. But doesn't ISC do dynamic bad sector mapping? ESIX does; this is a really nice feature. -- John W. Temples -- john@jwt.UUCP (uunet!jwt!john)
ed@mtxinu.COM (Ed Gould) (04/15/91)
>then during Phase 5 - Check Free List fsck responds >"CAN NOT READ: BLK 106626". If this message appears *without* any indication from the driver that there was a read error, then the problem is that the super-block thinks that the filesystem extends past the end of the partition in which it was created. It's possible to create such a filesystem on some systems with some versions of mkfs, since creating the filesystem does not often involve writing into the last several sectors. It should not be possible to create one of these on a BSD-based filesystem, since newfs is supposed to attempt to write into the last sector of the filesystem before doing anything else. -- Ed Gould No longer formally affiliated with, ed@mtxinu.COM and certainly not speaking for, mt Xinu. "I'll fight them as a woman, not a lady. I'll fight them as an engineer."