chris@mimsy.umd.edu (Chris Torek) (02/14/90)
In article <709@shodha.dec.com> alan@shodha.dec.com writes: >Once you've cleared a Forced Error on a replaced block you >need to determine if the block was important. I like to do this before clearing the forced error, myself. This gives me a chance to poke around with the (known bad) data before wiping it out. Be aware that ncheck (and perhaps icheck as well) work better if you do this afterward, since it reads large chunks of data from the disk and does not bother to retry individual sectors after an error. (I fixed this in ncheck recently when we acquired a bad sector on an RZ55.) >[icheck -b ...] If the block belongs to a file ... or a set of files (by holding the inode data itself: there are 4 inodes per sector) or is an indirect block of some file ... >you can track down the file name by the inode number with: > > ncheck -i inode-number special > >This can be slow, This can be slow because ncheck is buggy! I got this bug fixed in 4.3BSD-tahoe years ago, but the fix had not got into Ultrix as of the second-to-last time I needed ncheck on the DECstations (the last time was due to the bad block mentioned above; at that point we had either fixed ncheck or switched versions of Ultrix, because that bug was no longer present). The bug is obvious if you have the source, and hard to spot without it except via the fact that ncheck is slow. The problem is, in essence, that a loop that should be of the form: for (offset = 0; offset < directory_size; offset += entry_size) { blknum = file_offset / blksize; blkoff = file_offset % blksize; if (blkoff == 0) /* at sector boundary */ read block blknum; examine entry at blkoff; } is instead of the form: for (offset = 0; offset < directory_size; offset += entry_size) { blknum = file_offset / blksize; blkoff = file_offset % blksize; if (blknum == 0) /* oops */ read block blknum; examine entry at blkoff; } In an 8K/1K file system, this causes some entries in a directory whose size is > 8192 bytes to be ignored; in a 4K/512 file system, it causes some to be ignored in those directories > 4096 bytes long. In all cases it causes one file-system block read (translating to one physical read of the raw device) per directory *entry*, instead of one per directory *block*. If you do not have Ultrix source, you cannot fix this bug. (If you do have source, take a look at readdir(). You will need to add a line or two and use the lblkno macro to calculate the block number. The block offset calculation is correct, but needs to be done outside the if().) (Actually, you could patch this via adb, if you have sufficient skill and near-omniscient knowledge :-) .) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris