[comp.os.minix] Badblocks

anderson@vms.macc.wisc.edu (Jess Anderson) (12/30/89)

(1.3 AT version)

I ran diskcheck to find bad blocks and marked them with
badblocks.  In making a mistake, I made a discovery.

If you enter a block number of a block that's already
been marked, you get a message about that block already
being in the bitmap.

However, you *also* get a new file .Bad_xxxxxx with 0 
length, which probably you shouldn't get.

==Jess Anderson===Academic Computing Center=====Univ. Wisconsin-Madison=====
| Work: Rm. 2160, 1210 West Dayton St., Madison WI 53706, Ph. 608/263-6988 |
| Home: 2838 Stevens St., 53705, 608/238-4833   Bitnet: anderson@wiscmacc  |
==Internet: anderson@macc.wisc.edu====UUCP:{}!uwvax!macc.wisc.edu!anderson==

eyal@cancol.oz (Eyal Lebedinsky) (05/04/90)

Hi there

Some time ago I reported a problem with badblocks. I am now on 1.5.9 and
the problem is still there. The two functions blk_ok() and set_bit() will
access the first inode bit-map block even if a later one is needed. The
bit offset is converted into a block number and a block offset (see lines
429/430) but later the block number is not used again. The lseek() is done
to the first block in all cases.

-- 
Regards
	Eyal

paula@bcsaic.UUCP (Paul Allen) (05/08/90)

In article <368@cancol.oz> eyal@cancol.oz (Eyal Lebedinsky) writes:
>Some time ago I reported a problem with badblocks. I am now on 1.5.9 and
>the problem is still there. The two functions blk_ok() and set_bit() will
>access the first inode bit-map block even if a later one is needed. The
>bit offset is converted into a block number and a block offset (see lines
>429/430) but later the block number is not used again. The lseek() is done
>to the first block in all cases.

From looking at the 1.5.9 badblocks.c, I think Eyal's right.  When it's
calculating how far into the partition to seek, it computes whole-block
and fractional-block components of the offset and then forgets about the
whole-block component.  I fixed this in the 1.3 version in order to get
my disk working, but either I didn't post my diffs or they didn't get
incorporated into subsequent versions.  :-(  I think I see what's wrong,
but I won't be able to test it for a week or more.  One ought to be able
to test this out on any filesystem with more than 8192 zones.  

By eyeballing the code, I've generated a patch that I think might fix the 
problem, but I have not tested it.  Somebody please try this out and report 
what happens.  (Actually, the first step is to look real closely at this
patch to verify for yourself that it does what I think it will.)   Just try 
to mark a bad block with a number >8192, and then look at the zone bit-map 
to see if the right bit got marked.  (This patch is relative to 1.5.9,
but the crc for badblocks.c hasn't changed since at least 1.5.3.)

Paul Allen

------------------------ start of patch --------------------------
*** badblocks.c	Sat Mar  3 23:53:24 1990
--- fixed.badblocks.c	Mon May  7 17:53:05 1990
***************
*** 430,438 ****
    offset = z_num - (block << BIT_MAP_SHIFT);	/* offset */
    words = z_num / INT_BITS;	/* which word */
  
!   blk_offset = (long) (2 + sp->s_imap_block);	/* zone map */
!   blk_offset *= (long) BLOCK_SIZE;	/* of course in block */
!   blk_offset += (long) (words * SIZE_OF_INT);	/* offset */
  
  
    lseek(fd, 0L, SEEK_SET);	/* rewind */
--- 430,439 ----
    offset = z_num - (block << BIT_MAP_SHIFT);	/* offset */
    words = z_num / INT_BITS;	/* which word */
  
!   blk_offset = (long) (2 + sp->s_imap_block);	/* zone map start block */
!   blk_offset += (long) block;	/* add in zone-map block number */
!   blk_offset *= (long) BLOCK_SIZE;	/* convert to bytes */
!   blk_offset += (long) (words * SIZE_OF_INT);	/* add in offset within last block*/
  
  
    lseek(fd, 0L, SEEK_SET);	/* rewind */
***************
*** 472,477 ****
--- 473,479 ----
    words = offset / INT_BITS;	/* which word */
  
    blk_offset = (long) (2 + sp->s_imap_block);
+   blk_offset += (long) block;   /* add in zone-map block number */
    blk_offset *= (long) BLOCK_SIZE;
    blk_offset += (long) (words * SIZE_OF_INT);
 
--------------------------- end ---------------------------------- 
-- 
------------------------------------------------------------------------
Paul L. Allen                       | pallen@atc.boeing.com
Boeing Advanced Technology Center   | ...!uw-beaver!bcsaic!pallen

greg@mobius.Viewlogic.COM (Gregory Larkin) (10/04/90)

Hi all,

Does anyone have any documentation in a file that describe the
diskcheck(1) and badblocks(1) programs in detail?  For some reason,
the user manual shipped with V1.3 does not have these programs in
it.

Thanks a lot,
--
Greg Larkin (ASIC Engineer)
Viewlogic Systems, Inc. (The CAE Company)
293 Boston Post Road West       -------------------------------------
Marlboro, MA 01752              |"We've got captains not courageous,|
508 480 0881 x321               |captains tumbling into madness..." |
E-mail: greg@Viewlogic.COM      |Peter Garrett, Midnight Oil        |
                                -------------------------------------

u31b3hs@cip-s02.informatik.rwth-aachen.de (Michael Haardt) (03/13/91)

>When I first made the partition, I ran "readall /dev/hd6" to
>find the bad blocks.  I then ran "badblocks" to get rid of
>them (I think!).
>[trouble deleted]
>doesn't badblocks stop other programs, like "ar" from writing
>to bad areas by allocating the bad blocks to files?
>
>Is it a problem to mv the .BadXXXX files to a subdirectory
>called "BAD".  Does the mv kill the bad blocks information?
>
>Also, when I run fsck after badblocks, I get weird messages
>about duplicate zones and missing bitmaps?  Is this normal?
 
I had a similar problem a year ago.  I was using version 1.3
and badblocks did no work as it should work.  In my opinion,
it does not change the zone-table or something like that.
I am using 1.5.10 at the moment (upgrade from plains) and I
never saw the 1.5.10 manual.  The 1.3 manual explains the
filesystem very detailed and de(1) is one of the best disk
editors I ever saw.  It was not very difficult to make the
entries by hand:
 
-   At first I created so many empty files as I needed for
    my bad blocks.  Eight blocks will fit in one file.
 
-   Then I copied de(1) and fsck(1) to my ram floppy.  I
    unmounted the harddisk and patched to bitmaps.  Look at
    the manual for the structure of the file system.  After
    making a entry, I used fsck to check if it is ok.
 
-   At last, I mounted the hard disk and set the permissions
    for the .badXXXX files to 000.
 
I saw a lot of patches for badblocks on the net, but I did
not try them, because there was no desire for using it again.
It should be possible to move(!) the .badXXXX files, because
move only moves a directory entry, not the file itself (if
they are on the same filesystem).
 
My rc executes fsck with auto-repair option for the hard disk,
before it is mounted.  After a regular shutdown, there are no
inconsistencies, but there may be some after a power failure.
fsck seems to be the best way to eliminate them.

Hope this helps & Namaskaar

Michael Haardt (u31b3hs%cip-s01.informatik.rwth-aachen.de@unido.bitnet)