[comp.os.minix] bad blocks on hard disk

jim@turbo.UUCP (09/02/87)

Having just managed to get my hard disk set up and operating 
with the 1.2 release of xt_wini.c, I now find that I get 
non-recoverable errors in the area where I have a couple
of known bad tracks.( Surprise, surprise). Has anyone
tackled the problem of bad track mapping yet? The tracks
should have been marked as bad during low level format
as I entered the manufacturers bad track table at the
format programs prompt. What exactly does the Western Digital
controller do with the bad track information?
Hope someone can help me! Thanks in advance.

-- 
 -------------------------------------------------
#      Jim Shaw                                    #
#         (617)-460-8232                           #     
 -------------------------------------------------	
 jim@turbo.ray.com		.....!rayssd!turbo!jim

ast@cs.vu.nl (Andy Tanenbaum) (09/03/87)

In article <196@turbo.RAY.COM> jim@turbo.RAY.COM (Jim Shaw x8232) writes:
>...  I have a couple of known bad tracks ...

I believe that the DOS approach to bad tracks is to keep the last couple
of tracks in reserve, and to have a map that tells which of the regular
tracks are bad.  It then subsitutes the reserve tracks (or maybe just
sectors, I am not sure) for the bad ones.

I haven't done anything in this area, but I agree it is important.  If
someone who knows how DOS does it could modify the 1.2 xt_wini.c and
1.2 at_wini.c to handle bad tracks the DOS way (for compatibility) that
would no doubt be useful to many people.  Probably what you have to do
is read in the bad track table at the same time the partition table is
read in, and make a mapping between "virtual" tracks and "real tracks".
If DOS does it by sector rather than track, it becomes more complicated,
because keeping a map of 20,000 of more sectors is expensive. Even a
bit map would be expensive.

What might be reasonable is to keep a bit map per track, with the bit
0 if the cylinder is perfect, and 1 if there are 1 or more bad sectors.
Whenever a disk access happens, the driver could check the map, and if
the cylinder was perfect, just go ahead.  If it was bad, it would
have to search an explicit list of bad sectors.  This would only require
about 300 bytes for the map for a 20 MB drive, and would not require
much look up time in 99% of the cases (assuming 99% of the tracks were
ok).

Volunteers with rotten disks and knowledge of how DOS works are definitely
welcome here.

Andy Tanenbaum (ast@cs.vu.nl)

tsp@killer.UUCP (Tom Poindexter) (09/03/87)

In article <196@turbo.RAY.COM>, jim@turbo.RAY.COM (Jim Shaw x8232) writes:
> Having just managed to get my hard disk set up and operating 
> with the 1.2 release of xt_wini.c, I now find that I get 
> non-recoverable errors in the area where I have a couple
> of known bad tracks.( Surprise, surprise). Has anyone
> tackled the problem of bad track mapping yet? The tracks
> 
 I encountered a bad block on my AT's hard disk during a compile.  Fortunately,
the compile finished, leaving a .s file.  I renamed the file `block3535.bad'
and chmod'ed 000.  My fingers are still crossed!

Tom Poindexter  ihnp4!killer!tsp

wheels@mks.UUCP (Gerry Wheeler) (09/04/87)

In article <489@ast.cs.vu.nl>, ast@cs.vu.nl (Andy Tanenbaum) writes:
> I believe that the DOS approach to bad tracks is to keep the last couple
> of tracks in reserve, and to have a map that tells which of the regular
> tracks are bad.  It then subsitutes the reserve tracks (or maybe just
> sectors, I am not sure) for the bad ones.
> 
> Andy Tanenbaum (ast@cs.vu.nl)

I am NOT a DOS expert, but I do seem to recall reading this somewhere.
I think DOS simply writes a magic number in the FAT entry corresponding
to the bad sector so it will never be used. I don't think any replacement
sectors are allocated anywhere. This would explain how chkdsk can report
the number of bad sectors.

Note, however, that sometimes one can map out bad sectors or tracks at
a lower level. Some disk controllers allow bad sectors to be remapped
at the hardware level during the low-level format process. Once this is
done, the disk appears to be perfect to the software, albeit a little
smaller. The Adaptec 4000 series controllers, for example, can do this.
Nevertheless, a scheme for mapping out bad blocks at the OS level is nice
for those occasions when the disk develops new trouble spots. Saves going
through the low-level format again.

Why not simply have a utility which will append blocks to a file called
BAD_BLOCKS? Just don't try reading the file. I guess all the permissions
could be removed from it.
-- 
     ll  // // ,'/~~\' Gerry Wheeler {decvax,ihnp4,seismo}!watmath!mks!wheels
    /ll/// //l' `\\\   Mortice Kern Systems Inc.         (519) 884-2251
   / l //_// ll\___/   43 Bridgeport Rd. E., Waterloo, ON, Can. N2J 2J4
O_/

bing@galbp.LBP.HARRIS.COM (Bing Bang) (09/04/87)

I have a automated bad sector mapping built into my xt_wini.c, though,
unfortunately I did my work before the 1.2 release. I fiddled with the driver
to where it works  with my WDX-2 controller without the autoconfig roms,
and when the 1.2 came out I had no reason to change over and have to redo
the bad sector mapping (if it ain't broke...). I'll post my driver if anybody
cares.



-- 
Bing H. Bang                         +-------------------+
Harris/Lanier                        |MSDOS: just say no.|
Atlanta GA                           +-------------------+

guardian@laidbak.UUCP (Harry Skelton) (09/04/87)

In article <489@ast.cs.vu.nl> ast@cs.vu.nl () writes:
>In article <196@turbo.RAY.COM> jim@turbo.RAY.COM (Jim Shaw x8232) writes:
>>...  I have a couple of known bad tracks ...
>
>I haven't done anything in this area, but I agree it is important.  If
>someone who knows how DOS does it could modify the 1.2 xt_wini.c and
>1.2 at_wini.c to handle bad tracks the DOS way (for compatibility) that
>would no doubt be useful to many people.  Probably what you have to do
>
>Andy Tanenbaum (ast@cs.vu.nl)

As far as I know, there are two methods of bad track mapping that I have 
"bumped" into.  The first is from the HD OEM that created the controler
card.  Some are smart enough to have the bad track map somewhere on disk
that the OS doesn't touch (nice).  The old Datamac hard disks do this and
my current Xebec Controler will do this.

The other method is to keep track of bad tracks either in a table much like
the file system table or mark the sectors as 'bad/reserved/whatever' so the
OS does not touch them.  This is done under DOS with the program 'scavenger'.
(PD Utility - dos format is pretty dumb)

I beleive that SCO makes a device that is a pointer to the bad track table
and the OS has to check the table before it writes to a sector or something
like that.  

The end result is a table or marker to tell the OS that the sector/track is
bad.  Better a sector as a track can chew up good sectors.

?How bout a badtrack program and a mod to minix to have a bad_sector flag?
/dev/hd (you can tell I don't play with minix) can be a list of sectors
that are good and skipping the bad 'blocks' as needed or setup with
'badtrack'  ??? Ideas ???

NU070156%NDSUVM1.BITNET@wiscvm.wisc.edu (Glen Overby) (09/04/87)

In his message of 3 Sep 87, Andy Tanenbaum <ast@cs.vu.nl> writes:
> In article <196@turbo.RAY.COM> jim@turbo.RAY.COM (Jim Shaw x8232) writes:
> >...  I have a couple of known bad tracks ...
>
> I believe that the DOS approach to bad tracks is to keep the last couple
> of tracks in reserve, and to have a map that tells which of the regular
> tracks are bad.  It then subsitutes the reserve tracks (or maybe just
> sectors, I am not sure) for the bad ones.

Some hard disk controlers use this technique to "hide" bad tracks from
DOS, but DOS itself has the capability to lock out bad sectors (actually
allocation units). These sectors are what show up as bad sectors when
running 'chkdsk'. The tracks that the controler maps out will be invisible.

The controler mapping should still be invisible to Minix, but an alternate
scheme of locking would also be desireable. Building a 'bad block file'
is an old kludge of Unix's, but it works.

Glen Overby
Bitnet: nu070156@ndsuvm1
from UUCP: ihnp4!psuvax1!ndsuvm1.bitnet!nu070156

mlinar@poisson.usc.edu (Mitch Mlinar) (09/06/87)

In article <297@mks.UUCP> wheels@mks.UUCP (Gerry Wheeler) writes:
>In article <489@ast.cs.vu.nl>, ast@cs.vu.nl (Andy Tanenbaum) writes:
>> I believe that the DOS approach to bad tracks is to keep the last couple
>> of tracks in reserve, and to have a map that tells which of the regular
>> tracks are bad.  It then subsitutes the reserve tracks (or maybe just
>> sectors, I am not sure) for the bad ones.
>> 
>> Andy Tanenbaum (ast@cs.vu.nl)
>
>I am NOT a DOS expert, but I do seem to recall reading this somewhere.
>I think DOS simply writes a magic number in the FAT entry corresponding
>to the bad sector so it will never be used. I don't think any replacement
>sectors are allocated anywhere. This would explain how chkdsk can report
>the number of bad sectors.
>

As a matter of fact, the FAT table is the only way that DOS knows about bad
sectors.  A normal FAT entry can be 000 (12-bit number, 3-digit hex number)
for an unallocated block, 002 to FEF for normal blocks (000 and 001 are
reserved for the FAT and some directory tables).  FAT entries between FF0
and FF7 are reserved and FF8 through FFF indicate that this is the last
block of the file.

IBM designated the reserved FAT entry FF7 to signify a bad block.

Of course, the hard disk controller can do its own "remapping", but this is
only efficient for VERY fast hard drives as there is additional seek time as
well as loss of some of the hard drive space.  (Typically, only 1 or 2 sectors
are bad on a 17 512-byte sector track.  Why give up all the remaining usable
space while having tracks in "reserve" that could ALSO be used to hold data.
The only rebuttal to this argument is if the bad sector occurs BEFORE block
# 2 (boot tracks, directory, FAT tables).  Then, having a remap to a higher
track may be useful.  However, if the hard drive is "virgin" and has bad
data in this area, you should return it.  Otherwise, SOME systems allow one
to change the track offset to "bypass" the offending track - by offsetting
the entire system - which is better than losing a whole hard drive, but
worse than a bad block map).

As a last note, the current trend for controllers is to write the disk
information onto cyl 0, head 0, sector 0 (or 1) of the hard drive (reserved
for the controller itself).  If this track cannot be formatted, get a new
hard drive as there is no "remapping" that is done and the controller cannot
save any information.

-Mitch

ast@cs.vu.nl (Andy Tanenbaum) (09/08/87)

In article <1135@laidbak.UUCP> guardian@laidbak.UUCP (Harry Skelton(E)) writes:
>?How bout a badtrack program and a mod to minix to have a bad_sector flag?
>/dev/hd (you can tell I don't play with minix) can be a list of sectors
>that are good and skipping the bad 'blocks' as needed or setup with
>'badtrack'  ??? Ideas ???


Does anyone know if the part program (nee format) that comes with MS-DOS
puts some kind of mark on bad sectors (e.g. bad checksum) to force them
to be bad all the time?

If one could guarantee that all bad sectors always gave errors, 100%
of the time, bad block handling would be straightforward.  

1. The driver just reads and writes normally.

2. If a repeated error occurs, the driver looks up an alternate sector
   in a table.  

This scheme requires 0 overhead in the normal case, but it does require
that bad sectors ALWAYS give an error.  If a bad sectors sometimes is
ok, this scheme fails.

Putting bad blocks in a "bad blocks file" is ok, except if the bad
block is the super block, inode block, etc.

Does anyone feels inspired to write a program that you call with
   badblock /dev/hd1 4813 9234
with the result that a file is created on /dev/hd1 with name 4813 and
whose only blocks are 4813 and 9234 (up to a maximum of 7)?  After 
jamming block numbers into the inode, either the program has to update
the bit map, or tell the user to run fsck.  (A version that runs on
MINIX was posted a month or two ago).

Andy Tanenbaum

las@apr.UUCP (Larry Shurr) (09/08/87)

In article <489@ast.cs.vu.nl> ast@cs.vu.nl () writes:
>In article <196@turbo.RAY.COM> jim@turbo.RAY.COM (Jim Shaw x8232) writes:
>>...  I have a couple of known bad tracks ...
>
>I believe that the DOS approach to bad tracks is to keep the last couple
>of tracks in reserve, and to have a map that tells which of the regular
>tracks are bad.  It then subsitutes the reserve tracks (or maybe just
>sectors, I am not sure) for the bad ones.
>
This approach to bad track handling is generally a function of the controller
in conjunction with the formatting software (low-level format).  In this
scheme, the controller provides a "format-bad-track" function which writes
a bad-track header in place of the regular sector headers which tells the
controller "I'm a bad track - this track remapped to xxxx" where xxxx is the
physical address of another track which takes the place of the bad track.
The allocation scheme for replacement tracks is performed by the format
software although there are generally restrictions imposed by the controller
such as the replacement track must be on the same surface as the bad track.
You can get away with this most times because a bad track is generally not
completely unreadable/writable.

Now before you flame me, there's no way I can guarantee that all controllers
support this, nor can I guarantee that all low-level format software supports
this feature if it is available.  I just know that this is the way it worked
with the controller for which I wrote formatting software for the TI Pro.

Now the way DOS "handles" this is that it doesn't handle bad tracks.  The
FORMAT program handles unreadable sectors by marking the clusters which
contain them as "unavailable" in the FAT.  DOS then handles them by not
using the affected clusters.

regards, Larry
-- 
"The only thing worse than being talked about is not being talked about."
- Oscar Wilde, James Whistler or George Bernard Shaw depending on who you ask
Name: Larry A. Shurr
Addr: cbatt!osu-eddie!apr!las (preferred, alternates: {cbosgd,ihnp4}!cbcp1!las)

bdale@winfree.UUCP (Bdale Garbee) (09/09/87)

Hmmm.  I'm under the impression that trying to allocate some set of
reserved sectors as replacements for bad sectors is not a good thing.  The
scheme I had in mind was simply to create a (preferably hiden) file 
somewhere on the drive, and allocate all of the bad blocks to it.  The
way to do this would be to write a program that does a non-destructive
test of each sector on the disk, and allocates any baddies found to the
bad blocks file, along with reporting any relevant stats on the block to
the console.  I'd call the program "scrub" if I were writing it  :-)

Advantage of this is that it works, the filesystem code doesn't need to be
changed (I don't think), it's fairly simple to implement, and you always
have use of all the good sectors on your disk... none need be "reserved".

I'll get around to this someday.  If someone needs it sooner... go for it!
-- 
Bdale Garbee, N3EUA		phone: 303/593-9828 h, 303/590-2868 w
uucp: {bellcore,crash,hp-lsd,ncc,pitt,usafa,vixie}!winfree!bdale
arpa: bdale%winfree.uucp@bellcore.com
fido: sysop of 128/19		packet: n3eua @ k0hoa, Colorado Springs

ESC1111%DDAESA10.BITNET@wiscvm.wisc.edu (N.Head) (09/09/87)

Andy asked does the DOS format program mark bad sectors properly ::
Yes and no. The low level format supplied with the AT diagnostics does
write info to the disk (a using a controller format option I believe) to
ensure that bad blocks return a bad read status continually and reliably.

The FORMAT program supplied with DOS doesn't do this - it only marks the
block as bad in the File allocation table so that DOS never allocates it
to a file. Unfortunately MINIX takes no notice of the FAT :-) !!

Neither one or the other of these approaches will do anything sensible about
sectors which go bad during use of the system (short of doing a low level
format).

It would be possible to reformat a track 'on the fly' if a bad sector was found
(save the data that could still be read, use the BIOS format track service on
the offending track, restore the data, moving the data from the bad sector
s'where else). I know I don't have to time to get into that though ...

All this was from the viewpoint of an AT. Anyone know about PC's ??
Nigel.