pete@Octopus.COM (Pete Holzmann) (10/25/89)
Here's an edited copy of an article I wrote in May 1988, describing why the use of an RLL controller will not cause physical damage to the drive. Someone who repaired drives for a living had been claiming that RLL formatting would cause excessive seek errors... NOTE: If you find typos or errors in this article, please email your comments to me. I'll keep this file up to date and repost as needed. >As the number of >marginal sectors on the disk increases, so does the number of seeks. >Unfortunately, this heats up the arm assembly until (on some drives, >at least) it starts to warp. This causes more seeks, which causes more >heat, which ... Naturally, heat is most likely to be a problem with >low end drives, which are more likely to have problems running RLL. >If the problem is allowed to progress far enough, even reformatting the >drive for MFM won't help, as the head positioning system won't work >reliably any more. If you understood how seeks work on a disk drive, you wouldn't believe this for a minute. Here goes with more details about how disk drives work! First, the simple intuitive argument: Can too many seeks due to bad sectors or poor formatting cause the heads and/or positioner to fail due to heat problems? NO! A typical qualification test for a drive is a MILLION random seeks. This may not be performed on every drive in the production line, but drives should be able to seek continuously, just about forever, without ANY seek errors. Too many seeks in a row should never cause trouble! Now, let's talk about how things happen on a disk drive: In our last episode ('RLL Technical Details'), you'll recall that the beginning of a sector is formatted with a bunch of zero bytes and a thing called an address mark; this helps the disk-read electronics syncronize with the data on the disk. Now it is time to draw a more detailed picture of what the beginning of a sector looks like. I'll draw it top-to-bottom, with the beginning at the top: The following info is transcribed from the Maxtor OEM manual. Drive formatting is standard across the industry; any variations from the following description are very minor in nature (# sync bytes may change, # bits for head may be different, etc). Note that for MFM/RLL (non ESDI/ SCSI) drives, the format is completely up to the controller; the drive knows nothing about formats! If you are skimming, just remember the major field delineations (ID FIELD and DATA FIELD). Those are what is really important in the following discussion. Field # byte Value Comments Sync 13 0 For syncing the read-electronics ID FIELD: ID Address Mark 2 A1 FE Clock bit missing from 6th bit of the A1 in 1,3 RLL format (MFM). I'm not sure what bit is munged in 2,7 RLL; the key is that the first byte does NOT follow the 'rules', the second byte flags that this info is the ID field. Cylinder Low 1 xx Low 8 bits of current cylinder # Cyl Hi/Head 1 0ccchhhh Hi 3 bits of cylinder, 4 bit head # Sector # 1 N Sector # on this track CRC/ECC 2-n xxxx CRC or ECC bytes (depends on format) ID Trailer 3 0 So write-head-turnoff-glitch doesn't mess up the CRC/ECC GAP Sync for Data Fld 13 0 For syncing the read-electronics DATA FIELD Data Address Mark 2 A1 F8 Same as ID AM, but different marker so we know that data is coming. Sector Data N xxxx Actual sector data (512 bytes or whatever) CRC/ECC 2-n xxxx CRC or ECC bytes (depends on fmt) Data Trailer 3 0 So write-head-turnoff-glitch doesn't mess up the CRC/ECC GAP Extra inter-sector space 15 4E Takes up room so sectors are evenly spaced OK! Now you've got that table. Here's how seeking, reading and writing occur at the lowest level: SEEKING The controller issues some commands (not discussed in detail here) that cause the heads to move to the (hopefully) correct track. Once the drive electronics report that the seek is physically complete, the controller verifies this by looking for an ID field. If the ID field shows that we are in the right place, we're done! READING The controller reads until it finds the ID field for the sector to be read, or until N revolutions of the disk and it times out with an error. Once the correct ID field is found, the Data field is read. If there is any problem, including if the data field is not found right away [we don't want to read the data field for a different sector!], then we go back to looking for the ID field, retrying N times. If there are any CRC/ECC errors, we retry. WRITING The controller reads until it finds the ID field for the sector to be written, or until N revolutions of the disk and it times out with an error. Once the correct ID field is found, the entire data field is written. The controller simply waits long enough for most of the ID-to-Data GAP to go by, then starts to write, beginning with zeros in the sync field, the data A.M., the data itself, the CRC/ECC bytes, and the trailing zeros. THE ID FIELD IS NEVER TOUCHED. FORMATTING [By the way, this is 'low level' formatting for MSDOS people. The FORMAT command simply changes data in sectors. It does not write any ID fields.] The controller waits until the 'Index' signal is seen, which indicates that the heads are at a particular spot in the disk rotation (the 'Index', amazingly enough!!!) Then it simply writes an entire track of data, including all the gaps, ID fields and data fields for all sectors on the track. It doesn't read or sync up with anything except to verify that the formatting info was correctly written. On some high-end drives, an entire surface is specially formatted with information that helps the drive perform its physical seeks. This formatting is performed at the factory and is not *supposed* to be changeable afterwards. If the heads become physically misaligned to any great extent, this extra surface needs to be reformatted, usually at the factory. LOW END DRIVES DON'T HAVE THIS 'FEATURE'. It is used to make seeks faster, by the way. Now for some other info: BAD SECTOR CAUSES: - Physical defects on the disk surface. These can be avoided by simply not using bad areas on the disk. - Writing unreadable data (e.g. writing RLL format on a drive that isn't accurate enough). This can be fixed by writing good data. If a drive has not been physically damaged, and if the electronics are ok, then what used to work (physically) will still work now, even if you garbaged up the format in the meantime. - Writing when you shouldn't (e.g. controller firmware gone nutso) - Over time, thermal effects may cause the heads to shift slightly, which can cause new physical defects to be seen, and/or the old ID fields to be unreadable. Re-formatting will always correct this problem. - When a system is power-cycled, the power to the drive heads is 'unbalanced' for a moment. A small amount of magnetic effect is transmitted to the drive surface. This is generally not enough to actually change the data on the drive, but it weakens the signal that will be read next time. The accumulation of this effect over time causes good data to become marginal; marginal data to become bad. Since this only happens at each on/off cycle of your system, it isn't a major factor in general. You can avoid it completely by parking your disks [moves heads to unused cylinder] before turning the system off. Rewriting the low-level format and data on all cylinders will completely eliminate any accumulated problems of this type. - Murphy. [No matter how completely we may understand a system, someday it will probably do something completely inexplicable! Always leave room for the twilight zone... ;-)] CONCLUSIONS Now that you know this, you'll understand a few things better: - Except for drives with a separate seek-synchronizing surface as described above, there is no particular place on a disk surface that is non-formattable. If you lose the format on a disk for some reason, re-doing the low-level format will make the disk usable again. If you can't low-level format a drive, then either: - you have a problem somewhere else (firmware, cables, jumpers, etc) (sure, that's a general statement; sorry!) - you have a drive with a special-surface that has been wiped out (and NOT because you used a controller with a different data format!) It is possible that operator and/or firmware error caused the controller to write over something that should not be writeable. - you have a controller and/or BIOS that can't handle a non-formatted drive or incorrectly-formatted drive. I've heard rumors about this, and have seen examples. A different BIOS and/or controller may be able to fix the format for you; the drive would then become usable again. An example of this: DTK BIOS versions through 1/88 won't let you boot up DOS from a floppy if the drive isn't formatted. More recent versions of the BIOS have this fixed. Someone mentioned that some WD controllers can't talk to a drive that has been previously formatted RLL. The same drive is fine after reformatting with a non-Western-Digital MFM controller. The WD controller must have some kind of firmware bug. - you have a physically damaged drive. If any sectors on a disk surface can be written and read under any circumstances, then that particular head and associated electronics are ok. - Murphy again. Pete -- Peter Holzmann, Octopus Enterprises |(if you're a techie Christian & are 19611 La Mar Ct., Cupertino, CA 95014 |interested in helping w/ the Great UUCP: {hpda,pyramid}!octopus!pete |Commission, email dsa-contact@octopus) DSA office ans mach=408/996-7746;Work (SLP) voice=408/985-7400,FAX=408/985-0859