[comp.periphs] Formatting disk theory

hulsebos@philmds.UUCP (Rob Hulsebos) (08/24/88)

I have the experience that most people who need to format a disk do not
know anything about how this should be properly done. Most of the programs
that I saw just format the disk with a 'standard' pattern of 0xE5.

As I once read somewhere, this is the pattern for a single-density disk
(which uses the FM-coding mechanism). Single density is, however, something
used in the past: the very first micro that I worked on had single-density
floppies holding 120K (!) each.

But now, most disks (at least the disks that I use) have the MFM-coding
mechanism, also known as "double density". But the 0xE5 pattern is still
used, which is not correct: double-density disks must be formatted with
the 0xD6B6 pattern.

A different pattern is necessary because data is stored differently. The
0xD6B6 is some kind of worst-case pattern for MFM, so flaws on the disks
can usually be found.

QUESTION 1: Who knows more about this subject ? Any papers ?

I also heard that either for VAX/VMS or Ultrix or BSD a special diskformat
program exists which runs 40 or more passes over the disk, in an attempt
to find all bad spots on it and also any marginal sectors. It seems that
if you write the same data repeatedly to the same sector, and the sector
is not 100% OK, after a certain time the sector will appear to be bad.
Probably other algorithms are also used in this program to find more
errors.

QUESTION 2: Is this program public domain ? Has there ever been a 
            proper description of the algorithm used ?

Of course, with the current SCSI-disks it is fairly easy to work around
bad spots on a disk. But what I like about the 40-pass program is that it
can also detect marginal sectors, which are now still OK. And it is always
better to prevent those problems than to wait for them to appear.

------------------------------------------------------------------------------
R.A. Hulsebos                                       ...!mcvax!philmds!hulsebos
Philips I&E Automation Modules                            phone: +31-40-785723
Building TQ-III-1, room 11
Eindhoven, The Netherlands                                # cc -O disclaimer.c
------------------------------------------------------------------------------

chris@mimsy.UUCP (Chris Torek) (08/25/88)

In article <619@philmds.UUCP> hulsebos@philmds.UUCP (Rob Hulsebos) writes:
>I have the experience that most people who need to format a disk do not
>know anything about how this should be properly done.

Probably true.

>As I once read somewhere, [0xE5] is the [worst-case media] pattern for
>... FM-coding [aka `single density, which] ... is, however, something
>used in the past ....  now, most disks ... [use] the MFM-coding
>mechanism, also known as "double density". But the 0xE5 pattern is still
>used, which is not correct: double-density disks must be formatted with
>the 0xD6B6 pattern.

0xEC6D is listed as `media worst case' by the 4BSD Eagle formatter,
which expects MFM coding.  I have no idea whether this really *is* the
media worst case.

>QUESTION 1: Who knows more about this subject ? Any papers ?

Write to disk drive and media manufacturers; also look at standard
EDC/ECC references (Hamming codes).

>I also heard that either for VAX/VMS or Ultrix or BSD a special diskformat
>program exists which runs 40 or more passes over the disk, in an attempt
>to find all bad spots on it and also any marginal sectors.

`up to 48'.

>It seems that if you write the same data repeatedly to the same sector,
>and the sector is not 100% OK, after a certain time the sector will
>appear to be bad.

Rather, if you spend a lot of time on it, your chances of noticing that
it is bad are better.  Using different patterns is also wise, as it is
conceivable that somehow the media might get `permanently set' in some
bit pattern.  It is also a good idea to try to exercise the cabling,
and so forth: better to find faults *before* you put real data on the
disk.

>QUESTION 2: Is this program public domain?

No; it is under the usual Berkeley distribution clause.  It is also
not marked as being free of AT&T code, although I believe it must be.

>Has there ever been a proper description of the algorithm used ?

I am not sure what a `proper description' is, but it consists of
writing a set of test patterns, with ECC correction limited to a small
number of bits, possibly 0; this is easy on a Vax with Eagles, since
the ECC is done in software.  To make it go fast, format writes one
track at a time.  (Writing one cylinder at a time is not feasible since
the Massbuss limits transfers to 64KB.)

The test patterns are:

	/*
	 * Purdue/EE severe burnin patterns.
	 */
	unsigned short ppat[] = {
	0xf00f, 0xec6d, 0031463,0070707,0133333,0155555,0161616,0143434,
	0107070,0016161,0034343,0044444,0022222,0111111,0125252, 052525,
	0125252,0125252,0125252,0125252,0125252,0125252,0125252,0125252,
	#ifndef	SHORTPASS
	0125252,0125252,0125252,0125252,0125252,0125252,0125252,0125252,
	 052525, 052525, 052525, 052525, 052525, 052525, 052525, 052525,
	#endif
	 052525, 052525, 052525, 052525, 052525, 052525, 052525, 052525
	 };

I also have no idea what the worst case media pattern might be for
[2,7] RLL encoding.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

les@chinet.UUCP (Leslie Mikesell) (08/25/88)

In article <619@philmds.UUCP> hulsebos@philmds.UUCP (Rob Hulsebos) writes:
>.. But the 0xE5 pattern is still
>used, which is not correct: double-density disks must be formatted with
>the 0xD6B6 pattern.
>
>A different pattern is necessary because data is stored differently. The
>0xD6B6 is some kind of worst-case pattern for MFM, so flaws on the disks
>can usually be found.
>
The worst-case pattern is only needed for testing the disk for flaws.  The
idea is that the pattern that puts identical-polarity magnetic fields
together surrounded by opposite-polarity fields is the worst because
the identical-polarity bits will try to move apart.  The reason the pattern
is different for FM and MFM is that a clock pulse is put between each
data bit under FM recording but not MFM (thus the "double-density" of data).

Personally, I have always thought that it would be a good idea to 
go back and re-write the sectors after testing using a best-case
(alternating fields) pattern since that would be less likely to
degrade over time.  Of course stored data overwrites this pattern
anyway, but there are often partial clusters beyond the end of
a file (even partial sectors if the operating system does a read-before-
write) that contain the original formatting pattern and will cause
trouble if a disk error occurs.

Les Mikesell