[comp.sys.misc] RLL: an intuitive

chris@mimsy.UUCP (Chris Torek) (05/16/88)
[NB: followups redirected to comp.periphs.  It seems to me the only
rational place for this to continue, if anywhere.]

In article <218@octopus.UUCP> pete@octopus.UUCP (Pete Holzmann) writes
a nice long article describing 2,7 RLL encoding.  (Thanks, by the way,
although what I really wanted to see was that table you did not reproduce
:-) .)  Now let me try for the intuitive description, with some nitpicky
technical stuff too.  First the technical nits:

>I. How is data stored on a disk drive?
>As magnetic flux reversals (think of it as + to -). The POLARITY of the
>magnetic flux doesn't mean a thing. It is the TIMING of the flux reversals
>that is used to encode data.

Sort of.  + to - or - to + is not important, but it is not timing, but
rather presence, of reversals that encodes data.  The simplest encoding
---the old `IBM format' or `FM' or how you got a whole 128K :-) on an
eight-inch floppy---is simply clock+data, clock+data, ... where a data
`1' is a reversal and a data `0' is a missing reversal.

>II. What is RLL? What does the '2,7' in '2,7 RLL' mean?
>RLL means Run Length Limited. The [numbers] refer to the
>minimum and maximum time between flux reversals.

(Remember this for the intuitive explanation.)

>RLL codes are 'self clocking'.

All common drive formats are self-clocking: that just means the timing
is stored in among the data somehow, rather than on something
external.  An example of a non-self-clocking medium might be a paper
tape (although paper tape has sprocket holes that can be used for
clocking---this is how the `pull the tape through by hand' readers
work---most paper tape readers relied on a constant rate of tape
motion.)

>... 2,7 RLL is a variable length code (e.g. 0011 maps to 00001000 but
>010 maps to 100100); I don't have a simple formula for the 2,7 RLL code!

This is not variable length: four bits went to eight, and three to six.
In other words, n bits becomes 2n bits.  I think someone just squashed
equivalent table entries to make it look variable length.

>P.S.: If you read all the way to here, congratulations! I don't really expect
>that this stuff would really be interesting enough for people to read through
>250 lines of gobbledy gook... :-)

(You might be surprised.)

Anyway, time for intuition.  None of the following is strictly
accurate, but it should give you a good feel for how these things
work.

When you stick flux changes (data and/or clock) onto a disk surface,
they tend to `wiggle around', somewhat like paper dots dropped from a
bit too high up, in a slight breeze.  The better the drive, the more
accurately the dots land where they were supposed to go, but they
always wind up slightly out of place.

Unlike the paper dots, though, those darned flux changes wiggle *more*
when you put them closer together.  If you put too many of them in a
row, they crawl right out of where they were supposed to be:

    you wanted 1111:
    < wiggle >< wiggle >< wiggle >< wiggle >
	0         1         2         3

    but you got 11?0:
    < wiggle ><   wiggl><     wig><e      w>ggle
	0         1         2         3

The darned things are scared of each other!  We have to make sure the
closest we ever put them is one wiggle-space apart, or they will get
scared and wiggle away.  We will represent this minimum wiggle-space
with | marks below.  (Note that the wiggles can see through the marks
and will get scared if they are less than two | marks apart.  This
should make sense as soon as you get to the MFM diagram.)

On the other hand, if you put them too far apart, the controller gets
forgetful, as if it were counting sheep and the sheep stopped jumping:

    you wanted 10000001:
    | w |   |   |   |   |   |   | w |
      0   1   2   3   4   5   6   7

    but you got 1000..??oops
    | w |   |   |   |   |   |   | w |
      0   1   2   3  ... now where was I?

So the idea is that we have to put the wiggles close enough together so
that the controller does not forget to keep counting, but far enough
apart so that they do not scare each other away.  (In engineer-ese, the
flux reversals must be far enough apart not to interfere, but close
enough together to keep the decoding circuit in sync.)

Well, one way to do this is to use good ol' FM format, and put the
clock and data wiggles fairly far apart.  We always get clock+data, and
clock is always 1, so we get wiggle+blank or wiggle+wiggle.  If we got
a whole series of wiggle+blanks, each wiggle would be two spaces apart,
and if we got a whole series of wiggle+wiggles, they would be one space
apart.  One space has to be `far enough' and two has to be `close
enough'.  This is easy to arrange, but it only gives us a measly 128K
on a whole eight-inch disk.

So what to do?  Well, how about Modified FM?  We will put the clock
wiggle in only if we do not have enough data wiggles to keep the
controller counting.  If we have two data wiggles next to each other,
we can put them one space apart, because there will not be any clock
wiggle to scare them away.  Write it this way, with a `.' marking
half-spaces.  The clock wiggle is the one on the left of its dot.

	you wanted 11001011:
	| .w| .w| . |w. | .w| . | .w| .w|
	  0   1   2   3   4   5   6   7

The wiggles are still not too close---always at least one space apart,
and sometimes one and a half spaces---yet still not *too* far apart.

But maybe we can do better.  And with 2,7 RLL, we do!  Instead of
putting in a wiggle for every `1' data bit, and a special clock wiggle
if we need one, suppose we make up some tables giving whole bunches of
arrangements where we wiggles are at least three dots apart, but at
most eight.  Using Pete's two examples:

>e.g. 0011 maps to 00001000 but 010 maps to 100100

	you want 0011 010 010, so write 00001000 100100 100100:
	| . . | .w. | . .w| . .w| . .w| . .w| . .

The wiggles are still always at least one space (three dots) apart
(we never get something like | . .w| .w. | where the two are only
two-thirds of a minimum wiggle-space apart), and they are never more
than three (really eight-ninths) spaces apart.  Or again in EE-speak,
the minimum flux separation is still one unit, and the maximum 8/9
units, so they will not interfere and the electronics will stay in
sync.

Why, then, might some drives have trouble with 2,7 RLL encoding?
Well, compare an MFM picture with a 2,7 RLL picture:

	MFM: |  . w|  . w|  .  |w .  |  . w|  .  |  . w|
	RLL: | . . | .w. | . .w| . .w| . .w| . . |w. . |

In MFM we only have to be able to tell whether a wiggle (flux change)
is to the left or the right of a whole box, but in RLL we have to tell
whether it is in the first, second, or last third of the box.  If the
if the wiggles are wigglier than usual, or if the controller gets
sloppy, one of the wiggles might wiggle in the middle instead of on the
right.  When that happens, you get a read error.

In other words, Pete is exactly right.  What a 2,7 RLL encoding demands
is not that the wiggles (flux changes) be put closer together, but
rather that their position within each wiggle-space be pinpointed more
accurately.  If you try a drive not rated for this, it might not work:
Some drives just have loose wiggles.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris