karl@sugar.hackercorp.com (Karl Lehenbauer) (03/16/89)
In the Roomers column of the March 1989 issue of Amazing Computing, the Bandito says that certain game companies are getting 1.5 MB unformatted on standard 880K floppies. Can anyone verify if this is true and if so, explain how it works? -- -- uunet!sugar!karl | "Time is an illusion. Lunchtime doubly so." -- | -- Ford Prefect -- Usenet BBS (713) 438-5018
cmcmanis%pepper@Sun.COM (Chuck McManis) (03/17/89)
In article <3631@sugar.hackercorp.com> (Karl Lehenbauer) writes: >In the Roomers column of the March 1989 issue of Amazing Computing, the >Bandito says that certain game companies are getting 1.5 MB unformatted >on standard 880K floppies. >Can anyone verify if this is true and if so, explain how it works? If we consider that the "traditional" unformatted capacity of the standard 3.5" disk is 1MB, it would seem they were getting an exact 50% increase in space. And that would suggest that they had rewritten trackdisk.device to use RLL encoding rather than MFM encoding. This is entirely possible, unfortunately I can't think of any way right off hand you could decode the RLL bits with the blitter. This may be good or bad depending on how you look at it. It's good because the decoded buffer doesn't have to be in chip ram, it's bad because the CPU is doing all of the decoding and that could impact your floppy transfer rate. Check out the Western Digital Disk Controller Handbook for a description of how the bits lay in RLL mode. --Chuck McManis uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcmanis@sun.com These opinions are my own and no one elses, but you knew that didn't you.
arnet@hpirs.HP.COM (Arne Thormodsen) (03/17/89)
I don't know if it will help or not but Vol. 1 #6 of Amazing has an article called "Fun with the Amiga Disc Controller" that might offer some clues. I've never tested any of this stuff myself, it looks like pretty heavy duty hacking, but.... Arne Thormodsen hplabs!hpda!arnet
w-colinp@microsoft.UUCP (Colin Plumb) (03/23/89)
cmcmanis@sun.UUCP (Chuck McManis) wrote: > In article <3631@sugar.hackercorp.com> (Karl Lehenbauer) writes: >> In the Roomers column of the March 1989 issue of Amazing Computing, the >> Bandito says that certain game companies are getting 1.5 MB unformatted >> on standard 880K floppies. >> Can anyone verify if this is true and if so, explain how it works? > > [They seem to use RLL.] RLL isn't quite the answer. It's doable, but it's more interesting than that... On a disk, bits are encoded as magnetic domains. If you could see the north poles of the bits of oxide along a track, they'd be pointing alternately clockwise and counterclockwise. A spot where the direction of magnetisation changes is a 1; no change in magnetic field is a 0. When the read head passes over, it detects the 1's as a current pulse. 0's have to be figured out by timing. Since the standard bit time for 3.5" (really, 90mm) floppies is 2us, if two pulses are detected 2us apart, that was 2 consecutive 1's. If the gap was 4us, that was 101, etc. If the gap is too long, the controller can mis-count and get a wrong number of 0's. If the gap is too short, there's a really tiny section on the disk that's pointing the opposite way to its neighbours. It can get swamped and vanish. Very bad. At 300 rpm and on the inner tracks (track 79), the minimum time between 1's works out to 4us. On the outer tracks (track 0), you can achieve the same linear density by speeding up the bit clock (a la Commodore PET and C-64), or by slowing down the drive motor (Macintosh). Or, like the Amiga, ST, and IBM PC, you can not bother. Now, I mentioned that the bit clock rate is 2us. But you can't put 1's closer than 4us apart. So we have to make sure we always put at least one 0 between 1's. Of course, we also have to be careful not to have too long a string of 0's, either. (Side note: 1200 bps and up modems have problems with long strings of 0's, too. They solve it by XORing in a pseudo-random bit stream which the modems work out between themselves. This is referred to as the scrambler polynomial.) This is where RLL comes in. RLL is Run Length Limited. The first disk drives used a 4us bit clock, alternating data bits and 1 bits for synchronisation. Thus, the longest string of 0's was of length 1. I.e. runs of 0's were limited to length 1. But because they could have adjacent 1's (the lower bound on run lengths was 0), they needed to use a slow bit clock for this 0,1 RLL scheme. This is known as Frequency Modulation. (It is, in fact, just like FM stereo. A stream of 1's would get writtten 1111111111, each 1 standing for a polarity change. The signal would end up looking like a square wave of period 8us. A stream of 0's would get written as 010101010101, a square wave of half the frequency.) Then someone worked out a better scheme, MFM (Modified FM), which is still the most common today, in which 1's were always separated by at least 1 zero, and by at most 3. Thus, 1,3 RLL. It's just like FM (data, clock, data, clock), except that the clock 1's are only inserted between two consecutive 0's. This makes it harder to figure out what's clock and what's data (in FM, if you find a 0, you know it's data), but there are a few patterns that still are 1,3 RLL but aren't generated by this scheme. The standard sync mark is A1, 10100001 which, with clock pulses inserted (using i and o for 1 and 0 so you can tell the difference) is 1o0o1o0i0i0i0o1o. By suppressing the second clock pulse (the i after the third 0), you get the raw bit pattern 100010010001001, which cannot appear in normal data (try it!). Because the floppy controller (the trackdisk.device, in this case) eats the clock pulses, it reads as A1. But it's really magic. Because you always have a 0 between adjacent 1's, and you can write 1's 4 us apart, you can write raw bits 2 us apart and stay within the capabilities of the media. 2us per bit, and 0.2 seconds per revolution (300 rpm = 5 rps) gives you 100,000 bits per revolution. 80 tracks, times two sides, gives 16,000,000 bits per disk. Since it takes two raw bits to produce one data bit, that's 8,000,000 data bits, or 1,000,000 (not 1 Meg = 1024x1024) bytes per floppy. There are, however, more complex schemes. What people refer to as RLL in the IBM world uses a 2,7 RLL code. Because you always have 2 0 bits between 1's, you can use a 1.333 us bit clock and still keep your 1's far enough apart. (Actually, it's used for hard drives, which can take 1 bits much closer than 4 us, but the idea is the same.) The only loss is that your read circuitry has to be able to keep time for 8 bit periods as it skips 7 0's. (Also, your media has to stop the boundary from "wandering" as much, since you have less tolerance. But this is usually pretty easy compared with making it accept 2.666 us gaps between 1's.) The 2,7 RLL code still manages to encode 1 data bit as 2 "raw" bits, although it's not nearly as simple as data, clock, data. Even if you wanted to go to all the trouble to make the trackdisk.device use this encoding scheme, it wouldn't help, since you still can't write raw bits any faster than 2 us. But you can, however, use a different 1,n RLL scheme. If you apply a bit of math, it turns out that 1,3 RLL can give you 0.55146 data bits per raw bit. 0.5 of that is used by MFM, and a bit extra is used by the A1 sync pattern. 2,7 RLL gives you 0.51736 data bits per raw bit. Agian, 0.5 of that is used for data, and some of the rest is used for a sync marker. A 1,6 RLL scheme will give you 0.66903 data bits per raw bit... enough to spend two thirds of a raw bit encoding data and have a little left over for a unique sync pattern. However, wringing the last few drops of efficiency from an RLL code is hard (you have about 0.00237 bits to fit your sync marker in here, i.e. it's got to be 1/0.00237 is more than 40 bits long), so I'd use a 1,7 RLL code, which gives 0.67928 bits per bit, giving you 0.01262 bits to fit a sync pulse in, resulting in a more reasonable 8-bit sync pattern to look for. I still haven't actually come up with an encoding scheme, but I expect that just playing with it a bit will produce something. There's also an out I haven't mentioned: perhaps you can push the capabilities of the disk a little, and write adjacent 1 bits on the outer tracks. I think this would put them closer than the disk is rated for, but high-quality floppies can probably take it. But playing by the rules, I's use 1,7 RLL. Now, you're wondering how I came up with that figure of 0.67928 bits per raw bit. Here's where we get into Combinatorics. I goes like this: the null string is 1,6 RLL. No bits, no runs of the wrong length. Furthermore, and 1,6 RLL string can be followed by any of 10, 100, 1000, 10000, 100000, or 1000000 and produce a 1,6 RLL string. So, given a string of length n, we can produce strings of length n+2, n+3, n+4, n+5, n+6, and n+7. These are the only strings we can produce directly, so we can turn it around: we can make a string of length n from any string of length n-2, n-3, n-4, n-5, n-6, or n-7. So S(n), the number of strings of length n, is S(n-2)+S(n-3)+S(n-4)+S(n-5)+S(n-6)+S(n-7). And we know that there is one string of length 0, and none of length <0. S(0) = 1, S(-1) = S(-2) = ... = 0. From this, you can probably see how to write a program to compute the number of strings of length n that are 1,6 RLL. For any p,q RLL, actually. S(n) = S(n-p)+S(n-p-1)+...+S(n-q-1). All I did was use really long strings (about a million bits) and take the log base 2 (log base x of n is log(n)/log(x) for log() in any other base) of the number of possible strings, which gives the number of bits of information you can encode in such strings, and divide this by the number of bits I used (about a million). A few other numbers: 0,1 RLL: 0.69424 /* FM gets 0.5 of this - inefficient! */ 0,2 RLL: 0.87914 0,3 RLL: 0.94677 0,4 RLL: 0.97522 0,5 RLL: 0.98810 0,6 RLL: 0.99419 0,7 RLL: 0.99713 0,8 RLL: 0.99857 0,9 RLL: 0.99929 0,10 RLL: 0.99964 /* 0,infinity RLL is, of course, 1 */ 1,1 RLL: 0 /* I assume you can figure out why */ 1,2 RLL: 0.40568 1,3 RLL: 0.55146 1,4 RLL: 0.61744 1,5 RLL: 0.65089 1,6 RLL: 0.66903 1,7 RLL: 0.67928 1,8 RLL: 0.68525 1,9 RLL: 0.68878 1,10 RLL: 0.69091 2,3 RLL: 0.28776 2,4 RLL: 0.40568 2,5 RLL: 0.46495 2,6 RLL: 0.49790 2,7 RLL: 0.51736 2,8 RLL: 0.52933 2,9 RLL: 0.53691 2,10 RLL: 0.54179 When doing this computation, you'll quickly overflow 32 bits and even double- precision floating point. (After all, a 1 million bit long string in 2,7 RLL has over 2^500,000 possibilities.) I cobbled up my own pseudo floating-point, where if the sum S(n) ever exceeeded 1<<24, I divided it by 2 and incremented an exponent word (32 bits, of course). Adding two numbers in this form isn't too tricky, expecially since in this computation, you always know which is the larger (S(n) > S(n-1), after a few initial bobbles), and you can say: big += little>>(big_exponent-little_exponent); if (big > (2<<24)) { big >>= 1; big_exponent++; } I just kept a circular buffer of the number of strings of length n-1 through n-q-1, and printed out (exponent+log_2(mantissa))/n every thousand times through the loop. (If you do it every time, keep a check for mantissa==0, which is true for length=1 if the minimum run length > 0.) If someone else would like to cobble up a routine and verify my numbers, great. (If you remember enough combinatorics to get a closed-form expression for the number of bits, even better. I tried, but fell back on numerical approximations.) So, is everything clear now? Good. -- -Colin (uunet!microsoft!w-colinp) "Don't listen to me. I never do." - The Doctor
elg@killer.Dallas.TX.US (Eric Green) (03/24/89)
in article <40@microsoft.UUCP>, w-colinp@microsoft.UUCP (Colin Plumb) says: $ cmcmanis@sun.UUCP (Chuck McManis) wrote: $> In article <3631@sugar.hackercorp.com> (Karl Lehenbauer) writes: $>> In the Roomers column of the March 1989 issue of Amazing Computing, the $>> Bandito says that certain game companies are getting 1.5 MB unformatted $>> on standard 880K floppies. $>> Can anyone verify if this is true and if so, explain how it works? $> $> [They seem to use RLL.] $ $ RLL isn't quite the answer. It's doable, but it's more interesting than $ that... [interesting discussion of RLL, and compression gettable via that:] It may very well be that some of these game companies are staffed by former C-64 hackers, who remember how the 1541, and all old CBM 8-bit drives store data -- GCR. So while the Osborn 1 was storing a whole 90K of data on a single-sided single-density disk using MFM, the Apple II was storing 120K using GCR, and the CBM 8-bitters were storing 160K using GCR plus variable bit-rates (putting more sectors on the outer tracks, and fewer sectors on the inner tracks). It was a great hardware/software hack. Going from 880k to 1500k, though, sounds a bit far-fetched... -- | // Eric Lee Green P.O. Box 92191, Lafayette, LA 70509 | | // ..!{ames,decwrl,mit-eddie,osu-cis}!killer!elg (318)989-9849 | | \X/ Amiga. The homestation for the blessed of us. |
peter@sugar.hackercorp.com (Peter da Silva) (03/26/89)
What about GCR? -- Peter "Have you hugged your wolf today" da Silva `-_-' ...texbell!sugar!peter, or peter@sugar.hackercorp.com 'U`
cmcmanis%pepper@Sun.COM (Chuck McManis) (03/28/89)
>What about GCR?
It writes the same number of bauds as MFM so no net gain.
--Chuck McManis
uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcmanis@sun.com
These opinions are my own and no one elses, but you knew that didn't you.
"A most excellent barbarian ... Ghengis Kahn!"
peter@sugar.hackercorp.com (Peter da Silva) (03/28/89)
In article <96103@sun.Eng.Sun.COM>, cmcmanis%pepper@Sun.COM (Chuck McManis) writes: > >What about GCR? > It writes the same number of bauds as MFM so no net gain. The one Apple uses on the ][ does, maybe, but there are denser GCR codes available and they're supposed to be more resistant to noise than MFM and RLL. -- Peter "Have you hugged your wolf today" da Silva `-_-' ...texbell!sugar!peter, or peter@sugar.hackercorp.com 'U`