[net.micro.amiga] DOS filing system efficiency statistics and discussion

dillon@CORY.BERKELEY.EDU (Matt Dillon) (10/12/86)

	The Filing system has a lot of redundancy.  Not only is there a
sector allocation list for each file, but each data sector in a file 
contains 12 bytes of header information which is completely redundant.
Doing a little arithmetic, this redundant information takes up about
21K over the entire disk.  No great loss; With 880K on a disk that's
only 2.4%

	I would expect, since each data block has a pointer to the next,
that DOS would take advantage of this fact.

	When reading a huge file sequentially, I'm sure all of you have 
noticed that DOS seeks to some far away land every 3 cylinders,
corrosponding to every 33K of information.  Since each extension table (sector
allocation table) has 72 entries, we have 72/22 (22 sectors/cylinder) = 3.27 
cylinders.

	So obviously, DOS is tracing the file via its sector allocation
table rather than the link pointers.  Q.E.D. DOS isn't taking advantage of 
the link pointers.


----------
NOTE: In the following discussion, 'Extension blocks' refers to the 
sector allocation table for a file.  I will use both terms interchangably.

NOTE: track   : half a cylinder on a floppy (one side).
      cylinder: refers to the entire cylinder (both sides / 2 tracks)

NOTE: Amiga Floppy disk: 80 cylinders, 2 sides, (160 tracks)
      11 sectors/track = 22 sectors/cylinder.  Total of 1760 sectors.
      512 bytes (128 longwords)/sector, total = 880K (901120 bytes).

Maximum theoretical transfer rate:
	300rpm = 5 revs/sec = 5 tracks/sec =
	5*11 sectors/sec = 28160 bytes/sec

Current transfer rate: (loading LC1 on the second try so it's already
	cached the extension blocks): 210 sectors / 10 seconds =
	10752 bytes/sec

Maximum Speed improvment: 2.6x 




Copying LC1 to a completely blank disk:

-Header block of LC1 is on the same track as Root
-First data block is on same track as Root, Next data block begins one cylinder
 over. (leaving much of the Root cylinder free)
-Extension blocks are all located on the same track as Root


Effectively, this means that DOS could have cached all the extension 
sectors immediately, then load the entirety of LC1 without having to
do any extranious head moves.  I think it's quite plain that one of the
optimizations C-A put in between versions 1.1 and 1.2 can be readily seen
by my second observation.. the fact that only one data block was put on the
same track as the file header block (thus, short files could be loaded 
without a single head move).

You might think, "well, why don't we move the Extension blocks next to
the data sectors they are extending", and indeed, this would make sequential
reading faster.  However, if your doing a lot of Seek() calls this would
cause problems.  Having DOS trace sequential reads via the data-block links
would solve both problems, though modifying DOS to do that would probably be
a big hack. (but a worthwhile big hack).

-I think it would be a good idea NOT to put the first data block on the same
cylinder as the file entry.  You would then have double then number of 
sectors on that cylinder to put file headers.

-I think it would be a good idea NOT to put extension blocks on the same
cylinder as the file entry for the same reason.  Since the file header
contains the first S.A.T anyway, putting the other's one track
over (but still keeping them grouped together) would not decrease 
performance at all (currently, a head move is being done getting to the
first extension block anyway).

-Tracks with directories on them should not be used to store file or
extension information whenever possible.  Implimenting this would require
a bit of programming, but would make the filing system fast even on a
disk which creates/deletes a lot of junk.

Add to that a track buffering scheme (so we can DMA the second surface in
while MFM decoding the first), and I think directories will speed up
enough to satisfy most of us (and maybe even solve the workbench problem) as
well as getting double the performance reading sectors.

I think we can take disk I/O to its theoretical maximum.

	We may beat the MAC+ yet.

				-Matt