[comp.sys.atari.st] Caching in GEMDOS

t68@nikhefh.hep.nl (Jos Vermaseren) (04/20/88)

Because of all the noise about the caching in turboDOS it is good to
consider what GEMDOS actually does:
GEMDOS has some limited caching in the form of some 512 bytes buffers. The
number of these buffers is two for FAT usage and two for other use (directory
entries and heads and tails of files that don't fill an entire sector.
These are not write through buffers but they are flushed rather regularly.
1: because there are few buffers a buffer is bumped rather quickly.
2: whenever a file is closed or a program is terminated all buffers are
   flushed.
The second point means that it is very easy to make a sync: Open and close a
file and presto, or better yet: make a program that does nothing.

The order of flushing the buffers is also relevant. First the data and directory
entries are flushed, and then the FATs. The larger pieces of data are always
written before the FAT entry buffer is flushed so if a program crashes the
machine you may have some clusters with data but no FAT pointing to them and
also no directory telling that it exists (so it doesn't exist), or some data
with some FAT entries, but the directory doesn't know it (orphans), or a
directory with a length but the FATs haven't been flushed yet. This last
thing would leave nonsense. Maybe it is better to revert the order of the
flushing. This whould just leave orphans. Actually I think that was intended,
but due to a mistake it came out wrong. (The litterature is consequently
opposite to the real use of the variables in 4B2 and 4B6. see furtheron.)

There are also some bugs in this caching, although it is hard to show them
When the data that is written overlaps a number of sectors the sectors are
written as a whole without ever entering the GEMDOS buffers, after the 
buffers have been flushed. This flushing means however:
a: if dirty, write and mark as clean.
b: if clean, invalidate (ie mark as free).
So a program like:
	handle = Fcreate("name");
	Fwrite(handle,9L,"Old text");
	Fseek(0L,handle,0);
	Fwrite(handle,2000L,otherbuffer);
	Fseek(0L,handle,0);
	Fread(handle,9L,text);
obtains the old string in "text" while the disk contains the contents of buffer.
This is of course not very common.

I have repaired this and some other of these bugs in a disassembly of GEMDOS.
In this version the looking through the FATs goes also much faster due to a
rather small change in the two routines that ask for the next FAT entry.
As these routines know they are searching repeatedly they jump directly to the
cache buffer after the first read has established that the FAT is in the buffer.
The result is a rather swift response: DiskFree and first writes to a new file
go about six times faster. When some more buffers for the caching are put in
via the official channels (that is what the pointers at the addresses 4B2 and
4B6 are for) these factors go up even higher. Diskfree on a 16Mbytes partition
went down from > 9sec to < 1.1 sec! writing a new file took less than 1.5 sec
even when the partition was nearly completely filled (no tuneup). The gain in
compiler performance is striking.
The main gain lies in the fact that due to the repair of a couple of errors
there is no need to invalidate the buffers all the time.
Now the media change:
When a media change is detected there are two possibilities:
1: a real media change: GEMDOS reacts by invalidating all buffers that concern
   that drive.
2: a maybe mediachange: GEMDOS invalidates the buffer it was currently trying
   to use and reads it in again as if nothing has happened. At the moment I
   forget whether a dirty buffer is first written but I don't think so.
   The above is of course a very dangerous thing. It assumes the current buffer
   to be worthless but all other buffers concerning this drive are left as
   they are.

Together with some other bug repairs the whole is a <20K file and it gives a
performance on a hard disk that is slightly closer to the performance of a
RAMdisk than to the hard disk in the old version of GEMDOS. This program
is called JAMDOS and is currently only available for the UK version of the
MegaROMs (there are hard addresses in there!). It is of course not too difficult
to adapt to other versions. The main catch is however that >95% of the program
is the original GEMDOS, so it belongs either to Atari or to DRI or whomever.
This means that, unless Atari will specifically allow its distribution, it 
cannot be posted. I think for the moment we should just have a little patience
and hope that Allan Pratt has his new GEMDOS ready for release pretty soon.

J.Vermaseren
t68@nikhefh.hep.nl

saj@chinet.UUCP (Stephen Jacobs) (04/21/88)

The referenced posting (a detailed explanation of the builtin cacheing done by
GEMDOS) seems to explain a bug I encountered in a toy program I wrote to 
manipulate floppy disk volume labels.  It openned a file marked as a volume
label (ala MS-DOS), then hunted out the directory entry (after closing the
file) and changed trailing spaces to trailing nulls (to match the format of a
desktop-created volume label)(I may have blanks and nulls reversed).  Well,
after the program terminated, the blanks were back! (Yeah, I reread the 
directory entry after changing it).  But 2 separate caches...one flushed on
file closing, the other on program termination---interesting.