[comp.sys.amiga] Making a custom Filing System

dillon@CORY.BERKELEY.EDU (Matt Dillon) (05/15/87)

	I figure all I have to do is write my own driver (ala RAM:) which 
accesses the trackdisk for DFx:, making DOS forget about that particular 
drive (ala however diskcopy/format/disked do it).  I would like to use the
same filing system format for compatibility.

	I've been able to figure out the entire filing system disk format
(apart from some space wastage, it is quite elegant) except for the format of
the BITMAP blocks.  Anybody know what their format is ????  Also, how does
one take over a drive (just point me in the right direction).

	Lastly, I would have to be able to inform DOS when disks are 
mounted/dismounted.. is it possible?  Anybody have any ideas?

-----------------
Waste in the Filing System:

	-Extension blocks use 72 longwords to hold block numbers.  There is
	 a segment in the extension block of 46 longwords that isn't used
	 at all!!!! and it is even contiguous with the 72 longword block
	 ptr segment!!!!  I.E.  right now extension blocks can hold pointers
	 to up to 36K of data, but with almost no modification they can hold
	 pointers to up to 60K

	 possibly fixable without loosing compatibility with the current DOS.

	-The various types of blocks in the filing system use an 
	 almost-uniform format, but it is incredibly wasteful... using 
	 longwords to hold what could be 4 bit type fields.  Specifically,
	 the primary and secondary type fields could be combined.

	 not fixable unless you change the filing system format.

	-Some fields are defined but not used by the filing system.  In a
	 file header, there is a 'data size' field which is defined to be
	 the number of data block slots used, but DOS always sets this to 0.

	 Possibly upward compatible, assuming the current DOS ignores the 
	 field and later versions check the field against 0 and update it
	 to the proper value.

Inefficiencies due to DOS:
	
	-Why does a directory search go through all the extension blocks for
	 a file???  To count the number of actual blocks used by that file.
	 But, as Bryce pointed out, it doesn't need to do that... simply
	 divide the filesize by 488 to get the number of blocks.

	 And Get this:  Remember the field that was defined and not used???
	 guess what a good use would be... you got it. 

	-Why does DOS bother to read in extension blocks when loading large
	 files sequentially when it can simply use the 'Next Data Block'
	 field in the data blocks????

	-Why doesn't DOS do a top-level sort of the hash table when getting
	 directories?

	-Why does DOS compete with itself when multiple processes read and
	 write to disk simultaniously???  Even if each of two processes are
	 using an 8K block size, DOS will still do disk operations in 512
	 byte sizes (causing it to seek back and fourth when it really doesn't
	 need too!).


					-Matt

scotty@l5comp.UUCP (05/19/87)

In article <8705151922.AA05483@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes:
>
>	I figure all I have to do is write my own driver (ala RAM:) which 
>accesses the trackdisk for DFx:, making DOS forget about that particular 
>drive (ala however diskcopy/format/disked do it).  I would like to use the
>same filing system format for compatibility.
I've ripped RAM: apart and fooled around with it, like added relabel ram:
support. But it's no picnic, better approach is via a PIPE: style 'handler'.

>	I've been able to figure out the entire filing system disk format
>(apart from some space wastage, it is quite elegant) except for the format of
>the BITMAP blocks.  Anybody know what their format is ????  Also, how does
>one take over a drive (just point me in the right direction).
It's real elegant except for those 24 byte headers at the front of each sector.
Without these we could beam data straight from the disk into the user memory
buffer. But instead we must read the sector into a private buffer and check its
checksum, then crack the 488 bytes out of the sector and put them where the user
wanted them in the first place. That isn't too elegant in my view.

Bitmap blocks are very simple, they are a linear array of bits. The first bit
is the first available block on the drive past the RESERVED blocks. On a floppy
there are two reserved blocks so the first bit is block # 2. After that you just
run sequentially up till ya hit the end of disk. The best way I found to get a
feel for them is with a disk zap style program. Multi-tasking works out real
well for this :).

To keep the dog from sniffing the disk when it's inserted you send a dos packet
to the drive's handler process. The packet ID and params are in the AmigaDOS
technical reference manual from Bantam (mine is at work, sorry).

>	Lastly, I would have to be able to inform DOS when disks are 
>mounted/dismounted.. is it possible?  Anybody have any ideas?
You send a diskchange packet to the handler process for the drive involved. I
don't think this one is in the Bantam book. It's in the dos header file with
all the other ACTION_ definitions.

>Waste in the Filing System:
You forgot that idiot checksum at the front of every block on the disk. This is
a real space and time waster. What they couldn't trust the disk system to figure
out that a block was corrupted?

>Inefficiencies due to DOS:
The 1.2 amigadog also moves things around. While sniffing a hard disk to see how
bitmaps work I noticed that it would MOVE the bitmap blocks as they were
modified!!! Yuck. I've also seen some fun stuff happen with modified blocks of
other sorts. I suspect that the dog releases the previous block used for the
modified data and then hunts it up a new one from the current ALLOCATE ME NEXT
thingie.

>	-Why does a directory search go through all the extension blocks for
>	 a file???  To count the number of actual blocks used by that file.
>	 But, as Bryce pointed out, it doesn't need to do that... simply
>	 divide the filesize by 488 to get the number of blocks.
Well, if they wanted to they could implement SPARSE files which would shoot a
hole in your idea Matt. Both UN*X and Apple DOS 3.X support this concept, why
not us? :-) It would be fun to throw a NIL block number in a slot in an
extension block and see if the dog noticed it as a sparse block...

>	-Why does DOS bother to read in extension blocks when loading large
>	 files sequentially when it can simply use the 'Next Data Block'
>	 field in the data blocks????
See above, I suspect that the also redundent value stored in each block giving
it's sequence in the list is there to support sparse files.

>	-Why does DOS compete with itself when multiple processes read and
>	 write to disk simultaniously???  Even if each of two processes are
>	 using an 8K block size, DOS will still do disk operations in 512
>	 byte sizes (causing it to seek back and fourth when it really doesn't
>	 need too!).
The real problem here is that C-A only built in one track buffer per drive. The
buffers from addbuffers are nice, but more TRACK buffers would be REAL nice! :)
I did this on a CPM 68K system (added more track buffers) and had things speed 
up 4X that involved operations on multiple open files. You don't have to have
two processes competing to have this problem, one process reading from one file
and writing to another back and forth causes the same kinda thrashing.

Scott Turner

-- 
L5 Computing, the home of Merlin, Arthur, Excalibur and the CRAM.
GEnie: JST | UUCP: stride!l5comp!scotty | 12311 Maplewood Ave; Edmonds WA 98020
If Motorola had wanted us to use BPTR's they'd have built in shifts on A regs
[ BCPL? Just say *NO*! ] (I don't smoke, send flames to /dev/null)

phillip@cbmvax.UUCP (05/19/87)

in article <8705151922.AA05483@cory.Berkeley.EDU>, dillon@CORY.BERKELEY.EDU (Matt Dillon) says:
> 
> 
> 	I've been able to figure out the entire filing system disk format
> (apart from some space wastage, it is quite elegant) except for the format of
> the BITMAP blocks.  Anybody know what their format is ????  Also, how does
> one take over a drive (just point me in the right direction).
A disk BitMap looks like this:
(LONG) [CheckSum] [127 longwords of bitmap.]
	The high order bits refer to high blocks respectively.
bit representation:
		0 == ALLOCATED (USED)
		1 == FREE

The first bitmap block looks like this:
[CHECKSUM][ blocks: 2 - 33 ][ blocks: 34-65] etc...
            33-18 / 17-2<------------------High order word refers to
						blocks 18-33.
					   High order byte refers to
						blocks 26-33.
					   High order nibble refers to
						....:-)

At what level do you want to take over the drive?

-phil
==============================================================================
Phillip (Flip) Lindsay - Commodore Business Machines - Amiga Technical Support
  UUCP: {ihnp4|seismo|caip}!cbmvax!phillip      - Phone: (215) 431-9180
  No warranty is implied or otherwise given in the form of suggestion or 
  example. Any opinions found here are of my making.

cmcmanis@pepper.UUCP (05/20/87)

[ I have been *deep* inside the Amiga DOS file system and there are a few
  inaccuracies here that need to be pointed out...]

In article <132@l5comp.UUCP> scotty@l5comp.UUCP (Scott Turner) writes:
In discussing the Block format Scott says ...
> It's real elegant except for those 24 byte headers at the front of each 
> sector. Without these we could beam data straight from the disk into 
> the user memory buffer. But instead we must read the sector into a 
> private buffer and check its checksum, then crack the 488 bytes out 
> of the sector and put them where the user wanted them in the first 
> place. That isn't too elegant in my view.

Scott is talking to people who want to write drivers here, if you ask for
a block through DOS then DOS will check it's checksum for you.

> Bitmap blocks are very simple, they are a linear array of bits. The first 
> bit is the first available block on the drive past the RESERVED blocks. 
> On a floppy there are two reserved blocks so the first bit is block # 2. 
> After that you just run sequentially up till ya hit the end of disk. 
> The best way I found to get a feel for them is with a disk zap style 
> program. Multi-tasking works out real well for this :).

A couple of things, first the number of reserved blocks are stored in 
something called an Environment vector. It is documented in the 
exec/filehandler.h file. Don't hard code two into your routines. Second
the Bitmap blocks also have a Checksum. Phillip Lindsay's description
puts it as the first longword in the buffer. 

> To keep the dog from sniffing the disk when it's inserted you send a 
> dos packet to the drive's handler process. The packet ID and params 
> are in the AmigaDOS technical reference manual from Bantam (mine is 
> at work, sorry).

First the style comment, referring to various programs you dislike
with slightly permuted names like AmigaDOG, Lettuce, etc appears on
this end to be rather childish and petty. Let's try to be mature
adults and minimize the editorializing shall we? 

Second, the packet Scott mentions is ACTION_INHIBIT. Sending this
packet to a disk device with Arg1 set to TRUE causes it's icon to 
switch to the Dxx:BUSY icon. Note that if you do this the only
way to read and write data to the disk are is with trackdisk calls.
DOS and all other tasks are prevented from accessing your disk.
The feature is of course that when you send it ACTION_INHIBIT with
Arg1 set to FALSE it re-reads the label *and* the Disk.info file
and redisplays the icon on the screen. Therefore if you want your
program to change a disks icon 'on the fly' you can copy over your
disk.info file, then INHIBIT and de-INHIBIT the drive and the workbench
will pick up the new Icon.

> You forgot that idiot checksum at the front of every block on the disk.
> This is a real space and time waster. What they couldn't trust the disk 
> system to figure out that a block was corrupted?

Scott you presuppose a disk controller. The current Amiga software cannot
detect a corrupt sector. Paula reads in the bits and the blitter decodes
them, then the driver pulls out the sector you want. No automatic CRC 
checking is done like on other systems. Also note that on a PC or ST
the sector consists of a start of sector byte, track, and sector number,
start of data, 512 bytes of data, and two bytes of CRC. so they are really
512+2+2+2 (518) bytes long. Yes it would be possible to make sectors on
the Amiga with a Data Area 512 bytes long but that would needlessly 
complicate calculating where in the track buffer to read the data from.
(sector * 536) anyone? The Checksum in the sector provides the same 
function as the CRC in everyone elses sectors. 

> The 1.2 amigadog also moves things around. While sniffing a hard disk 
> to see how bitmaps work I noticed that it would MOVE the bitmap blocks 
> as they were modified!!! Yuck. I've also seen some fun stuff happen with 
> modified blocks of other sorts. I suspect that the dog releases the 
> previous block used for the modified data and then hunts it up a new 
> one from the current ALLOCATE ME NEXT thingie.

I suspect this is a safety feature. Ideally the new bitmap should be 
created in toto in 'new' blocks on the disk and then the bitmap pages
pointers in the root block written. It is already easy to crash a track
by popping the disk out at the wrong time or have the power go out. If
you crash the bitmap the darn thing Gurus (80000000000B). So the risk
is reduced to a window of one block write.

>    ...I suspect that the also redundent value stored in each block giving
> it's sequence in the list is there to support sparse files.

I suspect it is to make rebuilding the file system easier. It certainly
helps in that respect.

>Scott Turner

And of course what posting would be complete without a signature ...
--Chuck McManis
uucp: {anywhere}!sun!cmcmanis   BIX: cmcmanis  ARPAnet: cmcmanis@sun.com
These opinions are my own and no one elses, but you knew that didn't you.

dillon@CORY.BERKELEY.EDU (Matt Dillon) (05/21/87)

Chuck McManis writes:
>Scott you presuppose a disk controller. The current Amiga software cannot
>detect a corrupt sector. Paula reads in the bits and the blitter decodes
>them, then the driver pulls out the sector you want. No automatic CRC 
>checking is done like on other systems. Also note that on a PC or ST
>the sector consists of a start of sector byte, track, and sector number,
>start of data, 512 bytes of data, and two bytes of CRC. so they are really
>512+2+2+2 (518) bytes long. Yes it would be possible to make sectors on
>the Amiga with a Data Area 512 bytes long but that would needlessly 
>complicate calculating where in the track buffer to read the data from.
>(sector * 536) anyone? The Checksum in the sector provides the same 
>function as the CRC in everyone elses sectors. 

	Sorry, Scott is correct.  Although Paula may have no idea what 
it is decoding, the trackdisk.device certainly does.  A sector on the Amiga's
floppy is MFM-formatted as follows (taken from RKM II):

	two bytes of 00 data (MFM = AAAA each)
	two bytes of A1	     (sync bytes.. MFM encoded A1 without clock pulse)
	one byte of format-byte	(Amiga 1.0 format = FF)
	one byte of track number
	one byte of sector number
	one byte of sectors until end of write
	16 bytes OS recovery (the label area)
	4 bytes of header checksum
	4 bytes of data area checksum
	512 bytes of data

	total: 544 bytes (before MFM encoding) per sector, or about 935K
raw data.  I'm neglecting to tell you all how all of this is MFM encoded
as it is beyond the current conversation.

	So you see, an Amiga sector contains just as much information as
normal disk sector formats.  The big difference is that there is no 
spacing between sectors, and only a small spacing allowing for write 
DMA overlap.  Note that the sector is fully labeled with track
and sector number, and completely checksummed.  The DOS checksum IN the data
has nothing to do with the sector format and is a DOS thingy.  It *does*
provide a nice double-check on data integrity though.

	The 'sectors until end of write' is used for optimization.  You see,
when the Amiga DMA's a track in, it doesn't really care where in the track
it starts DMAing (thus, there is no lag time waiting for some specific
sector to come around).  This field allows the driver to pick out just where
it actually starting writing when it write the track out previously.  This
is needed because there *is* a certain amount of overlap when writing out
sectors and it's nice to know where the garbage is expected to be.  The 
disk controller can be given a word to search for before starting DMA, so
you at least get the track on a sector boundry.

					-Matt

scotty@l5comp.UUCP (Scott Turner) (05/22/87)

In article <19347@sun.uucp> cmcmanis@sun.UUCP (Chuck McManis) writes:
>Scott is talking to people who want to write drivers here, if you ask for
>a block through DOS then DOS will check it's checksum for you.
Actually I was talking to people who want to SPEED the file system UP!

>A couple of things, first the number of reserved blocks are stored in 
>something called an Environment vector. It is documented in the 
>exec/filehandler.h file. Don't hard code two into your routines. Second
>the Bitmap blocks also have a Checksum. Phillip Lindsay's description
>puts it as the first longword in the buffer. 
And as we shouldn't assume 2 reserved blocks, we shouldn't assume the disk has
1760 blocks on it either. Whenever I switch disk editors this is the SECOND
thing I have to fix after I change the string used in open device from
'trackdisk.device' to the name of my hard disk driver.

>First the style comment, referring to various programs you dislike
>with slightly permuted names like AmigaDOG, Lettuce, etc appears on
>this end to be rather childish and petty. Let's try to be mature
>adults and minimize the editorializing shall we? 
Not this guy! I'm afraid AmigaDOG is here to stay for me at least. :) Also,
the 'correct' name for the dos is trademarked and hence being good law abiding
people we should make note of this fact when we use it. Much simpler to use
AmigaDOG.

>(sector * 536) anyone? The Checksum in the sector provides the same 
>function as the CRC in everyone elses sectors. 
Chuck, you assume I'm worried about speed on the FLOPPY drive! My hard disk
drive system uses an ECC scheme which not only detects errors it attempts to
FIX them. Hence the AmigaDOG checksum is literally a lead weight around the
neck of hard disk drive systems.

>I suspect this is a safety feature. Ideally the new bitmap should be 
[more text]
>by popping the disk out at the wrong time or have the power go out. If
>you crash the bitmap the darn thing Gurus (80000000000B). So the risk
>is reduced to a window of one block write.
Whenever I get a crashed bitmap I get a repair session with the painfully slow
disk-validator from the L: directory. Haven't had it guru me yet. Besides Chuck,
the FLOPPY is written a TRACK at a time... This 'window of one block write'
only applies on a hard disk system. Unless the dog has been naughty and
scattered the bitmap from one end of the disk to the other. :) (I can't help
it, but calling it dog rather than dos just leads to more effective mental
images at times...)

While I'm on the disk-validator... Anyone out there want to recode this thing?
Two reasons: 1. The current one I think suffers from non-optimized disk seeks
(just like examine_next) and has a VERY slow routine for rebuilding the map
after it scurries all over the disk digging up data. 2. Once re-written it
could be made available for use by people wishing to make PD bootable disks.
Thus getting one more hurdle out of the way. Just think, we could kill two
birds with one stone. Faster rebuilds, and peace of mind for C-A corporate.

>I suspect it is to make rebuilding the file system easier. It certainly
>helps in that respect.
So do the NEXT pointers Chuck. The NEXT pointers give the sequence information
for rebuilding as well. BUT NOT for a "sparse" system since there's no way for
these NEXT pointers to encode a block that isn't there... :)

Scott Turner

-- 
L5 Computing, the home of Merlin, Arthur, Excalibur and the CRAM.
GEnie: JST | UUCP: stride!l5comp!scotty | 12311 Maplewood Ave; Edmonds WA 98020
If Motorola had wanted us to use BPTR's they'd have built in shifts on A regs
[ BCPL? Just say *NO*! ] (I don't smoke, send flames to /dev/null)

hamilton@uxc.cso.uiuc.edu (05/25/87)

scotty@l5comp says:
> >I suspect it is to make rebuilding the file system easier. It certainly
> >helps in that respect.
> So do the NEXT pointers Chuck. The NEXT pointers give the sequence information
> for rebuilding as well. BUT NOT for a "sparse" system since there's no way for
> these NEXT pointers to encode a block that isn't there... :)

    sure there is.  since block 0 can never appear in a file, zero
be used as a magic cookie, just like i-numbers on a Unix filesystem.

	wayne hamilton
	U of Il and US Army Corps of Engineers CERL
UUCP:	{ihnp4,seismo,pur-ee,convex}!uiucuxc!hamilton
ARPA:	hamilton@uxc.cso.uiuc.edu	USMail:	Box 476, Urbana, IL 61801
CSNET:	hamilton%uxc@uiuc.csnet		Phone:	(217)333-8703
CIS:    [73047,544]			PLink:  w hamilton

scotty@l5comp.UUCP (Scott Turner) (05/27/87)

In article <172200060@uxc.cso.uiuc.edu> hamilton@uxc.cso.uiuc.edu writes:
>    sure there is.  since block 0 can never appear in a file, zero
>be used as a magic cookie, just like i-numbers on a Unix filesystem.
If I understand the use of these next pointers 0 is used to indicate end of
the line...

And even if it weren't there's only ONE next field per block. If that field is
used to say "The next block is sparse" how do I get to the next non-sparse
block?

Scott Turner


-- 
L5 Computing, the home of Merlin, Arthur, Excalibur and the CRAM.
GEnie: JST | UUCP: stride!l5comp!scotty | 12311 Maplewood Ave; Edmonds WA 98020
If Motorola had wanted us to use BPTR's they'd have built in shifts on A regs
[ BCPL? Just say *NO*! ] (I don't smoke, send flames to /dev/nu72