w-colinp@microsoft.UUCP (Colin Plumb) (12/12/88)
Having gone through one file system design (Cogent XTM, if you've heard of it; it's appeared in Byte a bit) and seen what great speed improvements can be made by avoiding the Unix-style block allocation, I sat down to design a file system specifically for Amiga floppies. After all, they should be extremely fast - no waiting for the right sector to come around, just 5.5*5 revs/sec = 27.5 K/sec. But somehow they never quite make it. This is the result of about a day and a half of thinking. I throw it out here for comment. I particularly need to think more about making ExNext faster. (I think it's a pretty broken way to do directories; the "where you are in a directory" state should be distinct from the lock and file info. Even if this was done, there are performance hits in not allowing a person to get less information about each file in a directory.) Anyway, the Bat Out Of Hell file system: Observation: on the Amiga, there are 2 costs to reading a sector: - seek time, 3ms.track or thereabouts. 240ms (1/4 sec) worst-case. There's also head-settling time (15 ms?), but I forget how much. - read time, 60 sec/min / 300rpm * 2 = 0.4 sec/track. 400 ms is significantly more expensive than even a worst-case seek. Thus, while it is a *very* good idea to minimise the number of tracks a file is distributed among, it doesn't matter that much which ones. Note that this suggests that if trackdisk.device only read the surface it had to, things would go significantly faster. Once you have a track, doing just about anything with its contents is essentially free. After all, decoding it involves a blitter pass. A bunch of data shuffling can't cost more. This whole thing suggests an extent-based file system, where hopefully extents are whole tracks. The problem with extent-based file systems is fragmentation. A sector-based scheme lets you use any free space for any allocation request, while an extent-based one may not have an extent of the proper size. But, we have observed that accessing any specific track takes approximately 1/2 second, wherever it is, and doing anything with a track is free. Thus, a scheme which uses extents on a track basis (and compacts garbage), but allocates tracks one at a time. Thus, the scheme proper: A floppy consists of 80 tracks. A track is 11K of data, and 352 bytes of header-area data. (Also known as "OS recovery info".) All free space on a track is compacted, so the free map is simply a count of free space on a track (16 bits) times 80 tracks - less than the current 220 bytes of bitmap! A file inode contains pointers to extents, each containing a track number and the amount of data stored there. I'll call it an inode, even though it contains file name, comment, etc, just like a regular AmigaDos file header block. It's also variable-sized, growing or shrinking when a comment is changed, when renamed, or when an extent is added. Note that any integral number of bytes can be stored in an extent - great space efficiency, especially for small (e.g. ENV: entries) files! Each track, in the header area (and overflowing onto the track proper if necessary), keeps an index of all the extents stored on that track. Given an inode number and extent number, it can give you the offset into the track at which the extent begins. This way, you can compact the contents of a track without having to fix up the inodes which point to it. The track index also has an index of the inodes stored on the track, but mostly they're treated the same as part of a file's data. Note that, assuming inodes are always larger than 13.75 bytes, you can't fit more than 64K of them on a disk, so you can use 16-bit inode numbers (useful to keep the track indeces small). Now, there are two kinds of extents the file system worries about: fixed-size ones (other than the last extent in a file - they're never going to change size) and variable-sized ones (the last extent in a file, and inodes, which can grow). For the former, you want to pack them in to tracks as tight as possible, while you want to leave some slack on any track that has the latter. So the algorithm for adding data to a file works like this: - If there is no room in the file's last extent (get last extent's track number, look at that part of the free map), then allocate an extent of the apropriate size somewhere. You may want to pick the track to minimise seek time, but, as I said, except in extreme cases, this isn't crucial. - If there is enough room in the file's last track to hold the data, then repack the data on the track so the "hole" is immediately after this extent (e.g. if it used to be aaaaabbcccddddeeeeee----------ff, and you write to extent c, rearrange things so it's packed like aaaaabbccc----------fddddeeeeeef, and write the data to the end of extent c. Note that this moves as much data as keeping the hole in the same place and enlarging the extent just a little bit, but eliminates the move if you subsequently write more data (likely). - If there isn't enough room on the track, then even the obvious "top off this track and allocate a new extent" will require writes to two tracks, but there are better ways to use them: -- If possible, reallocate the whole extent on a different track. This saves a track read in future, and avoids growing an inode. Note that, if there is a spare track on the disk, this algorithm will put 11K of a slowly-growing file in it. -- If not, try to find the largest available extent on a fixed-size track (that is, one with only fixed-size extents on it), copy data into it starting with the beginning of what's already on disk, and recurse for what remains to be stored somewhere. Hopefully, it will fit on the current track. Together with the last rule, this will make sure free tracks are used up by files that can use them. -- If all else fails, fill up a track with variable-sized extents on it. The only other tricky bit lies in enlarging inodes, when an extent is added to them or when they're renamed, since you can't split them. you have to move them, and fix up the pointer in the parent directory. However, note that it's safe to delete the inode from the current track and write it somewhere else, and *then* fix up the parent, since even if the machine crashed before the last stage of this operation, the parent's pointer to the file can be proven bad (the inode isn't in the index). Nasty point: file deletion or truncation is somewhat expensive, as we can't just mark the blocks as free, we have to go to each track and add the block to the hole. Fortunately, we can take a short-cut with tracks on which the deleted file is the only thing there and simply mark them as empty. Since we hope for a good number of these, it should reduce the work considerably. Well? For those who have made it to the end of this, I believe this will provide fast access and near-optimal disk usage. But I'll have to write it to find out. Any suggestions before I do that? I do, as I said, need to think about making ExNext faster. The fact that inodes are *much* smaller than they are under OFS should help. Question: If I write a full track, does the trackdisk.device bother to read it first? After all, my FS knows when there's nothing on the track worth keeping. (Oh, damn... I just realised that, if I can migrate inodes across track boundaries, I'll have to fix up all the references on each extent. I can get around this by assigning each inode an address and a unique i.d. number (16 bits would be enough if I could track them, but I should use 32 if I want blind "i.d.'s will never get reused" to work) which the per-track extent indices would use, but it's ugly and takes up more memory. Any ideas? One possibility is to try and migrate a data extent, or failing that, the inode with the fewest extents to patch up, but that's pretty baroque.) For the advanced: figure out how to leave some faint hope for file undeletion with this scheme. I have some ideas, but they're not pretty. Note that for recovering from a trashed file system things are great, as the track index lets you identify all the bits of a file and put them in order. More questions: should I bother allocating on 1-byte boundaries? 16-byte boundaries will work almost as well, and let me pack track number and # of bytes into 16 bits. -- -Colin (uunet!microsof!w-colinp)
w-colinp@microsoft.UUCP (Colin Plumb) (12/14/88)
I realise it's pretty obvious, but just in case I confused anybody when describing how adding data to extent "c" in a track packed like so: aaaaabbcccddddeeeeee----------ff would result in its being repacked as aaaaabbccc----------fddddeeeeeef, that was a typo. It should have been aaaaabbccc----------ddddeeeeeeff, with f staying where it was. Sorry. P.S. I've received a few replies, but so far it seems as if I'm the best FS designer in this newsgroup. Isn't anyone going to prove I'm not? :-) Really, there must be *something* wrong, or the Amiga wouldn't have been designed with sectors in the first place. -- -Colin (uunet!microsof!w-colinp)
scotth@harlie.SGI.COM (Scott Henry) (12/16/88)
From article <65@microsoft.UUCP>, by w-colinp@microsoft.UUCP (Colin Plumb): > I realise it's pretty obvious, but just in case I confused anybody when > describing how adding data to extent "c" in a track packed like so: > aaaaabbcccddddeeeeee----------ff would result in its being repacked as > aaaaabbccc----------fddddeeeeeef, that was a typo. It should have been > aaaaabbccc----------ddddeeeeeeff, with f staying where it was. Sorry. > > P.S. I've received a few replies, but so far it seems as if I'm the best FS > designer in this newsgroup. Isn't anyone going to prove I'm not? :-) > Really, there must be *something* wrong, or the Amiga wouldn't have been > designed with sectors in the first place. > -- > -Colin (uunet!microsof!w-colinp) The biggest problem that I see with your scheme is that it will only work with floppy disks (hard disks need not apply). The hard disks that I am aware of know about sectors (frequently, in the case of SCSI, in the drive itself). OFS works fine on HD's, but your scheme would need a separate file system for HD's. I know of (at least) one extent-based HD file system (Silicon Graphics uses one), but it uses sectors as the level of granularity (this is all I know about it!, except that it is very fast). Scott Henry <scotth@sgi.com> #include <std/disclaimer.h> -- Scott Henry <scotth@harlie.sgi.com> {or, also on the Internet:} <skywalker@cup.portal.com> #include <std_disclaimer.h>
steveb@cbmvax.UUCP (Steve Beats) (12/16/88)
In article <65@microsoft.UUCP> w-colinp@microsoft.UUCP (Colin Plumb) writes: > >P.S. I've received a few replies, but so far it seems as if I'm the best FS >designer in this newsgroup. Isn't anyone going to prove I'm not? :-) >Really, there must be *something* wrong, or the Amiga wouldn't have been >designed with sectors in the first place. >-- > -Colin (uunet!microsof!w-colinp) I did take a brief scan over your original posting and will grant that it has a few nice features *FOR FLOPPIES*. The point is that the filing system really makes minimal use of the tracks/sectors/heads information. The only thing affected by this is the placement of file headers with respect to the data blocks. If I were to take your filing system and try to hook it up to some remote device driver (one that gets its data across a network for instance) all of the optimisations you mentioned would do nothing but waste time! Even ramdrive.device (although it does present itself as a disk) would not benefit from this scheme. The beauty of the filing system as it stands now, is that it can interface to just about any device that takes a logical offset for read and write commands. This is similar to a recent thread we had about bad block mapping in the file system. I don't think it's appropriate for the filing system to become too rigidly attached to the geometry of the media it's accessing. While local optimisations are important, you have to be really carefule about things that change from one drive type to another. I still maintian that the driver (which *HAS* to know about the media it is accessing) should be responsible for bad block mapping. In the same vein, the driver should be responsible for local speed optimisations based on the hardware it is controlling. (I've actually been forced to practice what I preach since I'm just finishing up a new device driver for a new hard disk controller). It's interesting to note that this viewpoint can also be applied to lower level system software too. For instance, many hard disk drivers provide some speedups by cacheing tracks in an LRU manner. While this does speed up the majority of slow devices, it ends up wasting CPU time and memory when I higher dollar SCSI drive is used. A case in point is the Quantum 40S and 80S drives that have a 64K read ahead cache built in. All these drivers end up doing is cacheing the cache; a somewhat pointless excersise. Steve
limonce@pilot.njin.net (Tom Limoncelli) (12/16/88)
I really like your idea. If you combine that with full track cacheing you'd have something. The only problem is that: 1. it only works for floppies -- people are moving to hard drives 2. Getting people to convert will be difficult unless you can convince all the non-hard drive people to convert. Of course, when they can affort hard drives they'll switch to the FFS. It does make for a very interesting project. Possibly a good project that could get yourself published in the ACM and get some people thinking about file systems that aren't sector based. -Tom -- Tom Limoncelli Drew University Madison NJ 201-408-5389 tlimonce@drunivac.Bitnet limonce@pilot.njin.net "Fences make good neighbors" -Frost "I want an MMU" -Me Standard disclaimer? No, we're still on the dpANS disclaimer.
w-colinp@microsoft.UUCP (Colin Plumb) (12/16/88)
< P.S. I've received a few replies, but so far it seems as if I'm the best FS < designer in this newsgroup. Isn't anyone going to prove I'm not? :-) < Really, there must be *something* wrong, or the Amiga wouldn't have been < designed with sectors in the first place. In article <23498@sgi.SGI.COM> scotth@harlie.SGI.COM (Scott Henry) writes: >The biggest problem that I see with your scheme is that it will only work >with floppy disks (hard disks need not apply). The hard disks that I am aware >of know about sectors (frequently, in the case of SCSI, in the drive itself). >OFS works fine on HD's, but your scheme would need a separate file system >for HD's. I know of (at least) one extent-based HD file system (Silicon >Graphics uses one), but it uses sectors as the level of granularity (this >is all I know about it!, except that it is very fast). True, I was being pretty silly; the rest of the world uses sectors, so dropping them can cause problems. Still, it's an idea to play with. As far as my filesystem design is concerned, I explicitly said it's only for floppies. I have been involved in writing an extent-based file system (yes, we (old we, not Microsoft) also use sector granularity), and it does go plenty fast. I was planning on reimplementing it for the Amiga, but as long as I only have floppy drives to play with, I decided to optimise the hell out of that particular piece of hardware. Now, come on, people... I'm not entirely ignorant, but my Amiga-specific abilities are pretty low. Aren't there any horrible, insuperable problems with taking the drive away from DOS, or automatically recognising the format of floppies inserted, or handling disk changes, or making WB recognise the disk, or something? And I said nothing about directory structure... my current idea is simply to let the directory inode contain a variable-sized (up to 1 track) array of struct {short inode; short hashval} putting a limit of about 2500 files in a directory. The longer hash value should essentially eliminate collisions, and the fact that all inodes are known should speed up directory searching. Is this fair, or should I do something else? If I allow inodes to migrate across tracks, what does this mean for ExNext() if I want it to proceed in track order? Also, should it be case-sensitive or not? (I'll at least throw in a compile-time switch.) Should I take the filename out of the inode and allow hard links? Should it have Unix-like open-file-deletion semantics? Should I add something (another datestamp?) to inodes? Should it allow sparse files? Will FCFS access be good enough, or should I try and optimise accesses in the queue? Any special packets I should implement? Isn't *anyone* going to flame me for not building in some super-nifty virus-checking algorithm or something? :-) :-) :-) :-) Finally, question for C-A: will WB 1.4 support variable-sized lock structs? If so, I might find it easier to use them now, not work with this WB, and just wait for 1.4. It's unlikely to get out of the CLI-bigot community before then, anyway. BYW, thanks for the RDF: device; it's a pointer in the right direction. -- -Colin (uunet!microsof!w-colinp)
peter@sugar.uu.net (Peter da Silva) (12/16/88)
In article <65@microsoft.UUCP>, w-colinp@microsoft.UUCP (Colin Plumb) writes about his extent-based file system... > Really, there must be *something* wrong, or the Amiga wouldn't have been > designed with sectors in the first place. The problem is that this file system loses badly when you do switch to a device with real sectors. So you have three choices: 1) Make floppies faster (maybe) and hard disks slower. 2) Make hard disks faster and flopies slower. 3) have completely different file systems on floppies and hard disks. Choice three adds considerably to the complexity, and opens the door for different hard disk manufacturers to define their own formats. It's the best choice for all-out raw speed. Luckily, it's relatively easy for you to implement a version of it (3b). Choice one is a problem, since it'll hurt sales of higher-end Amigas and make it seem like a game machine... which is a problem anyway. Choice three is the one Commodore chose. I think it's good enough. You can always switch to 3b any time you want to write the code. -- Peter da Silva `-_-' peter@sugar.uu.net Have you hugged U your wolf today? Disclaimer: My typos are my own damn busines#!rne
w-colinp@microsoft.UUCP (Colin Plumb) (12/16/88)
Thanks for the comments... I'm still looking for someone to disagree on implementation points. I've convinced myself it'll work, but just today failed to convince someone else I'd thought it through properly. And if I can't describe it to a person, I might not do too well describing it to a computer. Did anyone else get totally lost? In article <Dec.15.14.34.43.1988.10159@pilot.njin.net> limonce@pilot.njin.net (Tom Limoncelli) writes: >I really like your idea. If you combine that with full track cacheing >you'd have something. The only problem is that: >1. it only works for floppies -- people are moving to hard drives Sigh. I know. I will, as soon as I can find a cheap A1000->Zorro II adapter. I'm not planning on selling this as a commercial product or anything, I just want to mess around inside Amy. (Kinky!) It's for fun. Utility would be a nice side-effect. >2. Getting people to convert will be difficult unless you can >convince all the non-hard drive people to convert. Of course, when >they can affort hard drives they'll switch to the FFS. I wasn't really planning on taking the world by storm, more providing a nifty hack which the more adventurous might like to use. If it's widely adopted, I'll be most gratified, but I'm not planning that far ahead. >It does make for a very interesting project. Possibly a good project >that could get yourself published in the ACM and get some people >thinking about file systems that aren't sector based. I say.. my hacking, a research paper? I wonder if I could... In article <5512@cbmvax.UUCP> steveb@cbmvax.UUCP (Steve Beats) writes: >I did take a brief scan over your original posting and will grant that it >has a few nice features *FOR FLOPPIES*. The point is that the filing >system really makes minimal use of the tracks/sectors/heads information. True, true, this is not at all a general-purpose thing. But, given that I only have floppy drives to play with, I decided to see what I could do with them. I was rather hoping someone would take issue with *how* I do this, not my decision to restrict myself to floppies. Even if it does not turn out to benefit mankind in any great way, I think it would be a fun thing to play with, and hope some other people would like to contribute ideas. I'd be delighted if it turned out to be useful, but the hacking is the important part. >If I were to take your filing system and >try to hook it up to some remote device driver (one that gets its data >across a network for instance) all of the optimisations you mentioned >would do nothing but waste time! Even ramdrive.device (although it does >present itself as a disk) would not benefit from this scheme. The beauty >of the filing system as it stands now, is that it can interface to just >about any device that takes a logical offset for read and write commands. I know. Over a network, you want to minimise the amount of data sent, and where it is physically located doesn't make much difference; the network is the bottleneck. The ramdrive has constant-time access, so any attempt to cluster data is a waste of cycles. I have my own ideas about how to write a general-purpose file system, better than the FFS. However, they're somewhat better for hard drives, where crossing a track from block n to n+1 isn't particularly expensive; if you set things up right, the one-track seek time equals the rotational latency, and isn't much more than the rotational latency on one track. Since I only have a floppy to try things out on, I decided to try and optimise the design for that. Is this decision all that horrible? >I still maintian that the driver (which *HAS* to know about the media it >is accessing) should be responsible for bad block mapping. In the same >vein, the driver should be responsible for local speed optimisations >based on the hardware it is controlling. I too like hardware that does this for me. But if I go to all sorts of trouble to cluster my seeks and find the disk has mapped cylinder 5 to cylinder 1001, I'm pissed. Most file systems' jobs is to find space among a bunch of already-allocated blocks. It only takes a little bit of tweaking to consider bad blocks as permanently allocated, so why not use this ability? You can argue both sides. Your side seems to be winning. It's not really worth losing sleep over. (One in the morning? *I*'m losing sleep over it. G'night, all.) -- -Colin (uunet!microsof!w-colinp)
andy@cbmvax.UUCP (Andy Finkel) (12/17/88)
In article <85@microsoft.UUCP> w-colinp@microsoft.UUCP (Colin Plumb) writes: >Finally, question for C-A: will WB 1.4 support variable-sized lock structs? >If so, I might find it easier to use them now, not work with this WB, and >just wait for 1.4. It's unlikely to get out of the CLI-bigot community >before then, anyway. Workbench 1.0, 1.1, 1.2, and 1.3 support variable sized locks. Locks are allocated by the filesystem. Programs ask for them, ask the file system to duplicate them, free them, etc. So, yes, variable sized locks are supported, as long as you maintain the 5 initial documented fields. (6 if you count that your lock should be allocated the way DOS likes to see memory, ie a size on the beginning) (The ram-handler uses larger than standard locks, btw, so I'm fairly sure you can extend them :-) ) -- andy finkel {uunet|rutgers|amiga}!cbmvax!andy Commodore-Amiga, Inc. "Possibly this is a new usage of the word 'compatible' with which I was previously unfamiliar" Any expressed opinions are mine; but feel free to share. I disclaim all responsibilities, all shapes, all sizes, all colors.
sean@ms.uky.edu (Sean Casey) (12/20/88)
Heck, I'd be happy with something that wouldn't thrash when two processes use disk at the same time. Sean -- *** Sean Casey sean@ms.uky.edu, sean@ukma.bitnet *** Who sometimes never learns. {backbone site|rutgers|uunet}!ukma!sean *** U of K, Lexington Kentucky, USA ..where Christian movies are banned. *** ``My name is father. You killed my die. Prepare to Inigo Montoya.''
w-colinp@microsoft.UUCP (Colin Plumb) (12/21/88)
In article <10716@s.ms.uky.edu> sean@ms.uky.edu (Sean Casey) writes: >Heck, I'd be happy with something that wouldn't thrash when two processes >use disk at the same time. I'll see what I can do. This basically boils down to adequate cacheing. Technical questions: So far, I think the way to auto-recognise my floppies is to patch the RootNode RestartSeg pointer to first go to my validator, which checks to see if it's my disk format, then passes control to the existing validator, which will handle the existing file system (and FFS, in 1.4). Questions: - Is this the right thing to do? Is there a better way to stick my file system in, or will there be one in 1.4? - How to lock the RootNode structure? Since this field gets read rarely and modified even more rarely, I could go for years and never lose a race condition, but I would like to do it right. Forbid()/Permit()? - What info does the disk validator get called with? - What does the disk validator do, besides fsck and adding an entry to the device list? - Is there a way to get the slightly magic df0: and df1: aliases to work? I don't really know how they work. Thanks for the help! -- -Colin (uunet!microsof!w-colinp)
deraadt@xenlink.UUCP (Theo A. DeRaadt) (12/21/88)
In article <10716@s.ms.uky.edu>, sean@ms.uky.edu (Sean Casey) writes: > Heck, I'd be happy with something that wouldn't thrash when two processes > use disk at the same time. How about SetFunction() on the vectors of the trackdisk.device, so that somehow the requests to the traskdisk can get sorted? I am still fairly well in the thick about devices/handlers... so I'm just posing a question, is there a way to do something like this? <tdr.
w-colinp@microsoft.UUCP (Colin Plumb) (12/21/88)
Since this design I'm working with is completely track-based, it's easy to tell if there is any data I'm interested in on a track. If there isn't, I don't really need to read the track before writing it, saving considerable time. It appears that trackdisk.device always reads in a track before writing it, even if it has a full track's worth of data, so I'd have to use the forbidding-sounding ETD_FORMAT (since I use the header label area). However: it is, I suppose, possible for a disk drive to mis-seek. I seem to recall the number "1 in 10^6" used in some related connection on a floppy spec sheet. Is it, in fact, a Bad Idea to write without looking first? It's an awfully tempting performance optimisation.... If I buffer it properly, most of a copy of a large file can be done this way. (Another problem is that, if I do blind writes, I had better be pretty darn sure the track *is* empty, and it's not just that I didn't update the free list since writing to it. But this, I know how to handle.) Question: I though that the trackdisk.device read both sides of a cylinder when doing a track read. But now that I look, I can't find evidence either way. Does it read one or two sides? Thanks for any hints! -- -Colin (uunet!microsof!w-colinp)
w-colinp@microsoft.UUCP (Colin Plumb) (01/29/89)
Well, I've settled on a disk format for my faster file system. I use variable-length filename and comment fields, since everything is variable- sized anyway. I'm rather proud of the space savings over standard AmigaDOG, so I'm bragging to the net. (Really, I want *negative* comments.) For those who didn't see my other postings, you may get a bit lost, but the important thing is that this is exclusively for Amiga floppies. That means that I'm willing to do a lot of work in memory to avoid more seeks, and space utilisation is important. Anyway, it goes like this: - A disk is 160 tracks, with track 0 having 1024 bytes reserved for boot blocks. What's a sector? Never heard of them... :-) - Data is allocated within tracks on single-byte boundaries. - A track is a collection of extents, each with a 4-byte header (inode # and size), and possibly a free extent (which is handled in a slightly more complex way, since it can be only 1 byte long). The inode number is the inode number for inodes (gee), or the inode number of the owner of a data extent. An inode number encodes the track it's on, in a slightly tricky way, as it's possible to have 276 inodes on a track, but as there are 160 tracks, we can't just have 9 and 7 bit fields. (Yes, it's possible to fit over 32768 empty files on a floppy. I was rather amazed at this result myself.) - Timestamps are stores in 1 32-bit word, a la Unix, not in 3, a la AmigaDOG. - The boot blocks contain a pointer to the free map extent, which has some magic inode field in its header, and contains 4 bytes of volume creation time, 4 bytes of volume modification time, 320 bytes of free map (2 bytes of count of free bytes for each track) and 2 bytes of inode # of the root directory. A count of -1 free bytes on a track signals an invalid free map. Why waste a byte? :-) - A directory, in addition to its 4-byte header, contains 4 bytes of timestamp, 4 bytes of protection, 2 bytes of parent inode (a magic value for the root directory), 1 byte of name length, 1 byte of comment length, strlen(name) + strlen(comment) bytes of name and comment, and 3 bytes per entry. The number of entries is computed from the length of the extent stored in the header. An entry is 2 bytes of inode number and 1 byte of hash value. The hash function hasn't been decided yet. The fact that all directory entries are pointed to directly by the directory means that I can sort all the inodes by track when doing a directory listing, and preferentially allocate new directory entries on tracks others are already on. - A file is the same, except the entries are pointers to extents, and are 1 byte of track # and 2 bytes of length. Files and directories are told apart by the high bit of the name length field, since I don't, to be honest, see any crying need for file names more than 127 bytes long. - If, by some perverse combination of allocation and deallocation, a file has multiple extents on a single track, they are distinguished by being stored in order, so if extent #15 and #100 of a file are both on track 12, then extent #15 will be stored first. Extents don't have associated sequence numbers. A file's inode is conceptually extent -1 for this purpose. Note that inodes can be recognised from just the header, despite the lack of explicit flags, since the inode number in the header shows the inode is on this track, and the inode is the first extent of all those on this track labeled with that inode number. So, having said all that, we can compute the space taken up by the formatting of a disk. 1K of boot blocks, 4 bytes of free list header, 330 bytes of free list, 4 bytes of root directory header, and 16 bytes of root directory, plus the length of the name of the disk. Assuming a 1-character name, that's 355 bytes of overhead, 1379 including the boot blocks. AmigaDOG has 1 block of root directory and 1 block of bitmap, for 1K of overhead, 2K with the boot blocks. Another figure is the overhead associated with a small file (one that's stored in one extent, almost a certainty for files up to a few K). 3 bytes of entry in the parent directory, 4 bytes of inode header, 16 bytes of inode, assume 1 byte of name, 3 bytes of extent pointer, and 4 bytes of data extent header, for a total of 31 bytes of overhead. AmigaDOS would have a block (512 bytes) of file header, plus the 24 bytes of data in the data block, plus the internal fragmentation (256 bytes average). The maximum single file that will fit on a disk has 160 extents, on on each track, with 7 bytes of overhead per extent, plus the 24 bytes of file overhead computed above, for 1120 bytes of overhead. AmigaDOG would have 1 block of file header per 72 blocks of data, 25 blocks (12800 bytes) total, plus the block header information. In other words, the largest single file you could fit on a floppy would have a total of 1379+1120 = 2499 bytes of file-system overhead. At 880K (= 901120 bytes), that's 898621 bytes, 877.56K. However, why restrict ourselves to 880K per floppy? Each sector also has 16 bytes of "OS recovery info", 27.5K total on the floppy. It's trivial to use this area, as well. This gives us an extra 28160 bytes, for 926781 bytes maximum file size, 905.06K. Mac enthusiasts don't seem to notice the difference between 800K and 880K (heck, it's hard to *see* with slashed zeros). Do you think they'll notice the difference between 800K (less file-system overhead) and 900K (*including* file system overhead)? Okay, so I'm bragging... Now, the reason I posted all this is to find someone who'll tell me I'm doing something stupid. The main things I'm wondering about are: - Is it okay to use the "OS recovery info" areas like this, and - Is it okay to use the 16-byte header areas of the boot blocks? The former, I think is okay, even though I'm storing user data there, not quite in accordance with the original plans. The latter, I'm not so sure about. It's only 32 bytes, but it's a trifle messier in the logic not to use them. Other points: - Even though I can get something like 36,000 empty files on a floppy, I can't get more than 1929 entries in a single directory (i.e. more files than will fit on a whole floppy under the current AmigaDOG). Does anyone think there's something repugnant about this? - Can anyone suggest a good hash function? Otherwise, I'll just hunt through Knuth and pick one that strikes my fancy. You want to turn an arbitrary-length filename into 8 random bits. Since this must be case- insensitive and filenames are mostly letters, just masking off the 32 bit is probably easiest and doesn't cost any efficiency, but if you have something ingenious in mind, go ahead. Comments, anyone? -- -Colin (uunet!microsoft!w-colinp)
rminnich@super.ORG (Ronald G Minnich) (02/01/89)
In article <20@microsoft.UUCP> w-colinp@microsoft.uucp (Colin Plumb) writes: >Comments, anyone? Well, it seems fragile. Tell ya what. build it, then give it to me, and let me pop the disk in and out as much as i want, in the middle of anything you might do. If it can stand that abuse, I will believe it. But you don't seem to have a lot of redundancy, it seems. Things look like they could get lost easily. ron
gsarff@landru.UUCP (Gary Sarff) (02/09/89)
In article<5470@super.ORG>, rminnich@super.ORG (Ronald G Minnich) writes: >In article <20@microsoft.UUCP> w-colinp@microsoft.uucp (Colin Plumb) writes: >>Comments, anyone? >Well, it seems fragile. Tell ya what. build it, then give it to me, >and let me pop the disk in and out as much as i want, in the middle >of anything you might do. If it can stand that abuse, I will believe >it. But you don't seem to have a lot of redundancy, it seems. >Things look like they could get lost easily. >ron Since when could we do this with amigados's filing system, pop things in and out as much as we want in the middle of amigados doing something. I don't think so. Just try doing it with a disk between the time the light goes out once and then comes on again a few seconds later, and see what happens, usually "not-a-dos disk". As for not having redundancy what good is it if the OS doesn't do anything with it. No one seems to be crying over the new FFS doing away with that wonderful rendundancy that was talked about so much, say in the Byte issue covering the amiga, about how all the files and their sectors were chained together forwards and backwards so the entire disk could be rebuilt. Sure, after awhile we got DiskDoctor and DiskSalv but the OS as such never tried to fix things if they went bad, and DiskDoctor sometimes just made things worse, and we still don't have bad sector remapping. I don't think the new file system is such a bad idea. ----------------------------------------------------------------------------- "Do you have any Venezualan Beaver cheese?" -- The Cheese Shop, Monty Python I've often wondered why no one makes cheese from cat's milk?
rojas@skippy.berkeley.edu (Joseph Maurice Rojas) (02/24/89)
Dear colleagues, This is my first posting to a net so please bear with me... I've just brought my Amiga 500 and Okimate 20 printer up from L.A. but forgot my printer manual. Does anyone out there know the meaning of all the DIP switches (I think that's what you call those tiny switches...) on the plug'n'print cartridge? I would greatly appreciate the info as I have alot of letters to write! Sincerely, J. Maurice Rojas P.S.: If my address doesn't go through it's rojas@math.berkeley.edu.