filbo@gorn.santa-cruz.ca.us (Bela Lubkin) (11/28/89)
[note that I have directed followups to the .tech group] In article <832@lpami.wimsey.bc.ca> Larry Phillips writes: >The big advantage of the Amiga filing ^^^^^^^^^^^^^ >system is that a fully known filename is found VERY quickly. The reason for >this is that filenames are hashed into tables in the directory blocks, and can >be found within a (relatively) few seeks, simply by looking in the appropriate >header and offset within that header to find the pointer to the file. Hash >'collisions' are handled by chaining from one directory entry to the next, but >even then, there are relatively few reads to do. Oh, come ois and always has been bogus. Even MS-DOS does far better than AmigaDOS in this area. In the case of a known filename, MS-DOS reads the directory (which is effectively a single small binary file), searches it for the named file, and is done. Since each MS-DOS directory entry is 32 bytes, and a sector is 512 bytes, 16 entries fit in a sector; it must read (n/16) sectors to find a file, where n=the position of the desired file within the directory. In general that means it has to read 1-2 sectors for MOST files. It would have to read 5 sectors to find ANY file in my Amiga C: directory (70 files); 2 for my System: directory (31 files). Those sectors would almost always be physically contiguous. Compare this to AmigaDOS, which must read an arbitrary number of sectors (often 1, but sometimes more), which are virtually guaranteed to be all over the disk. Then compare wildcard operations. MS-DOS must still read the entire directory, which is still only a few (probably contiguous) sectors. Amiread one sector for each file in the directory, probably scattered all over the disk. I don't like this tradeoff in the least. This is a case of overblown data structures. I'd take a monolithic directory any day. In AmigaDOS' case, this would probably involve keeping the current directory blocks, but adding a true index for each directory -- a file that has nothing in it but filenames and the block numbers of their directory blocks. This could be a file of 34-byte records (30 characters + a longword block number). It would probably be best to put 15 of these in a sector and use the extra two bytes for something else, rather than starting a 16th entry there and carrying it over to the 2nd directory entry, to make it easier to reconstruct a damaged file system. The other argument I've heard for AmigaDOS' directory structure is "unlimited directory size", which is also bogus. Nothing limits any directory under MS-DOS, >except< the root directory. All others are just files which are treo the root directory is limited? Well, that's the choice MicroSoft made with MS-DOS. Needn't be that way. Just define a "super-root" directory which is not directly visible to the user and which has only one entry -- a pointer to the actual root directory, which is a normal directory file and can grow to whatever size is needed. [I speak of MS-DOS for two reasons: I know it well; and it uses SIMPLE solutions which happen to work pretty well. Perhaps in 6 months I'd be arguing for some form of UNIX file system or something, but I don't know it well enough yet. Actually, what I suggest sounds suspiciously like inodes, in a twisted kind of way.] Bela Lubkin * * // filbo@gorn.santa-cruz.ca.us CI$: 73047,1112 (slow) @ * * // belal@sco.com ..ucbvax!ucscc!{gorn!filbo,sco!belal} R Pentomino * \X/ Filbo @ Pyrzqxgl +408-476-4633 and XBBS +408-476-4945
limonce@pilot.njin.net (Tom Limoncelli) (11/29/89)
On my MS-DOS system, my path has about 10 entries. On my Amiga system, my path has about 10 entries. About the same number of files are searched, yet the Amiga does less seeks to the disk. (Diskbuffers flushed, of course). Why is this? Because the Amiga is doing O(N) seeks with hashing; but the MS-DOS does O(N*M/Z) searches. (N = number of directories; M= number of files per directory; Z = number of directory entries per sector) It balances out and the Amiga wins. Also, the Amiga's FFS caches side-sector and directory information better than MS-DOS. I've been told that FFS is modeled after the Berkeley Fast File System and has been benchmarked to be the fastest FS when judged with similar hardware. Why do you think MicroSoft is pushing their "new" high-performance file system for OS/2. There is a LOT of room for improvement over MS-DOS. This has been debated over and over here on comp.sys.amiga. It comes down to this (IMHO); when searching 1 directory for a couple files, sure, MS-DOS FS is a win. But when you are dealing with lots of files, FFS is the winner. Remember, a bubble sort can be faster than a QuickSort if you have a very small number of items. A bubble sort re-written in tight-coded assembler is still O(N^2) and is still a bubble sort. [ Insert favorite computer scientist vs. engineer joke here :-) ] -Tom -- Tom Limoncelli -- limonce@pilot.njin.net -- tlimonce@drunivac.bitnet rutgers!njin!tlimonce -- Drew University, Madison, NJ -- 201-408-5389
daveh@cbmvax.UUCP (Dave Haynie) (11/29/89)
in article <88.filbo@gorn.santa-cruz.ca.us>, filbo@gorn.santa-cruz.ca.us (Bela Lubkin) says: > Xref: cbmvax comp.sys.amiga.tech:9127 comp.sys.amiga:47011 > X-Claimer: I >am< R Pentomino! > In article <832@lpami.wimsey.bc.ca> Larry Phillips writes: >>The big advantage of the Amiga filing > ^^^^^^^^^^^^^ >>system is that a fully known filename is found VERY quickly. The reason for >>this is that filenames are hashed into tables in the directory blocks, and can >>be found within a (relatively) few seeks > Oh, come ois and always has been bogus. Even MS-DOS > does far better than AmigaDOS in this area. Huh? Have you actually _used_ an MS-DOS system with paths set up similar to the way an Amiga normally is? I have about 10x as many commands on my Amiga path as on my MS-DOS path (AT Bridge Card). There's a noticable wait for any path search on the MS-DOS side. There isn't on the Amiga. That's what I consider the bottom line. But onward... > In the case of a known filename, MS-DOS reads the directory (which is effectively > a single small binary file), Actually several binary files, since the directory very often (in fact, I can think of very few counterexamples in actual use) has far more than 16 files. > Since each MS-DOS directory entry is 32 bytes, and a sector is 512 bytes, 16 > entries fit in a sector; it must read (n/16) sectors to find a file.. So we'd have to read 9 sectors, worst case, to get an arbitrary file in my C: directory of 132 files. And perform up to 131 string comparisons. The midpoint would be 4 sectors and 65 string comparisons. On the Amiga, I'd have to read at least 2 sectors, the directory and the file block. I'd have to do one longword comparison to check for a hash chain collision, and a string compare and possible sector read for each hash chain entry until I find the file. For my 132 entry directory, there's likely to be one hash chain for each entry based on the hashtable size of 72, so the average file would be found in no more than 3 sector reads, a string comparison, and 2 longword comparisons. If there were 4 hash collisions in any hash table entry, that would grow to a worst case of 5 sector reads, 4 string comparisons, and 5 longword comparisons. > Those sectors would almost always be physically contiguous. On an empty hard disk, sure. But even then, you can't have it both ways. If your file is contiguous with the directory header, and you a bunch of new stuff to that directory, something's not contiguous anymore. Either you have to uproot all the files or you're running all over the place to collect those directory blocks. > Compare this to AmigaDOS, which must read an arbitrary number of sectors (often 1, > but sometimes more), are virtually guaranteed to be all over the disk. No they're not. Have you ever actually looked at how the file system (FFS especially) allocates things on the disk? I have, extensively. Try running DiskSalv over your hard disk sometime; things are nicely grouped. And also, no matter how badly fragged you get, hash collisions are linked in sorted order, so you never jump all over the place, as you imply, to find a named file. > Then compare wildcard operations. MS-DOS must still read the entire directory, which is > still only a few (probably contiguous) sectors. Here MS-DOS would read 9 files for my C: directory, AmigaDOS would read 133. That's a definite loss to AmigaDOS, even though it took much less time to load the directory program in the first place (I'm talking real life here; I refuse to use the built-in Dir command under MS-DOS, so I use a UNIX-alike LS which must be loaded from disk, same as the Amiga command). > Amiread bone sector for each file in the directory, probably scattered all > over the disk. I don't like this tradeoff in the least. Again, the Amiga files aren't scattered all over the disk. Do you really think they just allocate things at random. Sure, the disk would support blocks scattered all over, just as any filesystem that doesn't choke when nearing full would have to do. But even on my 70% full 80 meg disk, I don't see any problems like those you seem to believe in. The other thing you get from the Amiga FFS (even better under SFS) is ease of reconstruction. You can completely rebuild the entire disk structure even if every directory in the system were destroyed. There's no way to lose more than one file with a single damaged block, and actually you'd only lose part of that file in many cases (if you destroy the file block header). Under SFS, there's no way to lose any more than 1 block in a file with a randomly damaged sector. You've already told me I can put a serious hurtin' on 16 files with one random sector blasted away. What happens if I destory the root directory? > The other argument I've heard for AmigaDOS' directory structure is "unlimited > directory size", which is also bogus. Nothing limits any directory under > MS-DOS, >except< the root directory. Nothing limits the size of an FFS directory execpt the hard disk. I think we win here. Microsoft made lots of bad decisions for MS-DOS, but the worst was that they didn't have the forsight back then to build a system that let these things be fixed later very compatibly. You can yell all day about What Could Have Been, but most folks are far more concerned with What Actually Is and What Yet Will Be. Though they say that the 35 meg, or whatever, limit on hard disks is going Real Soon Now. Good thing about AmigaDOS is that you can pick the filesystem handler you want. Can't do that under MS-DOS. So you can use the CrossDOS filesystem for your hard disk instead of FFS. It won't be as fast, but you'll probably be happy with it... > Bela Lubkin * * // filbo@gorn.santa-cruz.ca.us CI$: 73047,1112 (slow) -- Dave Haynie Commodore-Amiga (Systems Engineering) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy Too much of everything is just enough
cmcmanis%pepper@Sun.COM (Chuck McManis) (11/29/89)
In article <87.filbo@gorn.santa-cruz.ca.us> (Bela Lubkin) writes: >Oh, come ois and always has been bogus. Even MS-DOS >does far better than AmigaDOS in this area. In the case of a known filename, >MS-DOS reads the directory (which is effectively a single small binary file), >searches it for the named file, and is done. Do your homework Bela, MS-DOS does a linear search of the directory structure and several searches when it is looking through nested directories. Linear structures work fine for small things (floppies come to mind) but don't scale well at all to big things like hard disks. Fill up a 20Meg hard disk with files in directories and subdirectories and benchmark MS-DOS against AmigaDOS when finding random files in the name space. Then fill up a 30Meg disk, if you have DOS 3.3 or higher try it on a 40Meg disk and then an 80 meg disk. You will find that MS-DOS is getting slower and slower whereas the time for AmigaDOS is constant. Arguing the superiority of MS-DOS using a floppy is like arguing the superiority of AmigaDOS using a 300Meg hard drive. The boundary conditions are uninteresting. If you have one file system for all disk devices in the system, what do you optimize for? I would argue that the world is backwards, since you can hardly find an MS-DOS system that doesn't have a hard drive, and yet lots of Amigas don't. The MS-DOS filesystem was born of the CP/M file system and it was built for floppies. Great, the AmigaDOS file system was born of TriPOS which was built for Winchester drives. The heritage shows through. Fortunately, the Amiga's file systems are loadable so you can use one file system for floppies and one for harddisks. Now isn't that special? Of course the Amiga file system isn't as well suited to wildcarding as a linear search is. Of course if you benchmark which is used more, Open() or ExNext() you might be suprised to find there is a bigger win optimizing Open() than there is ExNext(). My opinion of course, --Chuck McManis uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcmanis@Eng.Sun.COM These opinions are my own and no one elses, but you knew that didn't you. "If it didn't have bones in it, it wouldn't be crunchy now would it?!"
andrewt@watsnew.waterloo.edu (Andrew Thomas) (11/30/89)
> > Then compare wildcard operations. MS-DOS must still read the entire directory, which is > > still only a few (probably contiguous) sectors. > > Here MS-DOS would read 9 files for my C: directory, AmigaDOS would read > 133. That's a definite loss to AmigaDOS, even though it took much less time > to load the directory program in the first place (I'm talking real life > here; I refuse to use the built-in Dir command under MS-DOS, so I use a > UNIX-alike LS which must be loaded from disk, same as the Amiga command). > It seems to me that the single most common interactive file system operation is to find a directory listing. In second place is the use of wildcards in file names, and running a very close third is the use of filename completion for any shell that has it (you may disagree with my rankings). Under the above example, you lose bigtime with Amiga's file system, as all of these examples are equivalent to wildcarding. I can't complain about finding my commands quickly, but any shell that hashes its path will speed that up after the first time anyway. Why is AmigaDOS castrating its interactive use in favour of fast command lookup, when the fast command lookup will only effect the initial use of a command? Is it possible to have both? Could the file system store a secondary file containing only a set of file names? Even an ascii file would be fine. It would require more work for file creation, deletion, and moving, but it would massively speed up wildcard activities. It could be there as a secondary file to the filesystem so losing it would not be important. A simple chkdsk-like program could be used to recreate this file if it became corrupt by finding each file the hard way. The only other thing you have to do is keep this file out of reach of programmers. The disk usage is not really a problem. With 15 files to a sector, a disk with 500 files on it will only need 8K to store this information. I would happily make this trade for the speed it would provide. Is there something basically wrong with this scheme? It really sounds like a horrible kluge in some ways, but it's an issue that needs to be addressed. -- Andrew Thomas andrewt@watsnew.waterloo.edu Systems Design Eng. University of Waterloo "If a million people do a stupid thing, it's still a stupid thing." - Opus
barrett@jhunix.HCF.JHU.EDU (Dan Barrett) (12/01/89)
I don't really understand all the concern about the speed of directory listings and wildcards, at least for hard drives. I have a slow (65ms access time) hard drive, with diskperf results like 140Kb/sec, and I find directory listings to be of acceptable speed. A 100-file directory listing is quite quick. (I use "ls" version 3.1.) And finding a particular program in my 8-directory, ~400-file search path is instantaneous. Dan //////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ | Dan Barrett - Systems Administrator, Computer Science Department | | The Johns Hopkins University, 34th and Charles Sts., Baltimore, MD 21218 | | INTERNET: barrett@cs.jhu.edu | UUCP: barrett@jhunix.UUCP | | COMPUSERVE: >internet:barrett@cs.jhu.edu | BITNET: barrett@jhuvms.bitnet | \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\/////////////////////////////////////
usenet@cps3xx.UUCP (Usenet file owner) (12/01/89)
In article <3459@jhunix.HCF.JHU.EDU> barrett@jhunix.UUCP (Dan Barrett) writes: > > I don't really understand all the concern about the speed of >directory listings and wildcards, at least for hard drives. I have a slow Try doing a "dir df0:c" with your workbench disk in the floppy drive.... Now do you understand??? Joe Porkka porkka@frith.egr.msu.edu
morgan@camtwh.UUCP (Morgan W. Jones) (12/02/89)
In article <ANDREWT.89Nov29154933@watsnew.waterloo.edu> andrewt@watsnew.waterloo.edu (Andrew Thomas) writes: >with my rankings). Under the above example, you lose bigtime with >Amiga's file system, as all of these examples are equivalent to >wildcarding. I can't complain about finding my commands quickly, but >any shell that hashes its path will speed that up after the first time >anyway. Why is AmigaDOS castrating its interactive use in favour of >fast command lookup, when the fast command lookup will only effect the >initial use of a command? You see a saving on command lookup for each and every invocation, and caching has nothing to do with this. The purpose of caching is so that you don't have to scan the whole filesystem (or at least those directories listed in your path) every time that you execute your command. Once the command is found and executed, the shell remembers the path to the command but the operating system still has to search all of the directories in the path to find the inode (or file handle on the amiga) for the file. Where the amiga gains is in the directory search. MessyDos and Unix have to do a linear search on the directory (which for large directories may involve several disk accesses) whereas the amiga does a hashed lookup (which is usually one or two disk accesses). If the path to the file is several levels deep, the number of disk accesses would be significantly reduced. All this aside, though, I do agree that the minor (and on a HD insignificant) increase in direct access time does not justify the horrifically slow directory listings. -- Morgan W. Jones (morgan@camtwh) Orama Incorporated, Toronto, Canada. (416) 369-5088
new@udel.edu (Darren New) (12/02/89)
In article <5615@cps3xx.UUCP> porkka@frith.UUCP (Joe Porkka) writes: >In article <3459@jhunix.HCF.JHU.EDU> barrett@jhunix.UUCP (Dan Barrett) writes: >> I don't really understand all the concern about the speed of >>directory listings and wildcards, at least for hard drives. I have a slow >Try doing a "dir df0:c" with your workbench disk in the >floppy drive.... >Now do you understand??? > Joe Porkka porkka@frith.egr.msu.edu Something else I don't understand: The disk should be able to read about 5 tracks/second if you look only at the revolutions. Yet when doing something like DiskCopy, the disk takes more than one second/track to read or write. What's the slowdown? Is the blitter and system overhead really four times the read speed? -- Darren
ccplumb@rose.waterloo.edu (Colin Plumb) (12/04/89)
In article <5141@nigel.udel.EDU> new@udel.edu (Darren New) writes: > Something else I don't understand: > The disk should be able to read about 5 tracks/second if you look only > at the revolutions. Yet when doing something like DiskCopy, the disk > takes more than one second/track to read or write. What's the slowdown? > Is the blitter and system overhead really four times the read speed? First of all, you have to read about 1.1 revolutions to be sure to get a complete copy of each sector. Then remember that you need to read both top and bottom surfaces, and write the worst-case (slowest rotation speed) track, which is a bit under 1.1 nominal tracks. That adds up to 4.4 rotations to copy a cylinder, of which a 90mm floppy has 80. 70.4 seconds. The fastest copy program I know takes 69 seconds, so it's pretty close. It, I believe, reads one track while processing the track before so it's always reading or writing. This hides essentially all the blitter overhead. But Diskcopy isn't quite as smart. -- -Colin
rar@auc.UUCP (Rodney Ricks) (12/05/89)
In article <5615@cps3xx.UUCP> porkka@frith.UUCP (Joe Porkka) writes: >In article <3459@jhunix.HCF.JHU.EDU> barrett@jhunix.UUCP (Dan Barrett) writes: >> >> I don't really understand all the concern about the speed of >>directory listings and wildcards, at least for hard drives. I have a slow > >Try doing a "dir df0:c" with your workbench disk in the >floppy drive.... Actually, that is not a good test. There is another reason why the Amiga's dir command seems so slow. If my memory serves me correctly, the Amiga's dir command reads in a list of all of the files in the directory, then sorts them, THEN it prints them out. The MSDOS dir prints out what it gets to as it gets to it. So the Amiga's dir command SEEMS much slower because it doesn't start printing to the screen until it has seen ALL of the files in a directory. Use the list command instead (I do). It seems much faster than the dir command in that, like the MSDOS dir command, it lists the files as it gets to them, instead of after it has read and sorted them. > Joe Porkka porkka@frith.egr.msu.edu Rodney Ricks -- "We may have come over here in different ships, but we're all in the same boat now." -- Jesse Jackson // \\ // Rodney Ricks, Morehouse College \/