alexis@ccnysci.UUCP (Alexis Rosen) (10/22/88)
I have a few questions about A/UX. I realize that there may be many sides to some of these questions, but I'll take whatever I can get... 1) Can the Mac auto-reboot into A/UX automatically? If the machine dies, what happens? How often will it crash from high loads? What is a high load on a Mac IIx? 2) I know that the distribution comes with lots of source code. *What* source code? Can I get source for the stuff that's shipped only as binary? 3) Does A/UX have system accounting, and quotas, in the kernel, just like BSD? I know that "AU/X is all of SVR2 plus all of BSD 4.2 except ioctls", but what does that mean? Can it support the fast file system? What about stuff that comes with a standard BSD, but isn't part of the kernel? Am I even asking the right questions? 4) If the Mac crashes, the console will presumably go haywire. All diagnostic output will be lost. Can I make the console a serial device on one of the built-in serial ports? If not, can I stick a printer on one of those ports and redirect or copy ALL console output to the printer? 5) How many interrupts per second can the Mac deal with before getting hopelessly bogged down? The obvious answer is "it depends", but can someone out there give a rough approximation? 6) "Always balance your disk load." How much difference will there be between a system with one 620MB Wren and a system with a 620MB and a 90MB Wren? What is your favorite strategy with disks? Is it better to get a 90MB Wren or a 140MB Rodime as a second disk? 7) There is a well-known problem with Hayes-compatible modems on sysV machines. They often fail to reset properly after a session, so the next caller can't connect properly. I believe there is an answer, but I'm not sure. With A/UX, can I set up eight modems and be confident that they will work right? If Hayes-compatible modems are out, what do you recommend? The system would be used primarily for news and mail, so this is critically important to us. 8) To repeat an earlier query, is there ANYONE out there who runs more than three simultaneous users off of a Mac II or IIx? Any information about this would be deeply appreciated. 9) Lastly, I have asked in the past if anyone knew of a SCSI DMA board for the Mac II/IIx. Phil Ronzone was kind enough to spend a few moments with me explaining that DMA wouldn't be a big win. Nevertheless, all of my calculations indicate otherwise. What am I missing, if anything? Here are my numbers: Assuming a Wren IV with 16ms access time, capable of transmitting data at a sustained rate of 1 MByte/sec., and assuming the Mac II can do about 300 KB/sec sustained, then a hard page fault (4KBytes) will take: DMA: 16 ms + 4KB/(1MB/s) = 16 ms + ~4 ms = ~20 ms Mac w/no DMA: 16 ms + 4KB/(300KB/s) = 16 ms + ~13 ms = ~29 ms So a page fault will take 50% more time without DMA. Of course, transferring larger amounts of data, DMA wins by a lot more. The more you transfer, the closer you will get to 330% speed for DMA. For loading a 500KB program, DMA would be 300% faster. Are there other factors I have overlooked? With a very fragmented drive, non-DMA loses a little less, but not a lot less. Anything else? Thanks ---- Alexis Rosen alexis@dasys1.UUCP or alexis@ccnysci.UUCP Writing from {allegra,philabs,cmcl2}!phri\ The Big Electric Cat uunet!dasys1!alexis Public UNIX {portal,well,sun}!hoptoad/
phil@Apple.COM (Phil Ronzone) (10/27/88)
In article <941@ccnysci.UUCP> alexis@ccnysci.UUCP (Alexis Rosen) writes: >9) Lastly, I have asked in the past if anyone knew of a SCSI DMA board for > the Mac II/IIx. Phil Ronzone was kind enough to spend a few moments with > me explaining that DMA wouldn't be a big win. Nevertheless, all of my > calculations indicate otherwise. What am I missing, if anything? Killing this misinformation on Mac II and SCSI hopefully for the last time ... Assume you are running a typical one user I/O load of 40 to 80 1K blocks a second. When the Apple HD80 presents the data requested, then A/UX "yanks" the 1K, 4 bytes at a time, in a very tight loop. There is hardware assist to make for very quick "yanking". How quick? 3.657 bytes per microsecond. Or, 280 microseconds to "yank" the 1K block. NOTE THAT: Both a DMA chip and the 68020 would both begin the "yanking" of bytes at the same time. They would both finish around the same time. A DMA CHIP CAN DO ABSOLUTELY NOTHING (REPEAT NOTHING NOTHING NOTHING ...) TO MAKE THIS HAPPEN FASTER THAN HAVING THE 68020 TRANSFER THE BYTES. IS THE UNIVERSE NOW AWARE OF THIS!!! :-) :-) <- tongue-in-cheek screaming. So DMA can NOT make I/O performance improve. What it CAN do is free up more cycles. It can save that 280 microseconds per block, or, if you are doing 80 1K blocks per second, it can save you 22.4 milliseconds every second. That is 2.2% of your total CPU cycles every second. If you are doing swapping of a 160K "chunk" every second, then it will save you 4.9% of your total CPU cycles every second. At this point, a coomon objection is that "I/O can happen faster because the 68020 can start it quicker because the DMA is taking off the load ...". No - that ain't true either. Until either the DMA or 68020 is DONE transferring the data, you CAN'T start more I/O. Does this mean we are against DMA? NO NO NO! When you start transferring large amounts of data (multi-megabyte images to a LaswerWriter SC) you don't want to burn up your processor that way. Also unlikable is that interrupts get locked out for large transfers to slower devices. A/UX doesn't support LocalTalk on the onboard SCC's because a DataGram takes 21 milliseconds of LOCK-OUT-ALL-INTERRUPTS time. With an 8 DataGram per ATP transaction, 170 millseconds loses too many interrupts (Ethernet, keystrokes, mouse movements, incoming serial data etc.). SUMMARY - A DMA chip on the Mac II can NOT increase I/O throughput. It can free up more I/O cycles, although only 4% (predicted) / 8% (measured) for typical heavy UNIX I/O loads. DMA buys the most in reducing interrupt timing sensitivity, and in support the "very large data transfers" peripherals such as LaserWriter II SC. O.K.? P.S. My very first reaction after examining the Macintosh II prototype design was "What!? No DMA!!??". Just the facts, Ma'm. +------------------------+-----------------------+----------------------------+ | Philip K. Ronzone | A/UX System Architect | APPLELINK: RONZONE1 | | Apple Computer MS 27AJ +-----------------------+----------------------------+ | 10500 N. DeAnza Blvd. | If you post a bug to the net, and the manufacturer | | Cupertino CA 95014 | doesn't read it,does that mean it doesn't exist? | +------------------------+----------------------------------------------------+ |{amdahl,decwrl,sun,voder,nsc,mtxinu,dual,unisoft}!apple!phil | +-----------------------------------------------------------------------------+
alexis@ccnysci.UUCP (Alexis Rosen) (10/30/88)
In article <19528@apple.Apple.COM> phil@Apple.COM (Phil Ronzone) writes: >In article <941@ccnysci.UUCP> alexis@ccnysci.UUCP (Alexis Rosen) writes: >>9) Lastly, I have asked in the past if anyone knew of a SCSI DMA board for >> the Mac II/IIx. Phil Ronzone was kind enough to spend a few moments with >> me explaining that DMA wouldn't be a big win. Nevertheless, all of my >> calculations indicate otherwise. What am I missing, if anything? > >Killing this misinformation on Mac II and SCSI hopefully for the last time ... > >Assume you are running a typical one user I/O load of 40 to 80 1K blocks >a second. When the Apple HD80 presents the data requested, then A/UX >"yanks" the 1K, 4 bytes at a time, in a very tight loop. There is hardware >assist to make for very quick "yanking". How quick? 3.657 bytes per >microsecond. Or, 280 microseconds to "yank" the 1K block. > >NOTE THAT: Both a DMA chip and the 68020 would both begin the "yanking" >of bytes at the same time. They would both finish around the same time. > >A DMA CHIP CAN DO ABSOLUTELY NOTHING (REPEAT NOTHING NOTHING NOTHING ...) >TO MAKE THIS HAPPEN FASTER THAN HAVING THE 68020 TRANSFER THE BYTES. IS >THE UNIVERSE NOW AWARE OF THIS!!! :-) :-) <- tongue-in-cheek screaming. >[etc.] >SUMMARY - A DMA chip on the Mac II can NOT increase I/O throughput. It can >free up more I/O cycles, although only 4% (predicted) / 8% (measured) for >typical heavy UNIX I/O loads. DMA buys the most in reducing interrupt >timing sensitivity, and in support the "very large data transfers" peripherals >such as LaserWriter II SC. First of all, thanks to Phil for speaking out on this. My previous comment about him was sincere; I do appreciate the time he's willing to spend answering these questions. That said, I still have some questions. The Mac can transfer a 1KByte block in 280 usecs. That's fine, but it's not the whole story. If it were, it could do about 3.5 MBytes/sec. In fact, it can do less than one tenth of that speed. So what causes that discrepancy? My guess (uneducated, so please correct me if I'm wrong) is that it's the overhead for transferring that block. Can the Mac transfer 10 or 100 blocks in 2800 or 28000 microseconds? I don't think so. I don't know what the overhead for DMA is, but it seems to be a lot less. The Golden Triangle folks say they are getting about 1 MByte/sec from their board, using CDC Wrens. Since the Wrens are capable of just over 1 MByte/sec, G.T. might be able to do even better with a faster controller (then again, maybe not- I didn't ask). As I think my numbers show, (read my original posting for them) there is a BIG difference between 300KB/sec and 1MB/sec, even for single-user stuff. Maybe the question is whether there is something about the Mac that makes these faster transfers difficult (or impossible). I can't think of anything though. The very thought seems silly. So again, what am I missing? p.s.- I am leaving for a week Monday, that's why I won't be answering anything that needs a response until around Election Day. Sorry. ---- Alexis Rosen alexis@dasys1.UUCP or alexis@ccnysci.UUCP Writing from {allegra,philabs,cmcl2}!phri\ The Big Electric Cat uunet!dasys1!alexis Public UNIX {portal,well,sun}!hoptoad/
phil@Apple.COM (Phil Ronzone) (11/02/88)
In article <964@ccnysci.UUCP> alexis@ccnysci.UUCP (Alexis Rosen) writes: >That said, I still have some questions. The Mac can transfer a 1KByte block in >280 usecs. That's fine, but it's not the whole story. If it were, it could do >about 3.5 MBytes/sec. In fact, it can do less than one tenth of that speed. So >what causes that discrepancy? My guess (uneducated, so please correct me if I'm >wrong) is that it's the overhead for transferring that block. Can the Mac >transfer 10 or 100 blocks in 2800 or 28000 microseconds? I don't think so. >So again, what am I missing? DMA is merely a hardware assist for transferring data from the (typically and hopefully) buffered SCSI device, such as a hard disk, into the Mac system memory. To do this, the hard disk is instructed to read/write "N" blocks of data starting at a certain address. For example, opening a thousand 1 block files, each of which is located near the end of a slower hard disk, could, in ABSOLUTE worst case, can take as follows: 1000 seeks to front of the disk to read the inode and then go back to near the end of the disk to read one 1K block = 60ms * 1000 = 60 secs = about 16KB per second!!!! Rememberm that hard disk is getting more or less 40 - 100 requests for blocks per second under load - and each block has a more or less random block address in UNIX (to oversimplify it). On the other hand, a SCSI device that is actually 16M (16 megabytes!) of cache in the SCSI adapter (to ESDI) gave up to 400K/s data transfer rates. You could write at even higher rates, but every 1 or 2 seconds it hit an internal count of "writes pending" and accepted no more I/O until ALL the writes were flushed. So you had 2-3 seconds of immense data transfer, no matter how much "seeking" was implied, then 2 - 3 seconds of NO data transfer while the device flushed. So DMA (or "PIO") is but a small part of the equation for disk throughput. Consider interleave, disk seek, average bytes per transfer, "randomness" and so on. +------------------------+-----------------------+----------------------------+ | Philip K. Ronzone | A/UX System Architect | APPLELINK: RONZONE1 | | Apple Computer MS 27AJ +-----------------------+----------------------------+ | 10500 N. DeAnza Blvd. | If you post a bug to the net, and the manufacturer | | Cupertino CA 95014 | doesn't read it,does that mean it doesn't exist? | +------------------------+----------------------------------------------------+ |{amdahl,decwrl,sun,voder,nsc,mtxinu,dual,unisoft}!apple!phil | +-----------------------------------------------------------------------------+
dave@onfcanim.UUCP (Dave Martindale) (11/03/88)
In article <19528@apple.Apple.COM> phil@Apple.COM (Phil Ronzone) writes: > >Assume you are running a typical one user I/O load of 40 to 80 1K blocks >a second. When the Apple HD80 presents the data requested, then A/UX >"yanks" the 1K, 4 bytes at a time, in a very tight loop. There is hardware >assist to make for very quick "yanking". How quick? 3.657 bytes per >microsecond. Or, 280 microseconds to "yank" the 1K block. > >If you are doing swapping of a 160K "chunk" every second, then it will >save you 4.9% of your total CPU cycles every second. From my perspective, this is an argument that using the CPU to copy data is fine as long as you are using the System V 1K-block filesystem, since the filesystem so thoroughly throttles the disk. But if you ever switch to a filesystem with more throughput, you'll be in trouble. For comparison, our old, slow, vax 780 running 4.3BSD always reads 8K blocks on the filesystems that store images, and it manages to get about 60 blocks per second through the filesystem. That's about 500 Kb/sec, an order of magnitude larger than Phil's figures. And this is on old Eagle disks, where the average user data rate coming off the head is about 1.6 Mb/sec. More recent disks, even small Winchesters, are considerably faster. Our old Silicon Graphics workstation with a 70 Mb Vertex disk still manages about 200 Kb/sec, using SGI's proprietary "extent filesystem". So, if A/UX ever switches to a filesystem that allows access to some reasonable fraction of the disk's real bandwidth (say 500 Kb/sec to 1 Mb/sec), like other workstation manufacturers provide, having a DMA controller will suddenly become essential. Remember that once the data is in kernel memory, UNIX has to copy it to user memory, so the 68020 is going to be really busy just handling that. I hope Apple switches to a better filesystem soon.... Dave Martindale
alexis@ccnysci.UUCP (Alexis Rosen) (11/20/88)
Sorry I took so long to respond to this... In article <19816@apple.Apple.COM> phil@Apple.COM (Phil Ronzone) writes: "In article <964@ccnysci.UUCP> alexis@ccnysci.UUCP (Alexis Rosen) writes: ">That said, I still have some questions. The Mac can transfer a 1KByte block ">in 280 usecs. That's fine, but it's not the whole story. If it were, it could ">do about 3.5 MBytes/sec. In fact, it can do less than one tenth of that ">Speed. So what causes that discrepancy? My guess (uneducated, so please ">correct me if I'm wrong) is that it's the overhead for transferring that ">block. Can the Mac transfer 10 or 100 blocks in 2800 or 28000 microseconds? I ">don't think so. So again, what am I missing? " "DMA is merely a hardware assist for transferring data from the (typically and "hopefully) buffered SCSI device, such as a hard disk, into the Mac system "memory. " "To do this, the hard disk is instructed to read/write "N" blocks of "data starting at a certain address. For example, opening a thousand "1 block files, each of which is located near the end of a slower hard disk, "could, in ABSOLUTE worst case, can take as follows: " 1000 seeks to front of the disk to read the inode " and then go back to near the end of the disk to read one 1K block " = 60ms * 1000 = 60 secs = about 16KB per second!!!! " "Rememberm that hard disk is getting more or less 40 - 100 requests for "blocks per second under load - and each block has a more or less random "block address in UNIX (to oversimplify it). " "On the other hand, a SCSI device that is actually 16M (16 megabytes!) of "cache in the SCSI adapter (to ESDI) gave up to 400K/s data transfer rates. "You could write at even higher rates, but every 1 or 2 seconds it hit "an internal count of "writes pending" and accepted no more I/O until ALL "the writes were flushed. So you had 2-3 seconds of immense data transfer, "no matter how much "seeking" was implied, then 2 - 3 seconds of NO data "transfer while the device flushed. " "So DMA (or "PIO") is but a small part of the equation for disk throughput. "Consider interleave, disk seek, average bytes per transfer, "randomness" "and so on. This is useful information to have, but it really doesn't answer the question at all. Writing to a buffered SCSI disk can go up to 400K/s. That's great... but not too great. The Wrens can sustain a throughput of 1MB/s. That's really great. So why is the Mac's "great" != the Wren's "great"? In other words, if the lack of DMA isn't slowing down the Mac, what is??? It is clearly not going as fast as it should. ---- Alexis Rosen alexis@dasys1.UUCP or alexis@ccnysci.UUCP Writing from {allegra,philabs,cmcl2}!phri\ The Big Electric Cat uunet!dasys1!alexis Public UNIX {portal,well,sun}!hoptoad/
phil@Apple.COM (Phil Ronzone) (11/23/88)
In article <1011@ccnysci.UUCP> alexis@ccnysci.UUCP (Alexis Rosen) writes: >.... Writing to a buffered SCSI disk can go up to 400K/s. That's great... >but not too great. The Wrens can sustain a throughput of 1MB/s. That's really >great. So why is the Mac's "great" != the Wren's "great"? > >In other words, if the lack of DMA isn't slowing down the Mac, what is??? >It is clearly not going as fast as it should. One last last shot -- then I'm forgetting the matter. Typical UNIX SV filesystem, 50 reads a second, each read to essentially a random block. Assuming fast disks with 16MS average seek times. Each read/write takes maybe 200 to 900 microseconds, depending on hardware. This gives us a time to ACQUIRE IN MEMORY, for each block, 16MS + ~500 microseconds. Now if we had an INFINITELY FAST DMA/tranfers mechanism, we could cut that figure down to 16MS + ~0 microseconds. Notice the blazing increase in throughput!!! :-) OPEN QUESTION - why do you think the "Mac is not going as fast as it should"? If this is on a comparsion basis, tell me the equivalent machine that runs faster. Equivalent means 68020, ~16MHz, memory management, SCSI disk I/O, SVR2, etc. We really do want to know of equivalent hardware that runs better because of software. When we find it, we want to make ours run faster too. WARNING -- "fast & faster" can be exceptionally subjective. +------------------------+-----------------------+----------------------------+ | Philip K. Ronzone | A/UX System Architect | APPLELINK: RONZONE1 | | Apple Computer MS 27AJ +-----------------------+----------------------------+ | 10500 N. DeAnza Blvd. | If you post a bug to the net, and the manufacturer | | Cupertino CA 95014 | doesn't read it,does that mean it doesn't exist? | +------------------------+----------------------------------------------------+ |{amdahl,decwrl,sun,voder,nsc,mtxinu,dual,unisoft}!apple!phil | +-----------------------------------------------------------------------------+
dave@onfcanim.UUCP (Dave Martindale) (11/24/88)
In article <21057@apple.Apple.COM> phil@Apple.COM (Phil Ronzone) writes: > >OPEN QUESTION - why do you think the "Mac is not going as fast as it should"? >If this is on a comparsion basis, tell me the equivalent machine that runs >faster. Equivalent means 68020, ~16MHz, memory management, SCSI disk I/O, >SVR2, etc. We really do want to know of equivalent hardware that runs better >because of software. When we find it, we want to make ours run faster too. The Silicon Graphics IRIS 3000 series uses a 16 MHz 68020, with memory management. The old 70 Mb disks use an ST506 interface - SCSI should do better. The kernel is basically system V release something, and they get several times the disk throughput of A/UX. Why? Basically because they don't use the system V filesystem - they replaced it with an extent-based filesystem that reads and writes much larger data blocks at a time. I believe that the only way A/UX will get decent performance out of the disk is by switching to a different filesystem. Note that using the Bell filesystem cripples NFS performance too, since all reads and writes are done 1 Kb at a time, instead of the 4 or 8 Kb that other workstations use. So it matters even when you aren't using the local disk. If you change filesystems and quadruple disk throughput, DMA may become important for disk I/O. Or it might not. But for the moment, the filesystem software seems to be the problem. Dave Martindale
fnf@fishpond.UUCP (Fred Fish) (11/26/88)
In article <16775@onfcanim.UUCP> dave@onfcanim.UUCP (Dave Martindale) writes: >The Silicon Graphics IRIS 3000 series uses a 16 MHz 68020, with memory >management. The old 70 Mb disks use an ST506 interface - SCSI should >do better. The kernel is basically system V release something, and they >get several times the disk throughput of A/UX. > >Why? Basically because they don't use the system V filesystem - they >replaced it with an extent-based filesystem that reads and writes >much larger data blocks at a time. I believe that the only way A/UX >will get decent performance out of the disk is by switching to a >different filesystem. Maybe it's just a rumor, but I once heard from someone close to the A/UX project that the BSD filesystem was tried with A/UX, and it turned out to be even slower than the System V filesystem on the Mac-II hardware. There was an explanation, but I confess I didn't listen too closely to it. I hope that this is wrong, and that we will someday see a BSD filesystem with A/UX, because there are lots of things about A/UX that I like. I decided to retry the disk performance benchmark that I ran in Feb '88 and posted the results for. This posting contains a copy of that benchmark at the end. Here are the current results for an Amiga 2000 and the Mac-II, along with some old results for a Sun-3/50. Performance timings using Rick Spanbauer's diskperf.c program. Amiga Sun A/UX A/UX 2000 3/50 Mac-II Mac-II ST277N HD80SC HD80SC Nov 88 ????? Feb 88 Nov 88 File creations (files/sec) 14 6 6 6 File deletions (files/sec) 41 11 8 8 Directory scan (entries/sec) 92 350 371 397 Seek+read (seek+read/sec) 85 298 110 93 Read speed, 512 buffer (byte/sec) 67216 240499 55168 25593 Read speed, 4096 buffer (byte/sec) 109226 234057 53708 25323 Read speed, 8192 buffer (byte/sec) 187245 233189 54013 25183 Read speed, 32768 buffer (byte/sec) 374491 236343 53644 25123 Write speed, 512 buffer (byte/sec) 28187 215166 44181 43855 Write speed, 4096 buffer (byte/sec) 137970 182466 47211 46287 Write speed, 8192 buffer (byte/sec) 154202 179755 46832 46445 Write speed, 32768 buffer (byte/sec) 218453 187580 46930 46707 Notes: (1) Sun-30/50 timings by Rick Spanbauer. (2) All Amiga and Mac-II timings done by Fred Fish. (3) The Amiga 2000 uses an A2090 DMA controller, Workbench 1.3, and a Seagate ST277N (40 ms average access time). (4) The Mac-II uses an HD80SC (30 ms average access time) Comments: (1) I included both the Feb 88 and current Mac-II timings because the read figures were significantly different. I have no explanation for the discrepancy other than to note that the disk is probably now significantly fragmented, and I have since increased the number of I/O buffers to about 1000. (2) The Amiga timings I get for the relatively slow ST277N are about half of what have been reported by other people for faster drives (about 400-800 Kb per second maximum transfer rates). (3) Considering that the Amiga is a stock 68000 running at less than 8 MHz, using a 30% slower drive than the Mac-II, it seems obvious that disk I/O is not the Mac's strongest feature... :-) ====================================================================== /* ** Disk performance benchmark. If your Amiga configuration is substantially ** different from the ones mentioned here, please run the benchmark and ** report the results to either: ..!philabs!sbcs!rick or posting to ** comp.sys.amiga. Thanks! ** ** To compile benchmark for Unix 4.2/4.3 SUN 3.0/3.2: ** ** cc -o diskperf -O diskperf.c ** ** Amiga version was cross compiled from a SUN, so you'll have to figure out ** how to compile diskperf under your favorite compiler system. A uuencoded ** Amiga binary version of diskperfa is included with the shar file that ** contained this source listing. ** ** To run diskperf, simply type: ** ** diskperf [location], e.g. (on Amiga) diskperf ram: ** ** On the Amiga, you will need at least 256K bytes of "disk" wherever you ** choose to run. Unix systems will need about 3 mBytes free (larger size ** test files to delete buffer caching effect). ** ** Disclaimer: ** ** This benchmark is provided only for the purpose of seeing how fast ** _your_ system runs the program. No claims are made on my part ** as to what conclusions may be drawn from the statistics gathered. ** Just consider this program the "Sieve of Eratosthenes" of disk ** benchmarks - haggle over the numbers with friends, etc, but ** don't base purchasing decisions solely on the numbers produced ** by this program. ** ** Amiga timings gathered thus far: ** ----------------------------------------------------------------------------- Amiga A-1000, ~7mHz 68000, RAM: File create/delete: create 5 files/sec, delete 10 files/sec Directory scan: 5 entries/sec Seek/read test: 51 seek/reads per second r/w speed: buf 512 bytes, rd 201469 byte/sec, wr 154202 byte/sec r/w speed: buf 4096 bytes, rd 655360 byte/sec, wr 374491 byte/sec r/w speed: buf 8192 bytes, rd 873813 byte/sec, wr 374491 byte/sec r/w speed: buf 32768 bytes, rd 873813 byte/sec, wr 436906 byte/sec ----------------------------------------------------------------------------- Amiga A-1000, ~7mHz 68000, DF1: File create/delete: create [0..1] files/sec, delete 1 files/sec Directory scan: 43 entries/sec Seek/read test: 18 seek/reads per second r/w speed: buf 512 bytes, rd 11861 byte/sec, wr 5050 byte/sec r/w speed: buf 4096 bytes, rd 12542 byte/sec, wr 5180 byte/sec r/w speed: buf 8192 bytes, rd 12542 byte/sec, wr 5130 byte/sec r/w speed: buf 32768 bytes, rd 12542 byte/sec, wr 5160 byte/sec ----------------------------------------------------------------------------- Amiga A-1000/CSA Turbo board, ~14 mHz 68020, no 32 bit ram installed, RAM: File create/delete: create 7 files/sec, delete 15 files/sec Directory scan: 8 entries/sec Seek/read test: 84 seek/reads per second r/w speed: buf 512 bytes, rd 187245 byte/sec, wr 145625 byte/sec r/w speed: buf 4096 bytes, rd 655360 byte/sec, wr 327680 byte/sec r/w speed: buf 8192 bytes, rd 873813 byte/sec, wr 374491 byte/sec r/w speed: buf 32768 bytes, rd 873813 byte/sec, wr 436906 byte/sec ----------------------------------------------------------------------------- Amiga A-1000, ~7 mHz 68000, Ameristar NFS -> SUN-3/50, Micropolis 1325 disk: File create/delete: create 3 files/sec, delete 7 files/sec Directory scan: 10 entries/sec Seek/read test: 35 seek/reads per second r/w speed: buf 512 bytes, rd 30481 byte/sec, wr 3481 byte/sec r/w speed: buf 4096 bytes, rd 113975 byte/sec, wr 21664 byte/sec r/w speed: buf 8192 bytes, rd 145635 byte/sec, wr 38550 byte/sec r/w speed: buf 32768 bytes, rd 145365 byte/sec, wr 37449 byte/sec ----------------------------------------------------------------------------- SUN-3/50, Adaptec SCSI<->ST-506, Micropolis 1325 drive (5.25", 5 mBit/sec): File create/delete: create 6 files/sec, delete 11 files/sec Directory scan: 350 entries/sec Seek/read test: 298 seek/reads per second r/w speed: buf 512 bytes, rd 240499 byte/sec, wr 215166 byte/sec r/w speed: buf 4096 bytes, rd 234057 byte/sec, wr 182466 byte/sec r/w speed: buf 8192 bytes, rd 233189 byte/sec, wr 179755 byte/sec r/w speed: buf 32768 bytes, rd 236343 byte/sec, wr 187580 byte/sec ----------------------------------------------------------------------------- ** ** Some sample figures from "large" systems: ** ----------------------------------------------------------------------------- SUN-3/160, Fujitsu SuperEagle, Interphase VSMD-3200 controller: File create/delete: create 15 files/sec, delete 18 files/sec Directory scan: 722 entries/sec Seek/read test: 465 seek/reads per second r/w speed: buf 512 bytes, rd 361162 byte/sec, wr 307200 byte/sec r/w speed: buf 4096 bytes, rd 419430 byte/sec, wr 315519 byte/sec r/w speed: buf 8192 bytes, rd 409067 byte/sec, wr 314887 byte/sec r/w speed: buf 32768 bytes, rd 409600 byte/sec, wr 328021 byte/sec ----------------------------------------------------------------------------- SUN-3/75, NFS filesystem, full 8192 byte transactions: File create/delete: create 9 files/sec, delete 12 files/sec Directory scan: 88 entries/sec Seek/read test: 282 seek/reads per second r/w speed: buf 512 bytes, rd 238674 byte/sec, wr 52012 byte/sec r/w speed: buf 4096 bytes, rd 259334 byte/sec, wr 54956 byte/sec r/w speed: buf 8192 bytes, rd 228116 byte/sec, wr 26483 byte/sec r/w speed: buf 32768 bytes, rd 243477 byte/sec, wr 36174 byte/sec ----------------------------------------------------------------------------- DEC VAX 780, RP07: File create/delete: create 12 files/sec, delete 12 files/sec Directory scan: 509 entries/sec Seek/read test: 245 seek/reads per second r/w speed: buf 512 bytes, rd 168041 byte/sec, wr 141064 byte/sec r/w speed: buf 4096 bytes, rd 210135 byte/sec, wr 239765 byte/sec r/w speed: buf 8192 bytes, rd 206277 byte/sec, wr 239948 byte/sec r/w speed: buf 32768 bytes, rd 199222 byte/sec, wr 232328 byte/sec ----------------------------------------------------------------------------- DEC VAX 750, RA81: File create/delete: create 12 files/sec, delete 15 files/sec Directory scan: 208 entries/sec Seek/read test: 153 seek/reads per second r/w speed: buf 512 bytes, rd 99864 byte/sec, wr 72549 byte/sec r/w speed: buf 4096 bytes, rd 142663 byte/sec, wr 166882 byte/sec r/w speed: buf 8192 bytes, rd 147340 byte/sec, wr 153525 byte/sec r/w speed: buf 32768 bytes, rd 142340 byte/sec, wr 141571 byte/sec ----------------------------------------------------------------------------- */ #ifdef unix #include <sys/types.h> #include <sys/file.h> #include <sys/time.h> #include <sys/stat.h> #define SCAN_ITER 10 #define RW_ITER 3 #define RW_SIZE (3*1024*1024) #define SEEK_TEST_FSIZE (1024*1024) #define OPEN_TEST_FILES 200 #define TIMER_RATE 100 /* ** Amiga compatibility library for Unix. These are NOT full or correct ** emulations of the Amiga I/F routines - they are intended only to ** run this benchmark. */ #define MODE_OLDFILE 1005 #define MODE_NEWFILE 1006 #define ERROR_NO_MORE_ENTRIES #define OFFSET_BEGINNING -1 #define OFFSET_CURRENT 0 Open(name, accessMode) char *name; long accessMode; { int flags, file; flags = O_RDWR; if(accessMode == MODE_NEWFILE) flags |= O_TRUNC|O_CREAT; if((file = open(name, flags, 0644)) < 0) file = 0; return(file); } /* ** To be fair, write should be followed by fsync(file) to flush cache. But ** since when are benchmarks fair?? */ #define Write(file, buffer, length) write(file, buffer, length) #define Read(file, buffer, length) read(file, buffer, length) #define Close(file) close(file) #define CreateDir(name) mkdir(name, 0755) #define Seek(file, position, mode) lseek(file, position, \ (mode==OFFSET_BEGINNING ? 0 : (mode==OFFSET_CURRENT?1:2))) #define AllocMem(size, constraints) malloc(size) #define FreeMem(p, size) free(p, size) #define DeleteFile(filename) unlink(filename) timer_init() { return(1); } timer_quit() { } timer(valp) long *valp; { static struct timeval ref; struct timeval current; if(valp == (long *)0){ gettimeofday(&ref, 0); return; } gettimeofday(¤t, 0); *valp = (current.tv_usec - ref.tv_usec)/(1000000/TIMER_RATE); if(*valp < 0){ current.tv_sec--; *valp += TIMER_RATE; } *valp += (current.tv_sec - ref.tv_sec)*TIMER_RATE; } OpenStat(filename) char *filename; { int fd, result; struct stat statb; if((fd = open(filename, 0)) < 0) return(0); result = fstat(fd, &statb); close(fd); return(result == 0); } #else /* ** Iteration/size definitions smaller for Amiga so benchmark doesn't take ** as long and fits on empty floppy. */ #include <exec/types.h> #include <libraries/dos.h> #include <devices/timer.h> #ifdef MANX #include <functions.h> /* For Manx only */ #endif #define SCAN_ITER 5 #define RW_ITER 3 #define RW_SIZE (256*1024) #define SEEK_TEST_FSIZE (256*1024) #define OPEN_TEST_FILES 100 #define TIMER_RATE 10 /* misnomer, should be resolution */ struct MsgPort *timerport, *CreatePort(); struct timerequest *timermsg, *CreateExtIO(); long TimerBase; timer_init() { timerport = CreatePort(0, 0); if(timerport == (struct MsgPort *)0) return(0); timermsg = CreateExtIO(timerport, sizeof(struct timerequest)); if(timermsg == (struct timerequest *)0){ DeletePort(timerport); return(0); } if(OpenDevice(TIMERNAME, UNIT_VBLANK, timermsg, 0) != 0){ DeletePort(timerport); DeleteExtIO(timermsg, sizeof(struct timerequest)); return(0); } TimerBase = (long)timermsg->tr_node.io_Device; /* Hack */ return(1); } timer_quit() { CloseDevice(timermsg); DeleteExtIO(timermsg, sizeof(struct timerequest)); DeletePort(timerport); } timer(valp) long *valp; { static struct timeval ref; long t; timermsg->tr_node.io_Command = TR_GETSYSTIME; DoIO(timermsg); t = timermsg->tr_time.tv_secs; if(valp == (long *)0) ref = timermsg->tr_time; else { SubTime(&timermsg->tr_time, &ref); *valp = timermsg->tr_time.tv_secs*TIMER_RATE + (timermsg->tr_time.tv_micro/(1000000/TIMER_RATE)); } } OpenStat(filename) char *filename; { long lock, result; static struct FileInfoBlock fib; /* must be on &fib mod 4 == 0 */ if((lock = Lock(filename, MODE_OLDFILE)) == 0) return(0); result = Examine(lock, &fib); UnLock(lock); return(result); } #endif /* ** Benchmarks performed: ** ** 1) Raw file read/write rates. Tested for operation sizes of ** 512/4096/8192/65536 bytes. Return read/write figures for each ** tranfer size in bytes/sec. ** ** 2) Directory create/delete rates. Return create/delete entries ** per second. ** ** 3) Directory lookup rate. Create files in directory, and ** then measure time to lookup, open & stat entire directory contents. ** Return entries/second. ** ** 4) Seek speed test - create large file, then seek to various ** positions in file & read one byte. Seek distances intentionally ** chosen large to reduce cacheing effectiveness - want basic ** speed of disk format here. Return seeks/second. */ char *prepend = ""; /* prepend this path to all filenames created */ char scratch[8192]; /* scratch buffer used in various tests */ /* ** Our `C' library for the Amiga is a bit different than Unix's, so this ** routine will look a bit obtuse to most of you. Trying to avoid using ** sprintf().. */ maketemp(buf, pref) char *buf; { char *p, *q; int fnum; static int cnt; fnum = cnt++; q = buf; if(pref) for(p = prepend; *p; ) *q++ = *p++; for(p = "diskperf"; *p; ) *q++ = *p++; *q++ = 'A' + ((fnum>>8)&0xf); *q++ = 'A' + ((fnum>>4)&0xf); *q++ = 'A' + (fnum&0xf); *q++ = 0; } long sptest[] = {512, 4096, 8192, 32768, 0}; void rw_test() { long i, j, k, maxsize, file, RDaccTime, WRaccTime, Dt; struct timeval t0, t1; char *p, filename[64]; maxsize = -1; for(k = 0; sptest[k] != 0; k++) if(sptest[k] > maxsize) maxsize = sptest[k]; if((p = (char *)AllocMem(maxsize, 0)) == (char *)0){ printf("Could not get %d bytes of memory\n", maxsize); return; } for(k = 0; sptest[k] != 0; k++){ RDaccTime = WRaccTime = 0; for(j = 0; j < RW_ITER; j++){ maketemp(filename, 1); if((file = (long) Open(filename, MODE_NEWFILE)) == 0){ printf("Could not create %s\n", filename); return; } timer(0); for(i = RW_SIZE/sptest[k]; i > 0; i--) Write(file, p, sptest[k]); timer(&Dt); WRaccTime += Dt; Close(file); if((file = (long) Open(filename, MODE_OLDFILE)) == 0){ printf("Could not open %s\n", filename); return; } timer(0); for(i = RW_SIZE/sptest[k]; i > 0; i--) Read(file, p, sptest[k]); timer(&Dt); RDaccTime += Dt; Close(file); DeleteFile(filename); } printf("r/w speed:\t\tbuf %d bytes, rd %d byte/sec, wr %d byte/sec\n", sptest[k], (TIMER_RATE*RW_SIZE)/(RDaccTime/RW_ITER), (TIMER_RATE*RW_SIZE)/(WRaccTime/RW_ITER)); } FreeMem(p, maxsize); } seek_test() { char fname[64]; long i, fd, Dt, cnt, pos, dist; maketemp(fname, 1); if((fd = (long) Open(fname, MODE_NEWFILE)) == 0){ printf("Could not create %s\n", fname); return; } for(i = SEEK_TEST_FSIZE/sizeof(scratch); i > 0; i--) if(Write(fd, scratch, sizeof(scratch)) != sizeof(scratch)) break; if(i == 0){ cnt = 0; timer(0); for(dist = 256; dist <= 65536; dist <<= 2) for(pos = 0; pos < SEEK_TEST_FSIZE; pos += dist){ cnt++; Seek(fd, pos, OFFSET_BEGINNING); Read(fd, scratch, 1); } timer(&Dt); printf("Seek/read test:\t\t%d seek/reads per second\n", (TIMER_RATE*cnt)/Dt); } Close(fd); DeleteFile(fname); } char tempname[OPEN_TEST_FILES][16]; open_scan_test() { char dirname[64]; long lock, oldlock, cDt, dDt, sDt, i, j, fd, numRead; struct FileInfoBlock *fib; maketemp(dirname, 1); lock = CreateDir(dirname); #ifdef unix chdir(dirname); #else oldlock = CurrentDir(lock); #endif for(i = 0; i < OPEN_TEST_FILES; i++) maketemp(tempname[i], 0); /* ** Time Open of files. */ timer(0); for(i = 0; i < OPEN_TEST_FILES; i++){ if((fd = Open(tempname[i], MODE_NEWFILE)) == 0){ printf("Could not open %s/%s\n", dirname, tempname); break; } Close(fd); } timer(&cDt); /* ** Time open scan of directory. */ timer(0); numRead = 1; for(i = 0; i < SCAN_ITER; i++) for(j = 0; j < OPEN_TEST_FILES; j++) if(OpenStat(tempname[i]) != 0) numRead++; timer(&sDt); /* ** Time Close of files. */ timer(0); for(i = 0; i < OPEN_TEST_FILES; i++) DeleteFile(tempname[i]); timer(&dDt); printf("File create/delete:\tcreate %d files/sec, delete %d files/sec\n", (TIMER_RATE*OPEN_TEST_FILES)/cDt, (TIMER_RATE*OPEN_TEST_FILES)/dDt); printf("Directory scan:\t\t%d entries/sec\n", (TIMER_RATE*numRead)/sDt); #ifdef unix chdir(".."); rmdir(dirname); #else CurrentDir(oldlock); DeleteFile(dirname); #endif } main(argc, argv) int argc; char **argv; { if(!timer_init()){ printf("Could not init timer\n"); return(0); /* Exit in most systems, but not ours! */ } if(argc > 1) prepend = argv[1]; open_scan_test(); seek_test(); rw_test(); } -- # Fred Fish, 1346 West 10th Place, Tempe, AZ 85281, USA # noao!nud!fishpond!fnf (602) 921-1113
lalonde@nicmad.UUCP (John Lalonde) (11/28/88)
In article <165@fishpond.UUCP> fnf@fishpond.UUCP (Fred Fish) writes: >I decided to retry the disk performance benchmark that I ran in Feb '88 and >posted the results for. >Performance timings using Rick Spanbauer's diskperf.c program. > > Amiga Sun A/UX A/UX > 2000 3/50 Mac-II Mac-II > ST277N HD80SC HD80SC > Nov 88 ????? Feb 88 Nov 88 > >File creations (files/sec) 14 6 6 6 >File deletions (files/sec) 41 11 8 8 >Directory scan (entries/sec) 92 350 371 397 >Seek+read (seek+read/sec) 85 298 110 93 >Read speed, 512 buffer (byte/sec) 67216 240499 55168 25593 >Read speed, 4096 buffer (byte/sec) 109226 234057 53708 25323 >Read speed, 8192 buffer (byte/sec) 187245 233189 54013 25183 >Read speed, 32768 buffer (byte/sec) 374491 236343 53644 25123 >Write speed, 512 buffer (byte/sec) 28187 215166 44181 43855 >Write speed, 4096 buffer (byte/sec) 137970 182466 47211 46287 >Write speed, 8192 buffer (byte/sec) 154202 179755 46832 46445 >Write speed, 32768 buffer (byte/sec) 218453 187580 46930 46707 I compiled diskperf.c and ran it on two different sun 386i workstations. I made no modifications to the benchmark and ran it on a stock version of SunOS 4.0.0 sun 386i (model 150/8x) File creations (files/sec) 18 File deletions (files/sec) 54 Directory scan (entries/sec) 976 Seek+read (seek+read/sec) 1976 Read speed, 512 buffer (byte/sec) 776722 Read speed, 4096 buffer (byte/sec) 2995931 Read speed, 8192 buffer (byte/sec) 3534525 Read speed, 32768 buffer (byte/sec) 3700856 Write speed, 512 buffer (byte/sec) 280618 Write speed, 4096 buffer (byte/sec) 1728421 Write speed, 8192 buffer (byte/sec) 2069557 Write speed, 32768 buffer (byte/sec) 1787345 Notes: (1) Sun-386i model 150/8x uses a 20MHz 80386, has 8 MB of memory, and 32KB of cache. (2) Sun 386i configured with a CDC Wren IV (94171-300) disk (3) Sun 386i uses the Intel 82380 DMA Controller (4) Sun 386i uses the Western Digital 33c93 SCSI Controller IC -- John LaLonde Nicolet Instrument Corporation uucp: {ucbvax,rutgers,harvard}!uwvax!astroatc!nicmad!lalonde
ggere@csi.3Com.Com (Gary M. Gere) (12/01/88)
I thought it would be interesting to throw in the disk performance measurements from a DEC VAX/11-750 (~=.75 mip) with 3mb memory, Emulex controller with DMA and a Fujitsu M2351A "Eagle" 474mb 18ms seek disk. Amiga Sun A/UX A/UX 4.3BSD Sun 2000 3/50 Mac-II Mac-II VAX750 386i-150/8 ST277N HD80SC HD80SC Fujitsu CDC Wren-IV Nov 88 ????? Feb 88 Nov 88 "EAGLE" File creations (files/s) 14 6 6 6 13 18 File deletions (files/s) 41 11 8 8 14 54 Directory scan (entries/s) 92 350 371 397 306 976 Seek+read (seek+read/s) 85 298 110 93 333 1976 Read 512 buffer (byte/s) 67216 240499 55168 25593 180997 776722 Read 4096 buffer (byte/s) 109226 234057 53708 25323 400729 2995931 Read 8192 buffer (byte/s) 187245 233189 54013 25183 440578 3534525 Read 32768 buffer (byte/s) 374491 236343 53644 25123 499321 3700856 Write 512 buffer (byte/s) 28187 215166 44181 43855 123847 280618 Write 4096 buffer (byte/s) 137970 182466 47211 46287 402782 1728421 Write 8192 buffer (byte/s) 154202 179755 46832 46445 489226 2069557 Write 32768 buffer (byte/s) 218453 187580 46930 46707 503316 1787345 -- =============================================================================== CSI Division, 3Com Corporation, 2125 Hamilton Avenue San Jose, Calif 95125-5905 Gary M. Gere {3comvax,epimass}!csi!ggere or ggere@csi.3Com.Com +1 408/559-1118