mrm@sceard.Sceard.COM (M.R.Murphy) (09/23/89)
In article <2406@ucsfcca.ucsf.edu> root@cca.ucsf.edu (Systems Staff) writes: >In article <24550@louie.udel.EDU>, mcd@ccsitn.att.com writes: >> >> Actually the discussion about DMA for PC's has been one of the MOST >> informative discussion/thread/topics out of all the myriad things I >> have seen on the list > >Yes, too bad some of it is just plain wrong. Yup. > >For example, some of the discussion compared a single bus cycle of DMA >to a single bus cycle for CPU transfer where the CPU transfer requires >a bus cycle to fetch and another one to store. Although the the clock >count per cycle may be higher, DMA just requires one bus cycle for a >transfer. > >Then there is the issue of interrupt handling time blocking other >interrupts. > >Etc. etc. A big part of the etc.,etc. is the question of DMA usefulness. It's to allow the CPU to go off and do "good stuff" whilst the DMA transfer is in progress. In a single-user, single-tasking environment, it may be useless, slower, and silly to use DMA. Why bother having the code do start DMA loop while DMA not done cleanup from DMA in a single-thread environment? Doesn't make much sense unless one wishes to test DMA hardware. If the CPU can be off doing USEFUL other work, like running some other task, DMA can really improve system throughput -- even if the DMA transfer is slower than than the CPU instructions might be. This of course presumes that the DMA isn't blessed with some other pathological design flaw like keeping the CPU from getting at memory. It also means that the software that uses the DMA can't keep the CPU from doing other "good stuff". Having code do lock valuable and critical resource do DMA process that will unlock valuable and critical resource when done wait on valuable and critical resource before doing good stuff is also a bit misguided. There is lots of code that does the equivalent hiding in the murky depths of operating systems that are licensed without source :-) The code may even be dependent on hardware configurations that weren't expected. Long ago, there was a timesharing system that would run in 8K 12-bit words. The system required 4K and the user got 4K. One user, fine. Two users and the story goes execute user 1 for a slice DMA swap user 1 out, memory marked unavailable, no user runs DMA swap user 2 in, memory marked unavailable, no user runs execute user 2 for a slice DMA swap user 2 out, memory marked unavailable, no user runs DMA swap user 1 in, memory marked unavailable, no user runs execute user 1 for a slice ... Not particularly efficient, but it works. If the system has 12K words, then there can be 2 user "partitions" and the story goes like this with 2 users execute user 1 for a slice execute user 2 for a slice execute user 1 for a slice ... With 12K and 3 or more users, the story can still go like execute user 1 for a slice DMA swap user 1 out of partition 1, execute user 2 in partition 2 DMA swap user 3 into partition 1, execute user 2 in partition 2 execute user 2 for a slice (or switch to user 3) DMA swap user 2 out of partition 2, execute user 3 in partition 1 DMA swap user 1 into partition 2, execute user 3 in partition 1 execute user 1 for a slice ... The key is that the swap time is overlapped with execute time and the system does "good stuff" work while the DMA chugs along. Even if it's slower than the CPU. No DMA, with the CPU doing the work, and the throughput goes down. Gee, let's let the CPU handle the I/O transfer for swapping and for paging, too. That way no user needs to get in the way and bother the system when it's doing its own important work. Of course, no modern systems designers would ever forget the lessons learned when entire machines including their rotating storage had less memory than a single chip has today? Would they :-) -- Mike Murphy Sceard Systems, Inc. 544 South Pacific St. San Marcos, CA 92069 mrm@Sceard.COM {hp-sdd,nosc,ucsd,uunet}!sceard!mrm +1 619 471 0655