daveh@cbmvax.commodore.com (Dave Haynie) (06/12/90)
In article <524@cvbnetPrime.COM> jshekhel@feds19.UUCP (Jerry Shekhel ) writes: >In article <10373@batcomputer.tn.cornell.edu> (Daniel S. Riley) writes: >>That wasn't mindless ravings. Seriously, that 1/3 number confuses me--it >>should be lots higher. Going from a 16 bit to 32 bit machine and doubling >>the clock speed (8 MHz to 16 MHz) should give you at least a factor of 4, >>and then some because the '30 takes fewer clocks to do some things. >Not really, becase at 16 MHz, an uncached system is no longer zero-wait, >unless you use 60ns DRAMs, which I doubt these machines will use. Am I >wrong about this? I'd like to know for sure. Well, you're right about wait-states, but that doesn't matter. In rough terms, you get quite a few "factor of two" optimizations. First there's bus width. The 32 bit vs. 16 bit bus does help considerably, mainly based on the way the 68030 fetches, and its instruction pipeline works. Much plain 68000 code is 32 bit code/data, and even that which isn't is going to get a boot from the pipeline. Next comes the clock speeds, which are typically 2x-3x that of a 68000's clock. Then you get the memory cycle, which is twice as fast at the same clock speed, without wait states: the '030 fetches 32 bits in 2 clocks, the 68000 fetches 16 bits in 4 clocks. Next come the caches. A cache hit really takes 2 clocks, but it also has the effect of freeing up the bus unit. So if the CPU needs an instruction and a datum, and one is in the cache, fetching one from the bus and one from the cache is a big win. In fact, in the case of cache hits for both instruction and data, the 68030's Harvard architecture permits both to be used at the same time -- it's basically 64 bits internally. If you all up all these possible "factor of 2" improvements, you conclude that a 68030 at 16MHz might run roughly 16 times faster than a 68000 at 8MHz. That's really a peak value, but considering the optimizations done on many of the standard instructions, the peak performance might be even higher. Now, in reality, you're not always running simultaneous cache hits, you may not have a full pipeline, and you may not be running with 0 wait state memory. But it's quite reasonable to see a factor of 4-8 speedup in common, ordinary CPU bound applications. >-- JJS -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy "I have been given the freedom to do as I see fit" -REM