RWilson@acorn.co.uk (11/14/89)
Some data points for you on big CISC vs big RISC. I have in front of me Intel's "EVOLUTIONARY" picture of the 1.2 million transistor '486. This picture measures 53 cm by 33 cm including the pad ring. There are 3 large pieces of control logic: one across the top 34 cm by 8 cm (above what I take to be a datapath of similar size), another on the right 9 cm by 14 cm and a last on the bottom right 6 cm by 8 cm. The cache, (8KByte mixed data and instruction), on the bottom left is 30 cm by 11 cm. So out of 1749 square cm, the control logic uses 446 sq cm (a quarter of the area), the cache 330 sq cm (less than a fifth of the area). But clearly the cache has more transistors (at least 393,216 [one third of the chip] if they used a 6 transistor cell for the data store alone). [I have left out the microcode ROMs and PLAs from the control logic discussion: the big ROM is 12 cm by 5 cm and there are lots of smaller ones. Any CISC device wanting to do TAN etc will do it by microcode ROM (RISC would do it by software presently, but the on chip ROM isn't that big.....) which could be omitted in a RISCier CISC]. I also have a picture of ARM3, Acorn's 309,000 transistor ARM-with-4K-cache. This is 18 cm by 17 cm, again including the pad ring. The ARM instruction control logic is 5.5 cm by 3.5 cm; the cache control logic is 3 cm by 6 cm. (I suspect the cache control logic on the '486 is the bit to the right of the cache: no counted above). The cache has an 8 cm by 12 cm data store and four 4.5 cm by 2 cm associate tag stores. [yes, ARM3's cache really is 64 way set associative] So out of 306 square cm, the control logic of ARM3 plus cache uses 37.25 sq cm (around 1/8 of the chip) and the cache uses 132 square cm (around half of the chip). [the rest of ARM3 is datapath, wiring between blocks and pad ring] So what's bad about CISC is not the number of transistors the control logic for decoding the complex instructions needs, but the sheer irregularity (aka complexity) of it. Random logic just can't be packed as closely as, say, RAM on a chip design (and is getting relatively less dense as the geometries get smaller). This impacts the design time and chip area in an adverse manner - and thus the performance. The i860 (identical process to '486) with a RISC design manages to put the same basic functionality, a little faster, with 12KBytes of cache into the same (approximately) sized chip. In the future, with cache size such a dominant factor for single chip performance, one should expect processor units built on technologoy optimised towards making the RAM, thus making the logic parts relatively less dense again. [note that I'm not saying the '486 or ARM3 are smaller/better/faster/more correct/cheaper than each other :-)] --Roger Wilson (RWilson@Acorn.co.uk) (VLSI designers do it regularly) (sorry its all relative)