mark@mips.UUCP (Mark G. Johnson) (11/10/87)
A question from jpdres10@usl-pc.UUCP (Green Eric Lee): > ... how much does the adder in the register path slow things down > in the AMD29000? ... And, of course, Plain Old Registers have no > problem at all with something out there in the register addressing > path, except, of course, the decode tree ... and a respones from bcase@apple.UUCP (Brian Case) > Ok, so to address future speed advantages, yes there might be some > speed advantages for those with simple register files. However, for > the Am29000, the critical paths were quite balanced ... with, > I believe, the TLB and/or instruction cache being the limiting factor. > Next came the ALU, and then the register file. Unless you want to do > things like spread the ALU cost over two pipestages (possible to do), > I don't think the register file is going to be the limiting factor. > ... what do other people have to say? At least in MOS implementations, I'd agree with Brian that register file access will not be the speed-limiting path in future RISC chips, for both "windowed" and "flat" register file architectures. {I dunno about Bipolar or MESFET implementations}. A major reason: fast floating-point coprocessors. If RISCs stick to their current preference for synchronous instruction-stream-co-intrepreters, then the list of potential critical paths now includes all paths on the CPU *and* coprocessor(s), plus the generation/reception of the coprocessor handshake signals. For example, the double-precision fp ADD operation (52 bit mantissa, 11 bit exponent) is required to complete in two cycles in the MIPS fp chip. Doing all of the normalization shifting, exponent adjusting, mantissa addition, exception detection, and the *%#$_@ IEEE rounding operations is "intuitively" :-) :-) more than twice as bad as register file access, windows or not. Ignoring coprocessors for a moment, I think it's likely that on-chip TLB's will continue to be slower than register files. Usually the TLB contains many more bits than the register file, so its memory-array time constants are longer. TLB accesses also include some logical operations not found in the register file: hit/miss detection, plus output selection {for set-associative or fully-associative TLB's}. Finally, the "circular definition" argument: If it's suspected that register file access will be THE critical path, then the RISC design team will begin by designing and optimizing (and re-optimizing) the register file until it's as fast as that team of engineers can possibly make it. Now they know THE lower bound on the cycle time. So they use this cycle time definition in designing the rest of the chip, taking advantage of every last nanosecond wherever possible. In some places, they'll make good use of these extra nanoseconds, thus creating new "critical paths". Meaning that the register file is no longer THE critical path. Regards,