marc@oahu.cs.ucla.edu (Marc Tremblay) (04/13/90)
Recent chip designs have taken advantage of wide instruction paths to fetch, decode and issue more than one instruction per cycle. These chips need a multi-ported register file to sustain the bandwith necessary to provide several operands per cycle. For example the Intel 80960CA is advertised as having a 6-port register file (though some ports are probably time-multiplexed). I have previously designed a standard dual-port static register file (two simultaneous reads, one write per cycle) and I was wondering if the same kind of circuits normally used for a simple register file is used for a multi-ported register file. For example I was wondering if a design with cross coupled inverters driving the data lines through access transistors is still a valid choice. Another interesting aspect is the design of the few decoders required to access the ports. Some of these can be time-multiplexed but that still requires a lot of area for the remaining decoders. There are other important factors such as the forwarding unit and the load interlock circuitry associated with a multi-ported register file. Briefly, there is a lot to discuss and besides it is not a RISC vs. CISC topic! Marc Tremblay marc@CS.UCLA.EDU
upton@badger.cs.washington.edu (Michael Upton) (04/13/90)
In reguards to the design of multiported register files: About 2 read ports is as many as can be used for standard cross-coupled inverter ram cells. simultaneous reads on three or more ports to the same address results in corrupting the data. The standard fix for this is to add another inverter to the read side of the cross coupled inverters, thus decoupling the read from the internal nodes of the ram cell. Mike Upton
bron@bronze.wpd.sgi.com (Bron Campbell Nelson) (04/15/90)
In article <11426@june.cs.washington.edu>, upton@badger.cs.washington.edu (Michael Upton) writes: > > In reguards to the design of multiported register files: One thing I've wondered .. how much extra chip area does it take to build a multi-port register file? The late lamented MultiFlow VLIW machine, and the new crop of "super-scalar" chips that issue several instructions per clock must be able to read and write large numbers of registers simultaneously (something on the order or 10 reads and 5 writes per clock). How much extra hardware is needed to do this? How many more levels of logic are required over the "2 read 1 write" case? -- Bron Campbell Nelson bron@sgi.com or possibly ..!ames!sgi!bron These statements are my own, not those of Silicon Graphics.
mark@mips.COM (Mark G. Johnson) (04/15/90)
In article <56847@sgi.sgi.com> bron@bronze.wpd.sgi.com (Bron Campbell Nelson) writes: >> In reguards to the design of multiported register files: > >One thing I've wondered .. how much extra chip area does it take to >build a multi-port register file? The late lamented MultiFlow VLIW >machine, and the new crop of "super-scalar" chips that issue several >instructions per clock must be able to read and write large numbers of >registers simultaneously (something on the order or 10 reads and 5 >writes per clock). How much extra hardware is needed to do this? Consider, for a moment, the _hypothesis_ that superscalar CPUs require many-many-ported register files, *and* physical implementation of these additionally-ported files requires more hardware than the (2R,1W) register files of olden (nonsuperscalar) days. Just a hypothesis; it may or may not be true in real life. Wouldn't it be unpleasant if you had to add this extra hardware to a Large register file, like for example, one that had 7 or 8 windows of 16 regs per window? A penalty multiplied by a penalty, it might seem. :-) :-) of course, gate arrays ARE getting denser all the time ... :-) :-) -- -- Mark Johnson MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086 (408) 991-0208 mark@mips.com {or ...!decwrl!mips!mark}