lindsay@gandalf.cs.cmu.edu (Donald Lindsay) (01/29/91)
In article <1991Jan23.154727.26972@mozart.amd.com> tim@amd.com (Tim Olson) writes: >Register file access is not on the most critical path in the Am29000 >(and we have 192 3-ported registers!). Usually cache lookups and TLB >matching tend to be more critical, because they involve an immediate >comparison of tags after the access, and the arrays are usually larger >than register file arrays. Does this mean that a machine with Tomasulo-style tag matching would have no cycle time penalty? How sensitive would that be to the tag width, when there are, say, 50 or 80 destinations? [I am referring to a machine where tagged values are broadcast over an internal bus, and destinations select themselves via tag matching. The 360/91 FPU used this in place of traditional register address decoders.] -- Don D.C.Lindsay .. temporarily at Carnegie Mellon Robotics
tim@proton.amd.com (Tim Olson) (01/31/91)
In article <11705@pt.cs.cmu.edu> lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes: | | In article <1991Jan23.154727.26972@mozart.amd.com> | tim@amd.com (I) write: | >Register file access is not on the most critical path in the Am29000 | >(and we have 192 3-ported registers!). Usually cache lookups and TLB | >matching tend to be more critical, because they involve an immediate | >comparison of tags after the access, and the arrays are usually larger | >than register file arrays. | | Does this mean that a machine with Tomasulo-style tag matching would | have no cycle time penalty? How sensitive would that be to the tag | width, when there are, say, 50 or 80 destinations? Probably. Instruction and data caches are typically 1, 2, or 4-way set-associative, so they still involve a significant access time for each "way" before the tag compare (a 2-way set-associative, 64KB cache with a 4-word line size has to look up one of 2048 entries), but Tomasulo-style result-tagging is fully-associative in that all of the reservation station entries are compared to the result value(s) in parallel for a tag match, and therefore don't have the large access-time component that the caches have. There may be some time penalty involved in the distribution of the result tags to all reservation station entries, however, which may cause it to become one of the critical paths. -- -- Tim Olson Advanced Micro Devices (tim@amd.com)