davec@nucleus.amd.com (Dave Christie) (07/26/90)
In article <37269@shemp.CS.UCLA.EDU> marc@oahu.cs.ucla.edu (Marc Tremblay) writes: > >A full cycle seems to be allocated for register renaming. That's plenty >of time to access the map table and manage the tag lists. I suspect that >they may even do it twice per cycle to reduce the number of ports >of the map table. Indeed if two instructions can be renamed per cycle, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >and if both are FMA (Fused multiply and add) which require 3 source >tags and one destination tag, that's 8 ports/cycle for the mapping table. Are you implying that two FP instructions can be issued in one cycle?! I don't believe this is the case. ---------------------------------- Dave Christie My opinions only.
mark@hubcap.clemson.edu (Mark Smotherman) (07/27/90)
From davec@nucleus.amd.com (Dave Christie): > > Are you implying that two FP instructions can be issued in one cycle?! > I don't believe this is the case. Yes, the instruction dispatch on RS/6000 does not require a matched pair of an integer instruction and a flt.pt. instruction in order to dispatch (issue?) multiple instructions per cycle (as does the i860 in DIM and the i960CA). From H. Bakoglu and T. Whiteside, "RISC System/6000 Hardware Overview," in IBM RISC System/6000 Technology, 1990, order no. SA23-2619, p. 11: Four instructions per cycle can be fetched from the I-cache arrays to the instruction buffers and dispatch unit, which can dispatch up to four instructions per cycle. Two of these are internal dispatches to the ICU (branches and condition register instructions) and two are external dispatches to --> the FXU and FPU. There is no restriction on the combination --> of instructions that are dispatched to the FXU and FPU. They --> can be a fixed- and a floating-point instruction, or two --> fixed-point instructions, or two floating-point instructions. Because the fixed- and floating-point instructions are not mated together, instruction dispatch bandwidth or code space is not wasted. [The FXU and FPU both have instruction buffers with 12 entries each to even out the dispatching patterns.] Very nice. -- Mark Smotherman, Comp. Sci. Dept., Clemson University, Clemson, SC 29634 INTERNET: mark@hubcap.clemson.edu UUCP: gatech!hubcap!mark
billms@caen.engin.umich.edu (Bill Mangione-Smith) (07/27/90)
In article <9871@hubcap.clemson.edu> mark@hubcap.clemson.edu (Mark Smotherman) writes: >From davec@nucleus.amd.com (Dave Christie): >> >> Are you implying that two FP instructions can be issued in one cycle?! >> I don't believe this is the case. > >Yes, the instruction dispatch on RS/6000 does not require a matched >pair of an integer instruction and a flt.pt. instruction in order to >dispatch (issue?) Nope, this is not quite true. One FP instruction can be issued by the FPU each clock, though it can be a mult-add. Two can be sent (i.e. dispatched) to the FPU each clock, but atleast one of them sits there in a queue. This removes one more worry from the compiler about matching up instructions for dispatching from the I cache unit. >Mark Smotherman, Comp. Sci. Dept., Clemson University, Clemson, SC 29634 Bill Mangione-Smith billms@eecs.umich.edu
marc@oahu.cs.ucla.edu (Marc Tremblay) (07/27/90)
>billms@caen.engin.umich.edu (Bill Mangione-Smith) writes: >> mark@hubcap.clemson.edu (Mark Smotherman) writes: >>From davec@nucleus.amd.com (Dave Christie): >>> >>> Are you implying that two FP instructions can be issued in one cycle?! >>> I don't believe this is the case. >> >>Yes, the instruction dispatch on RS/6000 does not require a matched >>pair of an integer instruction and a flt.pt. instruction in order to >>dispatch (issue?) > >Nope, this is not quite true. One FP instruction can be issued by the >FPU each clock, though it can be a mult-add. Two can be sent (i.e. dispatched) >to the FPU each clock, but atleast one of them sits there in a queue. This >removes one more worry from the compiler about matching up instructions >for dispatching from the I cache unit. The original discussion was on register renaming. Yes,the RS/6000 can *rename* two instructions per cycle. Yes,the RS/6000 can execute some combinations of two FP instructions in 1 cycle. For example a floating-point load and a floating-point mult-add can be executed in parallel (the fixed-point unit does most of the work anyway!). If we talk about floating-point arithmetic instructions, no, the RS/6000 cannot execute two of them per cycle (most of them can be pipelined though). Since the FPU can rename more instructions than it can execute, a buffer must be inserted between the rename logic and the execution logic. That's accomplished by the decode buffer. _________________________________________________ Marc Tremblay internet: marc@CS.UCLA.EDU UUCP: ...!{uunet,ucbvax,rutgers}!cs.ucla.edu!marc
davec@nucleus.amd.com (Dave Christie) (07/27/90)
In <9871@hubcap.clemson.edu> mark@hubcap.clemson.edu (Mark Smotherman) writes: >From davec@nucleus.amd.com (Dave Christie): >> >> Are you implying that two FP instructions can be issued in one cycle?! >> I don't believe this is the case. > >Yes, the instruction dispatch on RS/6000 does not require a matched >pair of an integer instruction and a flt.pt. instruction in order to >dispatch (issue?) multiple instructions per cycle (as does the i860 >in DIM and the i960CA). > >From H. Bakoglu and T. Whiteside, "RISC System/6000 Hardware Overview," >in IBM RISC System/6000 Technology, 1990, order no. SA23-2619, p. 11: > > Four instructions per cycle can be fetched from the I-cache > arrays to the instruction buffers and dispatch unit, which > can dispatch up to four instructions per cycle. Two of these > are internal dispatches to the ICU (branches and condition > register instructions) and two are external dispatches to >--> the FXU and FPU. There is no restriction on the combination >--> of instructions that are dispatched to the FXU and FPU. They >--> can be a fixed- and a floating-point instruction, or two >--> fixed-point instructions, or two floating-point instructions. > Because the fixed- and floating-point instructions are not mated > together, instruction dispatch bandwidth or code space is not > wasted. [The FXU and FPU both have instruction buffers with > 12 entries each to even out the dispatching patterns.] I had assumed that there were just 32-bit paths from the ICU to the FXU and FPU - looks like it's 64. In any case, these feed the instruction queues; all you've told me so far is that two instructions can be placed in the FPU queue at once. This relieves the compiler of having to do load levelling for dispatch (don't know what it has to do with code space though...). What I asked about was issue - can two be issued from the FPU queue at once? This would not just require double the mapping file ports, but double the register file ports as well. One more question, assuming the answer to the previous one is "no": is the mapping file referenced when instructions are placed in the queue, or upon issue? I don't see much point in the former, considering the extra ports required. Again assuming "no", I wonder if one could take their spending a non-trivial amount of pins (64) as an indication of how well they thought the compilers could do load balancing, or an acknowledgement that a lot of interesting codes just aren't amenable to load balancing (I don't imagine it works in harmony with other optimizations). Then again, maybe they were there for the taking anyway. ------------------------ Dave Christie My opinions only.
usenet@mozart.amd.com (Usenet News) (07/27/90)
<1990Jul27.054936.18973@mozart.amd.com> Sender: Reply-To: davec@nucleus.amd.com (Dave Christie) Followup-To: Distribution: Organization: Advanced Micro Devices, Inc., Austin, Texas Keywords: From: davec@nucleus.amd.com (Dave Christie) Path: nucleus!davec In <1990Jul27.054936.18973@mozart.amd.com> davec@nucleus.amd.com I write: > >One more question, assuming the answer to the previous one is "no": is >the mapping file referenced when instructions are placed in the queue, >or upon issue? I don't see much point in the former, considering the >extra ports required. Brain damage alert: I've realized the renaming must be done at dispatch in the ICU to coordinate the renaming of an FPU register by an FXU instruction. Yes, the mapping file would obviously have double the ports. Never mind. ------------------------ Dave Christie My misconceptions only.