rtrauben@cortex.Eng.Sun.COM (Richard Trauben) (02/14/91)
In article <1991Feb12.225634.13757@m.cs.uiuc.edu> (Don Gillies) writes: >I read an article today in "The Microprocessor Report" that said the >increase from 32 to 64 bits added approximately 10-15% to the size of >the chip. This is quite surprising. You'd think that going from 32 >to 64 bits would double the size of the ALU and all the data paths and >registers. Does this mean that the datapath and ALU and all the >registers accounted for only 10-15% of the microprocessor to begin >with? What is the baseline design for this 10-15% increase in area? At first blush, the projected 15% area overhead for increasing the datapath width from 32bits to 64bits DOES sound very low. However, consider the following 'typical' area budget characteristics for a highly integrated single chip processor: Let 1/3 of the area budget be on-chip cache, 1/3 floating point unit and 1/3 be integer unit. A move to 64bits will have virtually no impact on the fp engine. While the area for tags double, large data block caches (i.e. 32-64bytes/tag) make this increase in tag ram area neglectable. Finally assume the integer unit is 1/2 datapath and 1/2 control. The exact percentage will depend on your favorite processor architecture and implementation. A first order approximation of the area impact is that the control section area stays constant while the datapath area doubles. uP Fraction of Total Effect 64-bit Sub-Block 32bit Area Budget on 32bit Area ---------- ----------------- ------------- FPU .33 1.0x=> .33 $ .33 1.1x=> .36 IU .33 ==>.5dpth->.16 2.0x=> .32 ==>.5cntl->.16 1.0x=> .16 ------------------- ----------- 1.00A 1.17A This back of the envelope calculation says area would increase by 17% by pasting a 64-bit IU core onto a generic design. Obviously the exact mileage will vary somewhat. ** OPINION ** It also suggests, for a fixed manufacturable die size with all other things being equal, that the quantitative cost to the customer for going to 64bit addressing in advance of truly needing it is analogous to not providing 1/2 the capacity of on-chip cache that they might have had otherwise. A common rule of thumb states that doubling the cache size will half the overall cache miss penalty. Simular arguments could be applied about the opportunity costs of an integrated FP engine on the high integration processor. There are opportunity costs in each engineering tradeoff made. You only get what you pay for... sometimes less :-). Regards, -Richard Trauben
carters@ajpo.sei.cmu.edu (Scott Carter) (02/20/91)
In article <7967@exodus.Eng.Sun.COM> rtrauben@cortex.Eng.Sun.COM (Richard Trauben) writes: >In article <1991Feb12.225634.13757@m.cs.uiuc.edu> (Don Gillies) writes: >>[ 15% chip size increase for the R4000 to fully support 64-bit integers ] >[reasonable assuming only 1/6 of the chip is integer datapath, and that >integer datapath roughly doubles in size] Yeah, it's not so much the size [hmm, I'd still kill for an additional 10% size budget], but I worry a bit about cycle time effects: ALU carry path only increases by one gate, but in a cache-superpipelined(tm) design like the R4000 I fear this may often be "the" critical path. Datapath mux width doubles, which increases the fanout on all the controls which drive said muxes. With MIPSCo's super designers, I don't doubt that they'll be able to put mega-drivers on all the appropriate points, but for architectures which need to be implementable in a less optimized design environment (e.g. our military/space designs which cannot afford full custom design on the random logic) I worry about this. Probably the right answer is design tools which can better handle the highly variable fanout problem at a higher level. Also ... >Let 1/3 of the area budget be on-chip cache, 1/3 floating point unit and >1/3 be integer unit. A move to 64bits will have virtually no impact on >the fp engine. While the area for tags double, large data block caches >(i.e. 32-64bytes/tag) make this increase in tag ram area neglectable. Tag size only doubles if you have virtual-address caches, otherwise tag size depends on physical address size (I doubt the R4000 has a 64-bit physical address :) Still, I worry about this for those design points where you want/need a virtual address cache, where the design point would otherwise be a shorter line (a branch target cache would be an excellent example, if someone starts wanting 64-bit instruction addresses for some reason (e.g. shared library tricks) - small on-chip caches in lower density processes like GaAs constitue a less contrived example). >-Richard Trauben Scott Carter - McDonnell Douglas Electronic Systems Company carter%csvax.decnet@mdcgwy.mdc.com (preferred and faster) - or - carters@ajpo.sei.cmu.edu (714)-896-3097 The opinions expressed herein are solely those of the author, and are not necessarily those of McDonnell Douglas.
mash@mips.COM (John Mashey) (02/21/91)
In article <7967@exodus.Eng.Sun.COM> rtrauben@cortex.Eng.Sun.COM (Richard Trauben) writes: >In article <1991Feb12.225634.13757@m.cs.uiuc.edu> (Don Gillies) writes: >>I read an article today in "The Microprocessor Report" that said the >>increase from 32 to 64 bits added approximately 10-15% to the size of >>the chip. This is quite surprising. You'd think that going from 32 >At first blush, the projected 15% area overhead for increasing the datapath >width from 32bits to 64bits DOES sound very low. However, consider the ... >This back of the envelope calculation says area would increase by 17% >by pasting a 64-bit IU core onto a generic design. Obviously the exact >mileage will vary somewhat. >It also suggests, for a fixed manufacturable die size with all other things >being equal, that the quantitative cost to the customer for going to 64bit >addressing in advance of truly needing it is analogous to not providing 1/2 >the capacity of on-chip cache that they might have had otherwise. A common >rule of thumb states that doubling the cache size will half the overall cache >miss penalty. This is a pretty good estimate. I have some slightly better numbers, although gathered informally (using a ruler on a chip plot, and rounding everything). I also split it up a little differently, and my earlier estimates (which is where some of the published numbers come from, i.e., some offhand comments) were actually a little high. Take all of the following within +/- 2%. The integer data path (+ some of its control; I wasn't too fussy) are about 14%, and the MMU/CP0 chunk is about 5% (it has to be wider, also). *** Altogether, I expect this all means that the die space cost was around 7-8% to get 64-bit integer data path and addressing. *** Now, the caches together are about 14% also, so maybe we could have doubled one of the caches, which would have been nice. On the other hand, I think there would have been some awkward layout issues, i.e., it might have been possible, but would have been hard, especially looking forward to a design that one can rapidly expand the cache sizes without re-laying-out everything in sight. (Note that caches like certain shapes more than others, and are nontrivial to squeeze into weird-shaped holes :-) -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
cprice@mips.COM (Charlie Price) (02/24/91)
In article <757@ajpo.sei.cmu.edu> carter%csvax.decnet@mdcgwy.mdc.com writes: >In article <7967@exodus.Eng.Sun.COM> rtrauben@cortex.Eng.Sun.COM (Richard Trauben) writes: >Also ... >>Let 1/3 of the area budget be on-chip cache, 1/3 floating point unit and >>1/3 be integer unit. A move to 64bits will have virtually no impact on >>the fp engine. While the area for tags double, large data block caches >>(i.e. 32-64bytes/tag) make this increase in tag ram area neglectable. >Tag size only doubles if you have virtual-address caches, otherwise tag size >depends on physical address size (I doubt the R4000 has a 64-bit physical >address :) >Still, I worry about this for those design points where you want/need a >virtual address cache, where the design point would otherwise be a shorter >line (a branch target cache would be an excellent example, if someone starts >wanting 64-bit instruction addresses for some reason (e.g. shared library >tricks) - small on-chip caches in lower density processes like GaAs >constitue a less contrived example). What you say is right. Just to keep the confusion factor down here, the on-chip primary caches in the R4000 are virtual index (access is based on the virtual address) but physical tag (deciding on a hit/miss is based on the phys addr) You overlap the virtual-to-physical translation with the cache access. The secondary cache is physical/physical. -- Charlie Price cprice@mips.mips.com (408) 720-1700 MIPS Computer Systems / 928 Arques Ave. / Sunnyvale, CA 94086-23650