[comp.arch] 64-bit addresses & multiprocessor cache request

pegram@uvm-gen.UUCP (pegram r) (02/22/90)

From article <7971@pt.cs.cmu.edu>, by lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay):
> In article <8840009@hpfcso.HP.COM> dgr@hpfcso.HP.COM (Dave Roberts) writes:
>>All those signals coming out of a chip will take
>>a lot of chip area around the edge of a die, hence driving physical die
>>sizes up (and chip yields down).
 
> The "small" system of the future would map a 48-bit or 64-bit virtual
> address to a 32-bit physical address. With an on-chip MMU, the pin
> count would be the same as it is now.
> -- 
> Don		D.C.Lindsay 	Carnegie Mellon Computer Science

I can possibly see the small system of the future operating that way, but
for really small systems, make the MMU optional, and slow the chip
slightly by multiplexing the data and address lines.  That requires no
more lines than are used currently and lets the external latches
handle the drive problems.

On another subject, does anyone have any recent references on cache
coherency in multiprocessor systems (with a good bibliographies)?  
I have a paper to write and don't want to miss current work.  Since
medium to fine grain multiprocessing systems use micros, I have 
been reading papers about recent microprocessor (cache) designs that 
allow for parallel processing, such as the cache descriptions in 
articles on the 68040.  The bus snooping - write back option of that 
design seems rather crude though, maybe off chip caches can be cleverer.
Email responses would be nice, I will post a summary only if there is 
interest.

Bob Pegram (Internet: pegram@griffin.uvm.edu) 301B, Votey Bldg.
					      University of Vermont
					      Burlington, Vt. 05405

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (02/23/90)

In article <1400@uvm-gen.UUCP> pegram@uvm-gen.UUCP (pegram r) writes:

| I can possibly see the small system of the future operating that way, but
| for really small systems, make the MMU optional, and slow the chip
| slightly by multiplexing the data and address lines.  That requires no
| more lines than are used currently and lets the external latches
| handle the drive problems.

  I really doubt that future systems will operate without an MMU,
because the low end chips will have one on chip and the fastest RISC
will run an O/S which needs one. I don't think multiplexing lines is 
going to be common because of market demand for power.

  Has anyone looked into an inline CPU package? Consider an extended
SIPP concept, with the package being as long as needed, or even being
several chips on a substrate. At 10 stations per inch, two sides, two
levels, this gives 40 connections/inch. Obviously this could be pushed
to at least double this, but by having a great long chip the congestion
of traces at the CPU would be less, address and data could be fed
separated, and power and gnd lines could be run between signal traces to
lower crosstalk.

  The thought of a 3-4 inch long CPU is unconventional, but doesn't use
more area than a square, and by clever parts placement could keep lead
length *between actual gates* low by allowing more support chips to be
close to the CPU.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
            "Stupidity, like virtue, is its own reward" -me

davidb@braa.inmos.co.uk (David Boreham) (02/26/90)

In article <2134@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>  Has anyone looked into an inline CPU package? Consider an extended
>SIPP concept, with the package being as long as needed, or even being
>several chips on a substrate. At 10 stations per inch, two sides, two
>levels, this gives 40 connections/inch. Obviously this could be pushed
>to at least double this, but by having a great long chip the congestion
>



High density IC packaging technologies are under development in all of the
major (and many minor) semiconductor manufacturers. What you are describing 
is basically a two-dimensional interconnect scheme. Texas instruments have
demonstrated this with actual DIE ! That is they have dice mounted vertically
on their edges on a substrate. The devices I've seen this demonstrated on were
memories but there's no reason why CPU-type dice couldn't be used.

In the UK, a company called Dowty developed a wierd thing called ``Chiprack''
which allowed stacks of little PCBs to be built up. The connections ran up th
outside of the stacks. This has been used to build demonstration Z80 and transputer
systems.

There are many more examples. In short, everyone would like interconnection
technologies which reduced the wiring delays in their systems and made things
cheaper and smaller.

However, the major pain with these schemes (and one of the reasons why they are
not in common use) is that it is difficult to get heat out of the assembly.
Most of the promising developments in this area use some kind of metal slug or
conduction path to get heat away from the dice.


I hope this sets your mind at rest that people are thinking about such things.

David Boreham, INMOS Limited | mail(uk): davidb@inmos.co.uk or ukc!inmos!davidb
Bristol,  England            |     (us): uunet!inmos.com!davidb
+44 454 616616 ex 547        | Internet: davidb@inmos.com