Donald.Lindsay@MATHOM.GANDALF.CS.CMU.EDU (11/30/89)
No one has mentioned what I consider to be the fatal flaw of toruses (tori?). Specifically, they can't be timeshared ("spaceshared"). Anyone who invests in a massively parallel machine would like to be able to run multiple small problems on it. Program development, for instance. After all, you wouldn't buy two: better to put all the money into a single machine. A hypercube is nice because it can be treated as two independent subcubes, and recursively one can have many subcubes. The message traffic within each subcube has no effect on the other subcubes - a wonderful property. A mesh is similar: you can have independent rectangular regions. A torus just doesn't work this way. (Neither does a mesh of buses.) Another design issue is the placement of the IO. A hypercube has n = log2(N) channels per node, so, you implement n+1 channels. (For n > 10, n and n+1 are indistinguishable.) Each node now has a direct connection to an IO system. For example, NCUBE dedicates one 16-node IO cube to each 128 nodes of the main cube. With a mesh, there are fewer channels per node, with more wires each. Adding one more is relatively more painful. But if one doesn't, and (for example) places the IO down the left-hand side, then the ability to subdivide may be compromised. Don D.C.Lindsay Carnegie Mellon Computer Science
wen-king@csvax.caltech.edu (Wen-King Su) (12/01/89)
>From: Donald.Lindsay@MATHOM.GANDALF.CS.CMU.EDU <No one has mentioned what I consider to be the fatal flaw of toruses >(tori?). Specifically, they can't be timeshared ("spaceshared"). < >Anyone who invests in a massively parallel machine would like to be <able to run multiple small problems on it. Program development, for >instance. After all, you wouldn't buy two: better to put all the <money into a single machine. > <A hypercube is nice because it can be treated as two independent >subcubes, and recursively one can have many subcubes. The message <traffic within each subcube has no effect on the other subcubes - a >wonderful property. A mesh is similar: you can have independent <rectangular regions. A torus just doesn't work this way. (Neither >does a mesh of buses.) My first project involving space sharing started with the Cosmic Environment (1984). The experiment eventually lead Intel to provide space sharing on their iPSC multicomputers. The experiments at the time were all conducted on cubes and space sharing was done by partitioning a big cube into smaller subcubes. What we have discovered when wormhole routing was developed is that process placement has became much less important. For medium grain multicomputers, message latency between two adjacient nodes is not measurably different form message latency between two that are the furtherest apart. Hence don't always have to preserve the topology when forming sub-machines, and a sub-machine can be just a random collection of nodes that satisfies the requirements of the program. Placement of special nodes (IO nodes, FPA nodes, Disk nodes, etc), is, likewise, much less important. The dissociation of programming from architecture helps to make multicomputer programming just a little bit easier. This concept of allocation is demonstrated on the Symult 2010 multicomputers. One thing that favors grids over cubes is that when a cube is subdivided, the total throughput of each node is reduced because some channels in each node becomes disused. If we subdivide a grid into smaller grids, the interior nodes retain full message throughput. One thing that makes grids look very good compared to torus is that if you subdivide a torus you get grids. So why not just use a grid to start with. /*------------------------------------------------------------------------*\ | Wen-King Su wen-king@vlsi.caltech.edu Caltech Corp of Cosmic Engineers | \*------------------------------------------------------------------------*/
js7a+@andrew.cmu.edu (James Price Salsman) (12/04/89)
A four-by-four torus is isomorphic to a 2x2x2x2 hypercube. On the other hand, a 2^8 hypercube is isomorphic to a 4x4x4x4 hypertoroid. I think the best solution for MIMD configurations is dynamic reconfigurability, such as provided by a big software controlled switch. The INMOS C004 does this for transputer networks, without much loss of link bandwidth. :James "I'm *SO* sick of 3D" Salsman
raja@twilight.osc.edu (Raja Daoud) (12/05/89)
In article <7299@hubcap.clemson.edu> js7a+@andrew.cmu.edu (James Price Salsman) writes: > >I think the best solution for MIMD configurations is >dynamic reconfigurability, such as provided by a big >software controlled switch. The INMOS C004 does this >for transputer networks, without much loss of link >bandwidth. > A practical rule of thumb (i.e. measured on real hardware): A message passing through 3 C004 chips looses 1/2 bandwidth due to bit re-synchronization within each C004. That limits (on the practical side) the scalability of a "Big Switch" made out of many C004 chips. There still is the question of "who" controls the switch (centralized/decentralized) and how many CPU cycles are wasted making the reconfiguration (decision and letting each CPU know where it stands (routing ...)) because users of MIMD machines want "application speed" at the end of the day, and if dynamic reconfiguration takes more time than what is gained by doing it, it wouldn't be practical (Not with the current hardware at least). On the other hand, software solutions that require little or no topology information are IMHO more attractive in the long run and easier for hardware to support. >:James "I'm *SO* sick of 3D" Salsman --Raja -=- ----------------------------------------------------------------- Raja Daoud raja@tbag.osc.edu Trollius Operating System (614) 292-4122 The Ohio State University Ohio Supercomputer Center
birenboi%girtab.usc.edu@usc.edu (Aaron Birenboim) (12/14/89)
In article <7254@hubcap.clemson.edu> Donald.Lindsay@MATHOM.GANDALF.CS.CMU.EDU writes: >No one has mentioned what I consider to be the fatal flaw of toruses >(tori?). Specifically, they can't be timeshared ("spaceshared"). Why not? I believe the NCUBE sets up a kind of virtual sub-cube topology. You refer to a sun-cube's nodes by a "pointer" index. the O.S. decodes this into the actual node location, so a sub cube of size 2^r may not actually have nodes where the node address the programmer uses and thinks of as a nearest neighbor is actually a nearest neighbor. Why not have the tours' or grid's O.S. parse out "virtual sub-tori" which are not physical tori, but look like it to programmers. The O.S. might have some trouble with assigning nodes which comminicate a lot close to each other, but this may not be a large problem. Is the discussion on grids and tori considering flit routing a la NCUBE? By sending out a flit and establishing a communication path all the way fron source node to receive node, ferthest neighbor communication is only 10% longer than nearest neighbor (an NCUBE claim). I realize that routing the connection time depends linearly on the number of nodes traversed, but the actual message transmission does not. Were people keeping the flit routing in mind when they made the claum that a return to hypercubes would be a good decision for optical interconnection networks? If so, why? The flit routing will take a similar amount of time, the messagge transmission is dependant on simple inter-node bandwidth, number of hops do not have such an extreme effect. -- ============================================================================= Aaron Birenboim | aaronb@eedsp.gatech.edu GT Box 30735 | Atlanta, GA 30332 | (only temporarily at birenboi@girtab.usc.edu)
grunwald@ncar.UCAR.EDU (Dirk Grunwald) (12/19/89)
>>>>> On 14 Dec 89 15:46:07 GMT, birenboi%girtab.usc.edu@usc.edu (Aaron Birenboim) said:
Aaron> Were people keeping the flit routing in mind when they made the
Aaron> claum that a return to hypercubes would be a good decision for
Aaron> optical interconnection networks? If so, why? The flit routing
Aaron> will take a similar amount of time, the messagge transmission
Aaron> is dependant on simple inter-node bandwidth, number of hops
Aaron> do not have such an extreme effect.
Aaron> --
-----
The reason for moving to tori/cubes is bandwidth; if you have more
bandwidth than you know what to do with (i.e. optics), you'd go back
to a lower number of hops (i.e hypercubes).