prism::warner.DECNET@hubcap.UUCP (07/14/87)
The answer to David Pase's question on hypercube routing is that quite a bit of research has been done on optimal communication algorithms. Probably the most important paper is "Spanning Graphs for Optimum Broadcasting and Persnalized Communication in Hypercubes" by S. Lennart Johnsson and Chieng- Tien Ho, Department of Computer Science Tech Report #500, Yale University, November 1986. The bottom line is that while there are a number of very simple routing algorithms, optimal algorithms are quite subtle.
Wen-King@hubcap.UUCP (07/14/87)
In article <278@hubcap.UUCP> pase%oregon-grad.csnet@RELAY.CS.NET (Douglas M. Pase) writes: > 000 -> 001 < 000 -> 010 > 000 -> 001 -> 011 ... etc <The channel between nodes 0 and 1 is used in four different cases, between 0 >and 2 in two cases, and 0 and 4 in only one case. This exponential piling <of messages into the lower numbered channels (which is compounded if one >considers similar usage of channels by other nodes as well) means that in <heavy message traffic there will be a lot of contention for the lower numbered >channels where the channels corresponding to the higher bits go unused. 000 -(0)-> 001 000 -(1)-> 010 000 -(0)-> 001 -(1)-> 011 000 -(2)-> 100 000 -(0)-> 001 -(2)-> 101 000 -(1)-> 010 -(2)-> 110 000 -(0)-> 001 -(1)-> 011 -(2)-> 111 On the countrary, I counted that channels 0, 1, and 2 are each used exactly 4 times, though they are not all channels coming out of node 0. The composite usage, assuming all nodes in the cube are doing similar things, will be the same for all of the individual channels. +--------------------------------------------------------------------------+ | Wen-King Su wen-king@vlsi.caltech.edu Caltech Corp of Cosmic Engineers | +--------------------------------------------------------------------------+
Wen-King@hubcap.UUCP (07/14/87)
In article <285@hubcap.UUCP> ravi@CS.UCLA.EDU writes: <>The obvious solution to this problem is to randomly choose the direction <in which the message should be routed, i.e. randomly choose one of the >bits that differ instead of scanning from right to left or left to right. <Another possibility is that one could start the left to right scanning for >each message at a randomly chosen bit for that message. If you do this, you may have problems with deadlock due to lack of queue space along the path of the message. One reason for using the routing algorithm described by pase in his article is that the algorithm is proven to be deadlock free provided that the destination node will eventually consumes all messages it receives. This particular routing algorithm is call the e-cube routing algorithm. +--------------------------------------------------------------------------+ | Wen-King Su wen-king@vlsi.caltech.edu Caltech Corp of Cosmic Engineers | +--------------------------------------------------------------------------+
ravi@hubcap.UUCP (07/14/87)
In article <291@hubcap.UUCP> wen-king@VLSI.CALTECH.EDU (Wen-King Su) writes: > >In article <285@hubcap.UUCP> ravi@CS.UCLA.EDU writes: ><>The obvious solution to this problem is to randomly choose the direction > <in which the message should be routed, i.e. randomly choose one of the > >bits that differ instead of scanning from right to left or left to right. > <Another possibility is that one could start the left to right scanning for > >each message at a randomly chosen bit for that message. > >If you do this, you may have problems with deadlock due to lack of queue >space along the path of the message. One reason for using the routing >algorithm described by pase in his article is that the algorithm is proven >to be deadlock free provided that the destination node will eventually >consumes all messages it receives. This particular routing algorithm is >call the e-cube routing algorithm. I can see the possibility that a message will never reach its destination when it is routed in an arbitrary direction (no longer shortest path routing) instead of a direction towards the destination. A case can be made for doing this in a congested situation when buffers in the chosen direction are full. However when one chooses a direction toward the destination I see no reason why it should deadlock. I would appreciate if you can give us the argument as to why it deadlocks or any references pointing to the deadlock proof. ravi ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ T. M. Ravi UCLA Computer Science Department, 3680 Boelter Hall, UCLA Los Angeles, CA 90024 Phone: (213) 825-2266 ARPA : ravi@CS.UCLA.EDU UUCP : {...sdcrdcf, ihnp4, trwspp, ucbvax}!ucla-cs!ravi ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
grunwald@M.CS.UIUC.EDU (07/16/87)
If two nodes are K hops apart, there are K bits on in their routing mask (the bit-mask generated by the from ^ to computation). By always looking at the L.S.B of the routing mask, you're choosing a single fixed path for communication between each pair of nodes. If you choose one of the on bits at random, or if you consider every 'on bit' and send the message out the channel with the smallest message queue, you'll have a better link utilization. From simulation data for a network switch which allows all links to be utilized all the time, this packet routing method will buy you some improvement (about 10%) when queues form. However, the Intel & Ametek cubes allow a single (or in the case of Intel, maybe two) concurrent transmissions. In this case, it doesn't matter which routing method you use. Link utilization is not the question: it's memory traffic within a node. I'm not certain how many channels Ncube will fire up at once.
bradley@M.CS.UIUC.EDU (07/16/87)
According to "A High Performance Operating System for the NCUBE", pp. 90 of _Hypercube Multiprocessors 1987_", "The broadcast instruction [in the NCUBE instruction set] allows a message to be sent on any subset of the output channels of a node in a single DMA action." In other words, only a single message can be sent at a time, but it can be sent out over several channels at once. David Bradley bradley@a.cs.uiuc.edu Department of Computer Science University of Illinois at Urbana-Champaign "Picasso: Official hypercube operating system of the 1988 Olympic Games"
agn@UNH.CS.CMU.EDU (Andreas Nowatzyk) (07/17/87)
Contrary to David Bradley's interpretation of the broadcast instruction on the Ncube-machine (BPTR and BCNT), it is possible to use each of the 22 DMA-channels (10 pairs of receiver/transmitter for communication inside the cube and 1 pair for I/O) independently: To start a transfer, you should check the ready-flag to see that the channel is idle (you may use the interrupt facility to do so instead). The LPTR instruction loads the address of the data to be transfered into the address register of an specific channel and the LCNT instruction loads the byte-count and initiates the transfer - no sweat. This is much easier than programming the ethernet-chip in the iPSC. There is however a bandwidth limitation: 22 active DMA channels would cause about 28 Mbyte/sec peak memory traffic in addition to the CPU. They all share the same data path to the off-chip memory. This would exceed the memory I/O bandwidth by a factor of 3. So a node cannot have more than 8 active, independent channels at a time. Anyway, this is only of interest to bare-metal system programmer: The typical user sees generalized send- and receive-calls that hide the topology, routing issues and the queueing of unsolicited messages. -- -- Andreas Nowatzyk (DC5ZV) Carnegie-Mellon University Arpa-net: agn@unh.cs.cmu.edu Computer Science Department Usenet: ...!seismo!unh.cs.cmu.edu!agn
Irving@hubcap.UUCP (07/18/87)
In article <278@hubcap.UUCP> pase%oregon-grad.csnet@RELAY.CS.NET (Douglas M. Pase) writes: >Message routing, it would seem, is a simple problem on a hypercube [...] > *** The Question *** > >Does anyone know of a routing algorithm which utilizes more evenly the channels >available for *all* combinations of node sources and destinations? >-- >Doug Pase -- ...ucbvax!tektronix!ogcvax!pase or pase@Oregon-Grad.csnet Well, it just so happens that my supervisor and I have a paper submitted for publication on just this problem... First, to apologise for the long delay; turnaround time out here is pretty bad. I have seen most of the other responses (they all arrived at the same time). Anyhow, what we have done is to compare fixed-order (the one you outlined), random order routing (at each router, a direction is chosen at random from the directions in which the message must move), and two "adaptive" schemes where the dimensions chosen depend on the all the messages passing through the node at the same time. The locally optimal assignment of messages to outgoing wires (that is, moving the most messages forward at once) is exactly the Bipartite Matching problem, for which the best know solution is a very tricky O(n^2.5) algorithm that would be almost impossible to implement effectively in silicon. We used this as one of our adaptive schemes; the other was a simple "first-fit" method. We simulated all of these under random message loads at a variety of load levels, and compared queue lengths, total message transit time (in cube steps) and a few other things. Basically, the "fixed order" methods perform dismally compared to the "adaptive" methods. The second part of the paper investigates what happens with limited queues at each corner of the cube. Messages are assigned moves away from their destination if necessary to prevent queue overflows. What we found was that very little queue space is needed (ie, 2 or 4 total (not per dimension) spaces for an 8-cube) to get the best throughput/delay under heavy loads. Under light loads it doesn't make any difference, since messages don't have to wait in the queues much. Also, our first-fit method comes very close to Bipartite Matching without the complex algorithm. The paper is available as a tech report from the University of Saskatchewan; e-mail me (reid@sask.UUCP) your postal address if you would like a copy. I have also done a fairly complete design of a router chip for an 8-cube using the first-fit algorithm; it maps quite well into VLSI. The design can be extended to attach an arbitrary number (limited by pin count and chip size) of PEs to each corner of the cube to make better use of the high load capacity of the first-fit routers. -- - irving - (reid@sask.uucp or {alberta, ihnp4, utcsri}!sask!reid) Whose idea was this, anyway?
bradley@seri.UUCP (07/20/87)
According to "A High Performance Operating System for the NCUBE", pp. 90 of _Hypercube Multiprocessors 1987_", "The broadcast instruction [in the NCUBE instruction set] allows a message to be sent on any subset of the output channels of a node in a single DMA action." In other words, only a single message cal be sent at a time, but it can be sent out over several channels at once. David @radley bradley@a.cs.uiuc.edu Department of Computer Science University of Illinois at Urbana-Champaign "Picasso: Official hypercube operating system of the 1988 Olympic Games"