conor@inmos.co.uk (Conor O'Neill) (01/26/89)
I'm interested to know if anyone has used Sigma networks for transputers. I first heard of them at the Occam User Group at Southampton in Sep 88, but I've forgotten who mentioned them. Basically, they give you a network of 2-to-the-n processors, with 4 links each, and a maximum distance between processors of n. This compares with a hypercube's requirement of n links per processor. Unfortunately you don't get this for free, because the *average* message distance is higher, and routing is not so simple. As a bonus though, the connections leave some links unused so that you can connect something else to the network! As you can see, these could be very useful for transputer systems. I will briefy describe how to create these. Suppose we have n=4 (i.e. 16 processors). Then each processor is numbered in binary with n bits: ABCD. This processor is connected to four other processors, whose numbers are found by shifting ABCD either right or left, and adding a 0 or a 1. So ABCD is connected to 0ABC, 1ABC, BCD0, BCD1. You notice that if ABCD = 0000, then it connects two links back to itself, which is where the spare links come from. If you draw out one of these networks, they tend to look pretty irregular, but here's the n=3 picture: 001---------011 / | \ / | \ 000 | 010=101 | 111 (both 000 and 111 have their other 2 links \ | / \ | / connected together) 100---------110 I haven't attempted to draw a 64 processor network, but you can use the fact that there's a lot of symmetry in there (vertical, horizontal, and rotational) to help draw them. A simplistic way to route a message is to slowly shift the bits of the destination address into the current address. This will almost always take n steps, so it is not very efficient. EG. to get from 000 to 101, shift the 000 right and shift in the rightmost bit of the destination to get to 100. Then shift in the middle bit of the destination to get to 010. Finally you get to 101 in 3 steps. Hence it is easy to see that you can always get from any processor to any other in n steps. The trick is to make use of the higher connectivity to reduce the number of hops. Also this simple method has the disadvantage that it doesn't fully use all of the links, for example all messages leaving node ABC will go to either 0AB or 1AB, without using BC0 or BC1. There would be a lot of scope for recognising sub-strings within the addresses and using that to reduce the number of hops, but I haven't been able to give this a lot of thought. It would become very important as you increased the number of nodes. This might also make better use of all four links leaving each processor. For example, with n=10, sending a message from 1010110101 to 1010111111 would take 10 steps if you did it that simple way, or you could shift in 11111 from the right instead, in 5 steps. Alternative you could use a mixture of shifts, if the common pattern was in the middle of the bit-pattern. So you could get from 0111111110 to 1111111111 by going to: 111111110x, 1111111110, 1111111111. (where x is any value) Anyway, has anyone actually used one of these? Incidently, they are not restricted to 4-way nodes. If you had 8 links, simply number the nodes in base 4, ie. ABCD is connected to 0ABC, 1ABC, 2ABC, 3ABC, BCD0, BCD1, BCD2, BCD3. Similarly for any *even* number of links. The person at the Occam User Group was actually discussing his work on finding other networks with better values for maximum distance verses number of processors and valency (number of links). I vaguely remember him reporting a network of distance 10, with over 10,000 processors, but I can't remember whether that was valency 4 or 10. Does anyone have any similar graph-theoretic results? -- Conor O'Neill, Software Group, INMOS Ltd. conor@inmos.co.uk Disclaimer: All views are my own, not those of INMOS.
CBKM23@ujvax.ulster.ac.uk (02/02/89)
In response to the enquiry by C. O'Neill, 30-1-89, I think that these are a variant of the perfect shuffle inteconnnection first presented by H.S. Stone in 1971. This pattern looks a lot more familiar than the graph shown in the message, but is topologically equivalent. I am using this interconnection at present, with 16 transputers mounted on four B012 motherboards, and have found it quite tricky to wire up, because the motherboards have hardwire connections for some of the links, and I have had to find four disjoint 'runs' of processors in the interconnection pattern. As to routing algorithms, I had a rummage through the literature a few months ago, but apart from the obvious algorithm came up with nothing. However ther is a vast North American research effort on network topologies, so there may well be a paper buried in there somewhere. Another way of viewing the topology is as two bi-directional binary trees, meshed together in a slightly tricky way. The two root nodes are the the processors, one down from the top, and one up from the bottom of the standard diagram. (This diagram is : split the processors into 'fronts' and 'backs', then draw a column of 'fronts' each with two wires emanating from them, connected to a column of 'backs'. Number the processors from top to bottom.) This view suggests clever ways of routing the data, but I haven't actually managed to produce an appropriate algorithm. Your question about optimal graphs for fixed valencies, is also of interest. I would also love to know if there is an answer. The topic appeared to be an unsolved question in graph theory, in the late 1970s, see Bollobas, 'Extremal Graph Theory', (Chapter 5 I think...), but perhaps an answer has been found by now. By the way, the problem is cast in Bollobas as finding the best pattern of direct flights between a set of airports, in such a way as to minimise the average flight time between airports, where indirect flights are defined as a concatenation of direct flights, and the number of flights leaving or ending at any one airport is limited. Obviously, the airports correspond to processors, and direct flights to links! Mary Shapcott, Department of Computing Science, University of Ulster at Jordanstown, Co. Antrim BT37 0QB tel : (0232) 365131 ext. 2671 References : Stone H.S., 1971. "Parallel Processing with the Perfect Shuffle" I.E.E.E. Transactions on Computers, Vol 20, No.2, pp 153-161 Parker D.S., 1980. "Notes on Shuffle/exchange-type Switching Networks" I.E.E.E. Transactions on Computers, Vol C-29, pp 213-222.
JRA@vax1.esprit.southampton.ac.uk (02/07/89)
I think I was the person who Conor O'Neill heard talking about sigma networks at the Southampton Occam User Group. I believe they are more widely known as delta networks (I chose the wrong Greek symbol at some point). The generalization of these to any even valency can safely be referred to as de Bruijn graphs. The largest known 4-valent graph of diameter does indeed have over 10,000 nodes. A table of the current largest known graphs of small valency and diameter is maintained by J.-C. Bermond of Universite de Paris-Sud. Here is the most up-to-date version of this table that I have for 3, 4 and 5-valent graphs. 3-valent 4-valent 5-valent Diameter 2 10 nodes 15 nodes 24 nodes '' 3 20 '' 40 '' 70 '' '' 4 38 '' 95 '' 182 '' '' 5 70 '' 364 '' 532 '' '' 6 128 '' 734 '' 2742 '' '' 7 184 '' 1081 '' 4368 '' '' 8 320 '' 2943 '' 11200 '' '' 9 540 '' 7439 '' 33600 '' '' 10 938 '' 15657 '' 123120 '' James Allwright James Allwright Dept. of Electronics & Computer Science Southampton University JRA@UK.AC.SOTON.ESP.V1 or JRA@UK.AC.SOTON.CM