HALLAM@physics.oxford.ac.uk ("Phillip M. Hallam-Baker") (03/16/90)
At Southampton there is a big(ish) array of 1260 T212's without external RAM (called `Deep Thought' as each board has 42 nodes...). There was talk of upgrading the RAM to 64K per node at one time maybe someone garbbled the message somwhere? - either from that machine or a similar T2 engine. Quite what can be done with a 64K NODE machine I don't know - surely the link speed would start to be a problem? If not that how about the physical cooling /mounting engineering type problems? Where to get the 256 Gigabytes of RAM to make it worth while? Sounds like a fun project! - Anybody out there want to write me a blank cheque to build one? Phillip Hallam-Baker Oxford University Nuclear Physics ZEUS group "You don't have to write in FORTRAN to work here - but it helps."
zenith-steven@cs.yale.edu (Steven Ericsson Zenith) (03/16/90)
In article <1828.9003152120@prg.oxford.ac.uk>, HALLAM@physics.oxford.ac.uk ("Phillip M. Hallam-Baker") writes: > > At Southampton there is a big(ish) array of 1260 T212's without external > RAM (called `Deep Thought' as each board has 42 nodes...). There was talk > of upgrading the RAM to 64K per node at one time maybe someone garbbled the > message somwhere? - either from that machine or a similar T2 engine. AN interesting machine, born of a fortuitous error. Someone bonded a large number of devices (either 90 or 180 deg rotation) incorrectly in their packages. Guy Harriman had special boards made to accomadate the error. > Quite > what can be done with a 64K NODE machine I don't know - surely the link > speed would start to be a problem? If not that how about the physical cooling > /mounting engineering type problems? Where to get the 256 Gigabytes of RAM to > make it worth while? Sounds like a fun project! Now here's an interesting question. What would the characteristics of such a machine be? Remember, there is only 2kbytes per node, fixed configuration. I figure the only useful way to program such a machine would be of the "load problem, compute, unload solution" variety, perhaps with an exchange with nearest nieghbour in there somewhere. But then I'm sure for most problems 2K is just not going to be enough. Anyone know better? >- Anybody out there want to > write me a blank cheque to build one? Make that cheque out to the two of us! > > Phillip Hallam-Baker > > > Oxford University Nuclear Physics > ZEUS group > > "You don't have to write in FORTRAN to work here - but it helps." "You don't have to write Occam to work here - and they'd rather you didn't." sez. -- . . Steven Ericsson Zenith * email: zenith@cs.yale.edu Department of Computer Science | voice: (203) 432 1278 Yale University 51 Prospect Street New Haven CT 06520 USA. "All can know beauty as beauty only because there is ugliness"
roger@wraxall.inmos.co.uk (Roger Shepherd) (03/20/90)
In article <19312@cs.yale.edu> zenith-steven@cs.yale.edu (Steven Ericsson Zenith) writes: >In article <1828.9003152120@prg.oxford.ac.uk>, >HALLAM@physics.oxford.ac.uk ("Phillip M. Hallam-Baker") writes: >> >> At Southampton there is a big(ish) array of 1260 T212's without external >> RAM (called `Deep Thought' as each board has 42 nodes...)..... > >> what can be done with a 64K NODE machine I don't know - surely the link >> speed would start to be a problem? If not that how about the physical cooling >> /mounting engineering type problems? Where to get the 256 Gigabytes of RAM to >> make it worth while? Sounds like a fun project! > > Now here's an interesting question. What would the characteristics of > such a machine be? Remember, there is only 2kbytes per node, fixed > configuration. I figure the only useful way to program such a machine > would be of the "load problem, compute, unload solution" variety, > perhaps with an exchange with nearest nieghbour in there somewhere. But > then I'm sure for most problems 2K is just not going to be enough. > Anyone know better? This is a very interesting questoion that Steve has raised. The first thing that I would note about such a machine is that 2k really is very little store. This much store very rapidly gets filled up with program. When the 1260 transputer node machine was residing at one of our staffers, Graham Cramp, programmed the machine to do primality testing of Mercenne numbers. This proved to be a difficult problem due to the lack of memory. Programs had to be written so as to minimise code size - roll up all your loops - try and build programs so as to deal with the general case even if it is faster to separate out the problem into disjoint special cases. From my experience working with Graham on this problem I would suggest the following (in addition to SEZ's suggestion) as plausible ideas. 1) Data base retreaval. The machine has a reasonable amount of memory (128 Mbyte). Give half of this to program (generous) this leaves 64 Mbyte. The search problem is compiled, either into t-code, or into something which will be interpreted by a (small) program sitting on each processor. This should work pretty well. The balence between search time and problem distribution seems to be reasonable; the processors can be organised into a 16 level binary treee so a problem needs to be distributed to only 16 nodes (say 0.2 mS?). The search time for a trivial linear search of 1Kbytes is about 1mS, and you need to get the answer back (0.2 mS). As you make the lookup more complex the communication cost should become proportionately smaller. 2) Functional decomposition. In one application run on the Southampton machine they have used 2 processors as a compute node, this allows them to run larger programs. I suspect that for some applications this would work very well. When considering machines like this you really do need to rethink your tradeoffs. For example, in an even more exotic case, we looked at using silicon compilation to build a machine for generating (small, less than 16-bit) prime numbers. We had an architecture which used a number of very simple processor which could perform division and which were handed out work by a controlling processor. The controller passed out odd numbers which were then tested for primality by the workers. The question is ``Should you store previously computed primes so that the workers divide candidates only by primes rather than by all odd numbers?''. For small primes the answer is NO; the density of the primes in the odd numbers is quite large for numbers of less than 16-bits; this has two implications, firstly, there is not that much computation saved by dividing only by prime numbers, and secondly, you need a large RAM to store those primes. It turns out that it is much more efficient to use that silicon area to build worker processors than to build a RAM. Of course, not many people see that sort of ecconomics! Roger Shepherd, INMOS Ltd JANET: roger@uk.co.inmos 1000 Aztec West UUCP: ukc!inmos!roger or uunet!inmos-c!roger Almondsbury INTERNET: roger@inmos.com +44 454 616616 ROW: roger@inmos.com OR roger@inmos.co.uk