JOHANSSON%FTS121@ICDC.LLNL.GOV (Erik) (03/09/88)
After my tirade regarding mailing list additions/deletions, I received a couple of requests for information regarding SPRINT, our Transputer based parallel processor. So here it is... SPRINT is the Systolic Processor with a Reconfigurable Interconnection Network of Transputers. It is a 64 processor multiprocessor developed at Lawrence Livermore National Laboratory for experimentally evaluating systolic algorithms and architectures. SPRINT was designed and built by Tony De Groot and myself in 1986, and has been up and running since December of that year. Architecture: SPRINT consists of 64 T414 Transputers (soon to be T800s) each with 128 KB of local memory. The Transputers are interconnected as a six dimensional hypercube. However, this poses a problem, since a six dimensional hypercube requires that each processing node have six communication links while the Transputer has only four. So, we designed a 4x8 crossbar switch which maps any of the four Transputer links to any of the six hypercube links. The remaining two link outputs on the crossbar switch can be used for I/O or left unused. The crossbar switch is implemented using a programmable gate array from Xilinx. The mapping of the switch is determined by a control register in the switch, which may be written or read by the Transputer. These crossbar switches allow us to select any network configuration which is a subset of a six dimensional hypercube (many desired network configurations fall into this category). In addition, each control register has initial conditions built in so that at power up, the network is programmed to map to an 8x8 grid. We have successfully mapped the network into a grid, a triangle, a binary tree, and a trapezoid. Also, since the Xilinx programmable gate arrays are programmed at power up, we can discard the switch program and reprogram the logic with direct link to link connections to get basically any network configuration we desire. The host is a Micro VAX II GPX workstation, running TDS and OPS (for those of you without VAXes, OPS is a development system which allows you to write OCCAM code that runs under VMS). We can also do program development on an AT with a B004 and download to the network. The host to network connection is achieved via a DRV11-WA DMA board in the VAX and a network I/O board of our own design. I posted a detailed description of the I/O board earlier, which I will summarize here. The I/O board is basically two 4Kx16 FIFOs to buffer the data flow. The FIFOs are interfaced to the DRV-11 on one end, and to an I/O Transputer on the other end. The I/O Transputer has 4MB of DRAM for buffer use. The actual connection to the network is made through the four links on the I/O Transputer. Algorithms: Some of the algorithms we developed on the SPRINT include linear algebra operations (matrix-matrix multiplication, the Faddeev algorithm, matrix inversion, QR-decomposition, singular value decomposition, etc), image processing algorithms (e.g., high pass, low pass, median filters, correlation, bilinear expansion), a computed axial tomography algorithm, a finite difference time domain electromagnetic simulation algorithm, and OF COURSE a mandelbrot algorithm. All these algorithms exhibited linear speed up with respect to the number of processors (at least up to 64). For comparison, the algorithms executed on the order of 200 times faster on SPRINT than comparable FORTRAN code running on a VAX 11/780 (well, at least the integer algorithms did). Also, the matrix multiply algorithm executed faster than a Cray X/MP running vectorized FORTRAN for a multiplication of two 512 x 512 integer matrices. That's all for now. If you have any questions, please drop us a line or post a message! Erik Johansson johansson@icdc.llnl.gov Tony De Groot degroot@icdc.llnl.gov Lawrence Livermore National Laboratory Home of SPRINT, the Systolic Processor with ...