neray@Alliant.COM (Phil Neray) (03/16/90)
In article <8261@hubcap.clemson.edu> boulder!foobar!grunwald@ncar.UCAR.EDU (Dirk Grunwald) writes: >I've heard that Alliant has announced an i860 based system with up to >28 processors. Anyone have any more information? What's the system >architecture? FX/80-like but with higher bandwidth bus? Glad you asked. Alliant announced the FX/2800 in January, with shipments to begin in March. The FX/2800 consists of: - Up to 28, 64-bit Intel i860 processors (40 MHz). - Up to 1 GB of main memory and 4 MB of cache memory (connected to the processors via a 1.28 GB/s crossbar switch). - Multiple 25 MB/s VME channels with disk striping, UltraNet, etc. - Extensively multi-threaded Concentrix operating system. - Parallel FX/Fortran, FX/C and FX/ADA compilers. - Real-time FX/RT executive (priority-driven, pre-emptive scheduling), co-resident with UNIX. - Optional tightly-coupled visualization capability (X11R3, PHIGS/PHIGS+, visualization toolkits). - Source-code compatible with the FX/80 Series. - $500K to $2M price range. Entry-level is 8-processor system. What's unique about this system? It's the first general-purpose, shared memory, parallel supercomputer that uses standard VLSI processors rather than a proprietary processor architecture. It's the first truly open supercomputer because it is standard at the processor/instruction-set level AS WELL AS at the usual UNIX level (UNIX, NFS, NQS, compilers, etc.). The idea is to bring the benefits of binary standards to the supercomputer world, so that users can benefit from a much broader applications base than has traditionally been available. The FX/2800 is compatible with the PAX (Parallel Architecture eXtension) ABI jointly defined by Intel and Alliant. ANY vendor who wants to build a binary-compatible system can go to Intel and buy the i860 processors, plus Alliant's concurrency control architecture and parallelizing compilers (which Alliant has licensed to Intel)...AND any software vendor can produce a single binary version of his application that runs on a variety of machines, from workstations to parallel supercomputers like Alliant's. Using standard processors (the i860 is a "Cray-on-a chip", as described by John Rollwagen) combined with parallelism and high-speed shared memory, we've built a system that is rated at 720 MFLOPS on the 1000x1000 LINPACK (in comparison, the single-processor Cray Y-MP/832 is rated at 308 MFLOPS, the C240 at 166 MFLOPS, and the VAX 9000/440VP at 312 MFLOPS). Other performance metrics: over 1.12 peak GFLOPS (DP), 1148 VAX MIPS (aggregate, based on Dhrystone V1.1) and 672 Whetstone MIPS (non-inlined, aggregate). The processors in the FX/2800 can be used as parallel or multiprocessors. Up to six parallel clusters are supported. The scheduler automatically "breaks-up" a cluster into independent multiprocessors if there are no parallel jobs waiting to execute, or automatically breaks clusters up in user-defined time-slices. Each cluster consists of up to 14 processors controlled via hardware-based, concurrency control instructions that are automatically generated by the compilers. The compilers detect opportunities for fine-grained parallelism, typically at the loop-level. (Up to 28 processors in the cluster are supported in certain situations, such as the 1000x1000 LINPACK). Explicit parallelism via compiler directives or UNIX tasking is also supported. (Note that UNIX itself runs directly on the i860 processors in an SMP implementation. There is no "front-end".) The i860 also has some interesting instruction-level parallelism features. It supports "superscalar" operations (up to three instructions per clock cycle - RISC integer/control, FP MUL and FP ADD). This requires sophisticated instruction scheduling in the compiler. The chip also supports pipelined floating operations, which allows our compilers to produce code that has been optimized for both vectorization and concurrency. So - the FX/2800 supports parallelism at multiple levels - instruction-level, loop-level, and task-level - in a truly open supercomputing environment. Thank you for your support. -- Phil Neray Domain: neray@alliant.com Alliant Computer Systems UUCP: {mit-eddie|linus}!alliant!neray Littleton, MA 01460 Phone: (508) 486-1429
dbradley@gibson.ncsa.uiuc.edu (David Bradley) (03/17/90)
So what are the architectural differences between an FX/2800 and a "conventional" shared memory MIMD system like an Encore or Sequent? Based on the posting by Phil Neray, they appear to be the following: - Faster processors - Bigger memory and cache - Processors connected to memory via crossbar rather than bus. (Or does the crossbar connect the processors and cache? This was ambiguous in the Neray's posting.) - Special "hardware-based concurrency control instructions". So from a hardware standpoint the machine is just like a really fast Encore or Sequent, right? Or am I missing something? Of course the software sounds pretty cool, especially the cluster scheduling. -- David Bradley University of Illinois at Urbana Champaign
carroll@beaver.cs.washington.edu (Jeff Carroll) (03/19/90)
In article <8389@hubcap.clemson.edu> neray@Alliant.COM (Phil Neray) writes: >Using standard processors (the i860 is a "Cray-on-a chip", as >described by John Rollwagen) combined with parallelism and ... > Did Rollwagen really say this? I've heard plenty of people at Intel say it - in fact, I believe I saw it in the early press releases. If Rollwagen *did* say this, I'd be very appreciative to anyone who can give me the publication reference. Jeff Carroll carroll@atc.boeing.com
schumach%convex@uunet.UU.NET (Richard A. Schumacher) (03/22/90)
>In article <8389@hubcap.clemson.edu> neray@Alliant.COM (Phil Neray) writes: >>Using standard processors (the i860 is a "Cray-on-a chip", as >>described by John Rollwagen) combined with parallelism and ... ! Please post a reference for this quote!