johnston@lbl-csam.ARPA (07/22/85)
(Doing this from memory and hoping not to misrepresent the machine....) The Sequent machine is a symmetric multiprocessor architecture with 2 to 12 CPU's, one memory (2 to 28 Meg), an I/O processor, etc. on a single backplane bus. The system is characterized as a parallel processor with dynamic load balancing, and efficient use of the processors (e.g. little overhead for parallel operation). The dynamic load balancing on the Sequent is accomplished in a very straightforward way. As each process comes to the head of the run queue it is assigned to the "least busy" processor. (This results in processes migrating across processors frequently, e.g. potentially (and actually) at every interrupt.) The key to the success of this scheme is that "least busy" is a condition which is maintained in a sort of status register associated with each CPU. This is part of the hardware support which includes interrupt handling and memory locking. This hardware support is, in my estimation, the key to making the Sequent machine very "smooth" running. The hardware in question is essentially a high speed message bus which connects each processor independently of the data bus. The message transmission, content, interpretation, etc. is all done in a VLSI chip designed by Sequent. While at USENIX I had the opportunity to work on the ten processor machine (12 Meg memory, and 4 or 5 Eagles, as I recall) which is the development machine at the Sequent plant. With fifty users on the machine doing the usual vi, mail, cc, etc. and with me trying to drive the load average up with troff and greps of /usr/dict/words, my general impression was that this machine would run rings around two or three 780's. It was easily the most responsive machine that I have ever worked on, including single user 780's. Not all is peaches and cream, however. The machine uses the National 32032 CPU and associated co-processors. The CPU chips are to be compared with a VAX 750 (according to Sequent), which seems reasonable based on my experience, but the floating point is very slow. I ran Jack Dongara's Linpack benchmark and the result was something slower than a 730. Demonstrating the efficiency of their parallel processor implementation however, 10 copies of the code running simultaneously ran within 5% of the same time as one copy. Sequent claims to be working on an alternative for the floating point processor. One of the exciting things about this machine is that Sequent seems to have devised and demonstrated an architecture for the efficient use of (at least) large grained parallel processing. The architecture of this machine seems to be nearly ideal for a timesharing environment. Bill Johnston (WEJohnston@lbl.arpa)
carpenter@nbs-vms (CARPENTER, BOB) (07/22/85)
I think you will find that this inter-processor task assignment bus is essentially that contained in the IEEE P-896 proposed standard. Sequent apparently saw the IEEE work in progress and picked up some of the more interesting parts. Bob Carpenter <carpenter@nbs-vms.arpa> This message in no way indicates any evaluation or endorsement of anything mentioned herein; it surely doen't represent any official opinion, bias, prejudice, policy, or fact. ------
jsq@im4u.UUCP (07/25/85)
We have one (12CPUs, 16Mbytes memory, 1 Eagle, 1 SCSI Eaglet, 1 SCSI cartridge tape drive, 1 Cipher 1/2" tape drive). We like it. Our experience bears out the linearity of increase in throughput with increasing numbers of CPUs, for fixed point CPU bound processes. For floating point, the throughput is even somewhat better (the individual floating point units may not scream, but there's one per processor). Installation took less than two hours, even after confusion over who was to supply the Ethernet transceiver and cable (us). Reliability has been phenomenal: the worst problem has been an overheating tape drive, fixed by turning the drive off when not in use. The OS is 4.2BSD, right down to the bugs (the tset bug, the rcp bug, the rwhod bug...), though naturally there are proprietary mods to the kernel to support the multiprocessing (including shared memory and semaphores). Minuses: Disk I/O through the Multibus is a bottleneck, though not enough to be a big problem. The C compiler is slow. The Pascal compiler is buggy and will not compile some things which will compile on most other known pascal compilers. It also won't compile TeX. (-: It's fast, though. :-) System sources are difficult to come by, though most VAX 4.2BSD user program sources will compile and run. The company is very responsive. Both disk I/O and the C compiler are quite noticably faster than when we got the machine in April, due to a software update. The bugs diminish. There are several papers in the Portland USENIX Proceedings about the machine, and also in IEEE Computer, etc. Though other companies are working on multiprocessor systems of the same general type as the Balance 8000 (i.e., multiprocessors with memory shared among all processors, not dual processors or networked single processors), Sequent appears to be the only one with an actual product so far. --->>> DISCLAIMER <<<--- This article is my personal opinion, and does not necessarily represent that of my employers, or the University of Texas, or Sequent Computer Corporation, or Opus the Penguin, not to mention reality. PS: Is anybody interested in an INFO-SEQUENT mailing list? (If so, reply by *MAIL* to me.) -- John Quarterman, UUCP: {ihnp4,seismo,harvard,gatech}!ut-sally!jsq ARPA Internet and CSNET: jsq@ut-sally.ARPA, soon to be jsq@sally.UTEXAS.EDU