[net.unix-wizards] Sequent talk at LBL

johnston@lbl-csam.ARPA (07/22/85)

(Doing this from memory and hoping not to misrepresent the machine....)

The Sequent machine is a symmetric multiprocessor architecture with 2
to 12 CPU's, one memory (2 to 28 Meg), an I/O processor, etc.  on a
single backplane bus. The system is characterized as a parallel
processor with dynamic load balancing, and efficient use of the
processors (e.g. little overhead for parallel operation).

The dynamic load balancing on the Sequent is accomplished in a very
straightforward way. As each process comes to the head of the run queue
it is assigned to the "least busy" processor. (This results in processes
migrating across processors frequently, e.g. potentially (and actually)
at every interrupt.) The key to the success of this scheme is that "least
busy" is a condition which is maintained in a sort of status register
associated with each CPU.  This is part of the hardware support which
includes interrupt handling and memory locking.  This hardware support
is, in my estimation, the key to making the Sequent machine very
"smooth" running. The hardware in question is essentially a high speed
message bus which connects each processor independently of the data
bus. The message transmission, content, interpretation, etc. is all
done in a VLSI chip designed by Sequent.

While at USENIX I had the opportunity to work on the ten processor
machine (12 Meg memory, and 4 or 5 Eagles, as I recall) which is the
development machine at the Sequent plant. With fifty users on the
machine doing the usual vi, mail, cc, etc. and with me trying to drive
the load average up with troff and greps of /usr/dict/words, my general
impression was that this machine would run rings around two or three
780's.  It was easily the most responsive machine that I have ever
worked on, including single user 780's.

Not all is peaches and cream, however. The machine uses the National
32032 CPU and associated co-processors. The CPU chips are to be
compared with a VAX 750 (according to Sequent), which seems reasonable
based on my experience, but the floating point is very slow. I ran Jack
Dongara's Linpack benchmark and the result was something slower than a
730. Demonstrating the efficiency of their parallel processor
implementation however, 10 copies of the code running simultaneously
ran within 5% of the same time as one copy.  Sequent claims to be
working on an alternative for the floating point processor.

One of the exciting things about this machine is that Sequent seems
to have devised and demonstrated an architecture for the efficient
use of (at least) large grained parallel processing. The architecture
of this machine seems to be nearly ideal for a timesharing environment.

	Bill Johnston (WEJohnston@lbl.arpa)

carpenter@nbs-vms (CARPENTER, BOB) (07/22/85)

I think you will find that this inter-processor task assignment bus
is essentially that contained in the IEEE P-896 proposed standard.

Sequent apparently saw the IEEE work in progress and picked up some
of the more interesting parts.

Bob Carpenter <carpenter@nbs-vms.arpa>

This message in no way indicates any evaluation or endorsement of anything
mentioned herein; it surely doen't represent any official opinion, bias,
prejudice, policy, or fact.

------

jsq@im4u.UUCP (07/25/85)

We have one (12CPUs, 16Mbytes memory, 1 Eagle, 1 SCSI Eaglet, 1 SCSI
cartridge tape drive, 1 Cipher 1/2" tape drive).  We like it.

Our experience bears out the linearity of increase in throughput with
increasing numbers of CPUs, for fixed point CPU bound processes.  For
floating point, the throughput is even somewhat better (the individual
floating point units may not scream, but there's one per processor).
Installation took less than two hours, even after confusion over who
was to supply the Ethernet transceiver and cable (us).  Reliability has
been phenomenal:  the worst problem has been an overheating tape drive,
fixed by turning the drive off when not in use.  The OS is 4.2BSD,
right down to the bugs (the tset bug, the rcp bug, the rwhod bug...),
though naturally there are proprietary mods to the kernel to support
the multiprocessing (including shared memory and semaphores).

Minuses:  Disk I/O through the Multibus is a bottleneck, though not
enough to be a big problem.  The C compiler is slow.  The Pascal compiler
is buggy and will not compile some things which will compile on most other
known pascal compilers.  It also won't compile TeX.  (-: It's fast, though. :-)
System sources are difficult to come by, though most VAX 4.2BSD user program
sources will compile and run.

The company is very responsive.  Both disk I/O and the C compiler are
quite noticably faster than when we got the machine in April, due to
a software update.  The bugs diminish.

There are several papers in the Portland USENIX Proceedings about the
machine, and also in IEEE Computer, etc.

Though other companies are working on multiprocessor systems of the
same general type as the Balance 8000 (i.e., multiprocessors with
memory shared among all processors, not dual processors or networked
single processors), Sequent appears to be the only one with an actual
product so far.

--->>> DISCLAIMER <<<--- This article is my personal opinion, and
does not necessarily represent that of my employers, or the University
of Texas, or Sequent Computer Corporation, or Opus the Penguin,
not to mention reality.

PS:  Is anybody interested in an INFO-SEQUENT mailing list?
(If so, reply by *MAIL* to me.)
-- 
John Quarterman,   UUCP:  {ihnp4,seismo,harvard,gatech}!ut-sally!jsq
ARPA Internet and CSNET:  jsq@ut-sally.ARPA, soon to be jsq@sally.UTEXAS.EDU