[comp.unix.wizards] Information on BSD 4.[23] on two/multiple processor systems

adb@elrond.CalComp.COM (Alan D. Brunelle) (08/05/87)

  We are seeking any information on BSD 4.2 or 4.3 support of multiple CPU
  systems. (More specifically a two CPU system such as a VAX 11/782 or 
  VAX 8800.) The types of information we are most interested in are:
    o Does the standard BSD 4.[23] release support the dual processor
      systems?
    o To what extent is the 2nd processor utilized? (Is it a true 
      multi-processor port in that the kernel and user processes can
      run on either CPU, or is it the case that the kernel runs on one
      and only one of the CPUs?) 
    o Are there any performance measures which compare a two processor
      system vs. the single processor version? (Using the same kernel
      source on a single processor system vs. a two processor system,
      does the two processor version equal 1.3? 1.5? 1.7? 2.0!? times
      a single processor system?)

  If a Berkeley standard release does not contain dual processor
  support, does anyone else out there have information on:
    o Which systems DO have a dual (or multiple) CPU UN*X?
    o How to 'coerce' a BSD port to work on a dual cpu architecture?
    o Information on how the two processors are utilized in that
      scheme.
    o What kind of theoretical/practical performance improvement benchmarks
      does the 2 CPU system have over the single CPU version?
      
  Comments about some of the issues involved with multiple processor 
  systems would be appreciated, as would the methods that could handle
  them correctly.  Information on (say) AT&T SysV.3 multiple CPU systems 
  or any other UN*X type boxes out there would also be useful.

  Any answers/comments would be most appreciated. Any replies to me
  would be summarized to the net later (if so desired). 

  thanks,
    al 

    "Elen sila lumenn' omentielvo."

/-----------------------------------------------------------------------\
| Alan D. Brunelle   (adb@durin.CalComp.COM)  (603) 885-8145            |
| uucp: ...{decvax,harvard,savax,wanginst}!elrond!durin!adb             |
| Calcomp (A Lockheed Company)         Display Products Division	|
| PTP2-2D02 Hudson NH 03051-0908     "We draw on your imagination" (tm) |
\-----------------------------------------------------------------------/

chris@mimsy.UUCP (Chris Torek) (08/06/87)

In article <1112@elrond.CalComp.COM> adb@elrond.CalComp.COM (Alan D.
Brunelle) writes:
>   We are seeking any information on BSD 4.2 or 4.3 support of multiple CPU
>   systems. (More specifically a two CPU system such as a VAX 11/782 or 
>   VAX 8800.) The types of information we are most interested in are:
>     o Does the standard BSD 4.[23] release support the dual processor
>       systems?

No; and in any case, 4.3BSD does not run on BI machines like the
8800 (except here at the University of Maryland!).  Ultrix 2.0
appears to support multiple CPUs, and it is not terribly difficult
to add similar support to 4.3BSD.  Most of the hooks are there;
this is what `masterpaddr' in Swtch() is all about.  George Goble
at Purdue has had a master/slave system running for years, using
an inexpensive version of the 782 (replace the SBI terminator with
a second KA780 and away you go).

>     o To what extent is the 2nd processor utilized? (Is it a true 
>       multi-processor port in that the kernel and user processes can
>       run on either CPU, or is it the case that the kernel runs on one
>       and only one of the CPUs?) 

The obvious changes for 4.3 result in a master/slave system.
Allowing any CPU to make syscalls and field interrupts requires
real locking, which would require fairly substantial kernel changes.
Just ask Sequent: they did it.  I would guess that Ultrix 2.0 uses
master/slave.

>     o Are there any performance measures which compare a two processor
>       system vs. the single processor version? (Using the same kernel
>       source on a single processor system vs. a two processor system,
>       does the two processor version equal 1.3? 1.5? 1.7? 2.0!? times
>       a single processor system?)

The answer is `it depends'.  Master/slave systems run anywhere from
<1.0 to 1.9+ times the rate of single-cpu systems, depending on
exactly what you do.  Consult any good operating systems book for
details.

>	"Elen sila lumenn' omentielvo."

Be careful, friends!  Speak no secrets!  :-)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

edler@cmcl2.NYU.EDU (Jan Edler) (08/07/87)

By now, quite a few decent quality multiprocessor UNIX systems have
been produced.  I'm sure my list is pretty out of date by now, but I'm
sticking a few references at the end of this message.  I'll let the
various vendors of such systems speak for themselves about details.  If
you want good performance with fairly little work, you should go with
one of them.

If you want to roll your own, it isn't too hard to get something simple
going (like a master/slave system), provided you are already pretty
familiar with the UNIX kernel and with various issues of parallel
programming.  But getting high performance is a lot more work.

Here at NYU we have developed three multiprocessor UNIX kernels, and
are working on a fourth.  The first was a master/slave system based on
v7, the second was a symmetric system also based primarily on v7, the
third was a master/slave system based on 4.2bsd, and we're now working
on a more aggressive symmetric system based on 4.3bsd.  As an
intermediate stage of this new version, we're also doing a "floating
master/slave" system, where most of the kernel is only executed by one
processor at a time, but it isn't always the same processor.

Schemes like master/slave are relatively easy because they work by
trying to preserve, to a great extent, the uniprocessor semantics that
are assumed throughout a standard UNIX kernel.  The better you do that,
the easier the job, but when you serialize things you sacrifice
performance, especially as the number of processors increases.  If your
application mix is not kernel-intensive, it can still give quite good
performance.

A more parallel kernel requires extensive modifications to eliminate
those uniprocessor assumptions.  Some of the references below already
do a reasonable job of explaining this.

In many cases the job of producing a multiprocessor UNIX kernel is also
tied up with the mundane issues of porting to a different machine,
possibly with a different MMU, etc.; these factors should not be
ignored.  There are also a host of additional issues that arise if you
are interested in larger numbers of processors; critical sections that
are entirely acceptable on smaller machines can become easily become
bottlenecks.

Here are some references:

%T The U\s-2NIX\s0 System: Multiprocessor U\s-2NIX\s0 Systems
%A M. J. Bach
%A S. J. Buroff
%J AT&T Bell Laboratories Tech. J.
%V 63
%N 8
%D Oct. 1984
%P 1733-1750
%K unix bltj 3b20
%X Good description of basic approach to a symmetric UNIX kernel

%T M\s-2UNIX\s0, A Multiprocessing Version of U\s-2NIX\s0
%A John A. Hawley,\0III
%A Walter B. Meyer
%R M.S. Thesis
%I Naval Postgraduate School
%C Monterey, California
%D June 1975
%K hawley munix unix pdp11
%X Possibly the earliest project to put UNIX on a multiprocessor.

%T A Dual Processor \s-2VAX\s0 11/780
%A George H. Goble
%A Michael H. Marsh
%R Tech. Report TR-EE 81-31
%D Sept. 1981
%I Purdue University
%K master slave unix vax multiprocessor
%X An early master/slave UNIX system that attracted a lot of attention.

%T VLSI Assist in Building a Multiprocessor U\s-2NIX\s0 System
%A Bob Beck
%A Bob Kasten
%J Proc. USENIX Conf.
%D Summer, 1985
%P 255-275
%K sequent balance 8000 dynix unix

%T A Multiple CPU Version of the U\s-2NIX\s0 Kernel
%A Eric J. Finger
%A Michael M. Krueger
%A Al Nugent
%J Proc. USENIX Conf.
%D Winter, 1985
%P 11-22
%K masscomp unix

%T The Design of the U\s-2NIX\s0 Operating System
%A Maurice J. Bach
%I Prentice-Hall
%C Englewood Cliffs, New Jersey
%D 1986
%K unix internals book

%T Considerations for Massively Parallel U\s-2NIX\s0 Systems
on the NYU Ultracomputer and IBM RP3
%A Jan Edler
%A Allan Gottlieb
%A Jim Lipkis
%J Proc. USENIX conf.
%D Winter, 1986

%T An Overview of the NYU Ultracomputer Project
%A Allan Gottlieb
%R NYU Ultracomputer Note #100
%I New York University
%C New York
%D 1986
%K ucn100

----------------------

Jan Edler
NYU Ultracomputer Project
edler@nyu
...!cmcl2!edler
(212) 998-3353

csg@pyramid.pyramid.com (Carl S. Gutekunst) (08/08/87)

>    o Which systems DO have a dual (or multiple) CPU UN*X?

All commercially available UNIX ports that run on multiprocessor systems are
proprietary -- they only support the vendors own hardware. The UNIX multi-
processor implementations that I am familiar with include, alphabetically:
_______________________________________________________________________________

AT&T has a twin-processor version of the 3B20, but I believe it runs a special
real-time version of UNIX. Someone from AT&T will certainly elucidate....

Celerity 1260D: a pair of proprietary RISC processors, very roughly 4 VAX MIPS
each but with exceptional floating point for the price. Master/Slave, 4.3BSD.

Computer Consoles Inc. Power 6/32MP: a pair of proprietary CPUs, 7 MIPS or so
each, master/slave. Choice of 4.3BSD or System VR2.

Counterpoint: 16Mhz 68020 workstation, up to four CPUs. Symmetric, I think,
but this was never was clear to me. System VR3 with Berkeley networking.

DEC: I don't have the numbers handy, but DEC has a least one dual processor
system that is supported by ULTRIX, master/slave. And of course there are a
number of university BSD multiprocessor ports for VAXen. 

Encore Multimax: up to ten 32332 microprocessors, symmetric, choice of 4.2BSD
or System VR3. This machine seems to currently be popular with universities,
an honor that changes annually. :-)

Elxsi 6400: Up to 10 monster ECL processors, an honest 10 VAX 780 MIPS each.
Symmetric. Very usable proprietary operating system called EMBOS, with a dual
port UNIX implementated on top. This is not a multiprocessor UNIX p_e_r_ s_e_,
but a very interesting solution to the problem. 

Gould NPL series: up to 10 12-MIPS proprietary ECL processors, Symmetric, dual
port UNIX. I don't know if these are shipping yet.

Pyramid 98x and 9000 series: up to two 3.2 MIPS or four 7 MIPS proprietary
processors, symmetric, dual port UNIX (System Vr3/4.2BSD + some 4.3BSD).

Sequent Balance 8000, Balance 21000, and Symmetry: up to 12 or 30 NSC 32032
microprocessors, and up to 30 Intel 80386 microprocessors, respectively. Fully
symmetric, obviously. Partial dual port UNIX (4.2BSD + some System V). The
Symmetry has been announced; volume shipments are expected in April '88.
_______________________________________________________________________________

An interesting observation is the dominance of *BIG* boxes in the UNIX multi-
processor arena; only the Celerity and the Counterpoint would even remotely
qualify as "personal" or "workstation" systems. The others are marketed as
multiuser systems for a large user base (from 48 to 512 users), where the main
purpose of the multiple processors is to support a large number of different
tasks, as opposed to subdividing single large tasks. (I don't mean the sub-
division of large tasks isn't possible, just that none of these are marketed
that way.)

My apologies for any omissions or errors; I'm sure the approriate people will
flame me appropriately. :-) You can contact the vendors for more info.

<csg>

gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/08/87)

In article <4514@pyramid.pyramid.com> csg@pyramid.UUCP (Carl S. Gutekunst) writes:
>All commercially available UNIX ports that run on multiprocessor systems are
>proprietary -- they only support the vendors own hardware.

Hardly surprising..

>AT&T has a twin-processor version of the 3B20, but I believe it runs a special
>real-time version of UNIX. Someone from AT&T will certainly elucidate....

You're describing the 3B20D.  There is also a 3B20A that runs a symmetric
(not master/slave) version of UNIX System V.  That version makes a nice
starting point for other multi-processor UNIX implementations, since all
the critical regions have already been identified.

Besides those on your list, Alliant, Arete, Convergent, and Cray also have
made multi-CPU versions of UNIX for their specific systems.
(I omit Denelcor because I don't want to claim that theirs ever worked!)

bzs@bu-cs.bu.EDU (Barry Shein) (08/08/87)

Just some errata on the Encore info (we now have four of them):

>Encore Multimax: up to ten 32332 microprocessors, symmetric, choice of 4.2BSD
>or System VR3.

Up to 20 processors, the confusion probably arises from 10 boards max
each with 2 CPUs.

>The others are marketed as
>multiuser systems for a large user base (from 48 to 512 users), where the main
>purpose of the multiple processors is to support a large number of different
>tasks, as opposed to subdividing single large tasks. (I don't mean the sub-
>division of large tasks isn't possible, just that none of these are marketed
>that way.)

Not sure about the word "marketed" but the Encore (and I'm sure others)
has extensions to their Unix (some new syscalls) to support multiple
CPUs in a process. A lot of this derives from Mach's threads (which
also runs on the Encore.)

Standardizing programming interfaces to exploit parallel CPUs within
the context of Unix remains a frontier.

	-Barry Shein, Boston University

loverso@encore.UUCP (08/11/87)

In article <4514@pyramid.pyramid.com> csg@pyramid.UUCP (Carl S. Gutekunst) writes:
> Computer Consoles Inc. Power 6/32MP: a pair of proprietary CPUs, 7 MIPS or so
> each, master/slave. Choice of 4.3BSD or System VR2.

Having worked with the tahoe (6/32) for 2 years at my previous place of employ,
I can say that its a fast machine, 6 MIPS single processor - hard MIPS (i.e.,
6 times a 780 - for real).  CCI rates the tahoe at 8 MIPS, and the MP
is at 15.  We never got an upgrade to an MP because BSD does NOT run on it.
CCI supports SysVr2 on all their machines, and a very POOR 4.2 port on
the 6/32 *only*.  4.3 is in the works for the tahoe, but not from CCI but
rather from Berkeley, as announced at previous Usenix conventions.  I was
involved in the beta test and the machine I ran is still up on the internet.
CSRG has plans to use the tahoe as their next processor base.

The tahoe is also OEM'd and sold by Unisys (Sperry 7000/40) and Harris (HCX/7).
All earlier products featured a versabus design, which made 3rd party
controllers hard to come by.  While CCI is releasing a new series of
versabus controllers, Harris is now manufacturing the HCX/9, a tahoe
but with a VMSbus for I/O.  Something to watch for!

John LoVerso, Encore Computer Corp
loverso@multimax.arpa, encore!loverso

[The above opinions are my own and shouldn't be associated with
 my current employer!]

kermit@BRL.ARPA (Chuck Kennedy) (08/11/87)

One other machine of interest is the Alliant FX/8.  It can have up to
8 "computational element" processors (approx. 5 MIPS ea.) plus up to 12
"interactive processors" (68020 cpus).

rich@oxtrap.UUCP (K. Richard Magill) (08/12/87)

Sequent Computer Systems of Beaverton Oregon makes a 32032 based box,
(2-30), based on 4.2 (+ some 4.3 hacks), with SysV calls hacked in if
you want them, (dual universe), shared memory, memory mapped disk
files, atomic lock memory, etc, etc.  This system is load balanced by
time slice.  And the system works slic (pun intended!).  I've been on
more versions of unix than I can remember right now and I usually find
2-10 bugs in the first week on a new system.  I've been on DYNIX now
for ~2 months with exactly one major complaint.  (tell you later, or
watch real soon on comp.sys.sequent).

Their new 4.3'ish os release is due out in days and they are in beta
test on a multiple '386 based system, (2-30).

My answers with regards to sequent DYNIX 2.1.1.

In article <1112@elrond.CalComp.COM> adb@elrond.UUCP writes:
>
>  ... any information on BSD 4.2 or 4.3 support of multiple CPU ...
>  systems. (More specifically a two CPU system such as a VAX 11/782 or 
>  VAX 8800.) The types of information we are most interested in are:
>    o To what extent is the 2nd processor utilized? (Is it a true 
>      multi-processor port in that the kernel and user processes can
>      run on either CPU, or is it the case that the kernel runs on one
>      and only one of the CPUs?) 

Symmetric.  ie, balanced by time slice.

>    o Are there any performance measures which compare a two processor
>      system vs. the single processor version? (Using the same kernel
>      source on a single processor system vs. a two processor system,
>      does the two processor version equal 1.3? 1.5? 1.7? 2.0!? times
>      a single processor system?)

All that I have seen and done are nearly linear, upt to about 27/30 on
a 30 cpu balance.

>    o Which systems DO have a dual (or multiple) CPU UN*X?

I know that encore offers a multiple 32032 based box but I don't know
much about it except that it is SysV'ish.  The new NCR broadway is a
bunch of 68020's that do load balance by process, (basically network
in a box, SysV + proprietary ipc), and the Convergent Technologies
systems use a proprietary operating system (CTOS), with a "unix
emulator" on top.  And then there is "the-network-is-the-computer"
SUN...

>    o What kind of theoretical/practical performance improvement benchmarks
>      does the 2 CPU system have over the single CPU version?

With the exception of locking which is seldom a problem, lock
overhead, which is noticable (check iocall results for sequent) but a
necessary evil, and process coordination, it is linear.

rich.

ps, oxtrap is a sequent B8000(6)
"lies keep people happy" - I94 west of detroit.