[comp.arch] Architecture specific OS features

aglew@crhc.uiuc.edu (Andy Glew) (09/19/90)
>                                        [T]he UMAX4.3 kernel, which, in
>turn, should not be confused in any way with a parallelized version of
>the 4.3BSD kernel.  UMAX4.3 only provides 4.3BSD-like kernel features
>over a base UMAX"4.2" kernel (VM, process control, etc) - it is not
>a "port" of the 4.3BSD  kernel.  [The kernel network code is a major
>exception to this].  That base kernel is still a reflection of the
>original (1985) design for UMAX.  This is not to say that design is
>wrong (which would be wrong to say, because it mostly works!), but
>rather to point out that some continuing problems - such as VM -
>are because of the original design.
>
>John Robert LoVerso			Home: john@loverso.loem.ma.us

I'm jumping onto this quote because it is evidence of a problem I have
seen several times in OS development that takes advantage of
specialized features of a particular computer architecture (in this
case, parallelism).

    I'm posting to comp.arch and comp.os.research because this is an
OS/architecture issue; I'm posting to comp.sys.encore because the
original post was there.


In short: if you take advantage of specific architectural features in
your OS, you may make it difficult to update the OS.

Many computer companies nowadays do not really develop their OS from
scratch - rather, they take BSD, or MACH, or System V, and port it to
their hardware.  Thus, these computer companies are caught in the
middle: they would like to pass on to their customers updates of the
OS they are based on, as well as the computer company's own "value
added" for the OS.

If the changes to the generic OS to support the computer company's
architecture are well isolated and encapsulated, there's little
problem - the changes are re-done, and the customer receives BSD4.3
just a little after the company received it. eg.
    Examples of such "controlled" architectural dependencies in the OS 
are different virtual memory structures, device drivers, etc.

However, if the changes to the generic OS to support the computer
company's architecture are widespread, then a lot of code needs to be
changed on each update of the base OS.
    "Parallelizing" the UNIX kernel seems to be an example of such a
change, to take advantage of a specific architectural feature, that
produces widespread changes to the kernel source.
    The initial steps of paralellization are small: typically a global
kernel semaphore, then a few finer grain semaphores to allow
concurrent file activity, etc. Eventually, there is a lot of parallel
synchronization spread in different places throughout the code - and
it becomes rather a pain to update the base version of the OS.

I believe that we have seen this occur with both of the vendors of
shared memory parallel machines, Sequent and Encore.  Both had, I
believe, a BSD4.2 porting base for their parallel versions.  Both took
a long time getting up to BSD4.3 (I remember watching from a
competitor that was BSD4.3 almost before (:-{) 4.3 was official,
wondering what was taking Sequent so long), and when they announced
BSD4.3 it was really just an update of user level programs and a few
kernel things (networking especially), around the underlying BSD4.2
kernel.
    Sequent may have a "true" BSD4.3 kernel by now (or their kernel
may be so different that it can no longer be called BSDish), but the
post that started me off seems to indicate that Encore is not yet
"true" BSD4.3.

"Parallelizing" the UNIX kernel really isn't all that hard.  It's been
done many times, at many different companies.  Some have taken the
gradual approach of applying first coarse grain locks, and then
refining them; some have "totally redesigned" the underlying kernel.
The reason why we do not see a great variety of parallel UNIXes is not
that it's difficult to do, but that parallelizing UNIX means that
you're in for a lot of grungy, expensive, software maintenance work,
treading water trying to keep up with updates from BSD or AT&T or SUN.

How to avoid this problem?:

    (1) Become very good at the *process* of parallelizing your
kernel.  Develop tools, etc. so that you can easily parallelize the
updates from your kernel OS supplier, even if the underlying OS
changes greatly.

    (2) Give your OS changes back to your OS supplier.  
    Really specific architectural changes may not be interesting to
the supplier; but parallelism is of pretty general interest and
utility.  A reasonably portable shared memory parallel kernel is
probably possible.
    There are several gotchas here: a) persuading your company that
you should give all of this proprietary code away may be difficult -
isn't that giving away all of your competitive advantage?  Well, yes
it is - if everybody can use the same parallel OS, then all you have
to compete on is your hardware performance.
    b) even if your company is willing to give your parallel OS back
to the OS supplier, the OS supplier may not want to take it.  They may
not trust you - they may be afraid that your parallel OS will only run
well on your hardware.  They may have NIH syndrome (AT&T suffers that
greatly). Or theymay just not want to bother - after all, they don't
have a need for the features you've added.

Eventually, something like the features you want will become part of
the standard BSD or AT&T UNIX; so maybe the best plan is just to admit
that whatever you're building is a stopgap, until your OS supplier
starts supplying an OS with the feature you want.


--
Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]