[mod.os] SIGOPS workshop report

darrell@sdcsvax.UUCP (04/03/87)

The January 1987 issue of Operating Systems Review carried a report on
the Second European SIGOPS Workshop "Making Distributed Systems Work."
I found the report very interesting and would recommend it to anyone
working on distributed systems, particularly distributed operating
systems.

Two questions raised during the workshop deserve wider discussion:

1. There seems to be a division of SIGOPS "insiders" and "outsiders."
   Some insiders want to bring in "outsiders" (for ex. Dave Clark) to
   SIGOPS functions.  I wonder if the readers of mod.os share this
   perception of SIGOPS isolation.

2. The existence of "really distributed applications" was questioned
   several times.  If you have a distributed (operating) system, what
   do you run on it?

		-Calton-

darrell@sdcsvax.UUCP (04/09/87)

[This is somewhat commercial, but it describes an interesting system. -DL]

In article <2950@sdcsvax.UCSD.EDU> Calton Pu writes:
>The January 1987 issue of Operating Systems Review carried a report on
>the Second European SIGOPS Workshop "Making Distributed Systems Work."
>...
>
>Two questions raised during the workshop deserve wider discussion:
>
>1. ...
>
>2. The existence of "really distributed applications" was questioned
>   several times.  If you have a distributed (operating) system, what
>   do you run on it?
>
>		-Calton-


I don't know if the average purist would qualify the FileNet system as a
truly distributed application, but it has many of the characteristics.
Just remember that our industry is driven by price/performance.  
No one (except a few researchers) wants distribution for its own
sake.  Furthermore, when it is forced upon a customer, he doesn't
want to know about it.

At FileNet we build a "turnkey" distributed system.  Our system
uses optical disk storage (in a "jukebox") to store images of forms,
checks, and such.  The system permits you to model paper flows in
an office.  You can read incoming documents in via a scanner (like
a copier).  The documents are cataloged in a database and stored
on optical disk.  They can be retrieved directly via the database
or passed around via the WorkFlo(tm) software package, which is the 
means by which paper flows are modelled.  The entire system is shipped 
"closed" to customers.  They can build their own (WorkFlo) applications,
but they cannot write programs directly, nor can they access basic 
operating system functions directly.

So why is this a distributed application?  Inherently, its not.  But
we build it out of 68000-series boxes for reasons of cost/performance.
So a separate machine is (usually) used for user-oriented work (the
workstation offers window manipulation, etc), for the database
server, and for controlling the optical storage jukebox (which uses
magnetic disk for caching).  Because some workstations do not have
magnetic disk, another machine provides them with "operating system
services", such as code loading and paging space. The whole system 
is based on Unix and is held together with Ethernet.  All the 
machines are adequately loaded -- in fact we have seen some 
machines loaded well over 90%.

The customer does not see the distribution during normal operation.
The system presents a clean illusion that it is unified/centralized.
But remember the duck -- so serene above the water, paddling like fury 
underneath.  The operating system has a Locus-like file system.  A
system of abstract data types (object-like) provides distribution of
processing.  The database and applications, for example, are represented
this way.

Currently, the system has several "single points of failure" that are
a result of (at least partially) inadequate distribution.  Initially,
these were considered acceptable.  For example, there is a "root"
machine.  If this machine has to be rebooted, the whole system must
be rebooted.

However, due to FileNet's amazing success (:-), but true), our
customers are now demanding systems so large that we can no
longer a) permit the current single points of failure, or b)
use the same paradigms of distributing the system components (i.e.,
one logical component per box will not cut it anymore).  Furthermore,
the current static binding of services to boxes is probably
too limiting.  In future, we want entire service organizations,
with several hundred on-line users, to depend on the system.
Overall, it must never fail, yet any single component can fail
(the usual story).

How do we plan to achieve this?  You think I'm going to tell the
entire net?  I've got stock to protect.  If you really want to find 
out the scoop: well, see, we have this vacancy in the 
operating system group....


Some background on FileNet: FileNet is in Orange County, CA.
The company is 4+ years old.  We have been profitable for 5 quarters.
We have 300 employees.  Our revenues grew some 280% last year, and will
be up some 80% this year.  Maybe we will go public one day.  We sell
to beauracracies (love that paper).  Banks, insurance companies,
mortgage companies, government departments, etc.  Our basic systems
go for around $0.5 M (sorry, no evaluation boards).

And YOU could work with us! Send resume by E-Mail to me or to:

	FileNet Corp
	3530 Hyland Avenue
	Costa Mesa, CA 92626.


	Martin S. McKendry
	FileNet Corp
	{hplabs,trwrb}!felix!martin


Disclaimer: The facts and figures above are the author's opinions, and do
not represent official views of FileNet.
-- 



-- 
Darrell Long
Department of Computer Science & Engineering, UC San Diego, La Jolla CA 92093
ARPA: Darrell@Beowulf.UCSD.EDU  UUCP: darrell@sdcsvax.uucp
Operating Systems submissions to: mod-os@sdcsvax.uucp