[comp.os.research] The Sprite Network Operating System

douglis@ginger.Berkeley.EDU (Fred Douglis) (01/25/88)

[ I would really like to see more articles like this one.  --DL ]

The following is a copy of an article I recently sent to the IEEE
Technical Committee on Operating Systems newsletter.  To format it,
use "ditroff -ms" or the equivalent.  There's nothing tricky in it, so
if you don't have *roff, just change things around for your favorite
formatter.

Note: the February issue of _Computer_ will contain a much more
extensive article on Sprite, including references.

Fred Douglis	(douglis@ginger.berkeley.edu)

---cut here---
.nr PS 11
.ps 11
.nr VS 13
.vs 13
.DS C
.LG
.LG
\fBAn Overview of the Sprite Project\fP
.NL
.sp 1c
John Ousterhout
Andrew Cherenson
Adam de Boor
Fred Douglis
Michael Nelson
Brent Welch
.sp 1c
Computer Science Division
Department of Electrical Engineering and Computer Sciences
University of California
Berkeley, CA  94720

spriters@ginger.berkeley.edu
.DE
.sp 2c
.NH 1
Introduction
.PP
Sprite is a new operating system that we have been designing and
implementing at U.C. Berkeley over the last three years.  Sprite aims
to optimize use of the emerging 
technologies of local-area networks, workstations with large
physical memories, and multiprocessor workstations.   While Sprite's
facilities are similar in appearance to those of 4.3 BSD UNIX,
the kernel has been completely re-implemented to provide a
high-performance network file system, process migration, and
shared address spaces.
.PP
Sprite is intended to provide high-quality support to a
relatively small internet or local-area network, ranging
from a few dozen to a few hundred users, and to combine the
advantages of timesharing with the high performance of personal
workstations.   
.PP
One of the innovative features of the Sprite kernel implementation
is its remote-procedure call (RPC) facility, which
kernels of different nodes (workstations) use to request services from 
each other.  The basic round-trip time for a simple RPC with 
no parameters is about 2.5 ms on Sun-3 workstations.
Fragmentation of large packets allows bulk data transfers to occur
at speeds upward of 700 kbytes/sec on Sun-3 workstations.
Also, Sprite's kernel is multi-threaded, enabling the system to support 
multiprocessors.  To allow more than one process to execute kernel code 
safely at one time, there are explicit locks on modules and data 
structures.
.PP
The network has been a high priority in the Sprite project,
and much effort has focused on the network file system and the process
migration facility. 
.PP
The network file system appears to users as a single shared hierarchy,
even though the file storage is distributed among several server
machines.  File operations behave exactly the same as if they were
executed on a single BSD timesharing system.
To achieve high performance in the network file system,
both clients and servers keep main-memory caches of recently-used 
disk blocks.  The Sprite file system negotiates with the virtual
memory system over usage of physical memory, so the file caches
can grow very large when virtual memory demands are low.
The caches improve performance by reducing
disk traffic and eliminating network traffic between servers and 
clients.  To coordinate shared access to files, Sprite
disables client caching for any file that 
is open for writing on one node (workstation) while open for 
either reading or writing on another.  Consistency actions 
are implemented by the servers when files are opened or closed.
The resulting performance degradation is judged \(em
based on measurements on \s-2UNIX\s+2 systems \(em to be minimal,
because files are open only for very short intervals and write-sharing
of files is infrequent.  Preliminary performance measurements show
that the Sprite caching mechanism allows most programs to execute only
1-5% slower on diskless workstations than on workstations with disks.
Many programs run 10-40% faster under Sprite's network file system
than under other network file systems, such as Sun's NFS.
.PP
Sprite provides a process migration facility that allows processes
to be moved between nodes with compatible instruction sets.  This
allows users to take advantage of idle workstations around the
network by offloading large jobs.  For example, Sprite contains
a new version of the ``make'' utility that uses process
migration to run independent portions of the make concurrently
on different workstations.
.PP
The network file system plays a large role in process migration by
allowing any file to be accessed by any workstation.  Thus it
isn't necessary to move files when migrating processes.  In
addition, Sprite uses ordinary files for backing storage in the
virtual memory system.  Thus, all that needs to be done to transfer
a process's address space is to page it out from its old workstation
to the backing file and page it in again on the new workstation.
.PP
In order to support multiprocessors, Sprite's implementation of
virtual memory provides a simple
mechanism for shared address spaces.
It allows multiple processes to share the
heap segment of memory using a new form of the ``fork'' system
call.  Sprite's sharing mechanism is all-or-nothing:  a process
cannot share one portion of its heap with one process and another
portion with other processes.
.PP
The Sprite team was formed in the fall of 1984, and coding began in
1985.  At present all of the major pieces of the system except for
process migration are operational; migration is temporarily disabled,
pending a revision of the file system.  All Sprite development
is done on Sprite, and members of the project use Unix systems only
for those few tools that have not yet been ported to Sprite
(generally, programs that do not run under X11).
.PP
Current work includes studying the interaction between virtual memory
and the file system, adding user-level processes to implement parts of
the file system, and reimplementing the file system to use an
append-only log format (suitable for use on write-once optical disks).