douglis@ginger.Berkeley.EDU (Fred Douglis) (01/25/88)
[ I would really like to see more articles like this one. --DL ] The following is a copy of an article I recently sent to the IEEE Technical Committee on Operating Systems newsletter. To format it, use "ditroff -ms" or the equivalent. There's nothing tricky in it, so if you don't have *roff, just change things around for your favorite formatter. Note: the February issue of _Computer_ will contain a much more extensive article on Sprite, including references. Fred Douglis (douglis@ginger.berkeley.edu) ---cut here--- .nr PS 11 .ps 11 .nr VS 13 .vs 13 .DS C .LG .LG \fBAn Overview of the Sprite Project\fP .NL .sp 1c John Ousterhout Andrew Cherenson Adam de Boor Fred Douglis Michael Nelson Brent Welch .sp 1c Computer Science Division Department of Electrical Engineering and Computer Sciences University of California Berkeley, CA 94720 spriters@ginger.berkeley.edu .DE .sp 2c .NH 1 Introduction .PP Sprite is a new operating system that we have been designing and implementing at U.C. Berkeley over the last three years. Sprite aims to optimize use of the emerging technologies of local-area networks, workstations with large physical memories, and multiprocessor workstations. While Sprite's facilities are similar in appearance to those of 4.3 BSD UNIX, the kernel has been completely re-implemented to provide a high-performance network file system, process migration, and shared address spaces. .PP Sprite is intended to provide high-quality support to a relatively small internet or local-area network, ranging from a few dozen to a few hundred users, and to combine the advantages of timesharing with the high performance of personal workstations. .PP One of the innovative features of the Sprite kernel implementation is its remote-procedure call (RPC) facility, which kernels of different nodes (workstations) use to request services from each other. The basic round-trip time for a simple RPC with no parameters is about 2.5 ms on Sun-3 workstations. Fragmentation of large packets allows bulk data transfers to occur at speeds upward of 700 kbytes/sec on Sun-3 workstations. Also, Sprite's kernel is multi-threaded, enabling the system to support multiprocessors. To allow more than one process to execute kernel code safely at one time, there are explicit locks on modules and data structures. .PP The network has been a high priority in the Sprite project, and much effort has focused on the network file system and the process migration facility. .PP The network file system appears to users as a single shared hierarchy, even though the file storage is distributed among several server machines. File operations behave exactly the same as if they were executed on a single BSD timesharing system. To achieve high performance in the network file system, both clients and servers keep main-memory caches of recently-used disk blocks. The Sprite file system negotiates with the virtual memory system over usage of physical memory, so the file caches can grow very large when virtual memory demands are low. The caches improve performance by reducing disk traffic and eliminating network traffic between servers and clients. To coordinate shared access to files, Sprite disables client caching for any file that is open for writing on one node (workstation) while open for either reading or writing on another. Consistency actions are implemented by the servers when files are opened or closed. The resulting performance degradation is judged \(em based on measurements on \s-2UNIX\s+2 systems \(em to be minimal, because files are open only for very short intervals and write-sharing of files is infrequent. Preliminary performance measurements show that the Sprite caching mechanism allows most programs to execute only 1-5% slower on diskless workstations than on workstations with disks. Many programs run 10-40% faster under Sprite's network file system than under other network file systems, such as Sun's NFS. .PP Sprite provides a process migration facility that allows processes to be moved between nodes with compatible instruction sets. This allows users to take advantage of idle workstations around the network by offloading large jobs. For example, Sprite contains a new version of the ``make'' utility that uses process migration to run independent portions of the make concurrently on different workstations. .PP The network file system plays a large role in process migration by allowing any file to be accessed by any workstation. Thus it isn't necessary to move files when migrating processes. In addition, Sprite uses ordinary files for backing storage in the virtual memory system. Thus, all that needs to be done to transfer a process's address space is to page it out from its old workstation to the backing file and page it in again on the new workstation. .PP In order to support multiprocessors, Sprite's implementation of virtual memory provides a simple mechanism for shared address spaces. It allows multiple processes to share the heap segment of memory using a new form of the ``fork'' system call. Sprite's sharing mechanism is all-or-nothing: a process cannot share one portion of its heap with one process and another portion with other processes. .PP The Sprite team was formed in the fall of 1984, and coding began in 1985. At present all of the major pieces of the system except for process migration are operational; migration is temporarily disabled, pending a revision of the file system. All Sprite development is done on Sprite, and members of the project use Unix systems only for those few tools that have not yet been ported to Sprite (generally, programs that do not run under X11). .PP Current work includes studying the interaction between virtual memory and the file system, adding user-level processes to implement parts of the file system, and reimplementing the file system to use an append-only log format (suitable for use on write-once optical disks).