[mod.os] Who needs files. Really "Sprite" -DL

brent@sloth.Berkeley.EDU (Brent Welch) (03/22/87)

I've been meaning to introduce Sprite to mod.os, and the discussion on
mapped files has spurred me to action.

I'm from the filesystem camp.  I'm one of a few students developing
the Sprite operating system here at Berkeley under Prof. Ousterhout.
Sprite appears much like UNIX on the outside, but we've spent the last
couple years designing and implementing our own kernel.  The kernel
interface is close enough to UNIX that we can re-link unix programs
with a compatibility library and run them just fine. csh make cc, etc.
Sprite has been up and running for some time, although it is still
under development.  (ie. Process migration, Processes behind files, and
the IP/TCP interface to outside world, are currently being finished up.)

A central notion of Sprite is that we're providing a nice distributed
filesystem that will solve the world's problems. :-)  Even though we
have a zillion diskless workstations, we'd like the ease of sharing
and administration of the old timesharing systems.  We assert that the
filesystem can provide enough functionality at high performance to do this.

In particular, there are some things that are more easily done with
a filesystem approach than with a mapped file approach.  One, you still
need naming.  The Sprite filesystem has a network global namespace that
includes files as well as other things like peripheral devices and server
processes.  The filesystem takes care of the distribution of the name space,
and it provides a uniform interface to all things kept there.

As was said by someone else, how do you map a terminal into your memory
so you can have device independent I/O?

Also, what does a mapped file approach do for your server processes that
want a name?  In Sprite, a server process can attach itself to a name in
the filesystem.  After that, clients can operate on the name just like it
was a file.  Of course, the filesystem operations may not map to what the
server wants to implement, so a more general operation, RPC essentially,
is also available on these special kinds of files.  This may sound exotic,
but it provides an RPC interface to a server process, something that is
well-known to be a good idea, plus it places the server in the regular
name space.  You can browse for availble services with "cd" and "ls".

What about caching?  In Sprite, diskless nodes keep a regular read-ahead,
delayed write, cache for filesystem blocks.  In order to keep things consistent
the fileservers track this caching.  To simplify things, they will inhibit
caching for a file, and force write backs and flushes, when different clients
are simultaneously writing a file.  Servers also let clients keep dirty
blocks after a file is closed, ie. write-back-on-close is not done.  Again,
they'll do the right thing client B reads a file shortly after client A
writes it.  However, this "sequential" sharing usually happens on the same
client, and the file may get deleted before being written out of the cache.
For example, running on the same hardware, 8 Meg Sun 3/75, Sprite runs
compilations faster than Sun's 3.2 using NFS.  The slowness of NFS is mainly
due to its use of write-through-to-server's-disk.  To recompile the Sprite
filesystem itself (28,000 lines of C...) takes a Sun UNIX diskless client 
15 min 52 sec,  while a Sprite diskless client, with the same size cache,
completes the compile in 9 min 54.  If we ramp up the max size of
the Sprite client's cache (the cache size varies dynamically in response
to demands by the VM system and FS) it can complete in 9 min 31 sec.  
(The compile runs faster on Sprite locally too, 10:51 on Sun/UNIX 3.2
vs. 9:12 locally on Sprite, both runs with same size caches.)

(Back to mapped files...)  If you map files, then it isn't possible to
"disable caching" when concurrent write sharing occurs.  You either legislate
against this kind of sharing, or implement more complex schemes.  Typically
you pass tokens around the network that allow you to modify pages.

One last objection.  Supose you have a multi-processor architecture,
the kind with a handful of processors each with a large local cache
(instructions and data, forget the filesystem here).  If its a virtual
address cache, then virtual synonyms, different virtual addresses that
refer to the same physical address, mess up the multi-processor's cache
consistency problem.  This means you may not be able to map the same file
into different processes' address spaces at different locations.  Actually,
I don't even think you need the complication of a multiprocessor, just
the virtual address tagged cache.  Anyhow, this is a more baroque objection
to mapped files, but operating systems must run on real machines, after all.

So, a return address that our whole group will read is
	...ucbvax!ucbginger!spriters
	spriters@ginger.Berkeley.EDU
and mine in particular is
	...ucbvax!ucbginger!brent
	brent@ginger.Berkeley.EDU

	Brent Welch