[mod.os] User-state file servers versus kernel implementation

mod-os@sdcsvax.uucp (01/14/87)

--

> Alternatively, does it matter?  It's all very well to say that your file
> system is in a user-state server rather than in your kernel, but your users
> are probably just as dependent on every little detail being right.  To make
> user-state servers more than an irrelevant implementation detail, the user
> must be offered a choice of servers AND IT MUST BE EASY TO WRITE A SERVER!
> Or at least, not impossibly hard.  If writing a server is as touchy a job
> as writing the equivalent inside the kernel, the wonderful flexibility will
> get little use. 

I agree.  For my master's project I wrote a distributed file
system for 4.2BSD VAX/UNIX.  All of the important code for
the distributed file system was done in the kernel.
Unfortunately, I was only able to bring the file system up
on four VAXen.  If I had designed the distributed file
system so that a user-level server could have been written
for *any* BSD-like UNIX system, I could have had my project
on all the machines in the College of Engineering -- the
VAXen plus over 30 SUNs!

Also, debugging the kernel had to be done late at night when
nobody in their right mind should have been awake!  If I had
user-level servers to debug, I could have done it during the
day.  Ugh.  Also, crashes of my file system meant a reboot
of the machine, and crashes of a user-level server simply
mean remote file access would be down for a while.

However, the performance of my distributed file system was
extremely good.  I think it had about twice the throughput
of SUN's NFS.

I guess the main reason I designed my distributed file
system in the kernel was to ensure that existing binaries
did not have to be recompiled to take advantage of remote
access.  I don't think this is possible with user-level
servers, is it?  If it is, could you please direct me to
technical information about how this is/was/could be done?
I would be very interested.

[Conceptually it is easy to do, take a look at the MACH papers. -DL]

Anthony Discolo
-----
Anthony Discolo
+---+---+---+---+---+---+---+
| d | i | g | i | t | a | l |
+---+---+---+---+---+---+---+
Database Systems Group
301 Rockrimmon Blvd. South
Mailstop CX01-2/N23
Colorado Springs, CO  80919
UUCP: ucbvax!decwrl!fastdb.DEC!discolo
ARPA: fastdb.DEC!discolo@decwrl.DEC.COM

--

mod-os@sdcsvax.uucp (01/14/87)

--

  In a distributed system that I am working on at the moment I built servers
to run as "user-state" servers. These servers ran as 'high priority' user
processes - they went through the 'real time scheduler' and were guaranteed
to run before 'normal' user processes. Overall, I would say, the experiment
was a failure, and not for the obvious reasons. It had to do with performance.

  I later modified by servers to be autonomous entities that ran inside the
Kernel's address space. When I did this performance picked right up. What
was happening in the original implementation was do singly to virtual memory
cache and TLB flushing and page faulting. When I made the server run inside
the Kernel address space, I didn't need to flush the VM cache, the TLB and I
didn't get page faults (at least not as frequently).

  While I still believe that a system should be written with a kernel and
surrounded by servers, I think it is advisable to make the point that some
servers need special handling. The current system I have supports two types
of servers - normal and high-performance. Based on the type of server, the
system does different things; this only makes the system slightly more 
complicated. Note that this only applies to systems with virtual memory; in
systems without virtual support I would expect no difference in performance.

  The two biggest reasons for constructing the system with a small kernel and
surrounded with servers was portability and maintanence. The only part of
the system that needs to be adapted to the new environment is the kernel - the
servers stay the same. The other reason, maintanence, was also important - the
system can change and I didn't have to worry about wierd (and forgotten) 
side effects : the interface is straight forward.

  A note of interest is that the file server I have written is machine
independent - a disk created and built on one machine can be read and accessed
on another (the 'endian of the machine doesn't matter). What I have found is
that performance is hindered too much ( < 0.5% of the processor time is taken
accounting for the rigid data format). Thus, assuming the physical media is
the same, I have taken disks formatted on an IBM PC and used them in the
disk drive on a National Semiconductor box without problem.

  Another interesting point is that I have separated the directory from the
file system. The directory is nothing more that a database server with a
few extensions (it is a B+ tree). What I am able to do is store not only
file names (and file server information), but things about processes, etc.
This turned out to be useful - I save .login style things in the database, 
etc.






--