[mod.os] kernel servers vs. user servers.

mod-os@sdcsvax.uucp (01/12/87)

--

In article <2438@sdcsvax.UCSD.EDU> henry@utzoo.uucp writes:
>
>Alternatively, does it matter?  It's all very well to say that your file
>system is in a user-state server rather than in your kernel, but your users
>are probably just as dependent on every little detail being right.  To make
>user-state servers more than an irrelevant implementation detail, the user
>must be offered a choice of servers AND IT MUST BE EASY TO WRITE A SERVER!
>Or at least, not impossibly hard.  If writing a server is as touchy a job
>as writing the equivalent inside the kernel, the wonderful flexibility will
>get little use. 
>
I think that user-state servers have a lot of advantages over
kernel-state servers. Of course, it shouldn't matter to the user
directly, but some of the advantages of user-state servers
I see are:

- Testing and debugging a new, improved, better and faster (and,
therefore, completely unreliable:-) server can be done during
normal production time, and all normal debugging tools are available.

- Backward compatability or emulation of a different OS is much
easier.

- By splitting a service into smaller services these services become
much easier to understand (and, hopefully, robust).

Of course, performance is still better for kernel-state servers,
but the gap seems to be closing quite fast.
-- 
	Jack Jansen, jack@cwi.nl (or jack@mcvax.uucp)
	The shell is my oyster.

--

darrell@sdcsvax.UUCP (01/17/87)

--

> File servers are not something that every user can write, but I certainly
> want the flexibility of having various gurus and researchers being able to
> write new file servers whenever they want to.    Writing a file server is
> a fair amount of work, but is not unreasonable for a Master's thesis.

I agree.  My point is that, apart from some significant-but-secondary
issues of debugging, this is independent of whether file servers are part
of the kernel or not.  Kernels with multiple file systems are not uncommon
now, and there is no intrinsic reason why an experimental file system cannot
coexist with a production one in such a kernel.  (It is shameful that UCB
did not do this with 4.2, in fact, since the 4.1->4.2 filesystem conversion
was an ideal application for it.)

It is certainly true that changing and debugging file systems is easier and
(on a multi-user machine) safer when they are not part of the kernel.

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry


--

> - Testing and debugging a new, improved, better and faster (and,
> therefore, completely unreliable:-) server can be done during
> normal production time, and all normal debugging tools are available.

True.  On the other hand, presumably one does not test/debug a file server
all that often...

> - Backward compatability or emulation of a different OS is much easier.

Not really; there is no fundamental difference between kernel and user
file servers in this regard that I can see.  (Remember that kernels can
have more than one kind of file system, with kernel code implementing all
of them.)  Can you elaborate?

> - By splitting a service into smaller services these services become
> much easier to understand (and, hopefully, robust).

Yes, but this is largely orthogonal to the issue.  It is perfectly possible
to have a kernel file system which is split cleanly from the rest of the
kernel (although admittedly a separate address space adds to the firmness
of the split).  Kernels which support multiple different file systems go
in for this to a considerable extent, of necessity.  Equally, it is possible
to have user-level file servers whose interface to the kernel is messy and
complex and whose innards are complicated and incomprehensible.  Moving a
service across the kernel-user boundary is neither necessary nor sufficient
for clean partitioning, although it may encourage it.


Understand, I am in favor of user-level file servers, subject to some
reservations about performance.  But I see them as a modest win if done
right, rather than an automatic huge win.


				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry
--

darrell@sdcsvax.UUCP (01/17/87)

--

[To moderator: I'm sending this again.  It didn't seem to make it the
first time. -avd]

---------------------Reply to mail dated 13-JAN-1987 05:03---------------------

> Alternatively, does it matter?  It's all very well to say that your file
> system is in a user-state server rather than in your kernel, but your users
> are probably just as dependent on every little detail being right.  To make
> user-state servers more than an irrelevant implementation detail, the user
> must be offered a choice of servers AND IT MUST BE EASY TO WRITE A SERVER!
> Or at least, not impossibly hard.  If writing a server is as touchy a job
> as writing the equivalent inside the kernel, the wonderful flexibility will
> get little use. 

I agree.  For my master's project I wrote a distributed file
system for 4.2BSD VAX/UNIX.  All of the important code for
the distributed file system was done in the kernel.
Unfortunately, I was only able to bring the file system up
on four VAXen.  If I had designed the distributed file
system so that a user-level server could have been written
for *any* BSD-like UNIX system, I could have had my project
on all the machines in the College of Engineering -- the
VAXen plus over 30 SUNs!

Also, debugging the kernel had to be done late at night when
nobody in their right mind should have been awake!  If I had
user-level servers to debug, I could have done it during the
day.  Ugh.  Also, crashes of my file system meant a reboot
of the machine, and crashes of a user-level server simply
mean remote file access would be down for a while.

However, the performance of my distributed file system was
extremely good.  I think it had about twice the throughput
of SUN's NFS.

I guess the main reason I designed my distributed file
system in the kernel was to ensure that existing binaries
did not have to be recompiled to take advantage of remote
access.  I don't think this is possible with user-level
servers, is it?  If it is, could you please direct me to
technical information about how this is/was/could be done?
I would be very interested.

Anthony Discolo
-----
Anthony Discolo
+---+---+---+---+---+---+---+
| d | i | g | i | t | a | l |
+---+---+---+---+---+---+---+
Database Systems Group
301 Rockrimmon Blvd. South
Mailstop CX01-2/N23
Colorado Springs, CO  80919
UUCP: ucbvax!decwrl!fastdb.DEC!discolo
ARPA: fastdb.DEC!discolo@decwrl.DEC.COM

--

jbn@glacier.stanford.edu (John B. Nagle) (01/18/87)

--

      It's a sad commentary on the operating systems we use today that we have 
to build things into the kernel "to make them go fast".  Operating system
internals technology seems to have regressed since the days of Multics or
even the Michigan Terminal System.  Part of the problem is that the construction
of a layered operating system requires suitable hardware for efficient 
interaction between the layers.  Such hardware is not rare; most superminis
and mainframes have it; but most microprocessors do not, partly because this
issue cuts across the line between the processor and the memory management
hardware, and for most microprocessors, these are on separate chips, and the
chips are so designed that operation without an MMU is possible.
      Much of the portability of the UNIX kernel comes from its assumption 
that the hardware offers minimal support for structuring the operating system.	
We pay a price for this, in that as services are added to the system, the kernel
becomes huge.  And it never seems to have everything that everybody wants.
      What is needed is hardware support for interprocess and interdomain
communication.  Ideally, one should be able to do the following:

      - Write an application which offers a set of services callable by 
 	other programs, but has enough control over its own operation that
	it can protect the integrity of its own data structures.  Consider,		for example, a database system which offers services to other
	programs but should be able to maintain the integrity of the database
	in the face of foulups by the applications.

      - Call other applications with little if any more overhead than that
	required to call a subroutine in one's own space.

      - Find out from an application the identity of your caller in a
	protection sense.

      - Communicate with multiple callers from the same application.

      - Keep applications outside the system protection boundary, i.e.
        applications should not "run as root".

MULTICS, for example, provided all these capabilities.  To a lesser extent,
so does VMS.  UNIX does not.  This results in a need for terrible kludges
whenever a shared service is implemented, most of which only sort of work.
Some UNIX systems need special devices to make mail work.  Anyone who has ever
used the INGRES database has run into the "INGRES locking device".  And the
line printer spooling systems have always had locking problems.  Almost all of
the shared services in UNIX have a "set UID to root" program somewhere, and
thus are potential ways to breach security.

      Sockets and streams are steps in the right direction, as is the SUN RPC
mechanism.  What one really wants is something with the arm's length 
relationship of an RPC but the efficiency of an ordinary call.  This is not
impossible; it was achieved before 1970.  But one has to assume hardware 
support to get that efficiency.

      An interesting approach would be to define an RPC facility which, when
both entities were on the same machine and the machine sported suitable
hardware, turned into a more efficient direct call across a protection 
boundary.  With such an approach, one could have efficiency, protection,
distribution, and portability, although versions on machines without suitable
hardware would be slow.

						John Nagle


--

darrell@sdcsvax.UUCP (01/18/87)

--

In article <2480@sdcsvax.UCSD.EDU>, John B. Nagle writes:
>                                  Part of the problem is that the construction
> of a layered operating system requires suitable hardware for efficient 
> interaction between the layers.  Such hardware is not rare; most superminis
> and mainframes have it; but most microprocessors do not, partly because this
> issue cuts across the line between the processor and the memory management
> hardware, and for most microprocessors, these are on separate chips, and the
> chips are so designed that operation without an MMU is possible.

The 68020 added "CALLM" and "RTM" instructions that implement the same
kind of ring-crossing calls as the DG MV series (and Multics, though I
haven't worked with Multics so I could be misrepresenting their
hardware).  There is support in the 68020's MMU for "call gates" and
all the associated paraphanelia to make it work.  The MMU checks the
top 3 bits of all accesses to ensure that low-privileged rings cannot
access the memory addresses reserved for high-privileged rings, and 
the CPU run special cycles to tell the MMU when to change its privilege
level.

I have heard that this feature is being dropped from the 68030, which
integrates the CPU and MMU on one chip, because NOBODY USES IT.  Also,
the microcode to do it is complicated and it takes the chip designers
weeks to understand what it's supposed to do, then do it -- and they
can't afford the time in the Race to Market.

Part of the problem is that the stack format used is incompatible with
the original JSR and RTS instructions, so nobody compiles subroutines for
the new stack format, so nobody can compile calls that use them either.
Somebody who really wanted to use an expensive call instruction rather
than a simple JSR could change their compiler, but so far nobody has
wanted to :^).


--

dan@sdcsvax.UUCP (01/19/87)

--

>I guess the main reason I designed my distributed file
>system in the kernel was to ensure that existing binaries
>did not have to be recompiled to take advantage of remote
>access.  I don't think this is possible with user-level
>servers, is it?  If it is, could you please direct me to
>technical information about how this is/was/could be done?
>I would be very interested.

   There is a nice little distributed, message passing operating system for
the IBM PC and AT, called QNX.  It has a small kernel for message services
and the clock interrupt, and relies on administrator tasks for everything
else, including file service.  There are various hacks in the default
administrators to optimize things like program loading, but the apparent
behaviour is as described.

   There are various other user-written file system administrators available, 
including one for MS-DOS file systems.  In order to avoid message forwarding
through the default f.s. admin, the library routines that communicate with
it maintain, for each file, the task ID of the administrator handling that
file.  The user administrator, on startup, sends an `adopt disk' message to
the default admin, telling it which disk name (it can be an arbitrary hex
digit between 1 and F) it wants to serve.  When a task first tries to open a 
file on that disk, the default administrator sends the library routine what
amounts to a redirect message, saying, "From now on communicate with task
so-and-so for this file".

   The redirection takes place completely under the covers, so any compiled
program can be used with any file system, as long as it provides services
adequate to that program's needs.


-- 
    Dan Frank
    uucp: ... uwvax!prairie!dan
    arpa: dan%caseus@spool.wisc.edu
--

darrell@sdcsvax.UUCP (01/21/87)

--

  Some people have commented that it is a sorry state of affairs that things
must reside in the "kernel" to be efficient. Others have made a statement that
"user servers" are the way to go. Well, in the system that I am working on the
only difference between a "kernel" server and a "user" server is that in a 
kernel server no cache or TLB flushing occurs. I can still run the debugger
and test the kernel process, just as I would for a user process. Thus, short
of a hardware performance improvement there is no difference.

  As for why the kernel server runs faster, well it is actually attributed to
two reasons - first, as mentioned no cache or TLB flushing occurs and secondly
the scheduling of "kernel" processes (AKA servers) is realtime, and based on
priority. User processes are scheduled accroding to a timeslicing scheme which
doesn't work well with "server" type things.



                     Chuck Wegrzyn, not a Codex employee.

--