[comp.unix.wizards] How to write a new Unix-like kernel

rsalz@bbn.com (Rich Salz) (10/20/89)

In <17166@rpp386.cactus.org> jfh@rpp386.cactus.org (John F. Haugh II) writes:
> ... discouraging paging
> the kernel is kinda wasteful the way kernels keep bloating.

Take this sentence backwards, and it becomes a feature:  since the kernel
can't page, you can't puff lots of stuff into it.  This has forced a
certain economy of design (phrase lifted from one of the Unix papers, read
them all and find out which one -- it'll be good for you) that has
resulted in the initial success of Unix lo these many years ago.

I don't think this bloat is necessary, and as Dick Dunn has implied in
<1989Oct19.220105.10185@ico.isc.com>, if you make it possible to have the
kernel page, then all you do is make it possible to have every
semi-competent bozo put everything they want in the kernel.  Goodbye
tasteful and understandable set of features, hello [VM][MV]S.

On the other hand, if you let the kernel page, then you can take all the
stuff that doesn't page and call that the "real" kernel.  As long as it's
paging the other parts, put them in user space, and give users the
opportunity to put their own code in for their programs.

I don't expect someone whose .signature says that Mach stands for
messages are crufty hacks will like this design very much, but I'd
rather avoid bloat, myself.  (Do I have to say that this is intended
to be a mild tweak and not one of the famous "Usenet ad hom.
attacks"?)

	/r$
-- 
Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.
Use a domain-based address or give alternate paths, or you may lose out.

jfh@rpp386.cactus.org (John F. Haugh II) (10/22/89)

In article <2046@prune.bbn.com> rsalz@bbn.com (Rich Salz) writes:
>In <17166@rpp386.cactus.org> jfh@rpp386.cactus.org (John F. Haugh II) writes:
>> ... discouraging paging
>> the kernel is kinda wasteful the way kernels keep bloating.
>
>Take this sentence backwards, and it becomes a feature:  since the kernel
>can't page, you can't puff lots of stuff into it.  This has forced a
>certain economy of design (phrase lifted from one of the Unix papers, read
>them all and find out which one -- it'll be good for you) that has
>resulted in the initial success of Unix lo these many years ago.

Don't shoot the messenger!  I haven't encouraged kernel bloat, I'm just
reporting the facts.  I've frequently professed my admiration for the
7th Edition kernel in this newsgroup.  And I have read the UNIX papers,
including the one which says that UNIX really isn't an operating system.

>I don't think this bloat is necessary, and as Dick Dunn has implied in
><1989Oct19.220105.10185@ico.isc.com>, if you make it possible to have the
>kernel page, then all you do is make it possible to have every
>semi-competent bozo put everything they want in the kernel.  Goodbye
>tasteful and understandable set of features, hello [VM][MV]S.

I agree.  However, if we are going to be adding features to a minimal
kernel [ such as networking, graphics, security ] we are either going to
have to cleverly redefine what exactly is =the= kernel or find more
efficient methods of managing the memory we are consuming.

In my mind it is the customer who is driving this software obesity.
You can either argue that Sun is successful because the customers
like the software features and buy more Sun's, or you can argue the
Sun programmers keep adding features because the sales staff keeps
doing a better job of fooling Sun's customers.  I'd love to see an
analysis of SunOS size versus Sun annual sales.  If you care to point
out that every company is growing their OS, I might point out that the
ones that didn't stay on the leading edge of creeping featurism are
now, in the main, out of business.

Besides, who says VMS or MVS or VM/SP are not useful operating systems?
If the question is "How do I put 2000 users on one machine?", my answer
is probably going to be MVS.  The question may be kinda stupid and
anyone who really wants a DASD farm deserves MVS, but it is a solution.

>On the other hand, if you let the kernel page, then you can take all the
>stuff that doesn't page and call that the "real" kernel.  As long as it's
>paging the other parts, put them in user space, and give users the
>opportunity to put their own code in for their programs.

Here I disagree.  Almost everything in a kernel is pagable.  If you
remove everything that can be paged or replaced you get CP.  Taking
a module out of the kernel and cleverly calling it the "real" kernel will
only populate the memory with "user" processes which are now paging.

The user -should- be able to select which file system they want bound
into their kernel.  If you want big and fat, the Berkeley Fast File
System is available.  If you want small and stupid, RT-11 comes to
mind ;-)

Create a standard interface model and code two or three file systems
to fit that model.  Then do the same with network interfaces, graphics
interfaces, etc.  I really should be able to have X in the kernel or
not.  I may need 16MB just to boot, but I should be able to do it.

>I don't expect someone whose .signature says that Mach stands for
>messages are crufty hacks will like this design very much, but I'd
>rather avoid bloat, myself.  (Do I have to say that this is intended
>to be a mild tweak and not one of the famous "Usenet ad hom.
>attacks"?)

Probably ;-)  Do you like the new .signature better?

I feel strongly against message passing schemes for the same reason
I'm not totally sold on lightweight kernel processes.  jsr's are cheaper
than messages or context switches.  You haven't guaranteed my MACH
processes aren't going to be paged out, so you've gained nothing more
than this warm fuzzy feeling that MACH's 55KLOC kernel is more
understandable than SunOS 4.0 or AT&T's latest overweight offering.

In fact, I'll even argue MACH is dangerous because it now gives
everyone an entirely new level to populate with crap.  I feel very
confident in stating that MACH will be big and crufty in even less
time than it took UNIX.  Everyone is so much better at adding cruft.
-- 
John F. Haugh II                        +-Things you didn't want to know:------
VoiceNet: (512) 832-8832   Data: -8835  | The real meaning of EMACS is ...
InterNet: jfh@rpp386.cactus.org         |   ... EMACS makes a computer slow.
UUCPNet:  {texbell|bigtex}!rpp386!jfh   +--<><--<><--<><--<><--<><--<><--<><---

chip@ateng.com (Chip Salzenberg) (10/26/89)

I have to agree with almost everything JFH and r$ have written about kernel
bloat and paging policies.  We all know that SunOS is the definition of
bloat, and that Plan 9 should take over the world.  (Oh, they didn't mention
Plan 9?  An oversight, I'm sure... :-))

However, let me comment on the speed of message passing...

According to jfh@rpp386.cactus.org (John F. Haugh II):
>In article <2046@prune.bbn.com> rsalz@bbn.com (Rich Salz) writes:
>>I don't expect someone whose .signature says that Mach stands for
>>messages are crufty hacks will like this design very much, but I'd
>>rather avoid bloat, myself.  (Do I have to say that this is intended
>>to be a mild tweak and not one of the famous "Usenet ad hom.
>>attacks"?)
>
>Probably ;-)  Do you like the new .signature better?
>
>I feel strongly against message passing schemes for the same reason
>I'm not totally sold on lightweight kernel processes.  jsr's are cheaper
>than messages or context switches.

This statement may be true of Mach messages.  However, it's a matter of
priorities.  If the message designer likes speed, you'll get speed.

For example, I once worked on the design and implementation of a message-
based real-time OS for the Z-80.  Its message executive was implemented as a
array of function pointers.  Unassigned message ports pointed to a common
"you can't get there from here" routine, so minimal error checking was
needed.  It was _very_ fast.  (It had to be.)

A process that wanted to respond instantly could simply attach a subroutine
of its choice to the message port.  If immediate response wasn't required,
messages could be dropped in mailboxes for later perusal.  Mailboxes didn't
require special casing since they were code, too -- three bytes of "CALL",
followed by the message data.  Okay, it was a hack, but user code didn't
have to deal with it; at least it was a localized hack.

Before you say "what about multiprocessing," note that the back end of a
message port could very well be a network transmission function.

Now you may be asking, why is this in a Unix newsgroup?  The answer:  The
message system I just described served once as a practical base for a very
fast Mach-like kernel.  It could do so again.
-- 
You may redistribute this article only to those who may freely do likewise.
Chip Salzenberg at A T Engineering;  <chip@ateng.com> or <uunet!ateng!chip>
"'Why do we post to Usenet?'  Naturally, the answer is, 'To get a response.'"
                        -- Brad "Flame Me" Templeton

pcg@rupert.cs.aber.ac.uk (Piercarlo Grandi) (10/29/89)

In article <2546462F.5156@ateng.com> chip@ateng.com (Chip Salzenberg) writes:

   For example, I once worked on the design and implementation of a message-
   based real-time OS for the Z-80.  Its message executive was implemented as a
   array of function pointers.  Unassigned message ports pointed to a common
   "you can't get there from here" routine, so minimal error checking was
   needed.  It was _very_ fast.  (It had to be.)

   A process that wanted to respond instantly could simply attach a subroutine
   of its choice to the message port.  If immediate response wasn't required,
   messages could be dropped in mailboxes for later perusal.  Mailboxes didn't
   require special casing since they were code, too -- three bytes of "CALL",
   followed by the message data.  Okay, it was a hack, but user code didn't
   have to deal with it; at least it was a localized hack.

MUSS (from Manchester University) has a similar scheme; you can
send entire virtual memory segments from one process to another,
and this only passes the handle to the segment's paging tables
from the source process to the target process.

The filesystem will, when you want to open a file, send you a
pointer to paging tables that map the disk pages of the
segment/file; as you use them they are faulted in. When two
process ask to open the same file, they are given paging tables
to the same physical pages, thus creating shared memory. On the
other hand, sending a message does not require any core-to-core
copy, nor copy-on-write and its complexities, only a pure process
switch.

Devices are seen as processes, so for example to print a file you
just send the handle to its paging tables to the printer process.
A tty is seen as a process; when you type a line it is wrapped in
a segment and sent to the process you have indicated to the tty
driver via an escape (yes, there is provision for efficient
handling of very small segment, less than a page worth). Each
process by default replies to the sender of the latest segment.

You can thus have multiple ttys interacting with a process,
multiple processes waiting on a tty (doing entirely away with the
ridiculous, hackery job control idea from Bill Joy), the
processes need not be on the same machine as the tty, the
filesystem can be on another machine, the printer on another
still (process names are network wide).

The entire kernel, which is substantially more powerful than the
Unix kernel, is a few dozen kilobytes on a VAX; this because a
lot of things are not necessary, from the buffer cache to file
handling to job control, as they are subsumed in the virtual
memory plus segment passing mechanism.

MUSS ran on a large cluster of very different machines (a home
made supercomputer, a mainframe, some minis, a few micros)
connected by something akin to a bus local area network, in the
early seventies, with remote and transparent login, filesystem,
print service, IPC...
--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk