[comp.unix.wizards] What SHOULD go in the kernel

jwr@ektools.UUCP (Jim Reid) (10/17/89)

Unix wizards,

I have some general questions concerning writing Unix device drivers:

Is there a rule of thumb of what should and should not be put in the
kernel?  To be more specific, is it better to make a device driver 
'lean-and-mean' or sophisticated, so that the user interface (the read(),
write(), ioctl()) is simpler?

What are the pros and cons of each?

Thanks for your time.

-- 

Jim Reid
Eastman Kodak Company
UUCP:   ...!rochester!kodak!ektools!jwr

news@bbn.COM (News system owner ID) (10/18/89)

jwr@ektools.UUCP (Jim Reid) writes:
< Subject: What SHOULD go in the kernel
< 
< Is there a rule of thumb of what should and should not be put in the
< kernel?  To be more specific, is it better to make a device driver 
< 'lean-and-mean' or sophisticated, so that the user interface (the read(),
< write(), ioctl()) is simpler?

As little as possible?. (both a statement and a question)

First off, kernels are generally harder to debug than user programs,
so the less stuff you add there the better off you will be.  Also,
most kernels won't do VM on themselves (for several _good_ reasons
:-) ), so any extra code you put in the kernel will be sitting there
taking up space even if you don't need it right now.

On the other hand, it's much harder to do real-time-ish things in a
user program than in the kernel on most UNIXes.

Personally, I'd go for lean and mean just 'cause.  Very seldom is fat
and featureful better than lean and mean, especially in a kernel.
Compare, for instance, v9 and SunOS 4.  (yes, it _is_ an often
repeated cheap shot at Sun, but it's also _true_.)

	-- Paul Placeway <PPlaceway@bbn.com>
	   Am I a wizard?  Are you qualified to judge?  Does it really
	   matter in the end?  "What I am is what I am, are you what
	   you are or what?" -- E.B.

eeg@frame.UUCP (Eric Griswold) (10/18/89)

In article <2186@ektools.UUCP> jwr@ektools.UUCP (Jim Reid) writes:
]Unix wizards,
]
]I have some general questions concerning writing Unix device drivers:
]
]Is there a rule of thumb of what should and should not be put in the
]kernel?  

Rule of thumb that I heard from years back:

  If you have determined that there is ABSOLUTELY no way that something 
can be implemented as a user process (as opposed to part of the kernel),
then you need a second opinion.
  If the second opinion agrees with you, then you need to think about it
some more.
  If, after this much thought, there still is no way to implement it as 
a user process, consider putting it in the kernel.

------
  +---   Eric Griswold	| "Eat right. Sleep tight. Get goodly exercise, and
 .+--    eeg@frame.com	| life's full spendor will poke you in the eyes."
( +---)	 sun!frame!eeg	|----------------------------------------------------
 `---'	 ames!frame!eeg	| I just want to disclaim that last disclaimer...

chris@mimsy.umd.edu (Chris Torek) (10/18/89)

In article <2186@ektools.UUCP> jwr@ektools.UUCP (Jim Reid) writes:
>Is there a rule of thumb of what should and should not be put in the
>kernel?  To be more specific, is it better to make a device driver 
>'lean-and-mean' or sophisticated, so that the user interface (the read(),
>write(), ioctl()) is simpler?

There are two conflicting goals in device drivers, and right now the
standard Unix kernels (BSD and [as far as I know] Sys5) do not always
help.

The first goal is indeed the `lean-and-mean' approach.  By writing only
what is absolutely necessary for reliability, you can get the driver
debugged quickly, and you do not wire bad assumptions into it.  You
can then tune it for performance, if necessary and appropriate.

The second goal is device independence.  If you wrote a DH driver so
that its read() and write() routines simply did I/O, and user code
had to do things like

	int lp = 024;	/* 8 bits, even parity */
	ioctl(fd, DHIOC_SET_LINE_PARAM_REGISTER, &lp);

it would work, and it would make the driver simpler.  But you would
discover that you needed many different versions of `stty' in order to
run on DHs, DZs, DMFs, DMZs, DHUs, DHVs, etc.  Instead of being quite
so `lean', it is better to establish a common abstraction, such as
a `tty device', and make the same (abstracted) ioctls work on each
device to the best of that device's ability.

Current Unix kernels offer a limited set of abstractions, and limited
support for each.  There are `block' (file system) devices and
`character' (other) devices, and these are further divided into `disk'
and `tape' block devices, and `tty' and `other' character devices.
(4BSD also has, to a limited extent, `tape' and `disk' devices, and it
has `network' devices that are neither block nor character [which is
another problem entirely].  Other kernels may have other classes of
character devices, e.g., Sun's `frame buffer' devices.)

I have rattling about in my head an improved system for grouping
devices based on abstractions, and if I ever sit down and write it, it
may appear in a future BSD release.  (But here I am reading news
instead of getting work done.)

Chris
-- 
`They were supposed to be green.'
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

jfh@rpp386.cactus.org (John F. Haugh II) (10/19/89)

In article <47040@bbn.COM> pplacewa@antares.bbn.com (Paul W. Placeway) writes:
>First off, kernels are generally harder to debug than user programs,
>so the less stuff you add there the better off you will be.  Also,
>most kernels won't do VM on themselves (for several _good_ reasons
>:-) ), so any extra code you put in the kernel will be sitting there
>taking up space even if you don't need it right now.

I don't have trouble with the first part, but discouraging paging
the kernel is kinda wasteful the way kernels keep bloating.

The primary restriction against a paging kernel is keeping the
paging code from being paged ;-).  After that, satisfying real-time
constraints, etc. will yield a collection of pages which must
be locked in memory as well.  What should be left in the list
of locked pages should only be lower halves of device drivers,
the VM manager, the pager, and the global data required by those.

Massive tables, seldom used device drivers, and one-time 
initialization code should all be candidates for the pager.
I paid for the memory, and damnit, I want to use it.

>On the other hand, it's much harder to do real-time-ish things in a
>user program than in the kernel on most UNIXes.

Granted.  However, much of what is in a UNIX kernel has no
real time requirements and should be paged out when not required.

Dynamically loadable device drivers are wonderful.  When can
we see dynamically unloadable device drivers ;-)
-- 
John F. Haugh II                        +-Things you didn't want to know:------
VoiceNet: (512) 832-8832   Data: -8835  | The real meaning of MACH is ...
InterNet: jfh@rpp386.cactus.org         |    ... Messages Are Crufty Hacks.
UUCPNet:  {texbell|bigtex}!rpp386!jfh   +--<><--<><--<><--<><--<><--<><--<><---

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/19/89)

In article <3596@frame.UUCP> eeg@frame.UUCP (Eric Griswold) writes:
>  If, after this much thought, there still is no way to implement it as 
>a user process, consider putting it in the kernel.

But first, think carefully about what service it is in GENERAL terms
that the kernel is not providing, and design an elegant implementation
that provides the GENERAL service, rather than just the specific need
that prompted the issue.

Of course, we were talking originally about device drivers.
My advice there is to keep them as simple as possible so long as
all the legitimate uses of the device are supported.

nagle@well.UUCP (John Nagle) (10/19/89)

       One could argue that device drivers don't belong in the kernel
at all.  With reasonable hardware support, no loss in protection is
implied by this.   The operating system must provide the mechanisms
to map one peripheral's I/O space into the space of the driver process,
and the memory management hardware must mediate accesses initiated by
the device itself (whether "DMA", "bus master operation", or "channel
program execution").

       Systems with these capabilities include IBM mainframes under VM, and
Apollo machines under Domain when equipped with the add-on box for user
supported peripherals.

       One also needs something like named pipes for communication between
applications and device drivers.  This intercommunication mechanism must
include 1) bidirectional I/O 2) out-of-band control messages ("IOCTLs"),
3) the capability of one end to verify the identity and security status
of the other end, and 4) the ability of one end to detect termination
of the other end.

       With capabilities like this, you can kick the device drivers,
terminal handling, networking, and file system out of the kernel.

       Unfortunately, UNIX isn't designed to work this way, and the
success of UNIX has resulted in the decline of hardware support for
this sort of thing.  The result is the bloated kernels of today.

					John Nagle
h

rcd@ico.isc.com (Dick Dunn) (10/20/89)

jfh@rpp386.cactus.org (John F. Haugh II) writes:
> ...pplacewa@antares.bbn.com (Paul W. Placeway) writes:
> >...Also,
> >most kernels won't do VM on themselves (for several _good_ reasons
> >:-) ),...
> ... discouraging paging
> the kernel is kinda wasteful the way kernels keep bloating.

Gack.  John is right that kernels keep expanding (rapidly!), and that it
does waste memory (real memory which costs real $) to keep all that dreck
resident.  The kernels of the Brave New Open International World are going
to be FAT!  More accurately, they won't be kernels even in any reasonable
stretch (!) of the term.

But do we have to accept that?  For the problems caused by bloating
kernels, at least we can say the problems might be slowing the growth a
bit.  I'm afraid that if you let the "kernel" page, you really open it up
wide--there's no reason to think twice about putting EVERYthing in the
kernel and turning it into another MVS.

Is there any way to induce a change in viewpoint?  Why not change the
perception of the problem from "we need a way to handle an ever-expanding
kernel" to "we need to stop the expansion of the kernel."  (Yes, I know, it
doesn't work quite that way--you need to restructure it dramatically and
throw large pieces OUT of the kernel.)

Also, since as John pointed out there's only part of the kernel that could
be pageable, why not call the non-pageable part "the kernel" and put the
pageable parts in something called "user-level code"?  The only loss I see
in doing this is that there will be people who won't be able to stroke
their egos by calling themselves "kernel programmers".  (Of course, they're
just the folks *I* don't want messing around in the kernel.)
-- 
Dick Dunn     rcd@ico.isc.com    uucp: {ncar,nbires}!ico!rcd     (303)449-2870
   ...No DOS.  UNIX.

peter@ficc.uu.net (Peter da Silva) (10/20/89)

In article <17166@rpp386.cactus.org> jfh@rpp386.cactus.org (John F. Haugh II) writes:
> I don't have trouble with the first part, but discouraging paging
> the kernel is kinda wasteful the way kernels keep bloating.

So redefine what the "kernel" is, like Mach does.
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"You can tell when a USENET discussion is getting old when one of the      'U`
 participants drags out Hitler and the Nazis" -- Richard Sexton

trt@rti.UUCP (Thomas Truscott) (10/20/89)

>        One could argue that device drivers don't belong in the kernel
> at all.

As device drivers continue to bloat in number and size,
and as hardware becomes more sophisticated,
this argument gains strength.
The NeXT Mach 1.0 operating system supports loadable device drivers.
The MIDI interface, and other things like a SLIP (RS-232 TCP/IP) driver,
are done that way.  The driver is dynamically linked to the kernel,
at which point it functions like an ordinary driver.
It can later be dynamically unlinked.  Pretty slick.
This makes kernel relinking unnecessary (indeed it is not supported).

Unfortunately, vanilla NeXT 1.0 does not have documentation
(or needed tools it seems) for writing one's own loadable driver.
We dearly need this feature to add our favorite device driver (Freedomnet)
to the NeXT box.  Perhaps this will be fixed in a newer release?
	Tom Truscott

matt@oddjob.uchicago.edu (Matt Crawford) (10/20/89)

Dick Dunn:
)   I'm afraid that if you let the "kernel" page, you really open it up
) wide--there's no reason to think twice about putting EVERYthing in the
) kernel and turning it into another MVS.

Under Elxsi's OS, "embos", even the page tables can be paged.  Even the
page tables for the paged-out page table entries can be paged out.  (But
it does stop at the third level of page table entries.)
________________________________________________________
Matt Crawford	     		matt@oddjob.uchicago.edu

news@bbn.COM (News system owner ID) (10/21/89)

jfh@rpp386.cactus.org (John F. Haugh II) writes:
< The primary restriction against a paging kernel is keeping the
< paging code from being paged ;-).  After that, satisfying real-time
< constraints, etc. will yield a collection of pages which must
< be locked in memory as well.  What should be left in the list
< of locked pages should only be lower halves of device drivers,
< the VM manager, the pager, and the global data required by those.
< 
< Massive tables, seldom used device drivers, and one-time 
< initialization code should all be candidates for the pager.
< I paid for the memory, and damnit, I want to use it.

Actually, I guess there are two ways of looking at this.  The first is
you want your kernel to stay up even if your swap device fails.  If
true, then there is obviously no way to let part of your kernel swap
out.

On the other hand, if you consider the machine dead when a swap device
dies, then swapping out the kernel is fair enough.  If your kernel
could do this, then demand-loadable device drivers would be less
needed (although still nice to have for other good reasons).  As John
indicates, a kernel that ran this way would probably be much more
memory efficient (not a bad thing, considering some of todays kernels).

A sort of work-around for this is to have a bunch of user-level kernel
processes that do most of the work (like tty processing), and let them
get paged and swapped out when not in use.  This is even a performance
win for some things (tty drivers, among others).

		-- Paul <PPlaceway@bbn.com>

guy@auspex.auspex.com (Guy Harris) (10/22/89)

 >>        One could argue that device drivers don't belong in the kernel
 >> at all.
 >
 >As device drivers continue to bloat in number and size,
 >and as hardware becomes more sophisticated,
 >this argument gains strength.

Yes, but...

 >The NeXT Mach 1.0 operating system supports loadable device drivers.
 >The MIDI interface, and other things like a SLIP (RS-232 TCP/IP) driver,
 >are done that way.  The driver is dynamically linked to the kernel,
 >at which point it functions like an ordinary driver.

...down to being a part of the kernel.

Sorry, just making drivers loadable into and, possibly, unloadable from
the kernel doesn't keep them from being in the kernel - it just makes it
easier to control which ones you are in your particular kernel.

peter@ficc.uu.net (Peter da Silva) (10/23/89)

The distinction between whether something goes in the kernel or runs as
a seperate process comes down to two considerations:

	1. Protection: does this thing have to violate normal process
		protection mechanisms?
	2. Performance: this comes down in turn to two considerations:
		a. Realtime activities: for example, doing a streaming tape
			driver is a pretty hard realtime problem.
		b. Throughput considerations: excessive context switches
			lowering system performance to an unacceptable
			amount.

As time goes on, the protection mechanisms get more complex and capable.
Shared memory, PTYs, and so on allow stuff that used to live in the kernel
(console drivers) to perform their jobs withing the protection mechanism.
Look at XTERM, for example.

In Mach, with user pagers and the like, this requirement is about dead.

As processors become faster, throughput questions become less meaningful.
Once upon a time canonical tty processing was one of those things that had
to be in the kernel. Again, XTERM is an example of an activity that has
moved outside the kernel... because processors can afford it as well because
PTYs exist.

The final barrier is realtime activities. UNIX is not a realtime system. To
some extent this can be glossed over as non-real-time performance becomes
fast enough. Still, real realtime support is needed before the kernel can
be completely flushed of alien material... a 20 megaherz 80386 is not fast
enough to handle XON/XOFF processing in a user process (a problem I'm currently
trying to deal with), let alone hard problems like tape drives, disk drives,
and networks.

One of these days, though...
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"I feared that the committee would decide to go with their previous        'U`
 decision unless I credibly pulled a full tantrum." -- dmr@alice.UUCP

smb@ulysses.homer.nj.att.com (Steven M. Bellovin) (10/23/89)

In article <14163@well.UUCP>, nagle@well.UUCP (John Nagle) writes:
> 
>        One could argue that device drivers don't belong in the kernel
> at all.  With reasonable hardware support, no loss in protection is
> implied by this.   The operating system must provide the mechanisms
> to map one peripheral's I/O space into the space of the driver process,
> and the memory management hardware must mediate accesses initiated by
> the device itself (whether "DMA", "bus master operation", or "channel
> program execution").
> 
>        Systems with these capabilities include IBM mainframes under VM...

In fact, VM/370 does a rather poor job at it.  More precisely, given the
System/370 I/O architecture -- which was not designed for virtualization,
or even (originally) for virtual memory -- analyzing the channel programs
for safety and correctness, and constructing the proper emulation is
hard and expensive.  There have been microcode assists added over the
years to aid VM, but I'm not sure if any of them help with arbitrary
I/O requests.  Fielding the interrupts is even harder.

This doesn't invalidate the basic Nagle's basic point, but allowing
user-level programs raw access to I/O space is quite complex architecturally.

samlb@pioneer.arc.nasa.gov (Sam Bassett RCD) (10/24/89)

	Nahhh -- let's call what's not pageable "the kernel", what is
pageable "the supervisor", and everything else "user level" code --
that will give the headhunters yet another category of people to look
for.


Sam'l Bassett, Sterling Software @ NASA Ames Research Center, 
Moffett Field CA 94035 Work: (415) 694-4792;  Home: (415) 969-2644
samlb@well.sf.ca.us                     samlb@ames.arc.nasa.gov 
<Disclaimer> := 'Sterling doesn't _have_ opinions -- much less NASA!'

dtynan@altos86.Altos.COM (Dermot Tynan) (10/27/89)

In article <17166@rpp386.cactus.org>, jfh@rpp386.cactus.org (John F. Haugh II) writes:
> The primary restriction against a paging kernel is keeping the
> paging code from being paged ;-).
> [...] seldom used device drivers, [...] should all be candidates for
> the pager.

This is not entirely accurate.
Another reason for not paging the kernel, is instruction restart within a
device driver.  A classic example is a UART with a FIFO.  Allowing
instruction restart after a page-fault, when the driver is reading from the
UART, and writing to (pageable) memory will create havoc.  Intel products
are insulated from this, because they have a separate I/O bus, which means
that I/O can only be done to an on-chip register.  However, memory-mapped
I/O will fail horribly.
It would probably make a big difference, if people who had kernel link-kits,
would remove all the junk they didn't need.  On top of that, if marketing
types would take a big deep breath, and decide on ONE topology, the kernel
wouldn't need to be so big.  I mean, why does S5R4 (or 4.4BSD for that
matter) need TCP/IP *and* ISO, RFS *and* NFS, etc, etc.
						- Der
-- 
	dtynan@altos86.Altos.COM		(408) 946-6700 x4237
	Dermot Tynan,  Altos Computer Systems,  San Jose, CA   95134

    "Far and few, far and few, are the lands where the Jumblies live..."

jfh@rpp386.cactus.org (John F. Haugh II) (10/27/89)

In article <3718@altos86.Altos.COM> dtynan@altos86.Altos.COM (Dermot Tynan) writes:
>This is not entirely accurate.

No, because you didn't bring all of the context with the quotation.
An early response stated that interrupt handler lower halves needed
to be locked in memory.

>Another reason for not paging the kernel, is instruction restart within a
>device driver.  A classic example is a UART with a FIFO.  Allowing
>instruction restart after a page-fault, when the driver is reading from the
>UART, and writing to (pageable) memory will create havoc.  Intel products
>are insulated from this, because they have a separate I/O bus, which means
>that I/O can only be done to an on-chip register.  However, memory-mapped
>I/O will fail horribly.

This may or may not be true depending on the CPU and how instruction
restart is handled.  If no memory cycles are duplicated on restart
[ i.e., a read cycle early in the instruction will not be re-executed
during page fault processing ] you probably won't lose.  Not all CPUs
get this wrong!  It is conceivable that enough state information is
stacked on the exception to permit the instruction to resume from the
same state [ this requires dumping gobs of internal state onto the
stack, including microcode registers, etc ].

The difference is whether the instruction is re-run, or re-started.
I've never written page fault handling code for the MC68020, but I
understand it dumps 40 or 50 words of information on the stack and
after the return from the page fault handler picks up at the same
exact microinstruction [ more or less no doubt ].  The implication of
this being that no steps are repeated [ and you can run all of your
programs in one page of memory ;-) ]

>It would probably make a big difference, if people who had kernel link-kits,
>would remove all the junk they didn't need.  On top of that, if marketing
>types would take a big deep breath, and decide on ONE topology, the kernel
>wouldn't need to be so big.  I mean, why does S5R4 (or 4.4BSD for that
>matter) need TCP/IP *and* ISO, RFS *and* NFS, etc, etc.

Yes.  Better still is dynamic configuration at boot time with a
minimal configuration set up by default during installation.
-- 
John F. Haugh II                        +-Things you didn't want to know:------
VoiceNet: (512) 832-8832   Data: -8835  | The real meaning of EMACS is ...
InterNet: jfh@rpp386.cactus.org         |   ... EMACS makes a computer slow.
UUCPNet:  {texbell|bigtex}!rpp386!jfh   +--<><--<><--<><--<><--<><--<><--<><---

jc@minya.UUCP (John Chambers) (10/29/89)

In article <47040@bbn.COM>, news@bbn.COM (News system owner ID) writes:
> jwr@ektools.UUCP (Jim Reid) writes:
> < Is there a rule of thumb of what should and should not be put in the
> < kernel?  To be more specific, is it better to make a device driver 
> < 'lean-and-mean' or sophisticated, so that the user interface (the read(),
> < write(), ioctl()) is simpler?
> 
> As little as possible?. (both a statement and a question)
> 
> First off, kernels are generally harder to debug than user programs,
> so the less stuff you add there the better off you will be.  Also,
> most kernels won't do VM on themselves (for several _good_ reasons
> :-) ), so any extra code you put in the kernel will be sitting there
> taking up space even if you don't need it right now.

[This is too good to pass up. ;-]  I'd like to observe that, though
this may be correct from a software-engineering point of view, it is
incorrect from a career-interest point of view.  I've made the mistake
of minimizing the kernel work on quite a few projects.  Now when I go
to interviews, it is clear what the result is.  Such design isn't ever
taken as evidence of practicality, good engineering practice, or any such
thing.  It is merely evidence that I am a Unix kernel lightweight.  At
a time when Unix-kernel/device-driver experts are getting roughly 50%
more bucks than those who work at the "application" level, it is clear
what a fool I've been.

Lately, I've been following people's advice here, and writing programs
that grovel around in a filesystem's raw device.  This could turn out 
to be a bad idea.  Once again, I've done it outside the kernel.  But
I have an excuse:  This is an object-only system.  If I had the source,
you bet I'd do it in the kernel (especially since in this case, it'd 
be easier).

> Personally, I'd go for lean and mean just 'cause.  Very seldom is fat
> and featureful better than lean and mean, especially in a kernel.
> Compare, for instance, v9 and SunOS 4.  (yes, it _is_ an often
> repeated cheap shot at Sun, but it's also _true_.)

In particular, there's a good demand for people with significant Sun
internals experience.  It can be hard to get this experience, due to
the shortage of source code at most Sun customer sites.  If you have
it available, the sensible advice is "Go for it."  If you have doubts
about whether it's for the best of your current employees, you might
try asking them whether they'd pay more for a Sun internals expert than
for a Sun applications programmer.  Then go with their answer.  

> 	-- Paul Placeway <PPlaceway@bbn.com>
> 	   Am I a wizard?  Are you qualified to judge?  Does it really
> 	   matter in the end?  "What I am is what I am, are you what
> 	   you are or what?" -- E.B.

In the current economy, much judging is done by those unqualified to
judge.  That's why there's so much dependence on credentials.  It's
hard to see how it could be any other way in a field with such rapid
technology change.

In response to the expected flames, I'll just pre-ask a pertinent
question:  If you want a job done right, shouldn't you be rewarding
those who do it right, rather than those that do it the hard way?

(I might also observe that Mach may bad news for kernel experts; it
seems they've moved a lot of stuff out of the kernel.... ;-)

-- 
#echo 'Opinions Copyright 1989 by John Chambers; for licensing information contact:'
echo '	John Chambers <{adelie,ima,mit-eddie}!minya!{jc,root}> (617/484-6393)'
echo ''
saying

gil@banyan.UUCP (Gil Pilz@Eng@Banyan) (10/31/89)

In article <3718@altos86.Altos.COM> dtynan@altos86.Altos.COM (Dermot Tynan) writes:
>Another reason for not paging the kernel, is instruction restart within a
>device driver.  A classic example is a UART with a FIFO.  Allowing
>instruction restart after a page-fault, when the driver is reading from the
>UART, and writing to (pageable) memory will create havoc.  Intel products
>are insulated from this, because they have a separate I/O bus, which means
>that I/O can only be done to an on-chip register.  However, memory-mapped
>I/O will fail horribly.

So page in (if necessary) and lock down the target page(s) *before*
starting the I/O, then unlock them on I/O completion.  What's the
problem ?

"forty orange cookies, some are black some are white
 what would it take to make 'em all turn *just* *right* ?
 forty orange cookies sitting on the bed
 one took off, the others followed
 went straight for my head"
	- house of large sizes

Gilbert W. Pilz Jr.       gil@banyan.com