[comp.arch] ZFOD before COW

aglew@urbsdc.Urbana.Gould.COM (04/27/88)

This is more of an OS question than an architecture question
(although it does have implications for large sparse memory
applications), but I'm going to cross-post:

Does anyone know of systems that implement zero-fill-on-demand
copy-on-write between pages belonging to the same process?

Background: ZFOD pages are pages that, when referenced, are 
allocated and then filled with zeroes. UNIX BSS is implemented
this way.

COW pages are two pages that are logically distinct, yet contain
the same data initially; so, they are created to point to the
same page, but write-protected; if there is ever a write that
causes them to become distinct, it faults, and a non-shared copy
of the page is created. COW is used in some UNIXes to implement
the address space copy of a fork() that creates two identical 
processes.

(While we are at it, copy-on-reference pages in virtual memory 
can be used to improve locality as processes migrate through 
a loosely connected distributed system, or even across a system
of memory busses in a multiprocessor (where accessing a page on
another bus is more expensive than on your own), used to migrate
memory close to where it is being used without copying the entire
address space.)

Reason for the question: I'm interested in an application that uses
a very large very sparse address space. The non-sparse regions are
clumpy, so there will be large contiguous regions of zeroes.
However, the probe density is high - many of those unused regions
will be read, but not written to.
    Simple ZFOD isn't satisfactory, since real pages will be allocated
and zero-filled as I read from them. What I really want is to 
allocate all the pages in my address space pointing to a single
page of zeroes, COWing if I write to a particular page.
    Since I do not know how much address space I need (ie. how
may PTEs need to be allocated) I would like to first take a fault
on reference, allocate the PTEs, pointing to the page of zeroes,
be able to read from those pages without further faults, and then
COW on a write fault. Thus, zero-point-on-read, before COW.
    Obviously, I am interested in OS support for this, since it's
an application. Can anybody point me to systems that already do it
(and boy! won't I be embarassed if one of the systems that I have
source access to does. But anyway...)

aglew@gould.com

mwyoung@f.gp.cs.cmu.edu (Michael Young) (05/05/88)

The original Mach implementation of zero-fill-on-demand memory
worked exactly as "aglew@gould.com" asked.  One "zero fill" memory
object, containing one zero-filled page, was used to back all
read-only requests.  Two performance issues lead us to change:

	Very few zero-fill pages are read but not written.  The
	extra memory fault (to copy-on-write a zero-fill page)
	cost too much for almost all applications.  [Furthermore,
	the page was copied from the original zero-fill page
	(just like any other copy-on-write operation), rather
	than zero-filling.  This added cost could be optimized
	away, but we didn't.]

	Many architectures make sharing a single physical page
	at several virtual addresses undesirable.  For example,
	the IBM RT/PC only allows page sharing at the segment level.
	To use one physical page would cause a fault on each
	new virtual address.

So, each zero-fill fault gets a fresh page.  However, these pages
are not "dirty" and are easily reclaimed during pageout; no backing
storage is wasted.  Still, the cost of zero-filling at all is high.

Several simple changes to Mach could be made to accomodate this
application:

	Mark zero-fill pages as such until they're dirtied, and possibly
	keep them on a separate free page list.  This makes it unlikely
	that it's necessary to zero-fill a new page.  This may be all you
	can do on an architecture with the virtual address aliaising problem.

	Restore the old "zero fill object", and use it if the architecture
	allows.  This involves hackery in the fault handler to reduce
	all pages in that object to the single "zero fill page".

	Manage a "user-defined" zero-fill object using the Mach external
	memory management interface.  If built into the kernel, the manager
	could cheat to use the same physical page for all virtual addresses.
	[This cheating is similar in form to how drivers for memory-mapped
	devices fit into the Mach VM system.]

For information on obtaining Mach sources, send mail to "mach@wb1.cs.cmu.edu".
Technical comments or questions can be directed to "info-mach@wb1.cs.cmu.edu".

			Michael

aglew@urbsdc.Urbana.Gould.COM (05/10/88)

Thanks to the lengthy posting about Mach
wrt ZFOD before COW.