[comp.unix.wizards] ZFOD before COW

aglew@urbsdc.Urbana.Gould.COM (04/27/88)

/* Written  6:37 pm  Apr 26, 1988 by aglew@urbsdc.Urbana.Gould.COM in urbsdc:comp.arch */
/* ---------- "ZFOD before COW" ---------- */
This is more of an OS question than an architecture question
(although it does have implications for large sparse memory
applications), but I'm going to cross-post:

Does anyone know of systems that implement zero-fill-on-demand
copy-on-write between pages belonging to the same process?

Background: ZFOD pages are pages that, when referenced, are 
allocated and then filled with zeroes. UNIX BSS is implemented
this way.

COW pages are two pages that are logically distinct, yet contain
the same data initially; so, they are created to point to the
same page, but write-protected; if there is ever a write that
causes them to become distinct, it faults, and a non-shared copy
of the page is created. COW is used in some UNIXes to implement
the address space copy of a fork() that creates two identical 
processes.

(While we are at it, copy-on-reference pages in virtual memory 
can be used to improve locality as processes migrate through 
a loosely connected distributed system, or even across a system
of memory busses in a multiprocessor (where accessing a page on
another bus is more expensive than on your own), used to migrate
memory close to where it is being used without copying the entire
address space.)

Reason for the question: I'm interested in an application that uses
a very large very sparse address space. The non-sparse regions are
clumpy, so there will be large contiguous regions of zeroes.
However, the probe density is high - many of those unused regions
will be read, but not written to.
    Simple ZFOD isn't satisfactory, since real pages will be allocated
and zero-filled as I read from them. What I really want is to 
allocate all the pages in my address space pointing to a single
page of zeroes, COWing if I write to a particular page.
    Since I do not know how much address space I need (ie. how
may PTEs need to be allocated) I would like to first take a fault
on reference, allocate the PTEs, pointing to the page of zeroes,
be able to read from those pages without further faults, and then
COW on a write fault. Thus, zero-point-on-read, before COW.
    Obviously, I am interested in OS support for this, since it's
an application. Can anybody point me to systems that already do it
(and boy! won't I be embarassed if one of the systems that I have
source access to does. But anyway...)

aglew@gould.com


/* End of text from urbsdc:comp.arch */

jc@minya.UUCP (John Chambers) (05/02/88)

In article <57900012@urbsdc>, aglew@urbsdc.Urbana.Gould.COM writes:
> Does anyone know of systems that implement zero-fill-on-demand
> copy-on-write between pages belonging to the same process?
> 
> Background: ZFOD pages are pages that, when referenced, are 
> allocated and then filled with zeroes. 

Yeah, Burroughs large systems do this.  They have hardware that
uses a hardware pointer (with base-address and size fields).  
In fact, when a program starts up, such pointers (or descriptors,
in Burroughs parlance) are initialized to have an address of zero,
and the 'presence' flag bit is off, causing a page fault when you
attempt to reference the array.  The paging system notes that the
array hasn't been allocated, allocates it, fills it with zeroes,
modifies the descriptor, and returns to repeat the operation.

One nice effect of this is that arrays don't even exist until you
reference them.  For multiply-dimensioned arrays, each row is a
separately-allocated block.  Thus you might declare a 1000-by-1000
array, and not have the memory.  But if you only used rows 1, 3, 
and 17, then only those three rows (plus a row of descriptors for
the first subscript) would be allocated.

This technique isn't usable on most machines, as it sort of requires
some special hardware to make it work.  Well, you actually could
use their array representation on most machines with hardware MMUs,
but I've never heard of anyone doing it.  I'd be interested in being
proved ignorant....


-- 
John Chambers <{adelie,ima,maynard,mit-eddie}!minya!{jc,root}> (617/484-6393)

You can't make a turtle come out.
	-- Malvina Reynolds

aglew%fang@xenurus.gould.com (Andy Glew) (05/05/88)

>In fact, when a program starts up, such pointers (or descriptors,
>in Burroughs parlance) are initialized to have an address of zero,
>and the 'presence' flag bit is off, causing a page fault when you
>attempt to reference the array.  The paging system notes that the
>array hasn't been allocated, allocates it, fills it with zeroes,
>modifies the descriptor, and returns to repeat the operation.

This is zero-fill-on-demand, which many UNIX machines implement.

What I was asking about was ZFOD followed by COW.

Ie. initialize all pages nonallocated, nonpresent.  Take a page fault
on reference.  If the reference was a read, allocate the PTEs,
pointing to a single globally read-only page of zeroes.  On a write
access, actually allocate the page.

I have received much mail from people who misunderstood what I was
asking for. Obviously, I did not describe the problem clearly enough.
The most useful suggestion was that user-writable page fault handlers
in Mach can do this for you.