[comp.os.research] ZFOD before COW

aglew@urbsdc.Urbana.Gould.COM (Aglew) (04/27/88)

This is more of an OS question than an architecture question
(although it does have implications for large sparse memory
applications), but I'm going to cross-post:

Does anyone know of systems that implement zero-fill-on-demand
copy-on-write between pages belonging to the same process?

Background: ZFOD pages are pages that, when referenced, are 
allocated and then filled with zeroes. UNIX BSS is implemented
this way.

COW pages are two pages that are logically distinct, yet contain
the same data initially; so, they are created to point to the
same page, but write-protected; if there is ever a write that
causes them to become distinct, it faults, and a non-shared copy
of the page is created. COW is used in some UNIXes to implement
the address space copy of a fork() that creates two identical 
processes.

(While we are at it, copy-on-reference pages in virtual memory 
can be used to improve locality as processes migrate through 
a loosely connected distributed system, or even across a system
of memory busses in a multiprocessor (where accessing a page on
another bus is more expensive than on your own), used to migrate
memory close to where it is being used without copying the entire
address space.)

Reason for the question: I'm interested in an application that uses
a very large very sparse address space. The non-sparse regions are
clumpy, so there will be large contiguous regions of zeroes.
However, the probe density is high - many of those unused regions
will be read, but not written to.
    Simple ZFOD isn't satisfactory, since real pages will be allocated
and zero-filled as I read from them. What I really want is to 
allocate all the pages in my address space pointing to a single
page of zeroes, COWing if I write to a particular page.
    Since I do not know how much address space I need (ie. how
may PTEs need to be allocated) I would like to first take a fault
on reference, allocate the PTEs, pointing to the page of zeroes,
be able to read from those pages without further faults, and then
COW on a write fault. Thus, zero-point-on-read, before COW.
    Obviously, I am interested in OS support for this, since it's
an application. Can anybody point me to systems that already do it
(and boy! won't I be embarassed if one of the systems that I have
source access to does. But anyway...)

aglew@gould.com