[comp.os.mach] External Pager Example

francis@cs.ua.oz.au (Francis Vaughan) (05/28/91)
I mentioned last week that I have a Mickey Mouse
External Pager that might form a useful stub program
for those who wanted to write one. My mailbox runneth over.

So here it is. I have cleaned it up a bit. 

There is really very little to it. It is a pity that
the standard Mach distribution does not include something
like this as it is mostly stuff typed in from documentation
and how to's got from reading source code.

The authors of Mach really do themselves a dis-service by making 
it obscure just how it fits together. It is not as if there is 
anything  hard about writing an external pager from the point of 
view of just getting one going. However the documentation
simply lists the calls, and does not explain the glue. So what
I am really supplying is small amount of glue. 

I am sure there are bits that the more knowlegeable may like to
correct. I would welcome such comments (as I am sure would many
others.)


					Francis Vaughan.

-----------------------------------------------------------------


/*
  This is an example of how to build an external pager under Mach.
  It was written and runs on a Sun 3/60 running MSD 2.6.

  Some parts are still a little messy. Sorry about the wide format.
  I can't be bothered reformatting it. If you still use 80x24 I
  am sorry for you.

  Francis Vaughan. May 1991. 

  Caveat Emptor.

  built with:

  cc -g -o external-pager-example external-pager-example.c -lsys
-lthreads -lmach

*/

/*
  Introduction.

  The mach expernal pager interface is built in two parts.
  System calls are provided which allow user code to control the
  contents of the memory object that contains the pages used by
  the client address spaces.

  The user supplys a set of routines that are called by the kernel
  via a MIG generated interface to handle such things as page faults
  and memory protection violations.

  This program servers two purposes. You can take it pretty much as
  a shell to build a real external pager (I have). It also demonstrates
  a very mickey mouse pager in action. One page is supplied to a client
  which reads a value out of it and then writes to it. Stunning stuff.
  
  The pager is perhaps unusual in that the pager and its client are in
  the same task. This is not a problem, it should be trivial to pull
  the two apart into client and server programs. Remember that if the
  pager ever touches a page in part of the address pace that it is serving 
  it may deadlock.

  A simple routine region_scan() is included that prints out the map of the
  tasks address space by regions. It is interesting and occasionaly useful.

*/





#include <mach.h>
#include <stdio.h>
#include <cthreads.h>
#include <servers/netname.h>
#include <mach_error.h>
#include <mach/message.h>
#include <mig_errors.h>

mutex_t  server_lock;
condition_t  server_cond;
port_name_t server_terminate_port;
extern int cthread_debug;

extern kern_return_t memory_object_server();

/* The following routine is taken directly from 
   "A programmers guide to the Mach User Environemt".
   The documentation implys that it is a library routine, however
neither I nor ld can find it.
   As it is, you can get away with a greatly cut down version, for the
external pager 
   so I have just left the code here commented out. Someone might find
it saves a few minutes
   someday.
/*

kern_return_t
mig_server( service, function )
     port_name_t service;
     boolean_t (*function)();
{
  int requestbuf[MSG_SIZE_MAX/sizeof(int)];
  int replybuf[MSG_SIZE_MAX/sizeof(int)];
  msg_header_t *request = (msg_header_t *) requestbuf;
  death_pill_t *reply = (death_pill_t *) replybuf;
  msg_return_t mr;
  for (;;) {
    request->msg_size = sizeof(requestbuf);
    request->msg_local_port = service;

    mr = msg_receive(request, MSG_OPTION_NONE, 0 );

    if (request->msg_local_port == task_notify())
      cthread_exit();

    (void) (*function)(request, &reply->Head);
    if ((reply->Head.msg_remote_port != PORT_NULL ) && (reply->RetCode
!= MIG_NO_REPLY)) 
      (void)msg_send(&reply->Head,SEND_TIMEOUT,0);
  }
}
*/

/* This is the body of the routine that will be instantiated as the
external pager */

/* This routine creates the port that will represent the memory object,
and then
   builds a port set to listen on. I have added a termination port such
that when
   any message is received on it, it kills the pager nicley. This is
possibly a waste of time,
   but does allow me to kill the pager easily and have a chance to clean up.

   Once the ports are built, the thread signals that it is running, and
loops waiting
   for messages. Messages from the kernel through the memory object port
"Server_Port"
   are passed on to the the library function "memory_object_server".

   memory_object_server() has been build by MIG and calls the
appropriate user supplied routine
   with the appropriate parameters. Unlike the usual MIG RPC calls none
of these routines
   supplys a return value (they could, but the kernel is not waiting for
it). Therefore
   I have cut down the mig_server code to delete the return message
stuff. You still need
   to pass in a return message even though it will not be used.
*/

External_Pager_Proc(arg)
int arg;
{
  port_name_t Server_Port;
  port_name_t Port_Set;
  kern_return_t rs;
  int requestbuf[MSG_SIZE_MAX/sizeof(int)];
  msg_header_t *request = (msg_header_t *) requestbuf;
  int replybuf[MSG_SIZE_MAX/sizeof(int)];
  death_pill_t *reply = (death_pill_t *) replybuf;
  msg_return_t mr;


/* build the ports */
  if (( rs = port_allocate( task_self(), &Server_Port)) != KERN_SUCCESS) {
    mach_error("  Port_allocate for server port", rs ); exit (1); }

  if (( rs = port_set_allocate( task_self(), &Port_Set)) != KERN_SUCCESS) {
    mach_error("  Port_set_allocate for server", rs ); exit (1); }

  if (( rs = port_set_add( task_self(), Port_Set, Server_Port)) !=
KERN_SUCCESS) {
    mach_error("  Port_set_add for server", rs ); exit (1); }

  if (( rs = port_set_add( task_self(), Port_Set, task_notify())) !=
KERN_SUCCESS) {
    mach_error("  Port_set_add task_notify for server", rs ); exit (1); }

  if (( rs = port_set_add( task_self(), Port_Set,
server_terminate_port)) != KERN_SUCCESS) {
    mach_error("  Port_set_add server_terminate_port for server", rs );
exit (1); }

  if (( rs = netname_check_in(name_server_port, "ExternalPagerPort",
PORT_NULL, Server_Port)) != KERN_SUCCESS) {
    mach_error("  netname_check_in for server", rs ); exit (1); }

  mutex_lock(server_lock);
  condition_signal(server_cond);  /* let the main proc know that the
server is running */
  mutex_unlock(server_lock);

  /* this is a cut down version of mig_server that does just what is
needed for the external pager */
  /* it is dumped inline as there is no reason to make it a funtion */
  
  for (;;) {
    request->msg_size = sizeof(requestbuf);
    request->msg_local_port = Port_Set;
    
    mr = msg_receive(request, MSG_OPTION_NONE, 0 );
    
    if (request->msg_local_port == task_notify())
      continue;
    if (request->msg_local_port == server_terminate_port)
      break;
  if ( FALSE ==  memory_object_server(request,reply)){
    mach_error("  memory_object_server failed ", FALSE ); exit (1); }
  }
  
  /*    One could replace the above for loop with this (but then there
is no terminate message).
    if (( rs = mig_server( Port_Set, memory_object_server )) != KERN_SUCCESS) {
    mach_error("  mig_server for server", rs ); exit (1); }
    */

}



/* This is a useful little routine that displays the layout of VM. 
   Since it will often be used in a multi threaded program I use the routine
   trace to print, rather than printf directly. This assumes the other
threads use trace().
   (Which they do here.) Probably better here would be to grab io_lock
at the beginning
   and then printf, but I haven't bothered. (I don't like holding locks
for long.)
*/

#include <stdio.h>
#include <varargs.h>

struct mutex io_lock;

/* VARARGS */
void
trace( va_alist )
     va_dcl
{
  va_list ap;

  mutex_lock( &io_lock );
  va_start( ap );
  vprintf( va_arg( ap, char* ) );
  va_end( ap );
  mutex_unlock( &io_lock );
}



void region_scan()
{
/* this routine scans through VM printing the attributes of each
allocated region */
  kern_return_t rs;
  vm_address_t last_address;
  vm_address_t start_address;
  vm_size_t    region_size;
  vm_prot_t    region_prot;
  vm_prot_t    region_max_prot;
  vm_inherit_t region_inheritance;
  boolean_t    region_shared;
  port_t       region_object_name;
  vm_offset_t  region_offset;

  start_address = 0;
  last_address = 0;


  while ( KERN_SUCCESS == vm_region(task_self(),&start_address,
&region_size, &region_prot, & region_max_prot,
				    &region_inheritance, &region_shared, &region_object_name,
&region_offset) ) {


    if ( start_address != last_address ) /* there is a gap */ 
      trace("unallocated: start- %7x, size- %7x\n",
	    last_address, start_address - last_address );

    trace("Region:      start- %7x, size- %7x, ",start_address,region_size);
    if ( region_prot & VM_PROT_READ )  trace("R"); else trace("-");
    if ( region_prot & VM_PROT_WRITE ) trace("W"); else trace("-");
    if ( region_prot & VM_PROT_EXECUTE )  trace("X"); else trace("-");
    trace("  ");
    if ( region_max_prot & VM_PROT_READ )  trace("R"); else trace("-");
    if ( region_max_prot & VM_PROT_WRITE ) trace("W"); else trace("-");
    if ( region_max_prot & VM_PROT_EXECUTE )  trace("X"); else trace("-");
    trace("  ");
    if ( region_inheritance == VM_INHERIT_SHARE )  trace("share");
    if ( region_inheritance == VM_INHERIT_COPY )   trace("copy ");
    if ( region_inheritance == VM_INHERIT_NONE )   trace("---- ");
    trace("  ");
    if ( region_shared )   trace("Shared "); else trace("      ");
    trace("   object: %2x ",region_object_name);
    trace(" offset: %x\n");

    start_address += region_size;
    last_address = start_address;
    }
  
}


void Client_Proc(arg)
int arg;

{

  port_t Server_Port;
  kern_return_t rs;
  unsigned int *buffer;
  unsigned int  aux;
  vm_offset_t  offset;
  trace(" Client running\n");


/* In the general case you will need a lookup for the appropriate port.
Here we don't really
   need it since the server and client live in the same task and the
port could just as well
   be a shared variable. I have left it in for the sake of completeness.
*/

  if (( rs = netname_look_up( name_server_port,"","ExternalPagerPort",
&Server_Port)) != KERN_SUCCESS) {
    mach_error(" netname_look_up for client", rs ); 
    trace("Check that server is running\n");
    exit (1); 
  }
  trace("Client: Server Port = %x\n",Server_Port);

  /* Now we actually build the memory */
  /* just one page in this mickey mouse example */

  offset = 0;
  /* You can place the slab of memory you map anywhere in the memory
object with "offset".
     A neat trick if you know where you want the memory in your address
space (pass in a value for "address" 
     and set anywhere to FALSE) is to make the offsets in the memory
object equal the addresses
     in the address space. Just make offset = address. Here I don't bother.
     */

  
  if (( rs = vm_map( task_self(), &buffer, vm_page_size,  NULL,   TRUE, 
Server_Port,  offset,     FALSE,
/*                    task        address   length        mask  anywhere
mem-obj   offset   copy   */
		    VM_PROT_ALL, VM_PROT_ALL, VM_INHERIT_NONE  )) != KERN_SUCCESS) {
/*                  curr_prot,    max_prot,      inheritance  */
    mach_error("  Vm_Map for client", rs ); exit (1); }


/* now to prove that it works try some mickey mouse operations on some pages */

  trace("Client: mapped page: %x, port: %x\n",buffer,Server_Port);
  /* lets touch it and see what happens */

  trace("Client: touch page with read\n");
  aux = buffer[2];
  trace("Client: page contains %u\n", aux);

  trace("Client: write page\n");

  buffer[1] = 23;

/* now that we have tried out the page server, show the vm map */
/* note the area served by the external pager */

  region_scan();

  cthread_exit();
}



/* Now for the guts of the external pager */


memory_object_t           Served_Object;
memory_object_control_t   Control_Port;
memory_object_name_t      Object_Name;
vm_size_t                 PageSize;

/* we must provide the following routines, these will be called by
memory_object_server() */

/* mostly these are just shells of routines. It does save typing them in! */


/*  These are the routines :
memory_object_init
memory_object_data_write
memory_object_data_request
memory_object_data_unlock
memory_object_lock_completed
memory_object_copy
memory_object_terminate
*/

kern_return_t
memory_object_init(memory_object, 
		   memory_control,
		   memory_object_name,
		   memory_object_page_size
)
memory_object_t          memory_object;
memory_object_control_t  memory_control;
memory_object_name_t     memory_object_name;
vm_size_t                memory_object_page_size;
{
kern_return_t rs;

/* This routine will be called each time a vm_map call is make against
this memory object */

/* You may wish to build appropriate data structures etc, to cope with
the new reference to
   the memory object. Maybe do some sanity checking too. */

  Served_Object = memory_object;
  Control_Port  = memory_control;
  Object_Name   = memory_object_name;
  PageSize      = memory_object_page_size;


/* When you are ready you must call memory_object_set_attributes, until
you do, the external pager 
   interface will not run, and threads that fault against the memory
will be stalled. */

  if (( rs = memory_object_set_attributes( memory_control, TRUE, TRUE,
MEMORY_OBJECT_COPY_NONE)) != KERN_SUCCESS) {
    mach_error(" memory_object_init: memory_object_set_attributes failed
", rs ); 
  }
  
  trace(" Serviced memory_object_init request for object %x\n",memory_object);


}

kern_return_t
memory_object_data_write(memory_object,
			 memory_control,
			 offset,
			 data,
			 data_count
)
memory_object_t           memory_object;
memory_object_control_t   memory_control;
vm_offset_t               offset;
pointer_t                 data;
unsigned int              data_count;

{

/* This routine will be called when the kernel returns data to us. This
will be in response
   to action of the swapper, or after we ask for the data with
memory_object_lock_request()
   with the parameter should_clean set.
*/

/* The data is in a buffer allocated by a the kernel with the equivalent
of a vm_allocate somewhere 
   in our addresss space. When we have delt with it (typicaly putting it
somewhere safe and non-volatile)
   we should deallocate the buffer */

  vm_deallocate( task_self(), data, data_count );

  return(KERN_SUCCESS);
}

kern_return_t
memory_object_data_request(memory_object,
			   memory_control,
			   offset,
			   length,
			   desired_access
)
memory_object_t           memory_object;
memory_object_control_t   memory_control;
vm_offset_t               offset;
vm_size_t                 length;
vm_prot_t                 desired_access;
{

  static  unsigned int *buffer;
  kern_return_t rs;

  trace(" got memory_object_data_request msg, object %x, control: %x,
offset %x, length %x, access %x \n",
	memory_object,memory_control,offset,length,desired_access);


  /* This routine is called in response to a page fault for which there
is no data in the cache,
     the faulting thread will be blocked until we place the needed data
into the case by calling
     vm_object_data_provided().

     The type of access that caused the exception is delivered in
desired access.
     */

  /* Here we just build a page, put some junk in it and send it off */
  /* I have turned off all access so that I get the access faults back
when it gets touched */

  buffer = (unsigned int *)malloc(vm_page_size);
  buffer[2] = 2323;

  if (( rs =
memory_object_data_provided(memory_control,offset,buffer,vm_page_size,VM
M_PROT_ALL )) != KERN_SUCCESS) {
    mach_error("  memory_object_data_request:
memory_object_data_provided failed: ", rs ); exit (1); }

/* Call this if the data is unavailable and the kernel should provide
the pages itself.
   How the kernel provides the data is dependant upon the manner in
which the object was created.
   See the manual page.*/
/*
  if (( rs =
memory_object_data_unavailable(memory_control,offset,PageLength)) !=
KERN_SUCCESS) {
    mach_error("  memory_object_data_request:
memory_object_data_unavailable failed: ", rs ); exit (1); }
*/

}

memory_object_data_unlock(memory_object,
			  memory_control,
			  offset,
			  length,
			  desired_access
)
memory_object_t           memory_object;
memory_object_control_t   memory_control;
vm_offset_t               offset;
vm_size_t                 length;
vm_prot_t                 desired_access;
{
  kern_return_t rs;

  /* This routine is called if a thread attempts an access to part of
the memory object that is currently
     disallowed. You must call memory_object_lock_request() to provide
the appropriate access.
     Remember that memory_object_lock_request() returns the access that
is to be _denyed_.

     If it is nessesary to synchronize the unlocking of the data with
the external pager program progress
     (which can be important for cache coherency algorithms for
instance) you can get a completion 
     message from the kernel by providing a port in the last paramemter
of memory_object_lock_request().
     This can be either the memory object port, in which case the
acknowlegment comes back as a call
     to memory_object_lock_completed, or you can provide a different
port in which case the message reception
     can be considered the acknowledge. If you are only expecting one
message at a time on that port
     you can get away with it, however if you need to know which address
range was unlocked, you must use
     memory_object_lock_completed() and appropriate signals.
     */


  /* Here we simply provide the appropriate access and log that we got
the fault. */
  /* Typicly you will do much more (or maybe trash the program). */

  trace(" got memory_object_data_unlock msg, object %x, control: %x,
offset %x, length %x :",
	memory_object,memory_control,offset,length);

  if ( desired_access == VM_PROT_READ ) {
    if (( rs =
memory_object_lock_request(memory_control,offset,length,FALSE,FALSE,VM_P
PROT_WRITE,PORT_NULL )) != KERN_SUCCESS) {
      mach_error("  memory_object_data_unlock:
memory_object_lock_request failed: ", rs ); exit (1); }
    trace(" read access fault\n");
  }
  if ( desired_access & (VM_PROT_WRITE) ) {
    if (( rs =
memory_object_lock_request(memory_control,offset,length,FALSE,FALSE,VM_P
PROT_NONE,PORT_NULL )) != KERN_SUCCESS) {
      mach_error("  memory_object_data_unlock:
memory_object_lock_request failed: ", rs ); exit (1); }
    trace(" write access fault\n");
  }
  if ( desired_access == VM_PROT_ALL ) {
    if (( rs =
memory_object_lock_request(memory_control,offset,length,FALSE,FALSE,VM_P
PROT_NONE,PORT_NULL )) != KERN_SUCCESS) {
      mach_error("  memory_object_data_unlock:
memory_object_lock_request failed: ", rs ); exit (1); }
    trace(" request for full access\n");
  }
  return(KERN_SUCCESS);
}


/* This routine is not actually listed in the documentation although it
is referenced by
   the manual pages for the other calls. You must provide this routine
otherwise
   the program will not link */

/* If you need synchronisation of a lock_request completing before
proceeding you
   can ask for this routine to be invoked by provideing the
memory_object port as the
   reply_to port in memory_object_lock_request
   */

kern_return_t
memory_object_lock_completed(memory_object,
			     memory_control,
			     offset,
			     length
)
memory_object_t           memory_object;
memory_object_control_t   memory_control;
vm_offset_t               offset;
vm_size_t                 length;

{



  return(KERN_SUCCESS);
}


kern_return_t
memory_object_copy(memory_object,
		   memory_control,
		   old_memory_control,
		   offset,
		   length,
		   new_memory_object
)
memory_object_t           memory_object;
memory_object_control_t   memory_control;
memory_object_control_t   old_memory_control;
vm_offset_t               offset;
vm_size_t                 length;
memory_object_t           new_memory_object;

{
/* This is only called if we wish to be notifed when a memory object is
copied by the kernel
   on the request of a user process and the copy occurs.
   We need to set the MEMEORY_OBJECT_COPY_CALL attribute in the memory
object if we wish to be so notifed.


   If you need to be notified you will know why. Mostly you either
couldn't care or will never
   have a copy happen.
*/

  return (KERN_SUCCESS);
 
}

memory_object_terminate(memory_object,
			memory_control,
			memory_object_name
)
memory_object_t           memory_object;
memory_object_control_t   memory_control;
memory_object_name_t      memory_object_name;
{
  /* Called when the memory object is done with. The last reference to
the memory object
     has been deleted. Time to clean up. */


  kern_return_t rs;

  /* Here we just deallocate the ports and stuff. You may need to do
much more */

  if (( rs = port_deallocate(memory_object )) != KERN_SUCCESS) {
    mach_error("  memory_object_terminate: port_deallocate for object
port failed: ", rs ); exit (1); }
  if (( rs = port_deallocate(memory_control )) != KERN_SUCCESS) {
    mach_error("  memory_object_terminate: port_deallocate for control
port failed: ", rs ); exit (1); }
  if (( rs = port_deallocate(memory_object_name )) != KERN_SUCCESS) {
    mach_error("  memory_object_terminate: port_deallocate for name port
failed: ", rs ); exit (1); }


}


/* Here as an example we just set two threads running, one as the
external pager, the other
   as a client of the memory object the pager provides. Once the client
dies we kill the pager
   and exit.
*/


void main()

{

  kern_return_t rs;
  cthread_t Client, Server;
  msg_header_t requestbuf;
  msg_header_t *request =  &requestbuf;

  /* so I at least get my trace writes */  
  setbuf(stdout,NULL);
  setbuf(stderr,NULL);

  cthread_debug = FALSE;  /* turn it on for the hell of it */

  server_lock = mutex_alloc();
  server_cond = condition_alloc();
  mutex_set_name(server_lock, "Server Lock");
  condition_set_name(server_cond,"Server Running");


  if (( rs = port_allocate( task_self(), &server_terminate_port)) !=
KERN_SUCCESS) {
    mach_error("  Port_allocate for server_terminate_port", rs ); exit (1); }


/* fork off the server */

  Server = cthread_fork(External_Pager_Proc, NULL );
  cthread_set_name(Server,"Server"); 
  cthread_detach(Server);

  mutex_lock(server_lock);
  condition_wait( server_cond, server_lock );
  mutex_unlock(server_lock);

/* fork off the client */
  Client = cthread_fork(Client_Proc, NULL );
  cthread_set_name(Client,"Client");
/* now wait for it to finish */
  cthread_join(Client);

  trace("killing server\n");
/* now kill the server, not the prettiest way, but one way */

  request->msg_size = sizeof(msg_header_t);
  request->msg_remote_port = server_terminate_port;
  request->msg_simple = TRUE;
  msg_send(request,MSG_OPTION_NONE,0);

}