francis@cs.ua.oz.au (Francis Vaughan) (05/28/91)
I mentioned last week that I have a Mickey Mouse External Pager that might form a useful stub program for those who wanted to write one. My mailbox runneth over. So here it is. I have cleaned it up a bit. There is really very little to it. It is a pity that the standard Mach distribution does not include something like this as it is mostly stuff typed in from documentation and how to's got from reading source code. The authors of Mach really do themselves a dis-service by making it obscure just how it fits together. It is not as if there is anything hard about writing an external pager from the point of view of just getting one going. However the documentation simply lists the calls, and does not explain the glue. So what I am really supplying is small amount of glue. I am sure there are bits that the more knowlegeable may like to correct. I would welcome such comments (as I am sure would many others.) Francis Vaughan. ----------------------------------------------------------------- /* This is an example of how to build an external pager under Mach. It was written and runs on a Sun 3/60 running MSD 2.6. Some parts are still a little messy. Sorry about the wide format. I can't be bothered reformatting it. If you still use 80x24 I am sorry for you. Francis Vaughan. May 1991. Caveat Emptor. built with: cc -g -o external-pager-example external-pager-example.c -lsys -lthreads -lmach */ /* Introduction. The mach expernal pager interface is built in two parts. System calls are provided which allow user code to control the contents of the memory object that contains the pages used by the client address spaces. The user supplys a set of routines that are called by the kernel via a MIG generated interface to handle such things as page faults and memory protection violations. This program servers two purposes. You can take it pretty much as a shell to build a real external pager (I have). It also demonstrates a very mickey mouse pager in action. One page is supplied to a client which reads a value out of it and then writes to it. Stunning stuff. The pager is perhaps unusual in that the pager and its client are in the same task. This is not a problem, it should be trivial to pull the two apart into client and server programs. Remember that if the pager ever touches a page in part of the address pace that it is serving it may deadlock. A simple routine region_scan() is included that prints out the map of the tasks address space by regions. It is interesting and occasionaly useful. */ #include <mach.h> #include <stdio.h> #include <cthreads.h> #include <servers/netname.h> #include <mach_error.h> #include <mach/message.h> #include <mig_errors.h> mutex_t server_lock; condition_t server_cond; port_name_t server_terminate_port; extern int cthread_debug; extern kern_return_t memory_object_server(); /* The following routine is taken directly from "A programmers guide to the Mach User Environemt". The documentation implys that it is a library routine, however neither I nor ld can find it. As it is, you can get away with a greatly cut down version, for the external pager so I have just left the code here commented out. Someone might find it saves a few minutes someday. /* kern_return_t mig_server( service, function ) port_name_t service; boolean_t (*function)(); { int requestbuf[MSG_SIZE_MAX/sizeof(int)]; int replybuf[MSG_SIZE_MAX/sizeof(int)]; msg_header_t *request = (msg_header_t *) requestbuf; death_pill_t *reply = (death_pill_t *) replybuf; msg_return_t mr; for (;;) { request->msg_size = sizeof(requestbuf); request->msg_local_port = service; mr = msg_receive(request, MSG_OPTION_NONE, 0 ); if (request->msg_local_port == task_notify()) cthread_exit(); (void) (*function)(request, &reply->Head); if ((reply->Head.msg_remote_port != PORT_NULL ) && (reply->RetCode != MIG_NO_REPLY)) (void)msg_send(&reply->Head,SEND_TIMEOUT,0); } } */ /* This is the body of the routine that will be instantiated as the external pager */ /* This routine creates the port that will represent the memory object, and then builds a port set to listen on. I have added a termination port such that when any message is received on it, it kills the pager nicley. This is possibly a waste of time, but does allow me to kill the pager easily and have a chance to clean up. Once the ports are built, the thread signals that it is running, and loops waiting for messages. Messages from the kernel through the memory object port "Server_Port" are passed on to the the library function "memory_object_server". memory_object_server() has been build by MIG and calls the appropriate user supplied routine with the appropriate parameters. Unlike the usual MIG RPC calls none of these routines supplys a return value (they could, but the kernel is not waiting for it). Therefore I have cut down the mig_server code to delete the return message stuff. You still need to pass in a return message even though it will not be used. */ External_Pager_Proc(arg) int arg; { port_name_t Server_Port; port_name_t Port_Set; kern_return_t rs; int requestbuf[MSG_SIZE_MAX/sizeof(int)]; msg_header_t *request = (msg_header_t *) requestbuf; int replybuf[MSG_SIZE_MAX/sizeof(int)]; death_pill_t *reply = (death_pill_t *) replybuf; msg_return_t mr; /* build the ports */ if (( rs = port_allocate( task_self(), &Server_Port)) != KERN_SUCCESS) { mach_error(" Port_allocate for server port", rs ); exit (1); } if (( rs = port_set_allocate( task_self(), &Port_Set)) != KERN_SUCCESS) { mach_error(" Port_set_allocate for server", rs ); exit (1); } if (( rs = port_set_add( task_self(), Port_Set, Server_Port)) != KERN_SUCCESS) { mach_error(" Port_set_add for server", rs ); exit (1); } if (( rs = port_set_add( task_self(), Port_Set, task_notify())) != KERN_SUCCESS) { mach_error(" Port_set_add task_notify for server", rs ); exit (1); } if (( rs = port_set_add( task_self(), Port_Set, server_terminate_port)) != KERN_SUCCESS) { mach_error(" Port_set_add server_terminate_port for server", rs ); exit (1); } if (( rs = netname_check_in(name_server_port, "ExternalPagerPort", PORT_NULL, Server_Port)) != KERN_SUCCESS) { mach_error(" netname_check_in for server", rs ); exit (1); } mutex_lock(server_lock); condition_signal(server_cond); /* let the main proc know that the server is running */ mutex_unlock(server_lock); /* this is a cut down version of mig_server that does just what is needed for the external pager */ /* it is dumped inline as there is no reason to make it a funtion */ for (;;) { request->msg_size = sizeof(requestbuf); request->msg_local_port = Port_Set; mr = msg_receive(request, MSG_OPTION_NONE, 0 ); if (request->msg_local_port == task_notify()) continue; if (request->msg_local_port == server_terminate_port) break; if ( FALSE == memory_object_server(request,reply)){ mach_error(" memory_object_server failed ", FALSE ); exit (1); } } /* One could replace the above for loop with this (but then there is no terminate message). if (( rs = mig_server( Port_Set, memory_object_server )) != KERN_SUCCESS) { mach_error(" mig_server for server", rs ); exit (1); } */ } /* This is a useful little routine that displays the layout of VM. Since it will often be used in a multi threaded program I use the routine trace to print, rather than printf directly. This assumes the other threads use trace(). (Which they do here.) Probably better here would be to grab io_lock at the beginning and then printf, but I haven't bothered. (I don't like holding locks for long.) */ #include <stdio.h> #include <varargs.h> struct mutex io_lock; /* VARARGS */ void trace( va_alist ) va_dcl { va_list ap; mutex_lock( &io_lock ); va_start( ap ); vprintf( va_arg( ap, char* ) ); va_end( ap ); mutex_unlock( &io_lock ); } void region_scan() { /* this routine scans through VM printing the attributes of each allocated region */ kern_return_t rs; vm_address_t last_address; vm_address_t start_address; vm_size_t region_size; vm_prot_t region_prot; vm_prot_t region_max_prot; vm_inherit_t region_inheritance; boolean_t region_shared; port_t region_object_name; vm_offset_t region_offset; start_address = 0; last_address = 0; while ( KERN_SUCCESS == vm_region(task_self(),&start_address, ®ion_size, ®ion_prot, & region_max_prot, ®ion_inheritance, ®ion_shared, ®ion_object_name, ®ion_offset) ) { if ( start_address != last_address ) /* there is a gap */ trace("unallocated: start- %7x, size- %7x\n", last_address, start_address - last_address ); trace("Region: start- %7x, size- %7x, ",start_address,region_size); if ( region_prot & VM_PROT_READ ) trace("R"); else trace("-"); if ( region_prot & VM_PROT_WRITE ) trace("W"); else trace("-"); if ( region_prot & VM_PROT_EXECUTE ) trace("X"); else trace("-"); trace(" "); if ( region_max_prot & VM_PROT_READ ) trace("R"); else trace("-"); if ( region_max_prot & VM_PROT_WRITE ) trace("W"); else trace("-"); if ( region_max_prot & VM_PROT_EXECUTE ) trace("X"); else trace("-"); trace(" "); if ( region_inheritance == VM_INHERIT_SHARE ) trace("share"); if ( region_inheritance == VM_INHERIT_COPY ) trace("copy "); if ( region_inheritance == VM_INHERIT_NONE ) trace("---- "); trace(" "); if ( region_shared ) trace("Shared "); else trace(" "); trace(" object: %2x ",region_object_name); trace(" offset: %x\n"); start_address += region_size; last_address = start_address; } } void Client_Proc(arg) int arg; { port_t Server_Port; kern_return_t rs; unsigned int *buffer; unsigned int aux; vm_offset_t offset; trace(" Client running\n"); /* In the general case you will need a lookup for the appropriate port. Here we don't really need it since the server and client live in the same task and the port could just as well be a shared variable. I have left it in for the sake of completeness. */ if (( rs = netname_look_up( name_server_port,"","ExternalPagerPort", &Server_Port)) != KERN_SUCCESS) { mach_error(" netname_look_up for client", rs ); trace("Check that server is running\n"); exit (1); } trace("Client: Server Port = %x\n",Server_Port); /* Now we actually build the memory */ /* just one page in this mickey mouse example */ offset = 0; /* You can place the slab of memory you map anywhere in the memory object with "offset". A neat trick if you know where you want the memory in your address space (pass in a value for "address" and set anywhere to FALSE) is to make the offsets in the memory object equal the addresses in the address space. Just make offset = address. Here I don't bother. */ if (( rs = vm_map( task_self(), &buffer, vm_page_size, NULL, TRUE, Server_Port, offset, FALSE, /* task address length mask anywhere mem-obj offset copy */ VM_PROT_ALL, VM_PROT_ALL, VM_INHERIT_NONE )) != KERN_SUCCESS) { /* curr_prot, max_prot, inheritance */ mach_error(" Vm_Map for client", rs ); exit (1); } /* now to prove that it works try some mickey mouse operations on some pages */ trace("Client: mapped page: %x, port: %x\n",buffer,Server_Port); /* lets touch it and see what happens */ trace("Client: touch page with read\n"); aux = buffer[2]; trace("Client: page contains %u\n", aux); trace("Client: write page\n"); buffer[1] = 23; /* now that we have tried out the page server, show the vm map */ /* note the area served by the external pager */ region_scan(); cthread_exit(); } /* Now for the guts of the external pager */ memory_object_t Served_Object; memory_object_control_t Control_Port; memory_object_name_t Object_Name; vm_size_t PageSize; /* we must provide the following routines, these will be called by memory_object_server() */ /* mostly these are just shells of routines. It does save typing them in! */ /* These are the routines : memory_object_init memory_object_data_write memory_object_data_request memory_object_data_unlock memory_object_lock_completed memory_object_copy memory_object_terminate */ kern_return_t memory_object_init(memory_object, memory_control, memory_object_name, memory_object_page_size ) memory_object_t memory_object; memory_object_control_t memory_control; memory_object_name_t memory_object_name; vm_size_t memory_object_page_size; { kern_return_t rs; /* This routine will be called each time a vm_map call is make against this memory object */ /* You may wish to build appropriate data structures etc, to cope with the new reference to the memory object. Maybe do some sanity checking too. */ Served_Object = memory_object; Control_Port = memory_control; Object_Name = memory_object_name; PageSize = memory_object_page_size; /* When you are ready you must call memory_object_set_attributes, until you do, the external pager interface will not run, and threads that fault against the memory will be stalled. */ if (( rs = memory_object_set_attributes( memory_control, TRUE, TRUE, MEMORY_OBJECT_COPY_NONE)) != KERN_SUCCESS) { mach_error(" memory_object_init: memory_object_set_attributes failed ", rs ); } trace(" Serviced memory_object_init request for object %x\n",memory_object); } kern_return_t memory_object_data_write(memory_object, memory_control, offset, data, data_count ) memory_object_t memory_object; memory_object_control_t memory_control; vm_offset_t offset; pointer_t data; unsigned int data_count; { /* This routine will be called when the kernel returns data to us. This will be in response to action of the swapper, or after we ask for the data with memory_object_lock_request() with the parameter should_clean set. */ /* The data is in a buffer allocated by a the kernel with the equivalent of a vm_allocate somewhere in our addresss space. When we have delt with it (typicaly putting it somewhere safe and non-volatile) we should deallocate the buffer */ vm_deallocate( task_self(), data, data_count ); return(KERN_SUCCESS); } kern_return_t memory_object_data_request(memory_object, memory_control, offset, length, desired_access ) memory_object_t memory_object; memory_object_control_t memory_control; vm_offset_t offset; vm_size_t length; vm_prot_t desired_access; { static unsigned int *buffer; kern_return_t rs; trace(" got memory_object_data_request msg, object %x, control: %x, offset %x, length %x, access %x \n", memory_object,memory_control,offset,length,desired_access); /* This routine is called in response to a page fault for which there is no data in the cache, the faulting thread will be blocked until we place the needed data into the case by calling vm_object_data_provided(). The type of access that caused the exception is delivered in desired access. */ /* Here we just build a page, put some junk in it and send it off */ /* I have turned off all access so that I get the access faults back when it gets touched */ buffer = (unsigned int *)malloc(vm_page_size); buffer[2] = 2323; if (( rs = memory_object_data_provided(memory_control,offset,buffer,vm_page_size,VM M_PROT_ALL )) != KERN_SUCCESS) { mach_error(" memory_object_data_request: memory_object_data_provided failed: ", rs ); exit (1); } /* Call this if the data is unavailable and the kernel should provide the pages itself. How the kernel provides the data is dependant upon the manner in which the object was created. See the manual page.*/ /* if (( rs = memory_object_data_unavailable(memory_control,offset,PageLength)) != KERN_SUCCESS) { mach_error(" memory_object_data_request: memory_object_data_unavailable failed: ", rs ); exit (1); } */ } memory_object_data_unlock(memory_object, memory_control, offset, length, desired_access ) memory_object_t memory_object; memory_object_control_t memory_control; vm_offset_t offset; vm_size_t length; vm_prot_t desired_access; { kern_return_t rs; /* This routine is called if a thread attempts an access to part of the memory object that is currently disallowed. You must call memory_object_lock_request() to provide the appropriate access. Remember that memory_object_lock_request() returns the access that is to be _denyed_. If it is nessesary to synchronize the unlocking of the data with the external pager program progress (which can be important for cache coherency algorithms for instance) you can get a completion message from the kernel by providing a port in the last paramemter of memory_object_lock_request(). This can be either the memory object port, in which case the acknowlegment comes back as a call to memory_object_lock_completed, or you can provide a different port in which case the message reception can be considered the acknowledge. If you are only expecting one message at a time on that port you can get away with it, however if you need to know which address range was unlocked, you must use memory_object_lock_completed() and appropriate signals. */ /* Here we simply provide the appropriate access and log that we got the fault. */ /* Typicly you will do much more (or maybe trash the program). */ trace(" got memory_object_data_unlock msg, object %x, control: %x, offset %x, length %x :", memory_object,memory_control,offset,length); if ( desired_access == VM_PROT_READ ) { if (( rs = memory_object_lock_request(memory_control,offset,length,FALSE,FALSE,VM_P PROT_WRITE,PORT_NULL )) != KERN_SUCCESS) { mach_error(" memory_object_data_unlock: memory_object_lock_request failed: ", rs ); exit (1); } trace(" read access fault\n"); } if ( desired_access & (VM_PROT_WRITE) ) { if (( rs = memory_object_lock_request(memory_control,offset,length,FALSE,FALSE,VM_P PROT_NONE,PORT_NULL )) != KERN_SUCCESS) { mach_error(" memory_object_data_unlock: memory_object_lock_request failed: ", rs ); exit (1); } trace(" write access fault\n"); } if ( desired_access == VM_PROT_ALL ) { if (( rs = memory_object_lock_request(memory_control,offset,length,FALSE,FALSE,VM_P PROT_NONE,PORT_NULL )) != KERN_SUCCESS) { mach_error(" memory_object_data_unlock: memory_object_lock_request failed: ", rs ); exit (1); } trace(" request for full access\n"); } return(KERN_SUCCESS); } /* This routine is not actually listed in the documentation although it is referenced by the manual pages for the other calls. You must provide this routine otherwise the program will not link */ /* If you need synchronisation of a lock_request completing before proceeding you can ask for this routine to be invoked by provideing the memory_object port as the reply_to port in memory_object_lock_request */ kern_return_t memory_object_lock_completed(memory_object, memory_control, offset, length ) memory_object_t memory_object; memory_object_control_t memory_control; vm_offset_t offset; vm_size_t length; { return(KERN_SUCCESS); } kern_return_t memory_object_copy(memory_object, memory_control, old_memory_control, offset, length, new_memory_object ) memory_object_t memory_object; memory_object_control_t memory_control; memory_object_control_t old_memory_control; vm_offset_t offset; vm_size_t length; memory_object_t new_memory_object; { /* This is only called if we wish to be notifed when a memory object is copied by the kernel on the request of a user process and the copy occurs. We need to set the MEMEORY_OBJECT_COPY_CALL attribute in the memory object if we wish to be so notifed. If you need to be notified you will know why. Mostly you either couldn't care or will never have a copy happen. */ return (KERN_SUCCESS); } memory_object_terminate(memory_object, memory_control, memory_object_name ) memory_object_t memory_object; memory_object_control_t memory_control; memory_object_name_t memory_object_name; { /* Called when the memory object is done with. The last reference to the memory object has been deleted. Time to clean up. */ kern_return_t rs; /* Here we just deallocate the ports and stuff. You may need to do much more */ if (( rs = port_deallocate(memory_object )) != KERN_SUCCESS) { mach_error(" memory_object_terminate: port_deallocate for object port failed: ", rs ); exit (1); } if (( rs = port_deallocate(memory_control )) != KERN_SUCCESS) { mach_error(" memory_object_terminate: port_deallocate for control port failed: ", rs ); exit (1); } if (( rs = port_deallocate(memory_object_name )) != KERN_SUCCESS) { mach_error(" memory_object_terminate: port_deallocate for name port failed: ", rs ); exit (1); } } /* Here as an example we just set two threads running, one as the external pager, the other as a client of the memory object the pager provides. Once the client dies we kill the pager and exit. */ void main() { kern_return_t rs; cthread_t Client, Server; msg_header_t requestbuf; msg_header_t *request = &requestbuf; /* so I at least get my trace writes */ setbuf(stdout,NULL); setbuf(stderr,NULL); cthread_debug = FALSE; /* turn it on for the hell of it */ server_lock = mutex_alloc(); server_cond = condition_alloc(); mutex_set_name(server_lock, "Server Lock"); condition_set_name(server_cond,"Server Running"); if (( rs = port_allocate( task_self(), &server_terminate_port)) != KERN_SUCCESS) { mach_error(" Port_allocate for server_terminate_port", rs ); exit (1); } /* fork off the server */ Server = cthread_fork(External_Pager_Proc, NULL ); cthread_set_name(Server,"Server"); cthread_detach(Server); mutex_lock(server_lock); condition_wait( server_cond, server_lock ); mutex_unlock(server_lock); /* fork off the client */ Client = cthread_fork(Client_Proc, NULL ); cthread_set_name(Client,"Client"); /* now wait for it to finish */ cthread_join(Client); trace("killing server\n"); /* now kill the server, not the prettiest way, but one way */ request->msg_size = sizeof(msg_header_t); request->msg_remote_port = server_terminate_port; request->msg_simple = TRUE; msg_send(request,MSG_OPTION_NONE,0); }