schmitz@fas.ri.cmu.edu (Donald Schmitz) (09/21/89)
A month or so ago, I asked for help with a SUN OS 3.X device driver problem. Thanks to a few kind souls on the net and at SUN, and lots of hacking, we finally got things working. Since what we did seems like a reasonable thing to do, I thought I'd share the solution here. We have developed a VxWorks or VRTX-like package that lets us write code for single board CPUs sharing a SUN-3(150/160/260) VME bus. This inlcudes a SUN device driver that supports message based communication with the add-on CPUs via shared memory and mailbox interrupts, and a SUN interface program that downloads these CPUs by mmap()ing the CPU memory and then dumping the right parts of an a.out file into it. Everything worked fine until we installed a 2nd add-on CPU, then the SUN would die in mysterious ways, like when starting SunTools, or exiting vi, or even during mount. Exactly where the machine died seemed to dependend on the size of the UNIX kernel. After lots of experimentation, it seemed the problems occured when we exceeded 8M of external devices (which interestingly is 1K of pages...). This was later verified by Peter Corke at CSIRO, Australia (thanks a lot), who claims this limit is actually documented in Sec 8.2.3 of the "Sun Catalyst Porting Reference Guide". The limit in 3.X is around 8-9Mb, and supposedly shrinks to 6Mb in 4.0.X. At first, this didn't make much sense, since there are lots of well known applications with more than 8M of device memory, ie. systems with 3 color frame buffers at 4M each, Mercury FPU boards at up to 10M each. The important thing is all of these applications don't map the entire device memory _at probe time_. The memory that gets mapped in at probe time (by the xxprobe() driver routine) is all that is available to the other driver routines, ie. xxread(), xxwrite() and xxioctl(). The rest of the device memory is mapped only into the user space by xxmmap(), via the user program calling mmap() on the device's file descriptor. Usually only the device control registers or IO buffers get mapped in at probe time. This is something the SUN driver manual barely hints at being able to do, and never mentions any sort of limit. Following is an outline of how to do this, for a hypothetical 1M device named "foo", with 1K of control registers at the low end of its memory that the driver needs access to. (This works for 3.X systems, the documentation seems to say 4.0.X drivers remain source code compatible but we haven't tried this yet). 1. In the mb_driver struct required for each device, declare the device size (7th field in the struct) as 1024 (0x400). 2. Write a xxmmap() routine that looks something like: foommap(dev, off, protection) dev_t dev; off_t off; int protection; { c_addt dev_addr; int page; if(off >= 0x100000) /* Check for valid device address to map, */ return -1; /* return error flag if out of range. */ /* * Get kernel space ptr to device memory, there are usually * easier ways to find it (the value we want is the 'reg' * argument passed to xxprobe), but this should always work. */ dev_addr = (foodriver.mdr_info[unit(dev)])->md_addr; if(off <= 0x400){ /* usual case, memory already mapped in by probe */ page = getkpgmap(dev_addr + off) & PG_PFNUM; /* get page number */ vac_disable_kpage(dev_addr + off); /* disable caching this memory */ } else { /* memory not mapped in by probe */ page = getkpgmap(dev_addr) & PG_PFNUM; page += btop(off); } return page; } If you want to size the device, either to make sure the right device is there or to determine which of several possible devices it might be, use the mapin() and mapout() routines to access small segments of the device while probing it. Mapin() is also useful for devices which have two non-continuous address spaces that must both be accessed by the driver. This is actually explained in (so-so) detail in the SUN "Writing a Device Driver" manual (the OS 4.0 versions are a big improvement over the 3.X doc). Finally, we received a message that all versions of SUN OS prior to 4.0.3 have had a bug in the boot program that will cause similar problems to those we encountered should the UNIX kernel grow "too big", where "too big" is vaguely defined as somewhere between 500K and 750K bytes. This can happen if your device drivers declare lots of global variables (like status structures or buffers). There is a fixed boot program available from SUN that is supposed to work with kernels of at least 750K. Don Schmitz (schmitz@fas.ri.cmu.edu)