[comp.sys.sun] Real time control in device drivers

trainoff@voodoo.ucsb.edu (10/31/89)

I know the unix is not a real time OS, but I have a real time problem in a
driver I have been working on.  The hardware is a custom printer
controller on a Sun4 running SunOS4.0.  The relevant part of the
controller is a fifo that the host loads via dma, and the printer prints
out of.  Once the host initiates the transfer, the controller handshakes
the dma to fill the FIFO.  When the FIFO is full the controller will not
ask for any more DMA.  If the buffer completely empties, you are in
trouble since the printer will just keep on reading the empty FIFO.  The
host is much faster than the printer so most of the time the fifo is full
and everything works fine.  The way that DMA on a sun-4 works is that
there it a special address space at the top of VME address space that the
sun will map its local memory into.  Sun calls this DVMA.  It is one
megabyte wide.  If the data you want to transfer is bigger than 1M you
have to do it in several chunks (or if your device has a DMA counter which
is less than your transfer).  To facilitate this breakup there is the
following kernal support.  Your xxread routine calls physio ith six
arguments

physio(strategy, buf, dev, flag, minphys, uio)
        void (*strategy)();
        struct buf *buf;
        dev_t dev;
        int rw_flag;
        void (*minphys)();
        struct uio *uio;

where in this case the strategy routine begins the real transfer.  buf is
a pointer to buffer containing the data, dev is the device number, rw_flag
sets the direction of the transfer, minphys is a routine to determine the
sizes of the chunks, and uio is a pointer to the user io structure.

physio makes sure that each chunk is physically locked in core, checks the
buffer flags to insure that the device is not busy, does some bookkeeping
in the uio struct and then calls strategy to do the transfer.

xxstrategy does some semaphore stuff to make sure that it really has the
device to itself.  Then it calls a sun routine, mbsetup, to map the buffer
into the DVMA space. xxstrategy then starts the dma.  It sleeps until the
io is done or it times out.

In the bottom half of the driver the xxintr() routine services the
interupt which the board generates to signal that the dma has finished and
then wakes up the sleeping xxstrategy().  xxstrategy finishes and then
physio sets up the next transfer.  The cycle repeats until all of that
data has been transfered.

This works fine most of the time.  The problem is that when there is heavy
disk or ethernet traffic, which have higher interupt priority it sometimes
happens that there is a delay between one dma transfer and the next.
Presumably it is because some other driver is busy servicing its own
interupt.  What I want to know is how I can insure that my driver gets to
set up its next dma transfer before anyone else.  I tried playing with
splx() levels at various parts of the driver to prevent the other
interupts from coming through.  This didn't work.

This lead me to something I don't understand.  I decided to raise the
hardware interupt priority of my device above the disk and the ethernet.
I decided on vmebus interupt level 4.  I figure that if the disk interupt
isn't serviced on time I just lose a disk rotation.  If the ethernet isn't
serviced, I lose a packet and need a retransmit.  If the fifo on my
printer controller empties, I lose the page and what's worse I don't know
that I have lost it.  I changed the interupt levels on the board and in
the GENERIC file.  I remade the kernel and rebooted.  The system hung
during the fsck of / and /usr.  Then just for fun I tried to reboot the
old kernel (with the priority 2 in the GENERIC file) with the board still
at level 4.  It came up fine and the problem was much reduced but not
eliminated.  I am confused by this.  I thought that the hardware and the
GENERIC file had to agree!?  What is going on here?

Any help or insight into this problem would be appreciated.

..Steve Trainoff

perw@holtec.se (Per Westerlund) (11/21/89)

trainoff@voodoo.ucsb.edu writes:
>This lead me to something I don't understand.  I decided to raise the
>hardware interupt priority of my device above the disk and the ethernet.
>I decided on vmebus interupt level 4.  I figure that if the disk interupt
>isn't serviced on time I just lose a disk rotation.  If the ethernet isn't
>serviced, I lose a packet and need a retransmit.  If the fifo on my
>printer controller empties, I lose the page and what's worse I don't know
>that I have lost it.  I changed the interupt levels on the board and in
>the GENERIC file.  I remade the kernel and rebooted.  The system hung
>during the fsck of / and /usr.  Then just for fun I tried to reboot the
>old kernel (with the priority 2 in the GENERIC file) with the board still
>at level 4.  It came up fine and the problem was much reduced but not
>eliminated.  I am confused by this.  I thought that the hardware and the
>GENERIC file had to agree!?  What is going on here?

SPARC has 16 interrupt levels, VME 7.  The intended mapping is: SPARC =
VME*2.  The hardware mapping is done by the VME interface on the
CPU-board.

We then must compensate for this in software, which is done by the macro
pritospl().  It multiplies given values with two, by shifting left 9 bits
when all that is required is 8.

Unfortunately, this mapping is already done by config!  In your config
file you entered 'priority 4'.  When xxattach() gets called the priority
level is set to 8 (try it).  (On one of the boards I have tried, this
resulted in trying to set the board's interrupt level to 8, which became 0
when only thre bits were used -> "interrupts off"!)

It also means that the routine trying to protect a critical region with
splx(pritospl(XXX)) fails, because 8*2 = 16 -> 0 by masking!  This can
explain your problems during fsck.

			Per Westerlund