nelson@sun.soe.clarkson.edu (Russ Nelson) (09/30/88)
I notice that the section of the Runtime System manual that deals with Writing Device Drivers and interrupts says that interrupts can be lost. Is this true? If so, does Microport consider it a bug (i.e. will it be fixed?) -- --russ (nelson@clutx [.bitnet | .clarkson.edu]) To surrender is to remain in the hands of barbarians for the rest of my life. To fight is to leave my bones exposed in the desert waste.
dave@micropen (David F. Carlson) (10/01/88)
In article <NELSON.88Sep29160014@sun.soe.clarkson.edu>, nelson@sun.soe.clarkson.edu (Russ Nelson) writes: > I notice that the section of the Runtime System manual that deals with > Writing Device Drivers and interrupts says that interrupts can be > lost. Is this true? If so, does Microport consider it a bug (i.e. > will it be fixed?) > --russ (nelson@clutx [.bitnet | .clarkson.edu]) The problem is not Microport's: its the d*mn IBM PC/AT interrupt controller (aka Intel 8259.) The problem is not solvable in software alone, thus Microport is not to blame. It was nice of them to tell you that it is a problem though so you won't pull your hair out trying to figure our why. It is good device driver design to *assume* you will lose a critical interrupt so your design can cover its ass with a polling. If the "next" interrupt time is known, a callout can be done to "simulate" the missing interrupt. The rule for device drivers anywhere is that there is no such thing as reliable interrupts. -- David F. Carlson, Micropen, Inc. micropen!dave@ee.rochester.edu "The faster I go, the behinder I get." --Lewis Carroll
sl@van-bc.UUCP (pri=-10 Stuart Lynne) (10/01/88)
In article <553@micropen> dave@micropen (David F. Carlson) writes: >In article <NELSON.88Sep29160014@sun.soe.clarkson.edu>, nelson@sun.soe.clarkson.edu (Russ Nelson) writes: >> I notice that the section of the Runtime System manual that deals with >> Writing Device Drivers and interrupts says that interrupts can be >> lost. Is this true? If so, does Microport consider it a bug (i.e. >> will it be fixed?) >> --russ (nelson@clutx [.bitnet | .clarkson.edu]) > >The problem is not Microport's: its the d*mn IBM PC/AT interrupt >controller (aka Intel 8259.) The problem is not solvable in software >alone, thus Microport is not to blame. It was nice of them to tell >you that it is a problem though so you won't pull your hair out trying >to figure our why. It is good device driver design to *assume* you >will lose a critical interrupt so your design can cover its ass with >a polling. If the "next" interrupt time is known, a callout can be done >to "simulate" the missing interrupt. The rule for device drivers anywhere >is that there is no such thing as reliable interrupts. > You're right, that problem is not Microports or on the 386 more generically the System V port. However we take note that SCO mysteriously looses far less than Microport. The reason of course is that Microport spends a *lot* more time at spl7 than SCO does. This exacerbates the problem that you mention. In my experience system interrupt overhead and time lost through use of spl7 is the primary cause of lost interrupts. At least when they are lost due to a large influx of them. SCO apparently has spent much time and effort in finding all places where spl7 is needed and *not* needed and has reduced the amount of time when they lock them out. For example with the tty drivers, try the following: stty 19200 -ixoff -echo; cat > /tmp/test then from your terminal emulator program dump about 100kb of data to the system. Even on an idle 386 system with System V you will see very few lines in the destination file which are correct. On the same 386 running SCO you will loose very few characters. SCO has also cleaned up interrupt servicing a bit. My rough guestimate for servicing a serial tx interrupt on a 386 is 300-350 micro seconds for SCO versus 400-450 micro seconds for System V. Of those figures the actual overhead for the interrupt servicing is probably about 150 vs 200 with the balance being spent in the actual serial driver interrupt routine. Let's take a poll. Will anyone using a Trailblazer on a Dumb serial port under any type of Unix system send me a message on how successful it runs uucp at 9600 or 19200. What is your normal operating parameters. I'll summarize if enough people reply. As a further example; here on van-bc I can run two Trailblazers fairly successfully at 9600 on dumb ports (van-bc is a 10Mhz 68010 based system). I cannot however run one at 19200. Unfortunately somewhere in the kernel someone is raising spl7 at odd intervals causing the uucp connection to drop a character and time out. Running two at 9600 works because even though the net throughput is the same as one at 19200, and the system overhead is actually higher, the time to fill the three character buffer in the 8274 is twice as long at 9600. About three milli seconds versus one and a half. I can pull the stuff out very quickly but someone is getting spl7 for something right around the one and a half milli second range, and sometimes the driver just can't quite get the data out before that next character arrives. But there is plenty of time to get it out when running at 9600, the 8274's buffers havn't even filled up yet. It's real close too, uucp will generally run for about five to fifteen seconds before loosing that character. -- Stuart.Lynne@wimsey.bc.ca {ubc-cs,uunet}!van-bc!sl Vancouver,BC,604-937-7532
hedrick@athos.rutgers.edu (Charles Hedrick) (10/04/88)
But SCO is based on Xenix. I don't know how much traditional Xenix code is present now compared with code from ATT's System V, but at least the developers had Xenix available to steal from. So they may not have had to go through the System V kernel from scratch, "cleaning up" interrupt handling. Rather, they may simply have adopted Xenix methodology for dealing with them. Xenix was designed from the beginning for relatively slow machines and support of funny devices, so it is reasonable to think that it might have better interrupt latency than SV. If the problems are in the base SV/286 or 386 kernel, i.e. the part from ATT/Intel, it may not be practical for Microport to fix it. I've recently been working on Minix a lot to get serial I/O to work there. The fixes were in general not to the RS232 driver, but throughout the kernel to keep down the size of locked code segments. Also some adjustment of buffer sizes, again not at the driver level. I would expect something similar in Unix. Microport may be unable/unwilling to make changes throughout the ATT-maintained portion of the kernel. Everybody keeps yelling about the serial device drivers as if the problem could be fixed there. I really doubt it.
mike@cimcor.mn.org (Michael Grenier) (10/04/88)
From article <Oct.3.14.16.01.1988.28689@athos.rutgers.edu>, by hedrick@athos.rutgers.edu (Charles Hedrick): > driver level. I would expect something similar in Unix. Microport > may be unable/unwilling to make changes throughout the ATT-maintained > portion of the kernel. Everybody keeps yelling about the serial > device drivers as if the problem could be fixed there. I really doubt > it. Actually, I believe the problems with the lost interrupts could be fixed in the serial device drivers. Right now, as each character forces an interrupt, the interrupt routine looks at the Interrupt Identification register and decides if a character is available to be read or not. If it is, the character is NOT simply put into a buffer but is also passed though the line discipline routines (clists and stuff) while running at spl7 (all other interrupts are turned off). This takes a considerable amount of time and should not be used within an interrupt routine. You can read a better discussion about clists in real time device drivers in the book "Writing UNIX Device Drivers". A better solution would be to simply put the character into a buffer and return out of the interrupt routine. Then the trick becomes "How do we get the characters through the line disipline routines?". One method might be to steal an idea that was presented in the book "The UNIX Papers" where polling was used in the device drivers. A working example of this is the Bell Tech's ICC card or Digiboard's Intelligent card where a seperate process is running handling the details of the card. The idea I have is to have a seperate process waiting on a wait() in the kernel where it would wakeup every 1/60th or 1/30th of a second to read the characters out of the buffer and pass them though the line disipline routines. In this way, most of the character processing time would be handled on with interrupts turned on. To improve processing time, one could allow reads and writes in raw mode to be passed directly to the buffers bypassing the line disipline routines altogether. (This assumes programs like UUCP and ZMODEM run with the serial line in RAW mode). I'm no device driver expert but I think the process can be made to wait at a priority less than PZERO so it will be the next process to run every tick or so...we don't want an undue latency time for people running terminals on the serial lines. I would be happy to write the above mentioned driver (to include support for the 16550 UARTS sitting here) if someone could explan to me what all of the fields in the linesw structure (in sys/conf.h) and tty structure (in sys/tty.h) are. -Mike Grenier mike@cimcor.mn.org ...uunet!rosevax!cimcor!mike ...amdahl!bungia!cimcor!mike
herder@myab.se (Jan Herder) (10/04/88)
In article <1900@van-bc.UUCP> sl@van-bc.UUCP (pri=-10 Stuart Lynne) writes:
<However we take note that SCO mysteriously looses far less than Microport.
<The reason of course is that Microport spends a *lot* more time at spl7 than
<SCO does. This exacerbates the problem that you mention.
<
<In my experience system interrupt overhead and time lost through use of spl7
<is the primary cause of lost interrupts. At least when they are lost due to
<a large influx of them.
<
<SCO apparently has spent much time and effort in finding all places where
<spl7 is needed and *not* needed and has reduced the amount of time when they
<lock them out.
<
<For example with the tty drivers, try the following:
<
< stty 19200 -ixoff -echo; cat > /tmp/test
<
<then from your terminal emulator program dump about 100kb of data to the
<system.
There are ways to dealing with this problem, it's called pseudo dma.
If you have dumb serial port, you make a very small interrupt ruotine
wich reads the uart and puts the characters in a big circular list, which
can be read at a later time. If you make sure to never lock out this small
interrupt routine you don't loose any characters.
This technic has been used whith DZ ports on VAXen and serial ports on SUNs.
The *RIGHT* way to do it is of course to get a better serial card.
--
Jan Herder, MYAB Sweden | Phone: +46 31 18 75 12
Internet: herder@myab.se | Fax: +46 31 18 28 42
UUCP: uunet!enea!chalmers!myab!herder | Address: Dr. Forseliusg 21
ARPA: herder%myab.se@uunet.uu.net | 413 26 Gothenburg
vandys@hpcupt1.HP.COM (Andrew Valencia(Seattle)) (10/04/88)
/ hpcupt1:comp.unix.microport / hedrick@athos.rutgers.edu (Charles Hedrick) / 11:16 am Oct 3, 1988 / >But SCO is based on Xenix. I don't know how much traditional Xenix >code is present now compared with code from ATT's System V, but at >least the developers had Xenix available to steal from. God, here we go again. Listen *very carefully*: 1. Old SCO XENIX was weird 2. Current SCO XENIX is a port of System V 3. Current SCO XENIX is still somewhat weird in the name of compatibility 4. Current SCO XENIX will become less weird when the merged port comes out 'Nuff said. Andy
sl@van-bc.UUCP (pri=-10 Stuart Lynne) (10/05/88)
In article <Oct.3.14.16.01.1988.28689@athos.rutgers.edu> hedrick@athos.rutgers.edu (Charles Hedrick) writes: >Microport to fix it. I've recently been working on Minix a lot to get >serial I/O to work there. The fixes were in general not to the RS232 >driver, but throughout the kernel to keep down the size of locked code >segments. Also some adjustment of buffer sizes, again not at the >driver level. I would expect something similar in Unix. Microport >may be unable/unwilling to make changes throughout the ATT-maintained >portion of the kernel. Everybody keeps yelling about the serial >device drivers as if the problem could be fixed there. I really doubt >it. I apologize for not making my original comments a little more clear. Yes this is exactly the problem. You can't just stick a better serial driver in without changing other things in the kernel as well. For example one of the basic differences between SCO 386 and the SysV 386 products is the priority of the interrupts. SCO SysV SPL7 Serial SPL7 Clock SPL6 Clock SPLTTY Serial SysV allows the clock interrupt to take over the machine at a higher priority level than (for example) the serial interrupts. SCO places the Serial interrupts at the top allowing them to take priority over virtually everything else in the system. Which one do you think will loose more serial interrupts (i.e. they both do but the numbers vary greatly)? SCO also has some other tricks in the serial driver interrupt handler such as not doing the standard input process there, but doing it from a poll routine at the clock interrupt priority level; again allowing receiving chars to take precedence over processing them. -- Stuart.Lynne@wimsey.bc.ca {ubc-cs,uunet}!van-bc!sl Vancouver,BC,604-937-7532
dyer@spdcc.COM (Steve Dyer) (10/05/88)
In article <10770002@hpcupt1.HP.COM> vandys@hpcupt1.HP.COM (Andrew Valencia(Seattle)) writes: >>But SCO is based on Xenix. I don't know how much traditional Xenix >>code is present now compared with code from ATT's System V, but at >>least the developers had Xenix available to steal from. > God, here we go again. Listen *very carefully*: >1. Old SCO XENIX was weird >2. Current SCO XENIX is a port of System V >3. Current SCO XENIX is still somewhat weird in the name of compatibility >4. Current SCO XENIX will become less weird when the merged port comes out Well, you're both right. I think Chuck's point is well taken that Microsoft had had a lot more experience on what NOT do to to get decent performance on a PC-type machine. A lot of this is just plain old kernel expertise. If you've ever looked at Sys V.3 kernel sources, you will find spln()'s all over the place, in places where there's no possibility that an interrupt could affect a particular flag or data structure. This is not inherently bad, but is a clue that some of the people who worked on it weren't quite on the ball (now, there's a lot of code which is correct, too!) Add to that the need for an OEM like Microport to provide its own device drivers and this has a greater possibility of occurring (calling the line discipline input routine at spl7(), if it's true, is a good example of this.) -- Steve Dyer dyer@harvard.harvard.edu dyer@spdcc.COM aka {harvard,husc6,linus,ima,bbn,m2c,mipseast}!spdcc!dyer
sl@van-bc.UUCP (pri=-10 Stuart Lynne) (10/06/88)
In article <591@cimcor.mn.org> mike@cimcor.mn.org (Michael Grenier) writes: >From article <Oct.3.14.16.01.1988.28689@athos.rutgers.edu>, by hedrick@athos.rutgers.edu (Charles Hedrick): >A better solution would be to simply put the character into a buffer >and return out of the interrupt routine. Then the trick becomes "How >do we get the characters through the line disipline routines?". One >method might be to steal an idea that was presented in the book >"The UNIX Papers" where polling was used in the device drivers. A >with interrupts turned on. To improve processing time, one could >allow reads and writes in raw mode to be passed directly to the buffers >bypassing the line disipline routines altogether. (This assumes Not really required, the line disciplines are not to bad when it comes to raw I/O. >programs like UUCP and ZMODEM run with the serial line in RAW mode). Uucp does, zmodem doesn't. This *is* essentially what SCO is already doing. They have built in support for a poll routine in a driver which is called every clock tick. Their interrupt routines for the serial driver are at SPL7 and the clock tick is SPL6. The serial interrupts operate out of buffers which are filled/emptied by the poll routine. >I'm no device driver expert but I think the process can be made to >wait at a priority less than PZERO so it will be the next process to run >every tick or so...we don't want an undue latency time for people >running terminals on the serial lines. By using the poll routine (or equivalent using timeout() with other Unix's) you don't have to worry about running as a user process with the other attendant issues you mention. You are effectively still running as an interrupt routine, the trick is to get the clock running at a lower spl level than serial interrupt. We are not too worried about general overhead as much as we are worried about leaving lot's of cpu cycles available at SPL7. In other words when a serial interrupt arrives, there is never a period of more than one or two hundred micro-seconds before we run the interrupt service routine. Of course there are some bugs to be worked out but it does work fairly well. On problem under SCO 386 is that all of the line discipline routines (e.g. canon()) protect the tty structure at SPL5. Unfortunately the poll routines come in at SPL6! There are a couple of small windows where some important information can get lost and the port will stop functioning until closed and re-opened. SCO has extra code in their poll routines to compensate for this problem. >I would be happy to write the above mentioned driver (to include >support for the 16550 UARTS sitting here) if someone could explan >to me what all of the fields in the linesw structure (in sys/conf.h) >and tty structure (in sys/tty.h) are. Already done. I'm finishing up the non-polling version today for both SCO and SysV on the 386. Hope to have polling versions tested by next week... it's working but the SPL5 problem is bitch. I've got to ensure that I've found all of the problem area's. Actually with the 16550's you don't quite need to go to a polling scheme, but with the 16450's it's the only way to guarrantee you don't loose interrupts. -- Stuart.Lynne@wimsey.bc.ca {ubc-cs,uunet}!van-bc!sl Vancouver,BC,604-937-7532
mike@cimcor.mn.org (Michael Grenier) (10/06/88)
From article <1905@van-bc.UUCP>, by sl@van-bc.UUCP (pri=-10 Stuart Lynne): ! For example one of the basic differences between SCO 386 and the SysV 386 ! products is the priority of the interrupts. ! ! SCO SysV ! SPL7 Serial SPL7 Clock ! SPL6 Clock SPLTTY Serial ! ! SysV allows the clock interrupt to take over the machine at a higher ! priority level than (for example) the serial interrupts. I don't think so. Microport has the serial interrupts at SPL7 (the highest) and the clock at the lowest (which is probably why the clock loses time!). In fact, I doubt Microport is losing that many interrupts on the serial lines until the entire system gets too loaded which doesn't take that much with the overhead being incurred. -Mike Grenier mike@cimcor.mn.org uunet!rosevax!cimcor!mike
sl@van-bc.UUCP (pri=-10 Stuart Lynne) (10/07/88)
In article <592@cimcor.mn.org> mike@cimcor.mn.org (Michael Grenier) writes: >From article <1905@van-bc.UUCP>, by sl@van-bc.UUCP (pri=-10 Stuart Lynne): >! For example one of the basic differences between SCO 386 and the SysV 386 >! products is the priority of the interrupts. >! SCO SysV >! SPL7 Serial SPL7 Clock >! SPL6 Clock SPLTTY Serial >! SysV allows the clock interrupt to take over the machine at a higher >! priority level than (for example) the serial interrupts. >I don't think so. Microport has the serial interrupts at SPL7 (the >highest) and the clock at the lowest (which is probably why the Can't speak to Microport 286, but I just spent an hour and a half pulling in Microport's 386 atconf directories off tape and they match the standard System V / 386 stuff pretty close. The clock is at SPL7 and serial is at SPLTTY. For inquiring minds, SPL6 < SPLTTY < SPL7. In other words SPL7 is actually priority level 8! In any event I'm not sure it will be possible to distribute a polling serial driver which needs the clock to be a lower SPL level, the standard release has a check for what SPL level it is running at and panics with a polite message if not at SPL7. -- Stuart.Lynne@wimsey.bc.ca {ubc-cs,uunet}!van-bc!sl Vancouver,BC,604-937-7532