[comp.windows.ms] Communication Error CE_OVERRUN under Windows

chrisr@cognos.UUCP (Christine Roine) (08/09/89)

Communication Problem
---------------------

One of our applications accesses and displays information retrieved 
from a host computer via a serial communication line.  We have a
communications bug that occurs fairly reliable, but cannot be
reproduced with the same sequence of steps every time.  (Programmer's
nightmare!)  The application runs perfectly for as much as 10 - 15
minutes.  Then, suddenly, we get a CE_OVERRUN error returned by 
GetCommError().  The documentation explains this as "A character
is not read from the hardware before the next character arrives.
The character is lost."  Our communication software asks for a 
retransmission of the data that was lost, but CE_OVERRUNs keep 
occurring in the retransmitted data.  

At first we thought that perhaps we were writing over the interrupt vectors 
in low memory.  However, a closer look at the data we were receiving
revealed that we get some correct data after a CE_OVERRUN occurs.
In some cases, for example, we are just dropping a couple of characters
in the middle of a message.

Our second idea was that perhaps something else we were doing was
disabling the interrupts causing us to miss data.  Apparently,
interrupts are turned off during disk I/O.  We don't do any
disk I/O during the time we are receiving data.  However, we are
guessing that Windows may be swapping segments between memory
and disk, and that this could be causing the problem.

Our third idea was to drop the baud rate (from 9600 to 2400).  (We
are grasping at straws!)  The CE_OVERRUNs still occurred.

Our fourth idea was to try to recover from a CE_OVERRUN.  We tried
shutting and re-opening the port, and resetting the UART with
an Escape call.  Nothing has helped.  

We are running Windows 2.03.  Microsoft supplied us with an advanced
version of the communications driver, but this didn't fix the
problem.

Has anyone run into this or a similar problem?


-- 
Christine Roine          Cognos Incorporated     S-mail: P.O. Box 9707
Voice: (613) 738-1440 x6111                              3755 Riverside Drive
  FAX: (613) 738-0002                                    Ottawa, Ontario
 UUCP: decvax!utzoo!dciem!nrcaer!cognos!chrisr           CANADA  K1G 3Z4

robert@sysint.UUCP (Robert Nelson) (08/09/89)

chrisr@cognos.UUCP (Christine Roine) writes:

>Communication Problem
>---------------------

>One of our applications accesses and displays information retrieved 
>from a host computer via a serial communication line.  We have a
>communications bug that occurs fairly reliable, but cannot be
>reproduced with the same sequence of steps every time.  (Programmer's
>nightmare!)  The application runs perfectly for as much as 10 - 15
>minutes.  Then, suddenly, we get a CE_OVERRUN error returned by 
>GetCommError().  The documentation explains this as "A character
>is not read from the hardware before the next character arrives.
>The character is lost."  Our communication software asks for a 
>retransmission of the data that was lost, but CE_OVERRUNs keep 
>occurring in the retransmitted data.  

>At first we thought that perhaps we were writing over the interrupt vectors 
>in low memory.  However, a closer look at the data we were receiving
>revealed that we get some correct data after a CE_OVERRUN occurs.
>In some cases, for example, we are just dropping a couple of characters
>in the middle of a message.

[ Other descriptions of supposed causes and attempted solutions removed ... ]

>Has anyone run into this or a similar problem?

I believe that interrupts being disabled too long is your problem.  I don't
think that disk interrupts are the cause unless one or more of the following
are true:

	1)	your system is really slow

		(Nobody would use Windows on an XT :-) Would they??

	2)	you have a disk cache program which disables interrupts
		while transfering whole tracks

	3)	your disk controller is really slow

	4)	your disk sector interleave is wrong.

I have seen problems with the windows asynch driver when used with EMS
drivers.  If you are using one, try running windows with the /n option and
if that doesn't help try disabling the EMS driver in your config.sys file.

If the problem is with the EMS driver check with the vendor for a later
version or a fix.

>-- 
>Christine Roine          Cognos Incorporated     S-mail: P.O. Box 9707
>Voice: (613) 738-1440 x6111                              3755 Riverside Drive
>  FAX: (613) 738-0002                                    Ottawa, Ontario
> UUCP: decvax!utzoo!dciem!nrcaer!cognos!chrisr           CANADA  K1G 3Z4
-- 
Robert B. Nelson                               Systems Interface Inc.
Phone: (613) 727-5001                          223 Colonnade Road South
UUCP: uunet!mitel!sce!cognos!sysint!robert     Nepean, Ontario, CANADA, K2E 7K3