[comp.windows.x] X11 lossage on Sun

dsj@GOLDHILL.COM (Dan Jacobs) (10/21/88)

Jim, 

We're having a very hard problem using X11 on Sun3's.  In short, it
seems that if a client accumulates a lot of generated events which it
doesn't read, the server just hangs up on its end of the socket
connection.  I haven't verified that this is precisely what's happening
but it's as close as I can get.

The problem rather a serious one for us, since we are running from 
within Lisp (and are not using CLX), and do not try to read and process
events while in the debugger or in other circumstances.

I don't know if my description of the problem is adequate, but what I'm
looking for in the way of a solution is either a way to prevent the
connection from being closed, or at the very least a way to detect the
situation and then a way to re-establish the connection.

Please let us know what you think.  We're rather stumped by this one.

Thanks, 

Dan Jacobs

keith@EXPO.LCS.MIT.EDU (Keith Packard) (10/21/88)

> In short, it
> seems that if a client accumulates a lot of generated events which it
> doesn't read, the server just hangs up on its end of the socket
> connection.

We've fixed this in R3 -- the server continues to buffer events
for deaf clients until the client reads them or the server runs
out of memory.  It used to hang on the write -- stopping the server
dead in its tracks.

						Keith Packard
						MIT X Consortium
						(617) 253-1428
						keith@EXPO.LCS.MIT.EDU

jdi@sparky.UUCP (John Irwin) (10/21/88)

Your message:

    
    Jim, 

    We're having a very hard problem using X11 on Sun3's.  In short, it
    seems that if a client accumulates a lot of generated events which it
    doesn't read, the server just hangs up on its end of the socket
    connection.  I haven't verified that this is precisely what's happening
    but it's as close as I can get.

    The problem rather a serious one for us, since we are running from 
    within Lisp (and are not using CLX), and do not try to read and process
    events while in the debugger or in other circumstances.

    I don't know if my description of the problem is adequate, but what I'm
    looking for in the way of a solution is either a way to prevent the
    connection from being closed, or at the very least a way to detect the
    situation and then a way to re-establish the connection.

    Please let us know what you think.  We're rather stumped by this one.

    Thanks, 

    Dan Jacobs

--------

We have also discovered this problem.  Consider another situation
using Lisp on stock hardware as a client of X11.  Suppose lisp sends a request
such as QueryFont to the server, then before reading back all of the reply
enters a garbage collection.  Now, if that garbage collection doesn't finish
within a few seconds (6, is it?) the server will hang up.  CLX has no built
in way of dealing with this or Dan's problem either.  This is a serious
difficulty that has no easy or elegant workaround on the Lisp side.

As Dan hints at, the best solution is to change the X11 server so that it
"multiprocesses," and thus can afford to time out much more slowly since
the window system is not frozen during the time it's waiting for a write or
read to/from the client to finish.  Another solution is some sort of
extension to ask the server not to time out (or to have a long time out)
on a certain connection, altough this doesn't prevent window system icing.

Another "solution" is to set the closedown mode to RetainPermanent, and try to
recreate the server connection and state whenever it hangs up.  Is this even
possible?  It would require checking the server connection for liveness
on every request and being able to fully recreate the server state and update
the Lisp state at any time.  This seems fairly difficult.

Possible workarounds on the Lisp side depend greatly on the particular Lisp
implementation.  One we have considered is using SIGIO and a signal handler
on the X socket.  When SIGIO goes off, and when the Lisp is not trying to
read from the socket, have the SIGIO signal handler read the available bytes
into a temporary buffer.  Then when the Lisp gets around to reading the data
or events it first reads any available data out of the temporary buffer.
This solution would seem to require a Berkeley Unix and fairly tight coupling
of Lisp to Unix, as well as an easy way to determine from a C interrupt
routine whether or not Lisp is actively listening to the X socket.

Anyone else thought about this?

	-- John Irwin, Franz Inc.  jdi%franz.UUCP@ucbarpa.Berkeley.EDU