[comp.windows.x] xinit scheduling bug

Dale.Moore@DALE.FAC.CS.CMU.EDU (08/30/90)

			  X Window System Bug Report
			    xbugs@expo.lcs.mit.edu


VERSION:
    R4

CLIENT MACHINE and OPERATING SYSTEM:
    n/a or SUN 3/60 running Mach 2.5 (BSD)

DISPLAY TYPE:
    n/a or CGfour

WINDOW MANAGER:
    n/a or mwm or twm

AREA:
    xinit

SYNOPSIS:

There is a timing bug in xinit.  It is noticeable in Xsun.
The shell script .xserverrc should exec the server rather
than start it as a fork process.
    

DESCRIPTION:

At startup time, Xsun server does a "fcntl(2, F_SETFL, flags | FNDELAY)" .
This sets the controling terminal to No-Delay.  The comments in the
source read

    /*
     *  Writes to /dev/console can block - causing an
     *  excess of error messages to hang the server in
     *  deadlock. 
     */

The server also defines an exit handler that restores, or actually
clears the FNDELAY bit upon exit.   The problem is that xinit does
not always wait for the server to finish.  But instead it waits on
the shell that the server is executing underneath to exit.

Normal shell command and program execution might be diagramed
like this.  The shell forks, and the child execs the program.
The shell waits for the child to exit, then continues.

        |
       Shell 
 |      |
        |\   <----- Fork
 T      | \     
 i      |  \
 m    Wait  \
 e    Fork   exec program
      pid      |
 |      |      |
 V      ~      ~
        |      |
        |     exit
        |
       cont
      

The flow for xinit would look something like


        |
       Shell 
 |      |
        |\
 T      | \     
 i      |  \
 m    Wait  \
 e    xinit  exec xinit
      pid      |
 |      |      |
 V      ~      \
        |      |\----------------\
        |      |                 |
        |      |                exec sh X
        |      |                 |
        |      |                 \
        |      |                 |\-------\
        |      |                 |        |
        |      |                wait    exec Xsun
        |      |                Xsun      |
        |      |                pid       |
        |      |                 |        |
        |      \                 |        |
        |      |\----\           |        |
        |      |     |           |        |
        |    wait   sh xinitrc   |        |
        |  xinitrc   |           |        |
        |      |     |           |        |
        ~      ~     ~           ~        ~
        |      |     |           |        |
        |      |    exit         |        |
        |      |                 |        |
        |     cont               |        |
        |      |                 |        |
        |     killpg             |        |
        |     server             |        |
        |     SIGTERM            |        |
        |      |                 |        |
        |     wait             receive  receive
        |     sh X             SIGTERM  SIGTERM
        |      |                 |        |
        |      |                exit      |
        |      |                         cleanup
        |     cont                       FNDELAY
        |      |                          |
        |     exit                       exit
      cont
        |
        |

The problem is scheduling. We don't know whether xinit will finish first
or if Xsun will finish first.  If xinit finishes before xSun, then the
shell might start reading on the terminal or console before Xsun has
a chance to restore (or clear) the FNDELAY bit.  The read from the shell
would immediately return 0 bytes.  The shell would interpret that as
being EOF and log the user out.

If xinit was exec'd (rather than forked) from .login, then getty might
try to do a read before Xsun has a chance to restore the FNDELAY bit.

In any case, xinit should really wait on the server, and not the
shell that the server is executing in.  The default behaviour can
be really confusing.


REPEAT BY:
    running default environment on Sun.

SAMPLE FIX:
    In .xserverrc put the line
	exec X