[comp.windows.x] xdm problem on Sun 4's -- unknown status 2817

medmunds@Verity.COM (Mike Edmunds x7645) (02/24/90)

I can't seem to get xdm to work on our Sun 4's unless I run it in
debug mode *and* explicitly specify the server on the command line.
If I don't do both of these things, then xdm starts up the server,
*doesn't* display the login widget and exits immediately.

If I look at the debug output from xdm, things start acting strangely
right after
  ManageSession :0
At this point in a successful run, resources are loaded and the login
widget appears.  In the unsuccessful runs, xdm falls right into
  Manager wait returns pid: 11535
  Display exited with unknown status 2817
The unknown status is always 2817.

xinit works fine, as does starting the server by hand.  The problem
does not appear on our Sun 3/60's.  Specifics below.

Any ideas?

- Mike Edmunds
  Verity, Inc.		(medmunds@verity.com)


------- Specifics --------

Environment:
  X11R4 patches 1-2
  Shared libraries
  gcc 1.37
  Sun 4, monochrome (tried two different machines)
  SunOS Release 4.0.3  or  SunOS Release 4.0 (.1?)

My /usr/lib/X11/xdm/Xservers file contains:
  :0 Local local /usr/bin/X11/X :0

I bind control-D to the abort-display action in Xresources. My
xdm-config file is straight out of mit/clients/xdm/config.


This command works:
[1]  xdm -udpPort 0 -debug 10 -server ":0 Local local /usr/bin/X11/X :0"

These don't work:
[2]  xdm -udpPort 0 -debug 10
     xdm -udpPort 0 -nodaemon
     xdm -udpPort 0
     xdm


The diff below compares the debugging output of the first two commands
above.  I used abort-display to terminate the first command; the
second one exited on its own.

% diff -c2 output_of_[1] output_of_[2]
*** debug-1	Fri Feb 23 18:56:56 1990
--- debug-2	Fri Feb 23 17:28:47 1990
***************
*** 34,41 ****
  DisplayManager._0.userAuthDir/DisplayManager.Local.UserAuthDir value /tmp
  StartServer for :0
! Server Started 11694
  '/usr/bin/X11/X' ':0' 
  display manager paused til SIGUSR1
! pid: 11695
  WaitForSomething
  signals blocked, mask was 0x0
--- 34,41 ----
  DisplayManager._0.userAuthDir/DisplayManager.Local.UserAuthDir value /tmp
  StartServer for :0
! Server Started 11534
  '/usr/bin/X11/X' ':0' 
  display manager paused til SIGUSR1
! pid: 11535
  WaitForSomething
  signals blocked, mask was 0x0
***************
*** 43,64 ****
  After XOpenDisplay(:0)
  ManageSession :0
! Loading resource file: /usr/bin/X11/xrdb -display :0 -load /usr/lib/X11/xdm/Xresources
! name now :0
! greet :0
! SecureDisplay :0
! Before XGrabServer :0
! XGrabKeyboard succeeded :0
! pseudoReset screen 0
! before XSync
! pseudoReset done
! done secure :0
! dispatching :0
! GreetDone: , (password is 0 long)
! UNMANAGE_DISPLAY
! Manager wait returns pid: 11695
! Display exited with UNMANAGE_DISPLAY
  WaitForSomething
  signals blocked, mask was 0x0
! Manager wait returns pid: 11694
  Zombie server reaped, removing display :0
  Nothing left to do, exiting
--- 43,51 ----
  After XOpenDisplay(:0)
  ManageSession :0
! Manager wait returns pid: 11535
! Display exited with unknown status 2817
  WaitForSomething
  signals blocked, mask was 0x0
! Manager wait returns pid: 11534
  Zombie server reaped, removing display :0
  Nothing left to do, exiting


After an unsuccessful run, /usr/lib/X11/xdm/xdm-errors will contain
only a single line.
  error (pid 11563): Unknown session exit code 2817 from process 11565

earle@POSEUR.JPL.NASA.GOV (Greg Earle - Sun JPL on-site Software Support) (02/25/90)

In comp.windows.x article <11008@zodiac.ADS.COM> you write:
>Display exited with unknown status 2817
>
>The unknown status is always 2817.
>
>xinit works fine, as does starting the server by hand.  The problem
>does not appear on our Sun 3/60's.  Specifics below.
>
>Any ideas?

Look at /usr/include/sys/wait.h, and RTFM wait(2):

WAIT(2)                   SYSTEM CALLS                    WAIT(2)
...
          +  If the low-order 8 bits of w_status are non-zero and
             are  not equal to 0177, the child process terminated
             due to a signal; the low-order 7  bits  of  w_status
             contain the number of the signal that terminated the
             process.  In addition, if the low-order seventh  bit
             of  w_status  (that  is,  bit 0200) is set, a ``core
             image'' of the process was produced; see sigvec(2).

`2817' is hex 0xB01.  Translation: process died with signal 11 (SIGSEGV).  It
did not leave a core dump (unfortunately).

I suppose one could say that it is a `bug' in xdm that it doesn't try to
interpret the value returned by wait(2) a little better (like putting out the
lower 8 bits as the status value, and checking bit 8 to see if the process
stopped or terminated abnormally, and outputting the higher 8 bits as the
signal number if abnormal termination occurred.  RFE of xdm's debug mode, I
say ...

--
-- 
	Greg Earle
	Sun Microsystems, Inc. - JPL on-site Software Support
	earle@poseur.JPL.NASA.GOV	(direct)
	earle@Sun.COM			(indirect)

keith@EXPO.LCS.MIT.EDU (Keith Packard) (02/26/90)

> If I look at the debug output from xdm, things start acting strangely
> right after
>   ManageSession :0
> At this point in a successful run, resources are loaded and the login
> widget appears.  In the unsuccessful runs, xdm falls right into
>   Manager wait returns pid: 11535
>   Display exited with unknown status 2817
> The unknown status is always 2817.

When the child process exits because of a signal, xdm computes
the effective status as

	(256 * signal_number) + 1

This makes the error message meaningful, because 2817 is 0x0b01 or
11 * 256 + 1 -- your child process is receiving signal 11 (SIGSEGV).  This
means that it is dumping core somewhere and you should be able to get 
at least a stack trace from the wayward process which will narrow the search
for a cause quite a bit.  xdm doesn't chdir around, so you should find the
core file in the directory you started xdm from.

As usual, the debug information from xdm is opaque to almost everyone.
Someday I should spend several hours making sense of all of the various debug
statements and make them more understandable.

Keith Packard
MIT X Consortium