peterson@SW.MCC.COM (James Peterson) (11/09/88)
VERSION: R3 CLIENT MACHINE and OPERATING SYSTEM: Sun 3/260 running SunOS 3.5 DISPLAY: Sun CG2 AREA: server SYNOPSIS: Server code core dumps when more than 5 screens are used. DESCRIPTION: The server code is ill-designed to handle multiple screens. These problems exist at many levels, both in the Sun-specific and dix code. The first problem is with include/misc.h. This file defines a symbolic constant MAXSCREENS with a value 3; this constant controls the maximum number of screens that the server can handle. It is used to control the size of arrays in dixstruct.h, dixfontstr.h, cursorstr.h, window.c, globals.c, and extension.c plus being used in the ddx code for cfb, mfb, mi, ndx, plx, and apollo, and should be used instead of NUMSCREENS for the sunFbs array in sun/sunInit.c. No code checks to see if the number of screens exceeds MAXSCREENS -- AddScreen in server/dix/main.c would the be appropriate place for this check. Looking at AddScreen, however, we find that the actual array of screen descriptions is more flexible. The screenInfo record has two variables: screenInfo.numScreens and screenInfo.arraySize. The arraySize variable indicates the size of the actual array for holding screen descriptions; numScreens says how many we actually have. If we invoke AddScreen and find that we have no more free space in the screen descrption array, the following code is executed: if (screenInfo.numScreens == screenInfo.arraySize) { screenInfo.arraySize += 5; screenInfo.screen = (ScreenPtr)xrealloc( screenInfo.screen, screenInfo.arraySize * sizeof(ScreenRec)); } This code allocates room for 5 more screens, and then re-allocates memory for the screen array and copies the previous screen array into the new memory -- fairly standard for Unix C code. This is, however, anomolous since the number of screens cannot exceed MAXSCREENS anyway, so simply preallocating screenInfo.screen and screenInfo.arraySize to MAXSCREENS eliminates the need for this code. screenInfo.arraySize is currently initialized to 0, so in the current code this dynamic array allocation is normally executed once at the very beginning. In a system with MAXSCREENS reset to a higher value (like 10), and more than 5 screens, it will be executed again later. The expectation would then be that space would be reallocated, values copied and the server would be able to support the additional screens. However, AddScreen later executes the following to complete the initialization of the new screen: if ((*pfnInit)(i, &screenInfo.screen[i], argc, argv)) { screenInfo.numScreens++; CreateGCperDepthArray(i); CreateDefaultStipple(i); } For a Sun, the pfnInit will be sunCG2CInit or sunCG2MInit or ... but all these are similar. In particular these routines call SunScreenInit which calls CreateScratchGC which stores the address of the screen descriptor (the &screenInfo.screen[i] parameter above called pScreen in the routines) into a GC, or by calling mfbScreenInit or cfbScreenInit, into the pScreen field of a drawable for the basic pPixmap. You can trace the code, but the basic idea is that the &screenInfo.screen[i] parameter under the name pScreen gets stored into various structures. This is not limited to Sun servers -- AddScreen itself calls CreateGCperDepthArray which calls CreateScratchGC which stores away a set of &screenInfo.screen[i] pointers. When the number of screens exceeds 5, however, the dynamic reallocation code in AddScreen is executed. This moves the screenInfo.screen array, and ALL THOSE STORED POINTERS ARE WRONG. The basic rule of reallocating space is that you can only do it if all pointers to that space can be found and changed. REPEAT BY: First run on any system with more than 3 screens to trash the server because MAXSCREENS is set to 3 and arrays allocated against it are not checked for exceeding their size. Second, change MAXSCREENS to a larger number (like 10) and run with more than 5 screens to see the dangling pointer problem. SAMPLE FIX: A simple hack: Set MAXSCREENS to 10 and change the dynamic allocation code to use MAXSCREENS instead of 5 and to recognize a fatal error if the number of screens exceed MAXSCREENS. A fix: redesign the entire screen handling code to either be fixed size static (ala MAXSCREENS) or truly dynamic. Also, all other reallocations may need to be examined to see that they do not leave dangling pointers.