[comp.windows.news] Race condition in NeWS?

dennis@dennis.colorado.edu (05/04/89)

I seem to have encountered some obscure race conditions in
one of my programs.

The first problem arose when I would invoke the redisplay option in the frame
menu for a special class of window
(the controlwindow that I posted some time ago).
If I did this, it caused the window to hang and not accept any more
commands (keyboard or mouse).
My, rather complicated, window class appeared to be in some sort
of infinite loop.

To further complicate things, this only happened on Color Suns;
black and white Suns worked ok.

After some effort (NeWS debugging facilities leave much to be desired),
I traced the problem to a 'pause' commmand in my
PaintFrameControls procedure.
If I removed the pause, the window worked ok; if I replaced it,
it failed.

As near as I can tell, what happens is that the refresh command
invokes paint to repaint the window.
Soon afterwards, the Menu code invokes paint on its client window
to repair the damage caused by the menu overwriting part of the window.
Both of these run as processes because the litewindow class
forks the paint processes.
Thus, in theory, we have two processes simultaneously trying to
repaint the window.

My hypothesis is that if the pause is not included, then the
two paint processes operate in serial order.  When the pause
is included, it allows them to interleave and this causes
some sort of  error whose exact nature I cannot yet determine.

To test this hypothesis, I set ForkPaintClient? to false.
When  I did this, the window again worked correctly.

I do not know exactly why the code works on a B&W Sun,
but not on a Color Sun.
Again, I hypothesize that re-painting on a Color Sun takes
longer, which might affect the timing of the race condition.
Alternately, there is a some bug in the way that NeWS handles color
that is causing the problem. 

So, be careful about sticking pauses into your code.
You might want to run with ForkPaintClient? false in all your windows
when on a color sun (but see below).

The second problem occurred when I would invoke zap from the frame
menu. What appears to be happening is that the destroy code
wipes out major portions of the window, and then the menu code
subsequently tries to repaint the window.
Again, this is a race condition.
My solution was to add a 'destroyed' field to my window,
and add a test in PaintClient and PaintFrame.  if the window
is destroyed, then the paint routines do nothing.
this is admittedly a kludge, but I suspect that
the whole window/menu system needs re-thinking to rid itself of
a whole class of race conditions like these.

Note that another way to fix this is to make ForkPaintClient?
true and put a pause in the /destroy code.
This should allow the menu process to complete and the window
to quiet before the /destroy code is executed.
I did not try this solution to see if it worked.
Notice that this solution is in direct opposition to the solution
proposed for problem one above.

-Dennis Heimbigner