[comp.windows.x] Question about XCopyArea

anneb@ai.etl.army.mil (Anne Brink) (02/15/90)

Hi there, fellow X-types.  I need your help.

I'm working on a graphics application in X, and I need a fast copy routine for
pixmaps. I'm calling it several zillion times in the space of oh, a few
minutes, and it takes over the entire machine.  Right now, I have the code
looping forever,  with 5-7 XCopyArea() calls followed by a mouse click check 
in each iteration. The area copied is either 512x512 or 256x256. I cannot use
something like bcopy(), since I need to perform logical operations with the
copy. The response is ugly: After I click on the mouse, it takes over 20
iterations for the program acknowledge the click! (I lost count somewhere
around 20, but it kept going for a while.)  It's also amusing to move the mouse
into another window and see how long it takes for the window manager to
register the move.  I've tried to slow things down with some sleep calls to let
things catch up, but it hasn't made a noticeable difference.  Maybe I'm not
sleeping long enough.  I had tried up to a half second under R3. 
We are now using X11R4 pretty much straight from MIT, on a Sun3/260 with a
color monitor. (We don't have OpenWindows yet.) 

Has anyone run up against the same problem?  Could you solve it? How?!? Is
there an alternative solution? I've read the Xlib manuals, and nothing springs
out of the pages at me.
Does OpenWindows have good bit-blitting support? When I programmed in SunView
way back when, the pixrect copy routines in SunView didn't seem to cause me as
much trouble.  

I'd prefer it you e-mailed to me, since right now, our net connection is a bit
flakey.  Even hints that I'm in a dream world will be helpful, 'cause then I
can try to find some other work around. I'll summarize anything I get if
there's enough interest.    

Thanks very much for reading this and for your help!

				Anne Brink
				anneb@etl.army.mil (Internet)
				...!uunet!etl.army.mil!anneb (UUCP, sort of)
-- 
#################################||############################################
  Anne Brink: anneb@etl.army.mil || Open the Spring Training Camps!
    ...!uunet!etl.army.mil!anneb || 	  Orioles in '90
#################################||############################################

keith@EXPO.LCS.MIT.EDU (Keith Packard) (02/16/90)

> I'm working on a graphics application in X, and I need a fast copy routine for
> pixmaps.

> The area copied is either 512x512 or 256x256. I cannot use something like
> bcopy(), since I need to perform logical operations with the copy.

> We are now using X11R4 pretty much straight from MIT, on a Sun3/260
> with a color monitor.

The MIT R4 server is heavily tuned for copy-mode bitblt.  All other rasterops
go through a very slow convoluted path which runs about 10 times slower.
It's not that the other rasterops are naturally that slow, but the code
checks the rasterop for every longword copied instead of expanding the
code 16 times, once per rasterop.  The old R3 server actually did expand
the code 16 times, but that broke many compilers as the resulting function
was rather large.

One potential solution would be to optimize the particular rasterop your code
needs by duplicating the copy-mode code and modifying it to perform the
required rasterop.

Keith Packard
MIT X Consortium

bjaspan@athena.mit.edu (Barr3y Jaspan) (02/16/90)

In article <401@ai.etl.army.mil>, anneb@ai.etl.army.mil (Anne Brink) writes:
> Hi there, fellow X-types.  I need your help.
> 
> I'm working on a graphics application in X, and I need a fast copy
routine for
> pixmaps. I'm calling it several zillion times in the space of oh, a few
> minutes, and it takes over the entire machine.  Right now, I have the code
> looping forever,  with 5-7 XCopyArea() calls followed by a mouse click check 
> in each iteration....

It sounds like the problem you are having is that the XCopyArea requests
get to the server faster than it can possibly deal with them, so they get
queued.  When you click a mouse button, it also goes on the queue and isn't
handled until all the XCopyAreas are.

I am finishing up a Xlib video game that does some very similar hosing of the
server.. basically, each "tick" every object in the game is told to move, and
every object that actually moves does two XCopyAreas (one to erase itself, 
one to draw itself in the new position.)  This results in zillions of calls
to XCopyArea.

There are three things that I am doing to prevent the problems you are having
(most of these are based on ideas I stole from someone else here.. :-)

1)  A "tick" is a certain, defined length of time (currently 14000 microseconds
in my game).  At the beginning of each tick (at the top each move loop) I
call MarkTime which uses gettimeofday() and stores the tv_usec field.  At
the end of each loop, I call WaitRemainerOfTick which calls gettimeofday
again and sleeps for 14000 - start_time (where start_time is the tv_usec) field
saved earlier.  The makes the program run smoothly without the speed
fluctuations
that would otherwise be caused by forcing the X server to run at full tilt.
(I just ignore the overhead of all these calls to gettimeofday().  An X
program such as this is so I/O bound that computrons come for free.)

2) At the end of *EVERY* move loop (at the end of each tick) I call XFlush..
this makes sure all of the events I've generated actually get to the server
(since events are guaranteed to get there until XFlush or XNextEvent or a 
similar function is called).

3) Every 20 ticks I call XSync(dpy, True).  The "true" means my program
actually waits until the X server has processed *all* the events I've sent
it.

Basically, these three things make sure the program stays more or less
synchronized with the X server, which prevents the problem of mouse events
taking 20 seconds to get processed.  

Hope this helps..

Barry Jaspan, MIT-Project Athena
bjaspan@athena.mit.edu

mouse@LARRY.MCRCIM.MCGILL.EDU (der Mouse) (02/17/90)

> 3) Every 20 ticks I call XSync(dpy, True).  The "true" means my
>    program actually waits until the X server has processed *all* the
>    events I've sent it.

No, the True means you're throwing events away.  XSync(dpy,False) is
enough to wait for the server to catch up.

From the Xlib doc:

	To flush the output buffer and then wait until all requests
	have been processed, use XSync.
	
	XSync(display, discard)
	      Display *display;
	      Bool discard;
	
	display   Specifies the connection to the X server.
	
	discard   Specifies a Boolean value that indicates whether
	          XSync discards all events on the event queue.
	
	The XSync function flushes the output buffer and then waits
	until all requests have been received and processed by the X
	server.  [...stuff about error events...]
	          Any events generated by the server are enqueued
	into the library's event queue.
	
	Finally, if you passed False, XSync does not discard the
	events in the queue.  If you passed True, XSync discards all
	events in the queue, including those events that were on the
	queue before XSync was called.

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu

mouse@LARRY.MCRCIM.MCGILL.EDU (der Mouse) (02/20/90)

> The MIT R4 server is heavily tuned for copy-mode bitblt.  All other
> rasterops go through a very slow convoluted path which runs about 10
> times slower.  It's not that the other rasterops are naturally that
> slow, but the code checks the rasterop for every longword copied
> instead of expanding the code 16 times, once per rasterop.  The old
> R3 server actually did expand the code 16 times, but that broke many
> compilers as the resulting function was rather large.

Why not control it with a configuration parameter?  I see no reason why
sites with compilers capable of handling the expanded version shouldn't
be able to get it....

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu

(Or split it off into 16 different functions, and eat the call overhead
if the area being blitted is big enough....)

hvr@kimba.Sun.COM (Heather Rose) (02/27/90)

In article <1990Feb16.010419.2462@athena.mit.edu> bjaspan@athena.mit.edu (Barr3y Jaspan) writes:
>In article <401@ai.etl.army.mil>, anneb@ai.etl.army.mil (Anne Brink) writes:
>> I'm working on a graphics application in X, and I need a fast copy
>routine for
>> pixmaps. I'm calling it several zillion times in the space of oh, a few
>> minutes, and it takes over the entire machine.  Right now, I have the code
>> looping forever,  with 5-7 XCopyArea() calls followed by a mouse click check 
>> in each iteration....
>
>It sounds like the problem you are having is that the XCopyArea requests
>get to the server faster than it can possibly deal with them, so they get
>queued.  When you click a mouse button, it also goes on the queue and isn't
>handled until all the XCopyAreas are.
>
>I am finishing up a Xlib video game that does some very similar hosing of the
>server.. basically, each "tick" every object in the game is told to move, and
>every object that actually moves does two XCopyAreas (one to erase itself, 
>one to draw itself in the new position.)  This results in zillions of calls
>to XCopyArea.

This sounds like good advice to me.  Since the original poster mentioned she
was using SunView, I'll mention that if you are using XView, you can easily
implement the "tick" with the notify_set_itimer_func().  The flushing and
syncing is done by the toolkit, but you can add more if you like.

There is an example of using a notify timer for animation in the online
O'Reilly examples.  I have included the relevant code snippets below...

FYI:  clock interrupt on sun3's and sun4's is about 10ms.  So resolution
as seen from a UNIX process is about 20ms.  About 2 to 2.5 images each second 
might be reasonable X11 animation--depends on how long each image takes to 
display.  I would suggest experimenting.  

Although a realtime video system (using special display hardware) would be 
more along the line of about 30 frames each second.  Of course, you want the 
window system to politely move out of it's way ;-)

Regards,

Heather

------------------

/* from "animate.c" */

struct itimerval timer;
...

turn it on or adjust time:

	timer.it_value.tv_usec = some value;
	timer.it_interval.tv_usec = some other value;

	notify_set_itimer_func(frame, animate,
		ITIMER_REAL, &timer, NULL);

turn it off:

	notify_set_itimer_func(frame, NOTIFY_FUNC_NULL, 
		ITIMER_REAL, NULL, NULL);

then the proc that is called each "tick":

Notify_value
animate(/* optional parameters */)
{
	draw your stuff or do whatever

	return NOTIFY_DONE;
}