[comp.windows.x] bug in XCopyArea

dave@romano.cs.wisc.edu (Dave Cohrs) (04/22/89)

Environment:
	Sun4/110
	SunOS 4.0.1
	bw2 monitor (1 bit display)
	X11R3 w/-DPURDUE speedups (compiled with cc)

Description:
	XCopyArea() in the above environment fails when copying a
	pixmap on top of itself, when the destination is a positive
	multiple of 32 bits *to the right* of the source.

Repeat-By:
	compile the following program:
		cc -o test test.c -lX11
	and run it.  With no args, it copies the pixmap 64 bits to
	the right on each mouse click.  On a broken implementation,
	this causes every other 32 bit group to be copied backward
	over the bits already copied.  With a single numeric ARG,
	it copies the pixmap ARG bits to the right on each click.
	It works fine for all values when ARG%32 != 0, or whenever ARG
	is negative.

Any hints???  Is this a bug in the -DPURDUE code, or something else?

------------------------------------Cut Here--------------------------------
#include <X11/Xos.h>
#include <X11/Xlib.h>
#include <X11/Xutil.h>
#include <X11/cursorfont.h>
#include <stdio.h>

#define spool_width 48
#define spool_height 48
static char spool_bits[] = {
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0xf8, 0x3f, 0x00, 0x00, 0x00, 0x80, 0x07, 0xc0, 0x03, 0x00,
   0x00, 0x70, 0x80, 0x07, 0x1c, 0x00, 0x00, 0x0c, 0xf0, 0x3f, 0x60, 0x00,
   0x00, 0x02, 0xf8, 0x7f, 0x80, 0x00, 0x00, 0x02, 0xf0, 0x3f, 0x80, 0x00,
   0x00, 0x0e, 0x80, 0x07, 0xe0, 0x00, 0x00, 0x7c, 0x00, 0x00, 0x7c, 0x00,
   0x00, 0xf0, 0x07, 0xc0, 0x1f, 0x00, 0x00, 0x80, 0xff, 0xff, 0x03, 0x00,
   0x00, 0x70, 0xfc, 0x7f, 0x00, 0x00, 0x00, 0xd0, 0xff, 0x40, 0x00, 0x00,
   0x00, 0x1c, 0x04, 0xff, 0x00, 0x00, 0x00, 0x0f, 0xfe, 0x40, 0x00, 0x00,
   0x00, 0x00, 0x04, 0xff, 0x00, 0x00, 0x00, 0x00, 0xfe, 0x40, 0x00, 0x00,
   0x00, 0x00, 0x04, 0xff, 0x00, 0x00, 0x00, 0x00, 0xfe, 0x40, 0x00, 0x00,
   0x00, 0x00, 0x04, 0xff, 0x00, 0x00, 0x00, 0x00, 0xfe, 0x40, 0x00, 0x00,
   0x00, 0x00, 0x04, 0xff, 0x00, 0x00, 0x00, 0x00, 0xfe, 0x40, 0x00, 0x00,
   0x00, 0x00, 0x04, 0xff, 0x00, 0x00, 0x00, 0x00, 0xfe, 0x40, 0x00, 0x00,
   0x00, 0x00, 0x04, 0xff, 0x70, 0x00, 0x00, 0x00, 0xfe, 0x40, 0x08, 0x00,
   0x00, 0x80, 0x07, 0xff, 0x0b, 0x00, 0x00, 0x70, 0xfe, 0x40, 0x1c, 0x00,
   0x00, 0x0c, 0x04, 0xff, 0x6e, 0x00, 0x00, 0x02, 0xfe, 0x40, 0x82, 0x00,
   0x00, 0x02, 0x0c, 0xff, 0x83, 0x00, 0x00, 0x0e, 0x30, 0x18, 0xe0, 0x00,
   0x00, 0x7c, 0xc0, 0x07, 0x7c, 0x00, 0x00, 0xf0, 0x07, 0xc0, 0x1f, 0x00,
   0x00, 0x80, 0xff, 0xff, 0x03, 0x00, 0x00, 0x00, 0xf8, 0x3f, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};

GC gc;
Window show_window;
Pixmap pixmap;
Pixmap spool_pixmap;
Pixmap bg_pixmap;
Display *display;
Window root;
int screen;
unsigned long fg, bg;
int border_wid = 2;
XGCValues gcv;
XSetWindowAttributes attrs;
XSizeHints hints;
XEvent event;

main(argc, argv)
    int argc;
    char *argv[];
{
    int bg_width = 64;
    int i, j;
    unsigned short buf[256] ;  /* For background pattern. */
    int movewid = 64;

    if(argc > 1 && (i = atoi(argv[1])) != 0)
	movewid = i;

    /* initialization */
    if (!(display = XOpenDisplay(NULL))) {
	fprintf(stderr, "unable to open display\n");
	exit (1);
    }
    screen = DefaultScreen(display);
    root = RootWindow(display, screen);

    bg = WhitePixel (display, screen);
    fg = BlackPixel (display, screen);

    /* allocate pixmaps:
       pixmap is the "backing store"
       spool_pixmap holds the picture
       bg_pixmap holds a gray pattern
     */
    pixmap = XCreatePixmap(display, root,
		bg_width*4, spool_height, DefaultDepth(display, screen));

    spool_pixmap = XCreatePixmapFromBitmapData(display, root,
			spool_bits, spool_width, spool_height,
			fg, bg, DefaultDepth(display, screen));

    for (i = 0; i < 16; i++) {    /* Load default gray background. */
	for (j = 0; j < 4; j++) buf[i*16+j] = 0x7777 ;
	for (j = 0; j < 4; j++) buf[i*16+4+j] = 0xdddd ;
	for (j = 0; j < 4; j++) buf[i*16+8+j] = 0xbbbb ;
	for (j = 0; j < 4; j++) buf[i*16+12+j] = 0xeeee ;
    }
    bg_pixmap = XCreatePixmapFromBitmapData(display, root,
			buf, 64, 64,
			fg, bg, DefaultDepth(display, screen));

    /* finish initialization */
    gcv.function = GXcopy;
    gcv.background = bg;
    gcv.foreground = fg;
    gcv.graphics_exposures = False;
    gc = XCreateGC(display, root,
	GCFunction | GCForeground | GCBackground | GCGraphicsExposures, &gcv);

    hints.x = 0;
    hints.y = 0;
    hints.width = bg_width*4;
    hints.height = spool_height;
    hints.min_width  = bg_width*4;
    hints.min_height = spool_height;
    hints.flags      = PMinSize|PSize;

    attrs.background_pixel = bg;
    attrs.border_pixel = fg;
    attrs.cursor = XCreateFontCursor(display, XC_top_left_arrow);

    /* create a window big enough to hold 4 copies of spool_bits (plus slop) */
    show_window = XCreateWindow(
	display, root,
	hints.x, hints.y, hints.width, hints.height,
	border_wid, CopyFromParent, CopyFromParent, CopyFromParent,
	CWBorderPixel | CWBackPixel | CWCursor,
	&attrs);

    XSetStandardProperties (display, show_window,
			    "test", "test",
			    None, argv, argc, &hints);

    XSelectInput(display, show_window, ExposureMask|ButtonPressMask);

    XMapWindow(display, show_window);

    /* initially fill the window */
    for(i=0; i<4; i++)
	XCopyArea(display, bg_pixmap, pixmap, gc,
		  0, 0, bg_width, spool_height, bg_width*i, 0);

    /* process events:
       expose: copy backing store to screen
       buttonPress: move pixmap 64bits right, copy a new spool into leftmost
		    position, then expose the pixmap
     */
    for(;;) {
	XNextEvent(display, &event);
	if (event.type == Expose) {
	refresh:
	    XCopyArea(display, pixmap, show_window, gc,
		       0, 0, bg_width*4, spool_height, 0, 0);
	    XFlush(display);
	} else if (event.type == ButtonPress) {
	    if(movewid > 0) {
		XCopyArea(display, pixmap, pixmap, gc, 0, 0,
			  bg_width*4-movewid, spool_height, movewid, 0) ;
		XCopyArea(display, spool_pixmap, pixmap, gc, 0, 0,
			  movewid, spool_height, 0, 0) ;
	    } else {
		XCopyArea(display, pixmap, pixmap, gc, -movewid, 0,
			  bg_width*4+movewid, spool_height, 0, 0) ;
		XCopyArea(display, spool_pixmap, pixmap, gc, 0, 0,
			  -movewid, spool_height, bg_width*4+movewid, 0) ;
	    }
	    goto refresh;
	}
    }
}
--
Dave Cohrs
+1 608 262-6617                        UW-Madison Computer Sciences Department
dave@cs.wisc.edu                       ...!{harvard,rutgers,ucbvax}!uwvax!dave

rws@EXPO.LCS.MIT.EDU (04/24/89)

	XCopyArea() in the above environment fails when copying a
	pixmap on top of itself, when the destination is a positive
	multiple of 32 bits *to the right* of the source.

The code you supplied works fine on my R3+ server with Purdue speedups compiled
with gcc 1.34 on a Sun 3/60 running SunOS 3.4.  It also works on two different
product servers I tried.

richard@torch.UUCP (Richard Nuttall) (05/10/89)

dave@romano.cs.wisc.edu (Dave Cohrs) writes:

>Environment:
>	Sun4/110
>	bw2 monitor (1 bit display)

>Description:
>	XCopyArea() in the above environment fails when copying a
>	pixmap on top of itself, when the destination is a positive
>	multiple of 32 bits *to the right* of the source.

This is a terrific bug, I can see loads of mail being generated about the 
pros and cons of the answer to this.

The problem is caused by the sparc processor.

XCopyArea eventually uses getbits and putbits. They both use SCRLEFT and 
SCRRIGHT.

These both use the >> and << operators respectively.

BUT!!!!!...... wait for it.....

On the sparc, x >> y == x >> (y % 32)  !!!! (think about it)
Thus 0xFFFFFFFF >> 32 == 0xFFFFFFFF.

This means that when the masks are generated and you need to shift by
 >= 32 bits, you don't get what you expected.

Other machines do the expected (to the people who wrote mfbbitblt.c) 
and put 0x0 in the result.

I don't particularly want to create a religious war as to whether a strictly 
out of range value such as 32 should have the results defined or not,
 (I think it should be nice and zero the word).

Here is the fix I put in. It appears only to be needed in mfbbitblt.c, so the
MFBBITBLT define is only defined in mfbbitblt.c

Unfortunately  we don't have the original source before we put all our 
speedups in, so I can't give a proper patch file.
Note that it is written so that only the sparc is affected.

File : maskbits.h

Add :
#ifdef MFBBITBLT
#ifdef sparc
#define ASR(val,shift) (((shift) > 31)? 0 : (val) >> (shift))
#else
#define ASR(val,shift) ((val) >> (shift))
#endif

#ifdef sparc
#define ASL(val,shift) (((shift) > 31)? 0 : (val) << (shift))
#else
#define ASL(val,shift) ((val) << (shift))
#endif

#else
#define ASR(val,shift) ((val) >> (shift))
#define ASL(val,shift) ((val) << (shift))
#endif

Change :
#if (BITMAP_BIT_ORDER == MSBFirst)      /* pc/rt, 680x0, sparc */
#define SCRLEFT(lw, n)  ASL((unsigned int)(lw),(n))
#define SCRRIGHT(lw, n) ASR((unsigned int)(lw),(n))
#else                                   /* vax, intel */
#define SCRLEFT(lw, n)  ((lw) >> (n))
#define SCRRIGHT(lw, n) ((lw) << (n))
#endif

Below is a little test program I wrote to demonstrate the problem. I would be
interested to know if the same problem occurs with any other processor chips.

/*
 * Test program to discover how << >> operations are handled for
 * a shift of >= 32
 *
 * written by Richard  Nuttall 10-May-1989
 *
 * Expected results:
 * On most machines: for shift of 0-31, correct shift.
 *                   for shift >= 32  , 0.
 * On sparc (SUN4) : for shift of 0-31, correct shift.
 *                   for shift >= 32  , result of shift % 32.
 */

#ifdef sparc
#define ASR(val,shift) (((shift) > 31)? 0 : val >> (shift))
#else
#define ASR(val,shift) ((val) >> (shift))
#endif

#ifdef sparc
#define ASL(val,shift) (((shift) > 31)? 0 : val << (shift))
#else
#define ASL(val,shift) ((val) << (shift))
#endif



main()
{
   unsigned int umask = 0xFFFFFFFF;
   int i;

   printf("i  ->    <<       >>       ASL      ASR\n");

   for(i = 0;i < 37;i++) /* use any number >= 32 */
   {
      unsigned int uleft  = umask << i;
      unsigned int uright = umask >> i;
      unsigned int sleft  = ASL(umask,i);
      unsigned int sright = ASR(umask,i);

      printf("%2d -> %8x %8x %8x %8x\n",i,uleft,uright,sleft,sright);
   }
}


Final note:
I guess this should really be logged as a bug in the mfbbitblt code.
 
-- 
Richard Nuttall               |    ukc!stc!datlog!torch!richard
Torch Technology Ltd.         |    0223 841000 X 309
Cambridge England             |    

DAN@YKTVMV.BITNET (05/17/89)

Reply to        Richard Nuttall

Richard suggested putting in ifdef code for sparc's
to fix a design bug in the sparc architecture.
In fact he should just complain to his compiler supplier
for not supporting ANSI C. ANSI C is quite clear
that the result of a shift of 32 is 0.

I would not want to see people defensively coding
all their shifts in this fashion because one
machine got it wrong. It will penalize all the
machines that got it right.

Walt Daniels, IBM Research (RISC land)

rws@EXPO.LCS.MIT.EDU (05/17/89)

    I would not want to see people defensively coding
    all their shifts in this fashion because one
    machine got it wrong. It will penalize all the
    machines that got it right.

I only wish we could take this attitude and get away with it!  We spend
half our lives mangling X code to work with each vendor's particular set
of bugs in their compiler/OS/utilities.  (And IBM has been one of those
offenders :-)

faulkner@jmullins.harvard.edu (Don Faulkner) (05/18/89)

all this talk about sun's compiler loosing on shifts is 
a good argument for "gcc" -- it does it right.
Amazing - the difference between two compilers  :-(
--

 Don Faulkner                                       
 Building 1, Room 803
 Harvard University, School of Public Health
 665 Huntington Avenue
 Boston, MA  02115

 ARPA:      faulkner%jmullins@harvard.harvard.edu                
 BITNET:    faulkner@harvard
 Telephone: (617) 732-2297

jdi@franz.UUCP (John Irwin) (05/18/89)

Your message:

    all this talk about sun's compiler loosing on shifts is 
    a good argument for "gcc" -- it does it right.
    Amazing - the difference between two compilers  :-(
    --

     Don Faulkner                                       
     ...
--------

I'm not sure what you're basing your praise of gcc on.  Here at Franz, we
use an X11 server compiled with gcc.  Before I applied Richard's fix,
if you opaque moved a window around the screen eventually it would be
competely trashed.  Now it works great!

	-- John
	   jdi@franz.com

erg@ulysses.homer.nj.att.com (Emden Gansner[drew]) (05/19/89)

Technically, there is no compiler bug. Old C (K&R) and new C
(K&R, 2 ed.; ANSI draft, Dec. 88) both specify that the shift
expression is undefined if the value of the right operand is
greater than or equal to the number of bit's in the left
operand's type.

guy@auspex.auspex.com (Guy Harris) (05/19/89)

>On the sparc, x >> y == x >> (y % 32)  !!!! (think about it)
>Thus 0xFFFFFFFF >> 32 == 0xFFFFFFFF.
>
>This means that when the masks are generated and you need to shift by
> >= 32 bits, you don't get what you expected.
>
>Other machines do the expected (to the people who wrote mfbbitblt.c) 
>and put 0x0 in the result.

A*hem*.  *Some* other machines do the expected, others don't.  SPARC is
*not* unique here.  The Intel 80386, as well as, I think, AT&T's 3B20
and their WE32K microprocessors, take the shift count modulo 32.

>I don't particularly want to create a religious war as to whether a strictly 
>out of range value such as 32 should have the results defined or not,

Good idea, since one very common processor - the 386 - does it one way
(the way you don't want it done), and other very common processors do it
differently; that's why the C standard leaves it undefined.

>Unfortunately  we don't have the original source before we put all our 
>speedups in, so I can't give a proper patch file.
>Note that it is written so that only the sparc is affected.

Bad idea, unless 80386 C compilers always generate code to do the extra
stuff to ensure that a shift count > 32 gives a result of 0.  I suspect
few, if any, do.

>I guess this should really be logged as a bug in the mfbbitblt code.

Good idea; if the code does, in fact, depend on a shift count > 32
giving a zero result, then it's depending on something that is not
guaranteed by the C language.

guy@auspex.auspex.com (Guy Harris) (05/19/89)

>Richard suggested putting in ifdef code for sparc's
>to fix a design bug in the sparc architecture.
>In fact he should just complain to his compiler supplier
>for not supporting ANSI C. ANSI C is quite clear
>that the result of a shift of 32 is 0.

The December 7, 1988 ANSI C draft, at least, is quite clear that the
result of a shift of 32 is *undefined*; if you want, I can cite chapter
and verse.  I tend to doubt that this changed in the version of the
standard proposed to X3J11. 

>I would not want to see people defensively coding
>all their shifts in this fashion because one
>machine got it wrong. It will penalize all the
>machines that got it right.

More than one machine "got it wrong"; the 80386 - you know, the one used
by YOUR COMPANY's high-end PS/2 machines - also "got it wrong" (check
out the 80386 Programmer's Reference Manual).  Unless the proposed ANSI
C standard differs from the December 7, 1988 draft, people had damn well
better code defensively. 

BTW, if you use "#if"s, it only penalizes people porting the code, not
the machines that got it right.

jm36+@andrew.cmu.edu (John Gardiner Myers) (05/19/89)

DAN@YKTVMV.BITNET writes:
> Richard suggested putting in ifdef code for sparc's
> to fix a design bug in the sparc architecture.
> In fact he should just complain to his compiler supplier
> for not supporting ANSI C. ANSI C is quite clear
> that the result of a shift of 32 is 0.

Please quote chapter and verse.  All of the C reference manuals I have
seen state that the result is undefined.  For example, K&R 2nd ed.,
section A7.8 says: (emphasis mine)

"The result is undefined if the right operand is negative, or grater
than or EQUAL TO the number of bits in the left expression's type."

> I would not want to see people defensively coding
> all their shifts in this fashion because one
> machine got it wrong. It will penalize all the
> machines that got it right.

I would like to see people coding all their shifts in order to stay
within the defined semantics of the language.

-- 
_.John G. Myers		Internet: jm36+@andrew.cmu.edu
(412) 268-2984		LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up


> Walt Daniels, IBM Research (RISC land)


				_.John