[comp.windows.x] Speedups for X11.2 Xsun

spaf@cs.purdue.EDU (Gene Spafford) (05/27/88)

Sam Kimery (kimery@helicon.math.purdue.edu) and I have been looking
at ways to speed-up the Sun 3  X11 server.  We've gotten some
interesting preliminary results we'd like to share so that others
can begin to make use of them, and possibly inspire others to
find other optimizations.

Suggestion 1 may not be possible for everyone, but it makes a
significant difference.  Suggestion 2 is simple, but also doesn't help
much unless you have lots of applications going.  Suggestion 3 provides
a very meaningful speedup and should work for servers other than Sun 3
systems, although we haven't yet tried it on any others.

Feedback & comments are welcomed!


Speed-up #1
-----------
Compile the Xsun server with the GNU gcc compiler.  We've built the
server with version 1.22 of the gcc compiler and found it to be
observably faster (no formal benchmarks yet).  Other benchmarks of
CPU-intensive code have shown GNU code to be up to twice as fast as
code from the Sun compiler.  Plus, as an added benefit, the
GNU-compiled Xsun object is almost 90K smaller than the Sun-cc version,
making 10 more 8K pages free on a Sun 3/50.  (Aside: raises some
interesting questions about the compiler technology used, eh?)

There are a few important things to do when compiling with gcc:
   1) Don't use the "-finline-functions" option unless you later want
   to go back and recompile selected modules without it.  The
   inline-function optimizer code is not yet fully debugged and
   sometimes messes up and dumps core, or produces faulty modules.
   2) Use the "-traditional" compiler option.  A good set of options to
   put in Sun.macros is "-O -traditional -fcombine-regs".  You may also
   need "-msoft-float" if you don't have the 68881 chip.
   3) You need to compile server/os/4.2bsd/oscolor.c with the Sun-cc
   compiler, or else you need to recompile and link the dbm library
   into the server (remember to remove the "-ldbm" from the Makefile in
   the server directory, too).  The GNU compiler handles functions
   returning structures differently than pcc-based compilers, and
   "oscolor.c" uses the DBM "fetch" function to get items from a
   database.  The distinctive symptom of forgetting to do this on
   monochrome is that your xterm windows may be all black or all white,
   depending on normal preference, and color monitors will be straight
   B/W.

If you compile any of the clients, you may also need to use
"-fwriteable-strings" to get around some sloppy coding, notably in
xterm.  Within few weeks we may post a complete set of diffs for
compiling all the libraries and clients using all the gcc
optimizations.  Mostly these have required cleaning up some constant
placement and code ordering.  Compiling them with gcc doesn't make
a major difference, in most cases, however.


Speed-up #2
-----------
Raise the priority of your server relative to other processes.  Put in
the following patch, and then install the server setuid to root.  Note
that as soon as it changes priority, it sets back to the appropriate
user uid for operation.  This higher priority helps to keep the screen
up-to-date.  Beware of boosting it too much, however, because it may
lock you out of your window manager and xterms if some screen-intensive
application is running.

Also note that by making the Xsun server setuid to root, you can make
proper use of the log file defined in ADMPATH if that happens to point
to a restricted directory.

The patch goes in server/os/4.2bsd/osinit.c

*** osinit.c.orig	Mon Feb 22 10:43:02 1988
--- osinit.c	Sun Mar 27 19:55:44 1988
***************
*** 60,65 ****
--- 60,67 ----
  	if (getpgrp (0) == 0)
  	    setpgrp (0, getpid ());
  
+ 	setpriority(0, 0, -5);
+ 	setuid(getuid());
  	been_here = TRUE;
      }
  
Speed-up #3
-----------
The following changes to the code in server/ddx/mfb makes use of known
properties of shifting unsigned quantities (it fills in zeros), and
makes use of a much larger precomputed table of constants to avoid
repeated loads, masks & stores.  The result is an incredible drop in
the amount of time spent doing bitblt operations, and the resulting
server is easily twice as fast as before (we're still running benchmarks
to come up with more definite figures).

Drawback: well, you have to define 1024 longwords of table instead of
64, but the extra storage obviates at least one AND, one ADD and one
memory reference for EVERY putbit mask operation.  Since there are
hundreds or thousands of those for even simple movements and exposures,
the tradeoff is worth it.

Note that we have ifdef'd these changes with PURDUE.  You either need
to change these to a local define, or else set -DPURDUE in
util/imake.includes/sun.macros in the definition of ServerDefines.  The
do a "make Makefiles; make depend" in the top-level server directory.

There is no more text beyond this point -- just patches to the code.

=====================================================
*** maskbits.h.orig	Wed May 25 22:38:33 1988
--- maskbits.h	Thu May 26 22:58:07 1988
***************
*** 28,35 ****
--- 28,39 ----
  
  extern int starttab[];
  extern int endtab[];
+ #ifndef PURDUE
  extern int startpartial[];
  extern int endpartial[];
+ #else PURDUE
+ extern unsigned partmasks[32][32];
+ #endif PURDUE
  extern int rmask[32];
  extern int mask[32];
  
***************
*** 206,213 ****
--- 210,222 ----
      else \
  	nlw = (w) >> 5;
  
+ #ifndef PURDUE
  #define maskpartialbits(x, w, mask) \
      mask = startpartial[(x) & 0x1f] & endpartial[((x) + (w)) & 0x1f];
+ #else PURDUE
+ #define maskpartialbits(x, w, mask) \
+     mask = partmasks[(x)&0x1f][(w)&0x1f];
+ #endif PURDUE
  
  #define mask32bits(x, w, startmask, endmask) \
      startmask = starttab[(x)&0x1f]; \
***************
*** 214,219 ****
--- 223,230 ----
      endmask = endtab[((x)+(w)) & 0x1f];
  
  
+ #ifndef PURDUE
+ 
  #define getbits(psrc, x, w, dst) \
  if ( ((x) + (w)) <= 32) \
  { \
***************
*** 244,249 ****
--- 255,288 ----
      *(pdst) = (*(pdst) & endtab[x]) | (SCRRIGHT(src, x) & starttab[x]); \
      *((pdst)+1) = (*((pdst)+1) & starttab[n]) | (SCRLEFT(src, m) & endtab[n]); \
  }
+ 
+ #else PURDUE
+ #define getbits(psrc, x, w, dst) \
+ if ( ((x) + (w)) <= 32) \
+ { \
+     dst = SCRLEFT(*(psrc), (x)); \
+ } \
+ else \
+ { \
+     dst = (SCRLEFT(*(psrc), (x))) | \
+ 	  (SCRRIGHT((unsigned) *((psrc)+1), 32-(x))); \
+ }
+ 
+ #define putbits(src, x, w, pdst) \
+ if ( ((x)+(w)) <= 32) \
+ { \
+     unsigned tmpmask; \
+     maskpartialbits((x), (w), tmpmask); \
+     *(pdst) = (*(pdst) & ~tmpmask) | (SCRRIGHT(src, x) & tmpmask); \
+ } \
+ else \
+ { \
+     unsigned int *ptmp_ = (pdst); \
+     int m = 32-(x); \
+     *ptmp_++ = (*(pdst) & endtab[x]) | (SCRRIGHT((unsigned) (src), x)); \
+     *ptmp_ = (*ptmp_ & starttab[(w)-m]) | SCRLEFT(src, m); \
+ }
+ #endif PURDUE
  
  #define putbitsrop(src, x, w, pdst, rop) \
  if ( ((x)+(w)) <= 32) \

*** maskbits.c.orig	Thu May 26 22:30:47 1988
--- maskbits.c	Thu May 26 22:34:51 1988
***************
*** 116,121 ****
--- 116,122 ----
  	0xFFFFFFFE
      };
  
+ #ifndef PURDUE
  /* a hack, for now, since the entries for 0 need to be all
     1 bits, not all zeros.
     this means the code DOES NOT WORK for segments of length
***************
*** 192,197 ****
--- 193,458 ----
  	0xFFFFFFFC,
  	0xFFFFFFFE
      };
+ #else PURDUE
+ unsigned int partmasks[32][32] = {
+      {0xFFFFFFFF, 0x80000000, 0xC0000000, 0xE0000000,
+       0xF0000000, 0xF8000000, 0xFC000000, 0xFE000000,
+       0xFF000000, 0xFF800000, 0xFFC00000, 0xFFE00000,
+       0xFFF00000, 0xFFF80000, 0xFFFC0000, 0xFFFE0000,
+       0xFFFF0000, 0xFFFF8000, 0xFFFFC000, 0xFFFFE000,
+       0xFFFFF000, 0xFFFFF800, 0xFFFFFC00, 0xFFFFFE00,
+       0xFFFFFF00, 0xFFFFFF80, 0xFFFFFFC0, 0xFFFFFFE0,
+       0xFFFFFFF0, 0xFFFFFFF8, 0xFFFFFFFC, 0xFFFFFFFE},
+      {0x00000000, 0x40000000, 0x60000000, 0x70000000,
+       0x78000000, 0x7C000000, 0x7E000000, 0x7F000000,
+       0x7F800000, 0x7FC00000, 0x7FE00000, 0x7FF00000,
+       0x7FF80000, 0x7FFC0000, 0x7FFE0000, 0x7FFF0000,
+       0x7FFF8000, 0x7FFFC000, 0x7FFFE000, 0x7FFFF000,
+       0x7FFFF800, 0x7FFFFC00, 0x7FFFFE00, 0x7FFFFF00,
+       0x7FFFFF80, 0x7FFFFFC0, 0x7FFFFFE0, 0x7FFFFFF0,
+       0x7FFFFFF8, 0x7FFFFFFC, 0x7FFFFFFE, 0x7FFFFFFF},
+      {0x00000000, 0x20000000, 0x30000000, 0x38000000,
+       0x3C000000, 0x3E000000, 0x3F000000, 0x3F800000,
+       0x3FC00000, 0x3FE00000, 0x3FF00000, 0x3FF80000,
+       0x3FFC0000, 0x3FFE0000, 0x3FFF0000, 0x3FFF8000,
+       0x3FFFC000, 0x3FFFE000, 0x3FFFF000, 0x3FFFF800,
+       0x3FFFFC00, 0x3FFFFE00, 0x3FFFFF00, 0x3FFFFF80,
+       0x3FFFFFC0, 0x3FFFFFE0, 0x3FFFFFF0, 0x3FFFFFF8,
+       0x3FFFFFFC, 0x3FFFFFFE, 0x3FFFFFFF, 0x00000000},
+      {0x00000000, 0x10000000, 0x18000000, 0x1C000000,
+       0x1E000000, 0x1F000000, 0x1F800000, 0x1FC00000,
+       0x1FE00000, 0x1FF00000, 0x1FF80000, 0x1FFC0000,
+       0x1FFE0000, 0x1FFF0000, 0x1FFF8000, 0x1FFFC000,
+       0x1FFFE000, 0x1FFFF000, 0x1FFFF800, 0x1FFFFC00,
+       0x1FFFFE00, 0x1FFFFF00, 0x1FFFFF80, 0x1FFFFFC0,
+       0x1FFFFFE0, 0x1FFFFFF0, 0x1FFFFFF8, 0x1FFFFFFC,
+       0x1FFFFFFE, 0x1FFFFFFF, 0x00000000, 0x00000000},
+      {0x00000000, 0x08000000, 0x0C000000, 0x0E000000,
+       0x0F000000, 0x0F800000, 0x0FC00000, 0x0FE00000,
+       0x0FF00000, 0x0FF80000, 0x0FFC0000, 0x0FFE0000,
+       0x0FFF0000, 0x0FFF8000, 0x0FFFC000, 0x0FFFE000,
+       0x0FFFF000, 0x0FFFF800, 0x0FFFFC00, 0x0FFFFE00,
+       0x0FFFFF00, 0x0FFFFF80, 0x0FFFFFC0, 0x0FFFFFE0,
+       0x0FFFFFF0, 0x0FFFFFF8, 0x0FFFFFFC, 0x0FFFFFFE,
+       0x0FFFFFFF, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x04000000, 0x06000000, 0x07000000,
+       0x07800000, 0x07C00000, 0x07E00000, 0x07F00000,
+       0x07F80000, 0x07FC0000, 0x07FE0000, 0x07FF0000,
+       0x07FF8000, 0x07FFC000, 0x07FFE000, 0x07FFF000,
+       0x07FFF800, 0x07FFFC00, 0x07FFFE00, 0x07FFFF00,
+       0x07FFFF80, 0x07FFFFC0, 0x07FFFFE0, 0x07FFFFF0,
+       0x07FFFFF8, 0x07FFFFFC, 0x07FFFFFE, 0x07FFFFFF,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x02000000, 0x03000000, 0x03800000,
+       0x03C00000, 0x03E00000, 0x03F00000, 0x03F80000,
+       0x03FC0000, 0x03FE0000, 0x03FF0000, 0x03FF8000,
+       0x03FFC000, 0x03FFE000, 0x03FFF000, 0x03FFF800,
+       0x03FFFC00, 0x03FFFE00, 0x03FFFF00, 0x03FFFF80,
+       0x03FFFFC0, 0x03FFFFE0, 0x03FFFFF0, 0x03FFFFF8,
+       0x03FFFFFC, 0x03FFFFFE, 0x03FFFFFF, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x01000000, 0x01800000, 0x01C00000,
+       0x01E00000, 0x01F00000, 0x01F80000, 0x01FC0000,
+       0x01FE0000, 0x01FF0000, 0x01FF8000, 0x01FFC000,
+       0x01FFE000, 0x01FFF000, 0x01FFF800, 0x01FFFC00,
+       0x01FFFE00, 0x01FFFF00, 0x01FFFF80, 0x01FFFFC0,
+       0x01FFFFE0, 0x01FFFFF0, 0x01FFFFF8, 0x01FFFFFC,
+       0x01FFFFFE, 0x01FFFFFF, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00800000, 0x00C00000, 0x00E00000,
+       0x00F00000, 0x00F80000, 0x00FC0000, 0x00FE0000,
+       0x00FF0000, 0x00FF8000, 0x00FFC000, 0x00FFE000,
+       0x00FFF000, 0x00FFF800, 0x00FFFC00, 0x00FFFE00,
+       0x00FFFF00, 0x00FFFF80, 0x00FFFFC0, 0x00FFFFE0,
+       0x00FFFFF0, 0x00FFFFF8, 0x00FFFFFC, 0x00FFFFFE,
+       0x00FFFFFF, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00400000, 0x00600000, 0x00700000,
+       0x00780000, 0x007C0000, 0x007E0000, 0x007F0000,
+       0x007F8000, 0x007FC000, 0x007FE000, 0x007FF000,
+       0x007FF800, 0x007FFC00, 0x007FFE00, 0x007FFF00,
+       0x007FFF80, 0x007FFFC0, 0x007FFFE0, 0x007FFFF0,
+       0x007FFFF8, 0x007FFFFC, 0x007FFFFE, 0x007FFFFF,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00200000, 0x00300000, 0x00380000,
+       0x003C0000, 0x003E0000, 0x003F0000, 0x003F8000,
+       0x003FC000, 0x003FE000, 0x003FF000, 0x003FF800,
+       0x003FFC00, 0x003FFE00, 0x003FFF00, 0x003FFF80,
+       0x003FFFC0, 0x003FFFE0, 0x003FFFF0, 0x003FFFF8,
+       0x003FFFFC, 0x003FFFFE, 0x003FFFFF, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00100000, 0x00180000, 0x001C0000,
+       0x001E0000, 0x001F0000, 0x001F8000, 0x001FC000,
+       0x001FE000, 0x001FF000, 0x001FF800, 0x001FFC00,
+       0x001FFE00, 0x001FFF00, 0x001FFF80, 0x001FFFC0,
+       0x001FFFE0, 0x001FFFF0, 0x001FFFF8, 0x001FFFFC,
+       0x001FFFFE, 0x001FFFFF, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00080000, 0x000C0000, 0x000E0000,
+       0x000F0000, 0x000F8000, 0x000FC000, 0x000FE000,
+       0x000FF000, 0x000FF800, 0x000FFC00, 0x000FFE00,
+       0x000FFF00, 0x000FFF80, 0x000FFFC0, 0x000FFFE0,
+       0x000FFFF0, 0x000FFFF8, 0x000FFFFC, 0x000FFFFE,
+       0x000FFFFF, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00040000, 0x00060000, 0x00070000,
+       0x00078000, 0x0007C000, 0x0007E000, 0x0007F000,
+       0x0007F800, 0x0007FC00, 0x0007FE00, 0x0007FF00,
+       0x0007FF80, 0x0007FFC0, 0x0007FFE0, 0x0007FFF0,
+       0x0007FFF8, 0x0007FFFC, 0x0007FFFE, 0x0007FFFF,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00020000, 0x00030000, 0x00038000,
+       0x0003C000, 0x0003E000, 0x0003F000, 0x0003F800,
+       0x0003FC00, 0x0003FE00, 0x0003FF00, 0x0003FF80,
+       0x0003FFC0, 0x0003FFE0, 0x0003FFF0, 0x0003FFF8,
+       0x0003FFFC, 0x0003FFFE, 0x0003FFFF, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00010000, 0x00018000, 0x0001C000,
+       0x0001E000, 0x0001F000, 0x0001F800, 0x0001FC00,
+       0x0001FE00, 0x0001FF00, 0x0001FF80, 0x0001FFC0,
+       0x0001FFE0, 0x0001FFF0, 0x0001FFF8, 0x0001FFFC,
+       0x0001FFFE, 0x0001FFFF, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00008000, 0x0000C000, 0x0000E000,
+       0x0000F000, 0x0000F800, 0x0000FC00, 0x0000FE00,
+       0x0000FF00, 0x0000FF80, 0x0000FFC0, 0x0000FFE0,
+       0x0000FFF0, 0x0000FFF8, 0x0000FFFC, 0x0000FFFE,
+       0x0000FFFF, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00004000, 0x00006000, 0x00007000,
+       0x00007800, 0x00007C00, 0x00007E00, 0x00007F00,
+       0x00007F80, 0x00007FC0, 0x00007FE0, 0x00007FF0,
+       0x00007FF8, 0x00007FFC, 0x00007FFE, 0x00007FFF,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00002000, 0x00003000, 0x00003800,
+       0x00003C00, 0x00003E00, 0x00003F00, 0x00003F80,
+       0x00003FC0, 0x00003FE0, 0x00003FF0, 0x00003FF8,
+       0x00003FFC, 0x00003FFE, 0x00003FFF, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00001000, 0x00001800, 0x00001C00,
+       0x00001E00, 0x00001F00, 0x00001F80, 0x00001FC0,
+       0x00001FE0, 0x00001FF0, 0x00001FF8, 0x00001FFC,
+       0x00001FFE, 0x00001FFF, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000800, 0x00000C00, 0x00000E00,
+       0x00000F00, 0x00000F80, 0x00000FC0, 0x00000FE0,
+       0x00000FF0, 0x00000FF8, 0x00000FFC, 0x00000FFE,
+       0x00000FFF, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000400, 0x00000600, 0x00000700,
+       0x00000780, 0x000007C0, 0x000007E0, 0x000007F0,
+       0x000007F8, 0x000007FC, 0x000007FE, 0x000007FF,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000200, 0x00000300, 0x00000380,
+       0x000003C0, 0x000003E0, 0x000003F0, 0x000003F8,
+       0x000003FC, 0x000003FE, 0x000003FF, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000100, 0x00000180, 0x000001C0,
+       0x000001E0, 0x000001F0, 0x000001F8, 0x000001FC,
+       0x000001FE, 0x000001FF, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000080, 0x000000C0, 0x000000E0,
+       0x000000F0, 0x000000F8, 0x000000FC, 0x000000FE,
+       0x000000FF, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000040, 0x00000060, 0x00000070,
+       0x00000078, 0x0000007C, 0x0000007E, 0x0000007F,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000020, 0x00000030, 0x00000038,
+       0x0000003C, 0x0000003E, 0x0000003F, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000010, 0x00000018, 0x0000001C,
+       0x0000001E, 0x0000001F, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000008, 0x0000000C, 0x0000000E,
+       0x0000000F, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000004, 0x00000006, 0x00000007,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000002, 0x00000003, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000001, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+ };
+ #endif PURDUE
  #else		/* LSBFirst */
  /* NOTE:
  the first element in starttab could be 0xffffffff.  making it 0
***************
*** 271,276 ****
--- 532,538 ----
  	0x7FFFFFFF
  	};
  
+ #ifndef PURDUE
  /* a hack, for now, since the entries for 0 need to be all
     1 bits, not all zeros.
     this means the code DOES NOT WORK for segments of length
***************
*** 347,352 ****
--- 609,874 ----
  	0x3FFFFFFF,
  	0x7FFFFFFF
  	};
+ #else PURDUE
+ unsigned int partmasks[32][32] = {
+      {0xFFFFFFFF, 0x00000001, 0x00000003, 0x00000007,
+       0x0000000F, 0x0000001F, 0x0000003F, 0x0000007F,
+       0x000000FF, 0x000001FF, 0x000003FF, 0x000007FF,
+       0x00000FFF, 0x00001FFF, 0x00003FFF, 0x00007FFF,
+       0x0000FFFF, 0x0001FFFF, 0x0003FFFF, 0x0007FFFF,
+       0x000FFFFF, 0x001FFFFF, 0x003FFFFF, 0x007FFFFF,
+       0x00FFFFFF, 0x01FFFFFF, 0x03FFFFFF, 0x07FFFFFF,
+       0x0FFFFFFF, 0x1FFFFFFF, 0x3FFFFFFF, 0x7FFFFFFF},
+      {0x00000000, 0x00000002, 0x00000006, 0x0000000E,
+       0x0000001E, 0x0000003E, 0x0000007E, 0x000000FE,
+       0x000001FE, 0x000003FE, 0x000007FE, 0x00000FFE,
+       0x00001FFE, 0x00003FFE, 0x00007FFE, 0x0000FFFE,
+       0x0001FFFE, 0x0003FFFE, 0x0007FFFE, 0x000FFFFE,
+       0x001FFFFE, 0x003FFFFE, 0x007FFFFE, 0x00FFFFFE,
+       0x01FFFFFE, 0x03FFFFFE, 0x07FFFFFE, 0x0FFFFFFE,
+       0x1FFFFFFE, 0x3FFFFFFE, 0x7FFFFFFE, 0xFFFFFFFE},
+      {0x00000000, 0x00000004, 0x0000000C, 0x0000001C,
+       0x0000003C, 0x0000007C, 0x000000FC, 0x000001FC,
+       0x000003FC, 0x000007FC, 0x00000FFC, 0x00001FFC,
+       0x00003FFC, 0x00007FFC, 0x0000FFFC, 0x0001FFFC,
+       0x0003FFFC, 0x0007FFFC, 0x000FFFFC, 0x001FFFFC,
+       0x003FFFFC, 0x007FFFFC, 0x00FFFFFC, 0x01FFFFFC,
+       0x03FFFFFC, 0x07FFFFFC, 0x0FFFFFFC, 0x1FFFFFFC,
+       0x3FFFFFFC, 0x7FFFFFFC, 0xFFFFFFFC, 0x00000000},
+      {0x00000000, 0x00000008, 0x00000018, 0x00000038,
+       0x00000078, 0x000000F8, 0x000001F8, 0x000003F8,
+       0x000007F8, 0x00000FF8, 0x00001FF8, 0x00003FF8,
+       0x00007FF8, 0x0000FFF8, 0x0001FFF8, 0x0003FFF8,
+       0x0007FFF8, 0x000FFFF8, 0x001FFFF8, 0x003FFFF8,
+       0x007FFFF8, 0x00FFFFF8, 0x01FFFFF8, 0x03FFFFF8,
+       0x07FFFFF8, 0x0FFFFFF8, 0x1FFFFFF8, 0x3FFFFFF8,
+       0x7FFFFFF8, 0xFFFFFFF8, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000010, 0x00000030, 0x00000070,
+       0x000000F0, 0x000001F0, 0x000003F0, 0x000007F0,
+       0x00000FF0, 0x00001FF0, 0x00003FF0, 0x00007FF0,
+       0x0000FFF0, 0x0001FFF0, 0x0003FFF0, 0x0007FFF0,
+       0x000FFFF0, 0x001FFFF0, 0x003FFFF0, 0x007FFFF0,
+       0x00FFFFF0, 0x01FFFFF0, 0x03FFFFF0, 0x07FFFFF0,
+       0x0FFFFFF0, 0x1FFFFFF0, 0x3FFFFFF0, 0x7FFFFFF0,
+       0xFFFFFFF0, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000020, 0x00000060, 0x000000E0,
+       0x000001E0, 0x000003E0, 0x000007E0, 0x00000FE0,
+       0x00001FE0, 0x00003FE0, 0x00007FE0, 0x0000FFE0,
+       0x0001FFE0, 0x0003FFE0, 0x0007FFE0, 0x000FFFE0,
+       0x001FFFE0, 0x003FFFE0, 0x007FFFE0, 0x00FFFFE0,
+       0x01FFFFE0, 0x03FFFFE0, 0x07FFFFE0, 0x0FFFFFE0,
+       0x1FFFFFE0, 0x3FFFFFE0, 0x7FFFFFE0, 0xFFFFFFE0,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000040, 0x000000C0, 0x000001C0,
+       0x000003C0, 0x000007C0, 0x00000FC0, 0x00001FC0,
+       0x00003FC0, 0x00007FC0, 0x0000FFC0, 0x0001FFC0,
+       0x0003FFC0, 0x0007FFC0, 0x000FFFC0, 0x001FFFC0,
+       0x003FFFC0, 0x007FFFC0, 0x00FFFFC0, 0x01FFFFC0,
+       0x03FFFFC0, 0x07FFFFC0, 0x0FFFFFC0, 0x1FFFFFC0,
+       0x3FFFFFC0, 0x7FFFFFC0, 0xFFFFFFC0, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000080, 0x00000180, 0x00000380,
+       0x00000780, 0x00000F80, 0x00001F80, 0x00003F80,
+       0x00007F80, 0x0000FF80, 0x0001FF80, 0x0003FF80,
+       0x0007FF80, 0x000FFF80, 0x001FFF80, 0x003FFF80,
+       0x007FFF80, 0x00FFFF80, 0x01FFFF80, 0x03FFFF80,
+       0x07FFFF80, 0x0FFFFF80, 0x1FFFFF80, 0x3FFFFF80,
+       0x7FFFFF80, 0xFFFFFF80, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000100, 0x00000300, 0x00000700,
+       0x00000F00, 0x00001F00, 0x00003F00, 0x00007F00,
+       0x0000FF00, 0x0001FF00, 0x0003FF00, 0x0007FF00,
+       0x000FFF00, 0x001FFF00, 0x003FFF00, 0x007FFF00,
+       0x00FFFF00, 0x01FFFF00, 0x03FFFF00, 0x07FFFF00,
+       0x0FFFFF00, 0x1FFFFF00, 0x3FFFFF00, 0x7FFFFF00,
+       0xFFFFFF00, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000200, 0x00000600, 0x00000E00,
+       0x00001E00, 0x00003E00, 0x00007E00, 0x0000FE00,
+       0x0001FE00, 0x0003FE00, 0x0007FE00, 0x000FFE00,
+       0x001FFE00, 0x003FFE00, 0x007FFE00, 0x00FFFE00,
+       0x01FFFE00, 0x03FFFE00, 0x07FFFE00, 0x0FFFFE00,
+       0x1FFFFE00, 0x3FFFFE00, 0x7FFFFE00, 0xFFFFFE00,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000400, 0x00000C00, 0x00001C00,
+       0x00003C00, 0x00007C00, 0x0000FC00, 0x0001FC00,
+       0x0003FC00, 0x0007FC00, 0x000FFC00, 0x001FFC00,
+       0x003FFC00, 0x007FFC00, 0x00FFFC00, 0x01FFFC00,
+       0x03FFFC00, 0x07FFFC00, 0x0FFFFC00, 0x1FFFFC00,
+       0x3FFFFC00, 0x7FFFFC00, 0xFFFFFC00, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00000800, 0x00001800, 0x00003800,
+       0x00007800, 0x0000F800, 0x0001F800, 0x0003F800,
+       0x0007F800, 0x000FF800, 0x001FF800, 0x003FF800,
+       0x007FF800, 0x00FFF800, 0x01FFF800, 0x03FFF800,
+       0x07FFF800, 0x0FFFF800, 0x1FFFF800, 0x3FFFF800,
+       0x7FFFF800, 0xFFFFF800, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00001000, 0x00003000, 0x00007000,
+       0x0000F000, 0x0001F000, 0x0003F000, 0x0007F000,
+       0x000FF000, 0x001FF000, 0x003FF000, 0x007FF000,
+       0x00FFF000, 0x01FFF000, 0x03FFF000, 0x07FFF000,
+       0x0FFFF000, 0x1FFFF000, 0x3FFFF000, 0x7FFFF000,
+       0xFFFFF000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00002000, 0x00006000, 0x0000E000,
+       0x0001E000, 0x0003E000, 0x0007E000, 0x000FE000,
+       0x001FE000, 0x003FE000, 0x007FE000, 0x00FFE000,
+       0x01FFE000, 0x03FFE000, 0x07FFE000, 0x0FFFE000,
+       0x1FFFE000, 0x3FFFE000, 0x7FFFE000, 0xFFFFE000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00004000, 0x0000C000, 0x0001C000,
+       0x0003C000, 0x0007C000, 0x000FC000, 0x001FC000,
+       0x003FC000, 0x007FC000, 0x00FFC000, 0x01FFC000,
+       0x03FFC000, 0x07FFC000, 0x0FFFC000, 0x1FFFC000,
+       0x3FFFC000, 0x7FFFC000, 0xFFFFC000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00008000, 0x00018000, 0x00038000,
+       0x00078000, 0x000F8000, 0x001F8000, 0x003F8000,
+       0x007F8000, 0x00FF8000, 0x01FF8000, 0x03FF8000,
+       0x07FF8000, 0x0FFF8000, 0x1FFF8000, 0x3FFF8000,
+       0x7FFF8000, 0xFFFF8000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00010000, 0x00030000, 0x00070000,
+       0x000F0000, 0x001F0000, 0x003F0000, 0x007F0000,
+       0x00FF0000, 0x01FF0000, 0x03FF0000, 0x07FF0000,
+       0x0FFF0000, 0x1FFF0000, 0x3FFF0000, 0x7FFF0000,
+       0xFFFF0000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00020000, 0x00060000, 0x000E0000,
+       0x001E0000, 0x003E0000, 0x007E0000, 0x00FE0000,
+       0x01FE0000, 0x03FE0000, 0x07FE0000, 0x0FFE0000,
+       0x1FFE0000, 0x3FFE0000, 0x7FFE0000, 0xFFFE0000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00040000, 0x000C0000, 0x001C0000,
+       0x003C0000, 0x007C0000, 0x00FC0000, 0x01FC0000,
+       0x03FC0000, 0x07FC0000, 0x0FFC0000, 0x1FFC0000,
+       0x3FFC0000, 0x7FFC0000, 0xFFFC0000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00080000, 0x00180000, 0x00380000,
+       0x00780000, 0x00F80000, 0x01F80000, 0x03F80000,
+       0x07F80000, 0x0FF80000, 0x1FF80000, 0x3FF80000,
+       0x7FF80000, 0xFFF80000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00100000, 0x00300000, 0x00700000,
+       0x00F00000, 0x01F00000, 0x03F00000, 0x07F00000,
+       0x0FF00000, 0x1FF00000, 0x3FF00000, 0x7FF00000,
+       0xFFF00000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00200000, 0x00600000, 0x00E00000,
+       0x01E00000, 0x03E00000, 0x07E00000, 0x0FE00000,
+       0x1FE00000, 0x3FE00000, 0x7FE00000, 0xFFE00000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00400000, 0x00C00000, 0x01C00000,
+       0x03C00000, 0x07C00000, 0x0FC00000, 0x1FC00000,
+       0x3FC00000, 0x7FC00000, 0xFFC00000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x00800000, 0x01800000, 0x03800000,
+       0x07800000, 0x0F800000, 0x1F800000, 0x3F800000,
+       0x7F800000, 0xFF800000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x01000000, 0x03000000, 0x07000000,
+       0x0F000000, 0x1F000000, 0x3F000000, 0x7F000000,
+       0xFF000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x02000000, 0x06000000, 0x0E000000,
+       0x1E000000, 0x3E000000, 0x7E000000, 0xFE000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x04000000, 0x0C000000, 0x1C000000,
+       0x3C000000, 0x7C000000, 0xFC000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x08000000, 0x18000000, 0x38000000,
+       0x78000000, 0xF8000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x10000000, 0x30000000, 0x70000000,
+       0xF0000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x20000000, 0x60000000, 0xE0000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x40000000, 0xC0000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+      {0x00000000, 0x80000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000,
+       0x00000000, 0x00000000, 0x00000000, 0x00000000},
+ };
+ #endif PURDUE
  #endif
  
  
-- 
Gene Spafford
NSF/Purdue/U of Florida  Software Engineering Research Center,
Dept. of Computer Sciences, Purdue University, W. Lafayette IN 47907-2004
Internet:  spaf@cs.purdue.edu	uucp:	...!{decwrl,gatech,ucbvax}!purdue!spaf

spaf@cs.purdue.EDU (Gene Spafford) (05/28/88)

Adam de Boor (deboor@nutmeg.berkeley.edu) has pointed out to us that
some braindamaged compilers may not like some of the code we changed in
maskbits.h (the IBM RT was an example he gave).  Since we're using
well-behaved compilers, we'll take his word on that.  He suggested some
restatements of the code to avoid those bugs, and since they have the
same effect as our code without adding extra operations, we'll present
them so our enhancements work for everyone's machines.

Adam also pointed out a small flaw in what we did in putbits (which we
never triggered in our testing, it appears), and a fix for that is
included as well.

Thus, use the following as the patch to maskbits.h rather than
the one we posted yesterday.


*** maskbits.h.orig	Wed May 25 22:38:33 1988
--- maskbits.h	Sat May 28 01:28:45 1988
***************
*** 28,35 ****
--- 28,39 ----
  
  extern int starttab[];
  extern int endtab[];
+ #ifndef PURDUE
  extern int startpartial[];
  extern int endpartial[];
+ #else PURDUE
+ extern unsigned partmasks[32][32];
+ #endif PURDUE
  extern int rmask[32];
  extern int mask[32];
  
***************
*** 206,213 ****
--- 210,222 ----
      else \
  	nlw = (w) >> 5;
  
+ #ifndef PURDUE
  #define maskpartialbits(x, w, mask) \
      mask = startpartial[(x) & 0x1f] & endpartial[((x) + (w)) & 0x1f];
+ #else PURDUE
+ #define maskpartialbits(x, w, mask) \
+     mask = partmasks[(x)&0x1f][(w)&0x1f];
+ #endif PURDUE
  
  #define mask32bits(x, w, startmask, endmask) \
      startmask = starttab[(x)&0x1f]; \
***************
*** 214,219 ****
--- 223,230 ----
      endmask = endtab[((x)+(w)) & 0x1f];
  
  
+ #ifndef PURDUE
+ 
  #define getbits(psrc, x, w, dst) \
  if ( ((x) + (w)) <= 32) \
  { \
***************
*** 244,249 ****
--- 255,293 ----
      *(pdst) = (*(pdst) & endtab[x]) | (SCRRIGHT(src, x) & starttab[x]); \
      *((pdst)+1) = (*((pdst)+1) & starttab[n]) | (SCRLEFT(src, m) & endtab[n]); \
  }
+ 
+ #else PURDUE
+ #define getbits(psrc, x, w, dst) \
+ if ( ((x) + (w)) <= 32) \
+ { \
+     dst = SCRLEFT((unsigned) *(psrc), (x)); \
+ } \
+ else \
+ { \
+     dst = (SCRLEFT((unsigned) *(psrc), (x))) | \
+ 	  (SCRRIGHT((unsigned) *((psrc)+1), 32-(x))); \
+ }
+ 
+ #define putbits(src, x, w, pdst) \
+ { \
+     int n = (x)+(w)-32; \
+     if (n <= 0) \
+     { \
+ 	unsigned tmpmask; \
+ 	maskpartialbits((x), (w), tmpmask); \
+ 	*(pdst) = (*(pdst) & ~tmpmask) | \
+ 		(SCRRIGHT((unsigned) src, x) & tmpmask); \
+     } \
+     else \
+     { \
+ 	unsigned int *ptmp_ = (pdst)+1; \
+ 	int m = 32-(x); \
+ 	*(pdst) = (*(pdst) & endtab[x]) | (SCRRIGHT((unsigned) (src), x)); \
+ 	*ptmp_ = (*ptmp_ & starttab[n]) | \
+ 		(SCRLEFT((unsigned) src, m) & endtab[n]); \
+     } \
+ }
+ #endif PURDUE
  
  #define putbitsrop(src, x, w, pdst, rop) \
  if ( ((x)+(w)) <= 32) \
-- 
Gene Spafford
NSF/Purdue/U of Florida  Software Engineering Research Center,
Dept. of Computer Sciences, Purdue University, W. Lafayette IN 47907-2004
Internet:  spaf@cs.purdue.edu	uucp:	...!{decwrl,gatech,ucbvax}!purdue!spaf