[comp.windows.x] SunOS malloc vs. GNU malloc: GNU loses

kaleb@THYME.JPL.NASA.GOV (Kaleb Keithley) (08/29/90)

As the result of a thread in comp.windows.x.motif, I conducted the following
test:

I built three MIT Xsun sample servers, server 1 linked with the default
SunOS 4.1 malloc, server 2 linked with GNU emacs malloc.o, and server 3
linked with GNUs alpha/prealpha malloc library (from prep.ai.mit.edu.)  
I then ran the x11perf benchmarking client on each server.  I include 
the link lines used for each below.

BTW, alpha/pre-alpha is not my designator, that's what's in the README file 
that came with gnus malloc.tar.Z file.

The results are rather dramatic; higher numbers representing better performance,
i.e. more "ops" per second. Suns malloc is a runaway winner.  Performance on
some operations was as much as a factor of two better.  Perhaps inexplicably,
gnu emacs malloc is sometimes faster, sometimes slower than the gnu 
alpha/pre-alpha malloc.

(May I be so bold as to throw down the gauntlet, c'mon GNU, we know you can 
do better.)

SunOS malloc link:
cc -g  -o Xsun ddx/sun/sunInit.o .././fonts/mkfontdir/fontdir.o ddx/sun/libsun.a
 ddx/cfb/libcfb.a dix/libdix.a os/4.2bsd/libos.a .././lib/Xau/libXau.a .././lib/
Xdmcp/libXdmcp.a ddx/snf/libsnf.a ddx/mfb/libmfb.a ddx/mi/libmi.a  .././extensio
ns/server/libext.a  -lm -ldbm

gnu emacs malloc link:
cc -o Xsun ddx/sun/sunInit.o malloc.o .././fonts/mkfontdir/fontdir.o ddx/sun/lib
sun.a ddx/cfb/libcfb.a dix/libdix.a  os/4.2bsd/libos.a .././lib/libXau.a .././li
b/libXdmcp.a  ddx/snf/libsnf.a ddx/mfb/libmfb.a ddx/mi/libmi.a .././extensions/s
erver/libext.a -lm -ldbm

gnu alpha/pre-alpha release malloc link:
cc -o Xsun ddx/sun/sunInit.o malloc.o free.o mcheck.o mstats.o realloc.o unix.o 
valloc.o .././fonts/mkfontdir/fontdir.o ddx/sun/libsun.a ddx/cfb/libcfb.a dix/li
bdix.a  os/4.2bsd/libos.a .././lib/libXau.a .././lib/libXdmcp.a  ddx/snf/libsnf.
a ddx/mfb/libmfb.a ddx/mi/libmi.a .././extensions/server/libext.a -lm -ldbm

Tests compiled and run on Sun SS1, SunOS 4.1.  SunCG3 frame buffer. Sun cc
compiler.  Eight megabytes of RAM.  Alpha/pre-alpha malloc was compiled
with -O (the default in the makefile.)  GNU emacs malloc compiled with 
-g (again, the default(?) in the emacs makefile.)  The server code was
not touched between links.

1: SunOS malloc
2: gnu emacs malloc
3: gnu alpha/pre-alpha malloc

     1         2         3    Operation
--------  --------  --------  ---------
152000.0  142000.0  132000.0  Dot
 49600.0   30000.0   31500.0  1x1 rectangle
 20000.0   12900.0   13200.0  10x10 rectangle
   681.0     488.0     488.0  100x100 rectangle
    30.1      21.8      21.7  500x500 rectangle
 19100.0   11200.0   10900.0  1x1 stippled rectangle
  3290.0    1890.0    1870.0  10x10 stippled rectangle
   221.0     140.0     139.0  100x100 stippled rectangle
    16.3      11.6      11.6  500x500 stippled rectangle
 23900.0   13200.0   13600.0  1x1 opaque stippled rectangle
  5070.0    2330.0    2380.0  10x10 opaque stippled rectangle
   343.0     179.0     183.0  100x100 opaque stippled rectangle
    24.7      13.6      15.6  500x500 opaque stippled rectangle
 39400.0   23600.0   25100.0  1x1 4x4 tiled rectangle
 17100.0   11000.0   11200.0  10x10 4x4 tiled rectangle
   661.0     466.0     466.0  100x100 4x4 tiled rectangle
    29.7      21.5      21.6  500x500 4x4 tiled rectangle
 22600.0   17300.0   18200.0  1x1 161x145 tiled rectangle
  7200.0    4520.0    4570.0  10x10 161x145 tiled rectangle
   226.0     127.0     123.0  100x100 161x145 tiled rectangle
    10.3       5.8       5.7  500x500 161x145 tiled rectangle
 45900.0   37200.0   36200.0  1-pixel line segment
 33200.0   26900.0   26800.0  10-pixel line segment
 13400.0    9820.0    9820.0  100-pixel line segment
  3650.0    2580.0    2580.0  500-pixel line segment
  9820.0    5980.0    6740.0  100-pixel line segment (1 kid)
  7780.0    4350.0    5130.0  100-pixel line segment (2 kids)
  6530.0    3550.0    4170.0  100-pixel line segment (3 kids)
 21700.0   14500.0   14000.0  10-pixel dashed segment
  5490.0    2950.0    2930.0  100-pixel dashed segment
  4960.0    2660.0    2650.0  100-pixel double-dashed segment
 53900.0   41600.0   41700.0  1-pixel line
 37600.0   30100.0   29700.0  10-pixel line
 14000.0   10100.0   10100.0  100-pixel line
  3690.0    2610.0    2610.0  500-pixel line
 23900.0   15300.0   15100.0  10-pixel dashed line
  5550.0    2980.0    2980.0  100-pixel dashed line
  5030.0    2690.0    2690.0  100-pixel double-dashed line
  2570.0    2520.0    2470.0  10x1 wide line
   696.0     529.0     531.0  100x10 wide line
   110.0      79.4      79.9  500x50 wide line
   266.0     229.0     234.0  100x10 wide dashed line
   261.0     212.0     227.0  100x10 wide double-dashed line
 19700.0   16700.0   16000.0  1-pixel circle
 14200.0   12500.0   12100.0  10-pixel circle
  4520.0    4070.0    4000.0  100-pixel circle
  1180.0    1060.0    1050.0  500-pixel circle
   368.0     227.0     230.0  100-pixel dashed circle
   273.0     170.0     173.0  100-pixel double-dashed circle
   275.0     192.0     198.0  10-pixel wide circle
    68.5      56.5      54.7  100-pixel wide circle
    14.2      11.9      11.3  500-pixel wide circle
    18.2      14.1      14.0  100-pixel wide dashed circle
    15.2      11.9      10.8  100-pixel wide double-dashed circle
  7250.0    5810.0    6070.0  10-pixel partial circle
  2310.0    2210.0    2240.0  100-pixel partial circle
 91500.0   90500.0   81100.0  1-pixel solid circle
  8090.0    6620.0    7190.0  10-pixel solid circle
   605.0     437.0     461.0  100-pixel solid circle
    34.5      25.5      26.0  500-pixel solid circle
  3420.0    3040.0    3180.0  10-pixel fill chord partial circle
   600.0     480.0     492.0  100-pixel fill chord partial circle
  3300.0    2890.0    3000.0  10-pixel fill slice partial circle
   611.0     476.0     480.0  100-pixel fill slice partial circle
 12300.0   11200.0   10900.0  10-pixel ellipse
  3940.0    3730.0    3690.0  100-pixel ellipse
   976.0     948.0     945.0  500-pixel ellipse
   433.0     277.0     280.0  100-pixel dashed ellipse
   323.0     207.0     211.0  100-pixel double-dashed ellipse
   118.0      96.3      97.0  10-pixel wide ellipse
    19.1      17.0      16.9  100-pixel wide ellipse
     3.9       3.7       3.6  500-pixel wide ellipse
     7.0       6.3       6.2  100-pixel wide dashed ellipse
     4.2       3.9       3.8  100-pixel wide double-dashed ellipse
  6830.0    5570.0    5530.0  10-pixel partial ellipse
  2680.0    2490.0    2480.0  100-pixel partial ellipse
 10400.0    7520.0    8000.0  10-pixel filled ellipse
  1010.0     684.0     728.0  100-pixel filled ellipse
    51.8      47.8      48.8  500-pixel filled ellipse
  3720.0    3200.0    3340.0  10-pixel fill chord partial ellipse
  1090.0     802.0     810.0  100-pixel fill chord ellipse
  3700.0    3140.0    3170.0  10-pixel fill slice partial ellipse
  1070.0     778.0     776.0  100-pixel fill slice ellipse
  6220.0    3680.0    3790.0  Fill 1-pixel/side triangle
  3390.0    2080.0    2120.0  Fill 10-pixel/side triangle
   340.0     234.0     235.0  Fill 100-pixel/side triangle
  4000.0    2520.0    2570.0  Fill 10x10 trapezoid
   402.0     278.0     279.0  Fill 100x100 trapezoid
  1340.0     762.0     814.0  Fill 10x10 stippled trapezoid
    25.7      12.9      14.3  Fill 100x100 stippled trapezoid
  1440.0     804.0     863.0  Fill 10x10 opaque stippled trapezoid
    28.4      13.9      15.5  Fill 100x100 opaque stippled trapezoid
  1290.0     699.0     777.0  Fill 10x10 tiled trapezoid
    22.3      10.9      12.6  Fill 100x100 tiled trapezoid
  2320.0    1530.0    1660.0  Fill 10-pixel/side complex polygon
   274.0     203.0     208.0  Fill 100-pixel/side complex polygons
 26900.0   16800.0   17200.0  Char in 80-char line (6x13)
 38100.0   24000.0   24300.0  Char in 80-char line (TR 10)
 12300.0    8510.0    8530.0  Char in 30-char line (TR 24)
 28700.0   17300.0   17600.0  Char in 20/40/20 line (6x13, TR 10)
 48900.0   34600.0   31800.0  Char in 80-char image line (6x13)
 25200.0   16200.0   16500.0  Char in 80-char image line (TR 10)
  7500.0    5230.0    5270.0  Char in 30-char image line (TR 24)
  5470.0    3610.0    3400.0  Scroll 10x10 pixels
   390.0     355.0     353.0  Scroll 100x100 pixels
    18.6      18.2      18.2  Scroll 500x500 pixels
  5070.0    3360.0    3170.0  Copy 10x10 from window to window
   342.0     303.0     303.0  Copy 100x100 from window to window
    15.9      15.2      15.3  Copy 500x500 from window to window
  5120.0    3670.0    3620.0  Copy 10x10 from pixmap to window
   398.0     373.0     374.0  Copy 100x100 from pixmap to window
    18.7      18.8      19.0  Copy 500x500 from pixmap to window
  5410.0    3170.0    3160.0  Copy 10x10 from window to pixmap
   353.0     291.0     311.0  Copy 100x100 from window to pixmap
    16.7      15.4      15.9  Copy 500x500 from window to pixmap
  6330.0    3760.0    3570.0  Copy 10x10 from pixmap to pixmap
   477.0     389.0     389.0  Copy 100x100 from pixmap to pixmap
    21.2      19.8      19.8  Copy 500x500 from pixmap to pixmap
  5170.0    3450.0    3420.0  Copy 10x10 1-bit deep plane
   420.0     354.0     352.0  Copy 100x100 1-bit deep plane
    22.8      19.7      19.6  Copy 500x500 1-bit deep plane
  2600.0    2160.0    2040.0  PutImage 10x10 square
    74.5      64.0      72.9  PutImage 100x100 square
     3.2       3.2       3.4  PutImage 500x500 square
   366.0     335.0     341.0  GetImage 10x10 square
    97.7      87.8      91.8  GetImage 100x100 square
     5.1       4.9       4.9  GetImage 500x500 square
 61100.0   45100.0   44700.0  X protocol NoOperation
   454.0     453.0     426.0  GetAtomName
   445.0     434.0     414.0  GetProperty
  7920.0    5000.0    5000.0  Change graphics context
  1420.0    1700.0    1030.0  Create and map subwindows (4 kids)
  1620.0    1250.0    1200.0  Create and map subwindows (16 kids)
  1620.0    1260.0    1210.0  Create and map subwindows (25 kids)
  1550.0    1190.0    1150.0  Create and map subwindows (50 kids)
  1460.0    1130.0    1090.0  Create and map subwindows (75 kids)
  1420.0    1090.0    1060.0  Create and map subwindows (100 kids)
  1180.0    2320.0    2240.0  Create and map subwindows (200 kids)
  3720.0    2960.0    2410.0  Create unmapped window (4 kids)
  3720.0    2990.0    2680.0  Create unmapped window (16 kids)
  3780.0    3010.0    2690.0  Create unmapped window (25 kids)
  3690.0    2990.0    2680.0  Create unmapped window (50 kids)
  3720.0    2950.0    2760.0  Create unmapped window (75 kids)
  3730.0    3000.0    2760.0  Create unmapped window (100 kids)
  3720.0    3010.0    2750.0  Create unmapped window (200 kids)
  1890.0    1430.0    1500.0  Map window via parent (4 kids)
  2710.0    2080.0    2130.0  Map window via parent (16 kids)
  2900.0    2190.0    2310.0  Map window via parent (25 kids)
  2900.0    2230.0    2260.0  Map window via parent (50 kids)
  2950.0    2270.0    2270.0  Map window via parent (75 kids)
  2930.0    2270.0    2300.0  Map window via parent (100 kids)
  2950.0    2320.0    2330.0  Map window via parent (200 kids)
  8280.0    5720.0    5680.0  Unmap window via parent (4 kids)
 18300.0   12700.0   12100.0  Unmap window via parent (16 kids)
 21500.0   14900.0   15200.0  Unmap window via parent (25 kids)
 24900.0   17500.0   17800.0  Unmap window via parent (50 kids)
 26400.0   18700.0   17600.0  Unmap window via parent (75 kids)
 27200.0   19300.0   19000.0  Unmap window via parent (100 kids)
 28400.0   20200.0   19200.0  Unmap window via parent (200 kids)
  1840.0    1500.0    2440.0  Destroy window via parent (4 kids)
  4860.0    3670.0    3760.0  Destroy window via parent (16 kids)
  5680.0    4060.0    4090.0  Destroy window via parent (25 kids)
  6140.0    4360.0    4350.0  Destroy window via parent (50 kids)
  6180.0    4490.0    4430.0  Destroy window via parent (75 kids)
  6370.0    4520.0    4530.0  Destroy window via parent (100 kids)
  6440.0    4610.0    4610.0  Destroy window via parent (200 kids)
   761.0     581.0     698.0  Hide/expose window via popup (4 kids)
  1340.0    1030.0    1190.0  Hide/expose window via popup (16 kids)
  1490.0    1050.0    1250.0  Hide/expose window via popup (25 kids)
  1400.0    1110.0    1320.0  Hide/expose window via popup (50 kids)
  1460.0    1190.0    1350.0  Hide/expose window via popup (75 kids)
  1620.0    1230.0    1380.0  Hide/expose window via popup (100 kids)
  1720.0    1270.0    1390.0  Hide/expose window via popup (200 kids)
   706.0     493.0     542.0  Move window (4 kids)
   513.0     350.0     381.0  Move window (16 kids)
   447.0     285.0     318.0  Move window (25 kids)
   318.0     216.0     228.0  Move window (50 kids)
   245.0     178.0     175.0  Move window (75 kids)
   207.0     148.0     145.0  Move window (100 kids)
   122.0      88.6      85.1  Move window (200 kids)
  9730.0    6290.0    6510.0  Moved unmapped window (4 kids)
  9810.0    6300.0    6350.0  Moved unmapped window (16 kids)
  9830.0    6280.0    6350.0  Moved unmapped window (25 kids)
  9700.0    6210.0    6360.0  Moved unmapped window (50 kids)
  9630.0    6190.0    6360.0  Moved unmapped window (75 kids)
  9590.0    6180.0    6340.0  Moved unmapped window (100 kids)
  9350.0    6080.0    6270.0  Moved unmapped window (200 kids)
  2240.0    1540.0    1730.0  Move window via parent (4 kids)
  4600.0    3450.0    3850.0  Move window via parent (16 kids)
  5280.0    3990.0    4460.0  Move window via parent (25 kids)
  5930.0    4560.0    5080.0  Move window via parent (50 kids)
  6390.0    4980.0    5500.0  Move window via parent (75 kids)
  6770.0    5330.0    5730.0  Move window via parent (100 kids)
  7110.0    5670.0    6020.0  Move window via parent (200 kids)
   612.0     486.0     572.0  Resize window (4 kids)
   519.0     385.0     400.0  Resize window (16 kids)
   469.0     318.0     360.0  Resize window (25 kids)
   346.0     239.0     269.0  Resize window (50 kids)
   274.0     204.0     212.0  Resize window (75 kids)
   233.0     177.0     178.0  Resize window (100 kids)
   150.0     106.0     106.0  Resize window (200 kids)
  8530.0    5740.0    6100.0  Resize unmapped window (4 kids)
  8390.0    5730.0    5980.0  Resize unmapped window (16 kids)
  8400.0    5720.0    5960.0  Resize unmapped window (25 kids)
  8410.0    5670.0    5950.0  Resize unmapped window (50 kids)
  8290.0    5650.0    5960.0  Resize unmapped window (75 kids)
  8180.0    5640.0    5930.0  Resize unmapped window (100 kids)
  8080.0    5560.0    5880.0  Resize unmapped window (200 kids)
   299.0     215.0     257.0  Circulate window (4 kids)
   177.0     149.0     172.0  Circulate window (16 kids)
   171.0     133.0     163.0  Circulate window (25 kids)
   159.0     123.0     148.0  Circulate window (50 kids)
   157.0     116.0     137.0  Circulate window (75 kids)
   150.0     109.0     127.0  Circulate window (100 kids)
   117.0      85.9     101.0  Circulate window (200 kids)
 31700.0   22600.0   22400.0  Circulate Unmapped window (4 kids)
 26200.0   18800.0   17800.0  Circulate Unmapped window (16 kids)
 23000.0   15900.0   14900.0  Circulate Unmapped window (25 kids)
 17500.0   10800.0   11500.0  Circulate Unmapped window (50 kids)
 13200.0    8650.0    9190.0  Circulate Unmapped window (75 kids)
 10200.0    7280.0    7560.0  Circulate Unmapped window (100 kids)
  6420.0    4470.0    4580.0  Circulate Unmapped window (200 kids)
-- 
Kaleb Keithley                      Jet Propeller Labs
kaleb@thyme.jpl.nasa.gov

"So that's what an invisible barrier looks like!"


-- 
Kaleb Keithley                      Jet Propeller Labs
kaleb@thyme.jpl.nasa.gov

"So that's what an invisible barrier looks like!"

keith@EXPO.LCS.MIT.EDU (Keith Packard) (08/30/90)

These results are not reasonable, unless something *really* strange is going on
inside that machine.  For example:

    30.1      21.8      21.7  500x500 rectangle

Processing this request should cause *no* allocations to occur; even if one
allocation per rectangle were allowed, the 30% performance degradation is not
understandable; the overwhelming majority of the time is spent filling the
screen full of bits.

I'd suggest some serious profiling in order to make sense of these results.

No, I am not saying that these results are not interesting, but I'm far from
certain that the memory allocator is completely to blame for the differences.

Keith Packard
MIT X Consortium