[comp.lang.modula2] BITSET Benchmarks

nagler@olsen.UUCP.UUCP (08/28/87)

After reading yet another message about inefficient operations in
Modula-2 versus Turbo (or C), I decided to do some benchmarks to
see if they were really true. This message contains the source 
to a C and a Modula-2 program.  The programs were compiled and run on 
an unloaded diskless Sun 3/50.  The compile times are included for 
the sake of completeness.  The compilers were both supplied by Sun,
but I should like to say that the Sun C compiler is much more mature
than its sibling M2 compiler.  The M2 compiler was derrived from the
original Lilith four-pass version and was retrofitted to Sun's standard
code generator.

The programs execute all the Modula-2 set operations in a
loop for 10,000,000 iterations.  Note that Modula-2 was used as the 
base, because there are more native set operations in Modula-2 
than there are in C.  C "register" declarations weren't used, 
because the Sun M2 compiler does not do register allocation which 
is the equivalent to "register" in C.  I assume that the register
allocation mechanism is the job of the code generator or optimizer.
As I found out later, the optimizer doesn't seem to do register
allocation/colouring at all.

The 2.0 FCS version of the Sun M2 compiler and the 3.2 SunOS/C 
compiler were used.  Note that people seem to think that the Sun
C compiler generates good code.

Here are the results:
    % time m2c -nobounds -norange -O x.mod -e x -o x
    1.9u 2.9s 0:14 33% 5+4k 158+86io 24pf+0w
    % time cc -O -o y y.c
    1.6u 1.3s 0:06 42% 5+5k 83+44io 5pf+0w
    % time x
    197.2u 0.2s 3:18 99% 0+0k 2+0io 0pf+0w
    % time y
    190.3u 0.2s 3:11 99% 0+0k 2+0io 0pf+0w
    %

If you don't know how to read Unix gobbly-gook, the C program took
190 seconds of CPU to run and the M2 program took 197 seconds.
So it turns out that C is 3% faster.  The -O flag is the optimizer,
but it doesn't appear to do too much for the Modula-2 or C.  Here are
the results of an unoptimized sequence:

    % time m2c -nobounds -norange x.mod -e x -o x
    1.5u 2.6s 0:14 29% 5+3k 154+82io 25pf+0w
    % time cc -o y y.c
    1.2u 1.2s 0:05 41% 5+4k 69+31io 5pf+0w
    % time x
    212.5u 0.2s 3:34 99% 0+0k 2+0io 0pf+0w
    % time y
    201.0u 0.0s 3:21 100% 0+0k 0+0io 0pf+0w

The results tell you that the optimizer doesn't do much.  The C
program consumed 201 seconds of CPU (5% speed up) and the M2 program
took 212 seconds (7% speed up).  The optimizer didn't reduce the
size of the code by an appreciable amount either.

The last test I performed was to change the C program to use register
variables.  The results are obviously stupendous.

    % time cc -DUseRegisters -o y y.c
    1.1u 1.2s 0:05 46% 5+4k 69+38io 5pf+0w
    % time y
    80.5u 0.0s 1:20 99% 0+0k 1+0io 0pf+0w
    % time cc -O -DUseRegisters -o y y.c
    1.4u 1.5s 0:06 45% 5+5k 97+43io 5pf+0w
    % time y
    58.6u 0.0s 0:58 99% 0+0k 2+0io 0pf+0w

The unoptimized version took 81 seconds and with the optimizer the
program speed up by 27% to 59 seconds.  The absolute speed up was
twice as much as the "non-register" C version.  The marinal change
was 5 1/2 times that of the "non-register" version.

I'll let you draw your own conclusions from all of this.  I am still
using Modula-2 in the hopes that some day, someone will take a couple
of months off to write an optimizer for Modula-2.  I have included the
test programs at the end of this note.  If nothing else, they server
as a crib sheet for translating between C and Modula-2.

Rob
-----------------------------------------------------------
MODULE x;
VAR
    a : BITSET;
    b : BITSET;
    c : BITSET;
    t : BOOLEAN;
    i : CARDINAL;
BEGIN
    
    FOR i := 0 TO 9999999 DO
	b := BITSET( CARDINAL( a ) * 2 );	(* b = a << 1; *)
	b := BITSET( CARDINAL( a ) DIV 2 );	(* b = a >> 1; *)
	INCL( a, 3 );				(* a |= ( 1 << 3 ); *)
	EXCL( a, 3 );				(* a &= ~( 1 << 3 ); *)
	c := a + b;				(* c = a | b; *)
	c := a - b;				(* c = a & ~b; *)
	c := a * b;				(* c = a & b; *)
	c := a / b;				(* c = a ^ b; *)

	t := 3 IN a;				(* t = a & ~( 1 << 3 ); *)
	t := a # b;				(* t = a != b; *)
	t := a <= b;				(* t = !( ( a | b ) & ~ a ); *)
	(* The < and > operators are not defined for sets *)
    END;

END x.

main()
  {
#ifdef UseRegisters
    register unsigned int a, b, c, t, i;
#else
    unsigned int a, b, c, t, i;
#endif UseRegisters
    
    for ( i = 0; i <= 9999999; i++ ) 
      {
	/* b := BITSET( CARDINAL( a ) * 2 );*/ 		b = a << 1;
	/* b := BITSET( CARDINAL( a ) DIV 2 ); */	b = a >> 1;
	/* INCL( a, 3 ); */			a |= ( 1 << 3 );
	/* EXCL( a, 3 ); */			a &= ~( 1 << 3 );
	/* c := a + b; */			c = a | b;
	/* c := a - b; */			c = a & ~b;
	/* c := a * b; */			c = a & b;
	/* c := a / b; */			c = a ^ b;

	/* t := 3 IN a; */			t = a & ~( 1 << 3 );
	/* t := a # b; */			t = a != b;
	/* t := a <= b; */			t = !( ( a | b ) & ~ a );
      }

  } /* main */

michael@corona.UUCP (09/02/87)

Neulich schrieb nagler@olsen.UUCP (Robert Nagler):
  After reading yet another message about inefficient operations in
  Modula-2 versus Turbo (or C), I decided to do some benchmarks to
  see if they were really true.

Perhaps this is the right time to ask the net for all the
benchmarks for Modula-2, which are available. If you got one and
you are willing to share, please mail it to me.

    Thanks in advance
    Michael Schmidt
-- 
UUCP:  ...!seismo!unido!pbinfo!michael    |  Michael Schmidt
       or michael@pbinfo.UUCP             |  Universitaet-GH Paderborn, FB 17
CSNET: michael%pbinfo.uucp@Germany.CSNET  |  Warburger Str. 100
ARPA:  michael%pbinfo.uucp@seismo.css.gov |  D-4790 Paderborn, West Germany