nagler@olsen.UUCP.UUCP (08/28/87)
After reading yet another message about inefficient operations in Modula-2 versus Turbo (or C), I decided to do some benchmarks to see if they were really true. This message contains the source to a C and a Modula-2 program. The programs were compiled and run on an unloaded diskless Sun 3/50. The compile times are included for the sake of completeness. The compilers were both supplied by Sun, but I should like to say that the Sun C compiler is much more mature than its sibling M2 compiler. The M2 compiler was derrived from the original Lilith four-pass version and was retrofitted to Sun's standard code generator. The programs execute all the Modula-2 set operations in a loop for 10,000,000 iterations. Note that Modula-2 was used as the base, because there are more native set operations in Modula-2 than there are in C. C "register" declarations weren't used, because the Sun M2 compiler does not do register allocation which is the equivalent to "register" in C. I assume that the register allocation mechanism is the job of the code generator or optimizer. As I found out later, the optimizer doesn't seem to do register allocation/colouring at all. The 2.0 FCS version of the Sun M2 compiler and the 3.2 SunOS/C compiler were used. Note that people seem to think that the Sun C compiler generates good code. Here are the results: % time m2c -nobounds -norange -O x.mod -e x -o x 1.9u 2.9s 0:14 33% 5+4k 158+86io 24pf+0w % time cc -O -o y y.c 1.6u 1.3s 0:06 42% 5+5k 83+44io 5pf+0w % time x 197.2u 0.2s 3:18 99% 0+0k 2+0io 0pf+0w % time y 190.3u 0.2s 3:11 99% 0+0k 2+0io 0pf+0w % If you don't know how to read Unix gobbly-gook, the C program took 190 seconds of CPU to run and the M2 program took 197 seconds. So it turns out that C is 3% faster. The -O flag is the optimizer, but it doesn't appear to do too much for the Modula-2 or C. Here are the results of an unoptimized sequence: % time m2c -nobounds -norange x.mod -e x -o x 1.5u 2.6s 0:14 29% 5+3k 154+82io 25pf+0w % time cc -o y y.c 1.2u 1.2s 0:05 41% 5+4k 69+31io 5pf+0w % time x 212.5u 0.2s 3:34 99% 0+0k 2+0io 0pf+0w % time y 201.0u 0.0s 3:21 100% 0+0k 0+0io 0pf+0w The results tell you that the optimizer doesn't do much. The C program consumed 201 seconds of CPU (5% speed up) and the M2 program took 212 seconds (7% speed up). The optimizer didn't reduce the size of the code by an appreciable amount either. The last test I performed was to change the C program to use register variables. The results are obviously stupendous. % time cc -DUseRegisters -o y y.c 1.1u 1.2s 0:05 46% 5+4k 69+38io 5pf+0w % time y 80.5u 0.0s 1:20 99% 0+0k 1+0io 0pf+0w % time cc -O -DUseRegisters -o y y.c 1.4u 1.5s 0:06 45% 5+5k 97+43io 5pf+0w % time y 58.6u 0.0s 0:58 99% 0+0k 2+0io 0pf+0w The unoptimized version took 81 seconds and with the optimizer the program speed up by 27% to 59 seconds. The absolute speed up was twice as much as the "non-register" C version. The marinal change was 5 1/2 times that of the "non-register" version. I'll let you draw your own conclusions from all of this. I am still using Modula-2 in the hopes that some day, someone will take a couple of months off to write an optimizer for Modula-2. I have included the test programs at the end of this note. If nothing else, they server as a crib sheet for translating between C and Modula-2. Rob ----------------------------------------------------------- MODULE x; VAR a : BITSET; b : BITSET; c : BITSET; t : BOOLEAN; i : CARDINAL; BEGIN FOR i := 0 TO 9999999 DO b := BITSET( CARDINAL( a ) * 2 ); (* b = a << 1; *) b := BITSET( CARDINAL( a ) DIV 2 ); (* b = a >> 1; *) INCL( a, 3 ); (* a |= ( 1 << 3 ); *) EXCL( a, 3 ); (* a &= ~( 1 << 3 ); *) c := a + b; (* c = a | b; *) c := a - b; (* c = a & ~b; *) c := a * b; (* c = a & b; *) c := a / b; (* c = a ^ b; *) t := 3 IN a; (* t = a & ~( 1 << 3 ); *) t := a # b; (* t = a != b; *) t := a <= b; (* t = !( ( a | b ) & ~ a ); *) (* The < and > operators are not defined for sets *) END; END x. main() { #ifdef UseRegisters register unsigned int a, b, c, t, i; #else unsigned int a, b, c, t, i; #endif UseRegisters for ( i = 0; i <= 9999999; i++ ) { /* b := BITSET( CARDINAL( a ) * 2 );*/ b = a << 1; /* b := BITSET( CARDINAL( a ) DIV 2 ); */ b = a >> 1; /* INCL( a, 3 ); */ a |= ( 1 << 3 ); /* EXCL( a, 3 ); */ a &= ~( 1 << 3 ); /* c := a + b; */ c = a | b; /* c := a - b; */ c = a & ~b; /* c := a * b; */ c = a & b; /* c := a / b; */ c = a ^ b; /* t := 3 IN a; */ t = a & ~( 1 << 3 ); /* t := a # b; */ t = a != b; /* t := a <= b; */ t = !( ( a | b ) & ~ a ); } } /* main */
michael@corona.UUCP (09/02/87)
Neulich schrieb nagler@olsen.UUCP (Robert Nagler): After reading yet another message about inefficient operations in Modula-2 versus Turbo (or C), I decided to do some benchmarks to see if they were really true. Perhaps this is the right time to ask the net for all the benchmarks for Modula-2, which are available. If you got one and you are willing to share, please mail it to me. Thanks in advance Michael Schmidt -- UUCP: ...!seismo!unido!pbinfo!michael | Michael Schmidt or michael@pbinfo.UUCP | Universitaet-GH Paderborn, FB 17 CSNET: michael%pbinfo.uucp@Germany.CSNET | Warburger Str. 100 ARPA: michael%pbinfo.uucp@seismo.css.gov | D-4790 Paderborn, West Germany