[comp.sys.transputer] C compiler efficiency - request and source

jeremy@cs.adelaide.edu.au (Jeremy Webber) (12/12/90)

To check out the code produced by a C compiler I wrote a little test program,
which appears below.  It contains a few simple code sequences to check the
quality of code generated by a C compiler.  I don't claim it is anything like a
complete benchmark, but I do submit that if a compiler can't deal with these
code sequences well, it doesn't stand a chance optimising REAL code.

If you have access to a Transputer C compiler, can you try compiling this
program, and either mail me the generated assembly code (preferable symbolic),
or tell me how each of the tests went.  I've tried it on the 3L Parallel C
compiler, and as a check, on the MIPS C compiler distributed by Silicon
Graphics for a Personal Iris.  I'm interested in comparing compilers.  I'm
particularly interested in Inmos C, YARC C and Par.C, but I'm also interested
in any others.

Can you also tell me how your compiler implements Transputer concurrency
features, such as message passing and process starting/stopping.  Eg. 3L
compiles to procedure calls, YARC C generates inline code for message passing
and ALT constructs.

I'll post the results.

Also, please mail me if you have any other comments on compiler quality for
Transputers.  Again, I'll summarise comments to the net.

Thanks,

		Jeremy

----- cut here and remove my signature from the end ---

/* Compiler Optimization Checkout Code
 *
 * These short sequences don't pretend to fully check out compiler
 * optimiser capability, but do determine whether a compiler performs
 * basic performs certain basic optimisations.
 * If a compiler fails on these, it doesn't stand a chance on real code...
 *
 * Compilers tested:
 *   MIPS C as supplied by Silicon Graphics with Irix 3.2 (as a control)
 *     Examined assembler output from compiler, which is after compiler
 *     optimisations, but before assembler peephole optimisations
 *     The MIPS compiler is generally regarded as very good at optimising.
 *     The MIPS processor is a conventional register achitecture, so expression
 *     evaluation is not directly comparable to the way transputers do it.
 *     Expression elimination and simplification, however, is.
 *   3L Parallel C Version 2.1
 *     Examined output from the decode utility, which I think is after all
 *     optimisations.
 *     Additional comment: All Transputer channel operations are performed
 *     by procedure calls.
 */

#include <math.h>

int deterministic_conditionals(void)
/* MIPS C optimises refs to a out of existence, and the while to a simple
 *        loop.  It also warns about the infinite loop.
 * 3L C doesn't simpify the while loop conditional or anything else.
 * YARC C does 
 */
{
  int a;

  while (1) {
   a = a + 1;
  }
}

int common_sub_expression(void)
/* Also loop invariant code
 * MIPS C allocates space for all variables, but doesn't refer to the unused
 *        ones.  It moves the loop invariant code, eliminates the common
 *        subexpression, and treats the "a+2" expression as a constant expr.
 *        It also optimises order of execution of the while conditional.
 * 3L C doesn't eliminate the unnecessary expressions in this
 */
{
  int a, b, c;
  char s[100];

  a = 3;
  b = 0;
  while (b++ < 100) {
    s[a+2] = s[a+2] + 10;
  }
}

int byte_arithmetic(void)
/* MIPS C compiles this to 1 load immediate instruction (the first c=a+b)
 * 3L C represents the char variables in integers to economise on byte refs
 */
{
  char a, b, c;
  int i;

  a = 10;
  b = 20;

  c = a + b;
  i = a + 2;
  c = a + b;
}

int byte_arithemic2(void)
/* Also fixed struct addressing...
 * MIPS C compiles this exactly as above.
 * 3L C does optimise the struct element refs.  It doesn't use temp. vars
 *      to overcome multiple refs to byte elements.
 */
{
  struct {
    char a, b, c;
    int i;
  } x;

  x.a = 20;
  x.b = 10;
  x.c = x.a + x.b;
  x.i = x.a + 2;
  x.c = x.a + x.b;
}

int unreachable_code(void)
/* ... and unused variables
 * MIPS C optimises this out of existence (really - just to a return instr.)
 * 3L C doesn't simplify any of this
 */
{
  int a, b, c;

  c = 200;
  while (0) {
    b = c - 10;
  }
}

int complex_expression(void)
/* MIPS does entirely register arithmetic.  It eliminates the last statement.
 * 3L C rearranges expressions for the tptr stack well for the first 2 exprs.
 *      It creates an unnecessary temp. for the third expression.
 */
{
  int a, b, c, d, e;

  a = b + c * (d - e);
  a = (d - e) - (a + a * b);
  a = b * ((c - d * (a + b)) * (a - b * (a + c)) + 2);
}

int constant_expressions(void)
/* MIPS C eliminates the first, does a function call for the second
 * 3L C evaluates the first at compile time, the second at run time, using
 *      a function call
 */
{
  int a, b;

  a = 4 + 7 * 3;
  b = abs(-2);
}

int floating_point_instr(void)
/* Check shedululing of floating point on a T8 - taken from
 * "The Transputer Instruction Set - A Compiler Writer's Guide p75.
 * MIPS C uses FP and integer regs (comparison to transputer not relevant)
 * 3L C generates "naive" code (as defined in the book) for this.
 */
{
  double a[20][20], b[20][20];
  double c, d;
  int i, j;

  a[i][j] = b[i][j]*c + d;
}

--
--
Jeremy Webber			   ACSnet: jeremy@chook.ua.oz
Digital Arts Film and Television,  Internet: jeremy@chook.ua.oz.au
3 Milner St, Hindmarsh, SA 5007,   Voicenet: +61 8 346 4534
Australia			   Papernet: +61 8 346 4537 (FAX)