alexande@cs.unc.edu (Geoffrey D. Alexander) (01/13/90)
I have encountered a strange peformance anomoly using contstructor expressions. Consider the following program. ====test4a.c=================================================================== typedef struct { double real; double imaginary; } complex; #define ADD_COMPLEX(x, y) \ (complex){(x).real+y.real, (x).imaginary+(y).imaginary} #define MULT_COMPLEX(x, y) \ (complex){((x).real*(y).real)-((x).imaginary*(y).imaginary), \ ((x).real*(y).imaginary)+((y).real*(x).imaginary)} main() { complex c; int i; complex x; c=(complex){1.0,1.0}; x=(complex){0.0,0.0}; for (i=1;i<=10000;i++) { x=ADD_COMPLEX(MULT_COMPLEX(x,x),c); } exit(0); } =============================================================================== Now, modify the this program so that the value of MULT_COMPLEX(x,x) is saved in a temporary variable. ====test4b.c=================================================================== typedef struct { double real; double imaginary; } complex; #define ADD_COMPLEX(x, y) \ (complex){(x).real+y.real, (x).imaginary+(y).imaginary} #define MULT_COMPLEX(x, y) \ (complex){((x).real*(y).real)-((x).imaginary*(y).imaginary), \ ((x).real*(y).imaginary)+((y).real*(x).imaginary)} main() { complex c; int i; complex x; complex y; c=(complex){1.0,1.0}; x=(complex){0.0,0.0}; for (i=1;i<=10000;i++) { y=MULT_COMPLEX(x,x); x=ADD_COMPLEX(y,c); } exit(0); } =============================================================================== Now, compile the programs as follows: gcc test4a.c -O -o test4a gcc test4b.c -O -o test4b Running on a Sun3-60M (w/o floating point chip) under SunOS Release 4.0.3, test4a takes 4.1 seconds, while test4b takes only 2.3 seconds. Anyone care to explain why? Note that I am using gcc version 1.36. Geoff Alexander
rfg@ics.uci.edu (Ron Guilmette) (01/20/90)
In article <9001121726.AA22371@dopey.cs.unc.edu> alexande@cs.unc.edu (Geoffrey D. Alexander) writes: >I have encountered a strange peformance anomoly using contstructor expressions. ... > >Now, modify the this program so that the value of MULT_COMPLEX(x,x) is saved in >a temporary variable. ... > >Now, compile the programs as follows: > > gcc test4a.c -O -o test4a > gcc test4b.c -O -o test4b > >Running on a Sun3-60M (w/o floating point chip) under SunOS Release 4.0.3, >test4a takes 4.1 seconds, while test4b takes only 2.3 seconds. Anyone care >to explain why? I believe that it is because of the fact that constructor expressions are treated as "executable" by the compiler (regardless of whether or not their "components" are compile-time constants). Thus, each time you "execute" the constructor expression, code must be executed to (again) copy each of the components into some (often temporary) area. If you "construct" only once and copy the value into your own "temporary" structure and then use that from then on, then you only pay for the actual "construction" operation once. // rfg