[comp.sys.ncr] Informix 4gl and C optimizer

jalsop@seachg.uucp (John Alsop) (12/14/90)

I ran across an interesting problem today which might be of interest to
other Tower users.

A client has an application written in Informix 4GL.  It consists of 25
source modules, and takes about an hour to compile on another Unix box
comparable to the Tower. The compilation process involves generating C code
from the 4GL source, and then compiling the C into an executable.

The first try at compiling the application on a Tower 600 with 3.00.01 ran
for over 7 hours without finishing.  No-one was too impressed.

A brief analysis of the compilation of one of the modules showed that the
actual compilation from 4GL to C to assembler took about 45 seconds.  The
optimizer then started up and ran for about an hour!

Changing the default optimization level from 2 to 0 allowed all 25 modules to
be compiled in 40 minutes.

I'm curious as to why the code generated by Informix' 4GL should cause such
pathological behaviour of the optimizer.  Ideas anyone?

-- 
John Alsop

Sea Change Corporation
6695 Millcreek Drive, Unit 8
Mississauga, Ontario, Canada L5N 5R8
Tel: 416-542-9484 Fax: 416-542-9479
UUCP: ...!uunet!attcan!seachg!jalsop

lee@ebbtide.ifs.umich.edu (Lee Pearson) (12/19/90)

In article <1990Dec14.035146.18882@seachg.uucp> jalsop@seachg.UUCP (John Alsop) writes:
>I ran across an interesting problem today which might be of interest to
>other Tower users.
>
>A client has an application written in Informix 4GL.  It consists of 25
>source modules, and takes about an hour to compile on another Unix box
>comparable to the Tower. The compilation process involves generating C code
>from the 4GL source, and then compiling the C into an executable.
>
>The first try at compiling the application on a Tower 600 with 3.00.01 ran
>for over 7 hours without finishing.  No-one was too impressed.
>
>A brief analysis of the compilation of one of the modules showed that the
>actual compilation from 4GL to C to assembler took about 45 seconds.  The
>optimizer then started up and ran for about an hour!
>
>Changing the default optimization level from 2 to 0 allowed all 25 modules to
>be compiled in 40 minutes.
>
>I'm curious as to why the code generated by Informix' 4GL should cause such
>pathological behaviour of the optimizer.  Ideas anyone?
>
>-- 
>John Alsop
>
>Sea Change Corporation
>6695 Millcreek Drive, Unit 8
>Mississauga, Ontario, Canada L5N 5R8
>Tel: 416-542-9484 Fax: 416-542-9479
>UUCP: ...!uunet!attcan!seachg!jalsop


I participated in writing a large application in Informix 4GL.  What we found
was that input statements from forms with many before and after field clauses
created hugh C switch statements.  When these routines were optimized, a slow
process but never 7 hours, the resulting executable code usually contained bugs
that caused the program to crash (I suspect from a bad return address on a
corrupted stack).  We typically turned off the optimizer for all our compiles.
I should note that this was on a Tower 32/400 and 32/450 running 1.00 and 2.00
operating systems.

Lee Pearson
Institutional File System project
The University of Michigan
313-763-0606

haug@almira.uucp (Brian R Haug) (12/19/90)

In article <1990Dec14.035146.18882@seachg.uucp> jalsop@seachg.UUCP (John Alsop) writes:
>I'm curious as to why the code generated by Informix' 4GL should cause such
>pathological behaviour of the optimizer.  Ideas anyone?

The reason for this pathological behaviour, taking a wild guess, is that the
code has several constants, and the procedure is fairly long.  At optimization
level 2, one of the optimization run in time proportional to the product of
these two values (length of assembler code and number of constants).  You may
also see this type of behaviour with lex output.

jalsop@seachg.uucp (John Alsop) (12/20/90)

In article <1990Dec19.000453.5631@almira.uucp> haug@almira.UUCP (Brian Haug) writes:
:In article <1990Dec14.035146.18882@seachg.uucp> jalsop@seachg.UUCP (John Alsop) writes:
:>I'm curious as to why the code generated by Informix' 4GL should cause such
:>pathological behaviour of the optimizer.  Ideas anyone?
:
:The reason for this pathological behaviour, taking a wild guess, is that the
:code has several constants, and the procedure is fairly long.  At optimization
:level 2, one of the optimization run in time proportional to the product of
:these two values (length of assembler code and number of constants).  You may
:also see this type of behaviour with lex output.

This sounds like it must be the problem.  The C code which was generated
by Informix in this case consisted of a large number of constant definitions 
and some long procedures which contained a large number of function calls 
and little else.

Unfortunately this kind of code can't really be optimized anyway, so all
the work is for nothing!


-- 
John Alsop

Sea Change Corporation
6695 Millcreek Drive, Unit 8
Mississauga, Ontario, Canada L5N 5R8
Tel: 416-542-9484 Fax: 416-542-9479
UUCP: ...!uunet!attcan!seachg!jalsop

ra@intsys.no (Robert Andersson) (12/20/90)

jalsop@seachg.uucp (John Alsop) writes:

>The first try at compiling the application on a Tower 600 with 3.00.01 ran
>for over 7 hours without finishing.  No-one was too impressed.
>A brief analysis of the compilation of one of the modules showed that the
>actual compilation from 4GL to C to assembler took about 45 seconds.  The
>optimizer then started up and ran for about an hour!

The C compiler optimizer in the 3.00.01 release seems to have trouble with
very big functions.  Handwritten code normally is split into small functions,
so that works ok.  But code output from yacc, lex and obviously also from
Informix 4GL typically is one *huge* function, and hence steps on the bug.

I have had the same problem you describe happen with a lot of freely
distributable sources.  Both perl and kermit spring to mind.

I normally do one of:
a) Use the 2.01.01 compiler.
b) Specify -O1.
c) Use gcc instead.  This is what I recommend.

Regards, Robert.
-- 
Robert Andersson       Voice +47 2 371055       International Systems A/S
ra@intsys.no           Fax   +47 2 356448       P.O. Box 3356
..!{uunet,mcsun,nuug}!intsys.no!ra             0405 Oslo 4, NORWAY

wescott@ncrcae.Columbia.NCR.COM (Mike Wescott) (12/22/90)

In article <1990Dec20.140514.6995@seachg.uucp> jalsop@seachg.UUCP (John Alsop) writes:
> Unfortunately this kind of code can't really be optimized anyway, so all
> the work is for nothing!

It is, however, easy to avoid this behavior and still get the benefits of
the optimizer.  Use the following command line:

	cc -O2 -W2,-N ....

The -N passed to optim turns off constant analysis.

--
	-Mike Wescott
	 mike.wescott@ncrcae.Columbia.NCR.COM