[comp.compilers] Encripted source as an ANDF

rf@mcc.comg (Ron Guilmette) (05/20/89)

worley@compass.com (Dale Worley) writes:
>The most interesting ANDF-type work I've seen is the concept of
>"shrouded source code", that is, compilable source that has been
>deliberately uglified by removing comments, renaming variables, etc.

I think that this is the best and most portable solution that anybody
is ever going to find, but it just won't catch on for at least three 
reasons.

First, when describing this approach to potential buyers (i.e. osf and
friends) people will have to use the word "source" somewhere in the
description (even though the type of "source" we are talking about is
*not* usable in the same way as "normal" source is).  Just using this
word will scare off a lot of people who have emotional rather than
rational responses to the word (i.e. managers and other people in
positions of authority and ignorance).

Second, this solution is just not "flashy" enough.  It would be hard for
OSF to come forward before the trade press and hold a press conference
and give out press releases saying how they are the good guys because they
are seeking out and applying advanvced state-of-the-art technologies, newly
developed as a product of painstaking cutting-edge research if, in the
final analysis, all they were announcing was "uglified" source code.
Let's face it... anybody could have thought of that.

Third, you cannot forget the "Beltway Bandit" syndrome.  Obviously, some
company is going to get a lot of money from OSF (not to mention the PR
value) by having the winning proposal.  Whatever company that is will have
thousands (perhaps millions) of reasons for insisting, at every opportunity
and on all occasions the the emperor *does* in fact have clothes.  (This
part ought to be fun to watch from the outside, especially as the arguments
get more and more ridiculous).

------

Regarding "uglified" source code, it should be noted that "uglification"
could really be a very intense process, going far beyond anything that
you might at first think of.  I have great confidence that we in this
industry can think up innumerable way in which to "uglify" source code.
Many of the fundamental techniques of "uglification" could be garnered
directly form student programs (and, unfortunately, many programs written
by "professionals"... and I use the term loosely... as does everyone else).
Yes, we already have a large repertoire of ugliness to draw from.  We can
easily make code ugly; we just have trouble making it (actually) pretty
(rather than just well indented).

Some uglification transformations were noted by Dale Worley in the original
posting (i.e. removing comments, changing identifiers).  I would like
to mention just a few more.

First, in the case of C (or C++) one of the best techniques for rapidly
removing the mantainability from a piece of source code (that is assuming
that there was any to being with) is to preprocess it with the C preprocessor.
This causes all sorts of essentially one way transformations and the output
tends heavily towards ugly.

Another very effective uglifing transformation is to perform comprehensive
de-structuring, i.e. the conversion of all loops and well structured if and
switch/case statements into the most degenerative possible equivalent forms
using GOTO statements.  This transformation alone should reasonably
prevent any theft of ideas (or code) given that it is guaranteed to make
the resulting code totally incomprehensible.  Given the current state of
the art in the area of restructuting tools, this is also (effectively) a
one way transformation for many languages.  This transformation is also
a particularly good candidate for use in an ANDF generator because the
semantics of the original source code may be strictly manitained, and
the transformation itself should have little or no effect on the quality
of the *final* generated (machine-dependent) code (assuming that even
modest optimizations take place in the ANDF compiler(s)).

For C, other good "semantics preserving" one way transformations include
the elimination of all typedef statements, and the elimination of all
enum type declarations/definitions and enum type constants.  In each of
these cases, the construct in question is just a convenient shorthand
notation which can be done away with via a sort of macro substitution.
Again, there should be no effect on final code quality.

The suggestion of yet further uglification techniques is left as an exercise
for the reader.  If you can't think of any then look at somebody else's
code.  If you look long enough, you're bound to see something which makes
code harder to maintain (although the original author is likely to disagree).

As a free service to the OSF, I will start collecting a list of uglifications.
If they arrive in sufficient number, I will summarize to the net.

[The author sent this followup shortly afterwards. -John]

Sorry.  But I just had to post a follow up note and mention one
more important source code uglification technique which couldd be
used for an ANDF generator.

The particular uglification technique I'm thinking of only works
for C++, but it is quite simple.  All you have to do to get a
really ugly (but semantically equivalent) version of a hunk of C++
source code is to run it through AT&T's cfront translator (which
generates totally obscure C code).  If you have never looked at
the output generated by cfront, you are really missing something!
Now we're really talking ugly!  Fortunately (as already well
demonstrated) we are also talking *portable* ugly.

// Ron Guilmette  -  MCC  -  Experimental Systems Kit Project
// 3500 West Balcones Center Drive,  Austin, TX  78759  -  (512)338-3740
// ARPA: rfg@mcc.com
// UUCP: {rutgers,uunet,gatech,ames,pyramid}!cs.utexas.edu!pp!rfg
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

albaugh@dms.UUCP (Mike Albaugh) (05/24/89)

>From article <3949@ima.ima.isc.com>, by rf@mcc.comg (Ron Guilmette):
[ much serious discussion of the _politics_ of uglified source ]
> 
> Another very effective uglifing transformation is to perform comprehensive
> de-structuring, i.e. the conversion of all loops and well structured if and
> switch/case statements into the most degenerative possible equivalent forms
> using GOTO statements.  This transformation alone should reasonably
> prevent any theft of ideas (or code) given that it is guaranteed to make
> the resulting code totally incomprehensible.  Given the current state of
> the art in the area of restructuting tools, this is also (effectively) a
> one way transformation for many languages.  This transformation is also
> a particularly good candidate for use in an ANDF generator because the
> semantics of the original source code may be strictly manitained, and
> the transformation itself should have little or no effect on the quality
> of the *final* generated (machine-dependent) code (assuming that even
> modest optimizations take place in the ANDF compiler(s)).

	I beg to differ. The sort of transformation suggested here is likely
to cripple the optimization effort, for much the same reason as cited against
RTL and the like. If the code is going to be "optimized" by the original
source->andf translation, assumptions have to be made about the eventual
target. These assumptions are no better than the RTL ones. If the code is
supposed to be optimized by the andf->machine_code translation, then the
control structures and variable scoping need to be preserved so, for example,
register allocation can be done "intelligently".

	For example, we have a locally developed "silliness reducer" which
we use on the output of the GreenHills compiler. We hesitate to call it
an "optimizer, because the resulting code is far from optimal, but it does
a fair job of deleting dead code, extraneous reg-reg moves, etc. One
headache was that the compiler uses D0 as the return register as well as
its most_favored_temp. If one only has the assembly code to inspect, it
is impossible to tell whether a seemingly dead calculation is really a
return value. Similar problems would crop up in an uglification that re-used
variables, expecting a specific number to occupy registers.

| Mike Albaugh (albaugh@dms.UUCP || {...decwrl!turtlevax!}weitek!dms!albaugh)
| Atari Games Corp (Arcade Games, no relation to the makers of the ST)
| 675 Sycamore Dr. Milpitas, CA 95035		voice: (408)434-1709
| The opinions expressed are my own (Boy, are they ever)
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

henry@zoo.toronto.edu (Henry Spencer) (05/24/89)

Upon reflection, it occurs to me that there is a minor problem with using
obfuscated source as an ANDF.  This problem turns major for most other
ANDF concepts I can think of.  Consider:  at some level, one must leave the
insides of libraries to the target system.  At the very least, how to do
a system call is system-specific.  Moreover, different systems often have
good reason to fiddle with the insides of, say, printf, to adapt it to
the facilities and characteristics of the particular system.  One really
wants an ANDF-distributed program to use the target's library, not one
that the program hauls along.

PROBLEM:  what to do about things like putc(), which are macros -- that 
is, are normally expanded before one generates obfuscated source, or
tokenized source, or compiler intermediate form -- but must interface
correctly with the local library?  This is a hassle even today on systems
where some code is distributed as object modules to be linked on the
target system:  one cannot improve stdio, for example, without risk of
breaking such code.  The insides of those macros are really part of the
target system's library and should be expanded on the target system.

Worse, it's not just function-like macros that are affected, but even
plain old numeric constants, which can appear in places like array
dimensions.  That means you can't just pretend that putc() is really a
function and have the ANDF translator on the target machine do the
code expansion -- that doesn't work for BUFSIZ.  To make BUFSIZ match
that of the target machine (which it must if you want the target's
setbuf() to work right), you have to do the preprocessing on the target.

One can imagine programming practices that would avoid some specific
cases of these problems, but it's a nasty problem in general.  Especially
if you want to be able to apply it to existing portable programs.

It seems to me that this kills any ANDF scheme which is not essentially
based on obfuscated (but non-preprocessed) source.

                                     Henry Spencer at U of Toronto Zoology
                                 uunet!attcan!utzoo!henry henry@zoo.toronto.edu
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

bpendlet@esunix.uucp (Bob Pendleton) (05/24/89)

rf@mcc.comg (Ron Guilmette):
> worley@compass.com (Dale Worley) writes:
>>The most interesting ANDF-type work I've seen is the concept of
>>"shrouded source code", that is, compilable source that has been
>>deliberately uglified by removing comments, renaming variables, etc.
> 
> Regarding "uglified" source code, it should be noted that "uglification"
> could really be a very intense process, going far beyond anything that
> you might at first think of.

> First, in the case of C (or C++) one of the best techniques for rapidly
> removing the mantainability from a piece of source code (that is assuming
> that there was any to being with) is to preprocess it with the C 
> preprocessor.

> Another very effective uglifing transformation is to perform comprehensive
> de-structuring, i.e. the conversion of all loops and well structured if and
> switch/case statements into the most degenerative possible equivalent forms
> using GOTO statements.

> For C, other good "semantics preserving" one way transformations include
> the elimination of all typedef statements, and the elimination of all
> enum type declarations/definitions and enum type constants.  

So as a general rule all "uglifications" must preserve semantics.

Ok, now lets make all temporaries explicit and break up expressions
into sequences of statements containing only binary or ternary
operations like "x op= y," "x = y op z," and "if(x)goto y."

Lets make all address arithmetic and dereferencing operations
explicit.

Now lets do some loop folding, move invariant code out of loops, and
common subexpression elimination.

Is that ugly enough for you?

Doesn't this process sound familiar? By the time you have the "source"
code in this uglified form you have done most of the work of compiling
it. If you are careful your uglifying transformations can not only
preserve the semantics of the original program, they can be machine
independent.

Mapping the partially compiled intermediate code, er... I must mean
uglified code, into machine code for a specific architecture without
messing up the semantics is hard. But then that has always been the
hard part of implementing standards compliant compilers.

			Bob P.
[From bpendlet@esunix.uucp (Bob Pendleton)]
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

jeffb@grace.cs.washington.edu (Jeff Bowden) (05/25/89)

In article <3963@ima.ima.isc.com> henry@zoo.toronto.edu (Henry Spencer) writes:
>
>PROBLEM:  what to do about things like putc(), which are macros

I was going to post this objection.  Since Mr. Spencer beat me to it, I will
offer a solution (so someone else can tear *it* apart).  

1) Write a preprocessor which, in addition to obfuscating, 
replaces #include<foo.h> with some goop that cpp will leave alone but will
indicate the name (e.g. extern int _foo_h();)

2) Run the result through cpp.

3) Write a post processor which replaces the goop with the original #include

4) Ship product to customers.
.........
One of the earlier poster mentioned transforming the source by changing
control structures into gotos.  This might work but I would guess that some
optimizing compilers might lobotomize themselves temporarily when they find
code with gotos in it (perhaps because it's not worth the effort to optimize
code with little-used language features, especially when they have all sorts
of hairy implications.) [Direct "goto" flames to /dev/tty.  Amuse yourself.]
--
"It has been discovered that C++ provides a remarkable facility for concealing
the trivial details of a program - such as where its bugs are."
[Rediscovering the structure of code written with gotos isn't all that hard,
and is quite common in Fortran compilers where you don't have much choice.
But it's true, it makes the ANDF back end harder than it has to be.  -John]
[From jeffb@grace.cs.washington.edu (Jeff Bowden)]
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

rfg@mcc.com (Ron Guilmette) (05/28/89)

Recently, albaugh@dms.UUCP (Mike Albaugh) writes:
> >From article <3949@ima.ima.isc.com>, by rf@mcc.comg (Ron Guilmette):
> [ much serious discussion of the _politics_ of uglified source ]
> > 
> > Another very effective uglifing transformation is to perform comprehensive
> > de-structuring, i.e. the conversion of all loops and well structured if and
> > switch/case statements into the most degenerative possible equivalent forms
> > using GOTO statements...

> > ...  This transformation is also
> > a particularly good candidate for use in an ANDF generator because the
> > semantics of the original source code may be strictly manitained, and
> > the transformation itself should have little or no effect on the quality
> > of the *final* generated (machine-dependent) code (assuming that even
> > modest optimizations take place in the ANDF compiler(s)).
> 
> 	I beg to differ. The sort of transformation suggested here is likely
> to cripple the optimization effort, for much the same reason as cited against
> RTL and the like. If the code is going to be "optimized" by the original
> source->andf translation, assumptions have to be made about the eventual
> target. These assumptions are no better than the RTL ones. If the code is
> supposed to be optimized by the andf->machine_code translation, then the
> control structures and variable scoping need to be preserved so, for example,
> register allocation can be done "intelligently".

Mike talks about two type of "optimizers" here, i.e. SOURCE => ANDF and
ANDF => MACHINE_CODE.  One of these possibilities is totally silly, in the
current context.

The real beauty of the simple idea I proposed was that almost everybody
already has a C compiler.  In the scheme I suggested, this compiler would
also serve (without major modifications) and the ANDF compiler.

Given this assumption, it should be obvious that there would be no need
whatsoever for a SOURCE => ANDF "optimizer" since the ANDF => MACHINE_CODE
transformation (i.e. "normal" compliation") would (presumably) already
have a good optimizer.

Mike says that even for an ANDF => MACHINE_CODE optimizier, "control
structures and variable scoping need to be preserved so, for example,
register allocation can be done 'intelligently'".  Well, he may have gotten
it half right.  Scoping information may be useful in this regard, but I
never suggested that any scoping information be destroyed.  Consider the
destructuring of:

	if (<expression>)
	{
		<local-variable-declarations>
		...
	}

into:

	if (<expression>)
		goto around_999;
	{
		<local-variable-declarations>
		...
	}
	around_999:

This destructuring transformation obviously *does not* have any effect
on scoping information.

Regarding the other half of Mike's argument (i.e. that "control structures"
must be preserved to do good optimization)  I believe that this is also
patently false.  I personally know of no reason why this should be the case,
and I challenge Mike to produce some evidence or proof that such information
improves the ability of an optimizer to do its work (either with respect to
register allocation, or with respect to any other type of commonly used
optimization mechanism).

In fact, quite to the contrary, I believe that the vast majority of modern
optimizers begin their analysis by reducing "higher-level" control constructs
down to their simpler "GOTO" equivalents.  Thus, if this transformation is
done at the source level, it should have absolutely no effect on the quality
of optimization for most well-written modern optimizers.

Mike seems to be saying that there are some optimizers which perform
specialized optimizations *only* on control-flow graphs derived from
"higher-level" control constructs (e.g. if-then-else, while-do, repeat-while,
for, etc.) and *not* on identical control flow graphs which happen to be
derived from some GOTO-filled programs.  I believe that this is wrong, and
that all "good" optimizers look for *all* optimization opportunities wherever
they might be found.

> 	For example, we have a locally developed "silliness reducer" which
> we use on the output of the GreenHills compiler...
> ... [ description of their post-processor which fixes up lousy GreenHills
> output code  ] ...

What does this have to do with anything (other than to demonstrate that
GreenHills compilers need more work)?

> ... Similar problems would crop up in an uglification that re-used
> variables, expecting a specific number to occupy registers.

I never suggested this as a "proper" uglification-step for an ANDF generator
(and I probably never will)!  We *were* talking about de-structuring.

// Ron Guilmette  -  MCC  -  Experimental Systems Kit Project
// 3500 West Balcones Center Drive,  Austin, TX  78759  -  (512)338-3740
// ARPA: rfg@mcc.com
// UUCP: {rutgers,uunet,gatech,ames,pyramid}!cs.utexas.edu!pp!rfg
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

rfg@mcc.com (Ron Guilmette) (05/28/89)

Recently, henry@zoo.toronto.edu (Henry Spencer) writes:
> Upon reflection, it occurs to me that there is a minor problem with using
> obfuscated source as an ANDF.  This problem turns major for most other
> ANDF concepts I can think of.  Consider:  at some level, one must leave the
> insides of libraries to the target system.  At the very least, how to do
> a system call is system-specific.  Moreover, different systems often have
> good reason to fiddle with the insides of, say, printf, to adapt it to
> the facilities and characteristics of the particular system.  One really
> wants an ANDF-distributed program to use the target's library, not one
> that the program hauls along.

This is perhaps *the* best argument IN FAVOR of using obfuscated source
code as ANDF, i.e. that it is *not* necessary to transmit some fully pre-
linked thing to the final destination system.  Rather, if you used some form
of "source", then, by definition, you would do the final "installation"
of a piece of ANDF stuff by compiling, and then linking with the local
libraries on the target system itself.  This solves lots of otherwise
UGLY problems that your would have with any system where the link-step
is done *before* distribution.

> PROBLEM:  what to do about things like putc(), which are macros -- that 
> is, are normally expanded before one generates obfuscated source, or
> tokenized source, or compiler intermediate form -- but must interface
> correctly with the local library?  This is a hassle even today on systems
> where some code is distributed as object modules to be linked on the
> target system:  one cannot improve stdio, for example, without risk of
> breaking such code.  The insides of those macros are really part of the
> target system's library and should be expanded on the target system.

A fair question.  I believe that a simple solution is available.

First, assume that we can build (or buy) a slightly modified C proprocessor
which will have the following minor extension.  It will have a "-u"
option, which is kinda like -U except that the given symbol *remains*
undefined even if a #define is subsequently seen for the given symbol.

Require the "manufacturer" of a given piece of software to identify
(before the obfuscation step, which includes preprocessing) the entire
list of "system-dependent" macros which are used within his code.  For
this list of macros (which will probably be quite small) the manufacturer
will perform the obfuscation step (with preprocessing) while using "-u"
options for each of the "system-dependent" macros which need to be
expanded on the "destination" system rather than on the original
"development" system.  This will causes any calls of the given macros
to be left "un-expanded" during obfuscation.

During "installation" on the "destination" system, appropriate "local"
definitions for the "system-dependent" macros could be supplied via
another (standard) preprocessing step and via -D command line options.


> Worse, it's not just function-like macros that are affected, but even
> plain old numeric constants, which can appear in places like array
> dimensions.  That means you can't just pretend that putc() is really a
> function and have the ANDF translator on the target machine do the
> code expansion -- that doesn't work for BUFSIZ.  To make BUFSIZ match
> that of the target machine (which it must if you want the target's
> setbuf() to work right), you have to do the preprocessing on the target.

Exactly right Henry.  As you correctly pointed out, some processing will
always have to be done of the "target" system.  Obviously, if you (as a
software author) make the choice to write your code such that it
depends on "system-dependent" macros, then you have left yourself no
alternative except to do (at least some) macro-preprocessing on your
intended "target" system(s).

I don't see any problem with doing preprocessing (for both function and
non-function macro expansion) on the "target" system(s) in an ANDF
distribution scheme.

> It seems to me that this kills any ANDF scheme which is not essentially
> based on obfuscated (but non-preprocessed) source.

In case you missed my point, I disagree very strongly.  You should be
able to do a (partial) preprocessing step on the "development" system,
and another (partial) preprocessing step on the "destination" system.
This will insure maximum portability and also maximum obfuscation.

// Ron Guilmette  -  MCC  -  Experimental Systems Kit Project
// 3500 West Balcones Center Drive,  Austin, TX  78759  -  (512)338-3740
// ARPA: rfg@mcc.com
// UUCP: {rutgers,uunet,gatech,ames,pyramid}!cs.utexas.edu!pp!rfg
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

les@chinet.chi.il.us (Leslie Mikesell) (05/29/89)

In article <3978@ima.ima.isc.com> jeffb@grace.cs.washington.edu (Jeff Bowden) writes:

>>PROBLEM:  what to do about things like putc(), which are macros

>I was going to post this objection.  Since Mr. Spencer beat me to it, I will
>offer a solution (so someone else can tear *it* apart).  
>
>1) Write a preprocessor which, in addition to obfuscating, 
>replaces #include<foo.h> with some goop that cpp will leave alone but will
>indicate the name (e.g. extern int _foo_h();)

How about a better approach altogether?  Instead of requiring all manufactures
to produce an obfucated code compiler, and all end users to maintain the
resources to store and run the code and compiler, why not suggest that the
manufacturers that want applications to be available for their machines provide
a code-generator and libraries for a standard compiler with no restrictions on
distribution.

That way the program writer does not have to have access to the target machine
or spend a fortune to buy a cross-compiler that may act differently from the
standard version. This relieves the burden from the end user and places it
where it belongs.  It would prevent the end user from compiling a copy of
a previously purchased program for a new environment by himself but I suspect
the the program suppliers would not see that as a real problem.  The need for
testing still applies, of course, but the steps required to do so are not
increased by this approach.

Can anyone comment on how far GNU C is from being a suitable platform for
acting as a multi-machine cross compiler?

Les Mikesell
[It seems to me that this proposal misses the point -- my understanding of
ANDF is to permit a single version of a program to exist that, probably as
part of the installation process, is turned into something executable on
whatever the local machine is without the vendor having to create N
different versions for N machines.  -John]
[From les@chinet.chi.il.us (Leslie Mikesell)]
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

kbierman@sun.com (05/31/89)

In article <3990@ima.ima.isc.com> it is written:
>.... talks about two type of "optimizers" here, i.e. SOURCE => ANDF and
>ANDF => MACHINE_CODE.  One of these possibilities is totally silly, in the
>current context.
>
>The real beauty of the simple idea I proposed was that almost everybody
>already has a C compiler.  In the scheme I suggested, this compiler would
>also serve (without major modifications) and the ANDF compiler.
>
>Given this assumption, it should be obvious that there would be no need
>whatsoever for a SOURCE => ANDF "optimizer" since the ANDF => MACHINE_CODE
>transformation (i.e. "normal" compliation") would (presumably) already
>have a good optimizer.

.... more stuff about "good optimizing compilers"

The real fly in the ointment is that C semantics sabatoge many of the
best optimizations currently available to other languages (like most
good fortran compilers). 

A _few_ are

1) unconstrained pointers, 
2) "for" vs "do"
   (while the author of the proposal may think it best to reduce
   directly to GOTO's, DO loops ensure that the control variable may
   not be modified by anything in its body ... thus allowing many
   interesting optimizations ... w/o having to do extensive analysis)
   
3) lack of knowledge about rules of complex arithmetic
4) trig functions (built into fortran) and many other items of interest in
   the "real "go fast"" community. 

Try compiling C on your local supercomputer (or many RISC machines)
vs. fortran ... typically the fortran optimizer generates much tighter
code. 

This is not to say Fortran is "better" than C; it simply features well
suited to numerical computation and automatic optimization thereof.
-- 
Keith H. Bierman      |*My thoughts are my own. Only my work belongs to Sun*
It's Not My Fault     |	Marketing Technical Specialist    ! kbierman@sun.com
I Voted for Bill &    |   Languages and Performance Tools. 
Opus  (* strange as it may seem, I do more engineering now     *)
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

henry@zoo.toronto.edu (05/31/89)

>> It seems to me that this kills any ANDF scheme which is not essentially
>> based on obfuscated (but non-preprocessed) source.
>
>In case you missed my point, I disagree very strongly.  You should be
>able to do a (partial) preprocessing step on the "development" system...

I should have been clearer.  Note the word "essentially", though.  I admit
I didn't think of partial preprocessing, which could be useful.

Do remember, though, that ANSI C in particular encourages rather more use
of macros with potentially implementation-specific bodies than older C
practice.  Not to say that partial preprocessing won't work, but a rather
larger number of identifiers are going to have to be marked "hands off".
[From henry@zoo.toronto.edu]
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

diomidis@ecrcvax.UUCP (Diomidis Spinellis) (06/10/89)

In article <4001@ima.ima.isc.com> kbierman@sun.com writes:
>The real fly in the ointment is that C semantics sabatoge many of the
>best optimizations currently available to other languages (like most
>good fortran compilers). 
>
>[...]
>2) "for" vs "do"
>   (while the author of the proposal may think it best to reduce
>   directly to GOTO's, DO loops ensure that the control variable may
>   not be modified by anything in its body ... thus allowing many
>   interesting optimizations ... w/o having to do extensive analysis)

The use of a ``register'' variable as a loop control variable insures
that no alias is used for that variable.  Thus the compiler only has
to look at the code inside the loop body in order to determine if the
variable is modified or not.  In most cases it will not be modified 
and the interesting optimizations can be performed.

>[...]
>4) trig functions (built into fortran) and many other items of interest in
>   the "real "go fast"" community. 

ANSI C defines the semantics of many functions including the trig functions.
Thus an ANSI C compiler is free to use inline code or known properties of 
these functions.  Many compilers that are conformant to ANSI drafts already
produce such code and Matthew Self (self@bayes.arc.nasa.gov) has posted
an almost ANSI conforming inline math library in gnu.gcc for the GNU C 
compiler (message-ID <8903230250.AA00243@bayes.arc.nasa.gov>.
--
Diomidis Spinellis                   European Computer-Industry Research Centre
Arabellastrasse 17, D-8000 Muenchen 81, West Germany          +49 (89) 92699199
USA: diomidis%ecrcvax.uucp@pyramid.pyramid.com     ...!pyramid!ecrcvax!diomidis
Europe: diomidis@ecrcvax.uucp                        ...!unido!ecrcvax!diomidis
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request