[comp.lang.misc] Quickpick construct

nevin@igloo.scum.com (Nevin Liber) (10/04/90)

In article <18718:Oct120:03:0090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <7950@scolex.sco.COM> seanf (Sean Fagan) writes:

>Yes, I'm aware of that; programs are almost always recompiled for new
>hardware. The point is that you don't need the original programmer
>slobbering over each new chip to figure out whether it's faster with
>algorithm X or algorithm Y. You let the compiler do the job for you.

But this solution would be a maintenance nightmare!  Let me give you a
few scenarios to illustrate my point:

Suppose Joe Programmer writes two routines to perform the same task:
one of them is portable but slower and the other is quick but machine-
dependent.  When he switches machines, the machine-dependent code will
still be picked even though it is no longer correct.

There is no way to "prove" that two different routines perform the same
task.  What happens if you find a bug?  You then have to figure out
which alternative was compiled, fix it, and hope that the same
alternative is picked again (after all, it may not be as efficient
after you put your fix in).  From a reliability point of view, you
would have to compile and test all the different combinations of code.
Unless the product is very small, this gets to be very boring and time
consuming.

Thirdly, I suspect that there are very few routines which would benefit
from this type of hand-optimisation, because it assumes that the compiler
can tell how long it will take to execute a given loop.  Most loops are
not interated a fixed number (by this I mean known at compile time) of
times.  Given anything bigger than a trivial loop, the compiler would
be doing no better (and possibly worse) than without this redundant
information.


Rather than adding constructs to help facilitate inline
hand-optimisation, I would like to see people writing simpler, easy to
understand code so that the compilers have a better chance at
optimising, since compilers tend to do it better than humans anyway.
Making hand-optimisation easier now will porbably get in the way of
better optimisers later.
-- 
	NEVIN ":-)" LIBER
	nevin@igloo.Scum.com  or  ..!gargoyle!igloo!nevin
	(708) 831-FLYS
California, here I come!	Public Service Announcement: Say NO to Rugs!

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/08/90)

In article <2882@igloo.scum.com> nevin@igloo.UUCP (Nevin Liber) writes:
> In article <18718:Oct120:03:0090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >Yes, I'm aware of that; programs are almost always recompiled for new
> >hardware. The point is that you don't need the original programmer
> >slobbering over each new chip to figure out whether it's faster with
> >algorithm X or algorithm Y. You let the compiler do the job for you.
> But this solution would be a maintenance nightmare!

Maybe it would be. However, it's infinitely better than the situation
you get with ifdefs. (All your complaints about maintenance apply
equally well to ifdefs.)

> Suppose Joe Programmer writes two routines to perform the same task:
> one of them is portable but slower and the other is quick but machine-
> dependent.

No! I'm not talking about extending the language in any way other than
quickpick. I can't magically solve the problem of people writing
non-portable code, but I'm also not saying that the compiler should
provide non-portable constructs.

Quite the opposite: quickpick itself is perfectly portable. In fact, it
should *encourage* the development of portable code. Most of the time
that I use machine-dependent constructs or assembly, it's because I'm
unable to get the compiler to recognize any portable idiom for a fast
assembly construct. Part of the quickpick idea is that every compiler
should provide *some* way to do *everything* (within reason).

> There is no way to "prove" that two different routines perform the same
> task.  What happens if you find a bug?

It's certainly no worse than the ifdef situation. I think it's quite a
bit better, since the optimization decisions are entirely local. There's
no global HAVE_BITCOUNT symbol to look out for, no possibility of typos
in an ifdef; instead, there's just a small section of code that does one
well-defined thing in several different ways.

> Thirdly, I suspect that there are very few routines which would benefit
> from this type of hand-optimisation, because it assumes that the compiler
> can tell how long it will take to execute a given loop.

It's unfortunate that people are so blind to the benefits of hand
optimization. Fine, you stay away from quickpick; just let me write my
code efficiently. Piercarlo, do you want to argue this one again?

Sure, I see examples of code that my compiler can't understand, let
alone optimize. But I also see lots of examples that it can understand.
Compilers are getting smarter; some of them can even use profiling
information to figure out how often a loop is executed.

> Rather than adding constructs to help facilitate inline
> hand-optimisation, I would like to see people writing simpler, easy to
> understand code so that the compilers have a better chance at
> optimising, since compilers tend to do it better than humans anyway.
> Making hand-optimisation easier now will porbably get in the way of
> better optimisers later.

Obviously you program in APL.

---Dan

pcg@cs.aber.ac.uk (Piercarlo Grandi) (10/12/90)

On 8 Oct 90 01:13:27 GMT, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) said:

brnstnd> It's unfortunate that people are so blind to the benefits of
brnstnd> hand optimization. Fine, you stay away from quickpick; just let
brnstnd> me write my code efficiently. Piercarlo, do you want to argue
brnstnd> this one again?

Yup! Of course! :-). But I am little busy now, so inflicting on the
tired masses another one of my treatises will have to wait a bit.

However, just to remind you of the gist of the argument: I object to
compilers rewriting a program's logic behind my back -- I think that
this should be done at the source level, either by the programmer alone
or by the programmer plus a programmatic assistant.

I do not object to:

* compilers doing their best to generate efficient code that _faithfully
reflects the logic_ in my source, if the source is expressed in a low
level language,

* compilers working even harder to generate efficient code that
_achieves the effects prescribed_ by my source, if the source is
expressed in a high level language.

I object to compilers that try to generate the best code that _achieves
the same effect as the logic_ of my source, when the source is expressed
as a low level language.

Just to make some obvious example: if I write three nested loops to
multiply matrixes 'm' and 'n', I want the compiler to generate code for
those three nested loops as I have written them -- if I write 'm * n',
the compiler can implement the '*' operator as it wants.

	For example on vector machines it is advantageous to write
	the three nested loops in a different order than scalar
	machines, but if I write three loops their order should not be
	changed -- if I write three nested loops it must be assumed that
	I know what I am doing, and it then falls on me to provide
	the optimal oder for the architecture I am using.

	On the other hand, if I write '*' the compiler can decide
	whether to generate the three nested loops for matrix multiply,
	and in which order, generating or not vector instructions.

What I do not want is the compiler trying to 'understand' that the
effect of my three nested loops is matrix multiplication and substitute
its implementation of '*' instead, behind my back. This is inefficient,
dangerous, and plain stupid overall (superoptimizers are just an amusing
joke).

Except maybe for the dusty deck/hapless programmer problem. But then I
believe that the solution to the dusty deck problem is not the compiler
doing hazardous transformations on a low level representation of the
source of the dusty deck, but on direct source level transformations
(dust off that deck!).

	Continuining the example above, the right thing would be to use a
	source tranformer, hopefully under programmer control, to change
	the three nested loops into an invocation to '*'.

In the dirty commercial world there are lots of companies that earn a
living dusting off decks. I wish that in the research world user
assisted source level transformation algorithms became more popular
outside the Lisp/AI community. It is a fascinating research subject,
more so I think than debugging aggressive optimizers.

I would suggest rereading Knuth on this specific subject, and Denning on
SDRAWKCAB for the general idea.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk