dsimon@eniac.seas.upenn.edu (Derron Simon) (05/09/91)
One utility I was looking for about a year ago is an optimizer for C code that can parse and optimize the original source. I don't know why this hasn't been attempted by any PD writers. It is definately non-trivial, but I'm surprised no one has ever attempted it. Think about it, a global or local optimizer with full portable C source... doesn't it make you smile. Well, has anyone heard of anything along this line, or have any opinions on the matter. -- /* Derron Simon * Internet: dsimon@eniac.seas.upenn.edu */ /* * Phone: (215)573-5650 GEnie: D.SIMON */ /* University of Pennsylvaina * FidoNet: 273/702 CS: 72571,1524 */ /* Moore School of Engineering * Box 0938/3700 Spruce St/Phila./PA 19104 */
steve@taumet.com (Stephen Clamage) (05/09/91)
dsimon@eniac.seas.upenn.edu (Derron Simon) writes: >One utility I was looking for about a year ago is an optimizer for C code >that can parse and optimize the original source. I don't know why this hasn't >been attempted by any PD writers. It is definately non-trivial, but I'm >surprised no one has ever attempted it. I'm not surprised. It is of very little use. Gains in performance at the source code level are best achieved by choosing better algorithms. Reliable optimizations achievable by source code manipulation as you describe are done anyway by quality compilers. Many of the best optimizations depend critically on how the compiler generates code, and on the characteristics of the machine. For example, an "obvious" optimization is to jump around unnecessary code. Yet on some pipelined machines, the penalty for a jump is so high it is actually faster to execute the useless instructions. Another "obvious" optimization is to convert certain kinds of multiplication into shifts and adds. The right way to do this depends critically on the machine -- some shifts on some machines are slower than some multiplies. The compiler can do this best, and quality compilers do it. Code which is fiddled to run faster with a given compiler/machine combination may run much slower with another combination. The result of the fiddling is generally not very readable, and hard to maintain. If you can't get a quality compiler and you have positively identified a bottleneck in your code, that is the time to look for a way to improve that stretch of code. Keep the original straightforward code around and document your fiddling, so that later maintainers can understand what has happened and why. -- Steve Clamage, TauMetric Corp, steve@taumet.com
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (05/10/91)
In article <721@taumet.com> steve@taumet.com (Stephen Clamage) writes: > dsimon@eniac.seas.upenn.edu (Derron Simon) writes: [ wants a source-code optimizer ] > I'm not surprised. It is of very little use. That's what I mean by counterproductive. But rather than trying to convince you that any competent human can easily outdo the best available automated optimizers, let me show you an unarguable advantage of source-code optimization. A typical math library takes hours, often even days to compile on each machine. Installing several libraries can start taking a huge chunk from the CPU time available for users. Why is compiling so slow? Because the optimizer spends so much time searching for optimizations that could have been expressed at the source level. Guess what? If someone includes a source-optimized version along with the original code, you won't have to waste nearly as much time doing the same optimizations again for your machine. This points to a productive area of research: how to make languages sufficiently expressive that all interesting optimizations *can* be expressed portably at the source-code level. Now doesn't this seem more useful than answering every optimization question with ``It is of very little use''? ---Dan
rh@smds.UUCP (Richard Harter) (05/10/91)
In article <721@taumet.com>, steve@taumet.com (Stephen Clamage) writes: > I'm not surprised. It is of very little use. Gains in performance at > the source code level are best achieved by choosing better algorithms. > Reliable optimizations achievable by source code manipulation as you > describe are done anyway by quality compilers.... Stephen's advice is good and generally sound. I just want to add some qualifiers. One is that there are quite a few local optimizations that lots of compilers don't seem to catch. A key point is that quite often you know more about the constraints than the compiler can assume. A more general point is that the advice to "choose better algorithms" is a bit misleading, although true if we take "algorithm" in its most general sense. When one thinks of algorithms one tends to think of algorithms in the abstract, e.g. sort algorithms, search algorithms, etc. Stephen's advice can lead to one trying to pick optimal algorithms which are then glued together. However a major cost in many programs is the glue, i.e. the data management and interfaces. For example fooby(...) {... p =malloc(n); ... free(p);} calls malloc and free in each invocation. If fooby is not recursive (so that there is a stack of malloc calls) we gain with static char *p = 0; static int sizeof_p = 0; fooby(...) { if (n>sizeof_p) { if (p) free(p); p = malloc(n); sizeof_p = n; } .... } This is a space/time tradeoff. We keep a permanent array (using space when not in fooby) and eliminate malloc/free calls in all but a few calls to fooby. There are lots of instances where well chosen data structures let you avoid the cost of copying by setting pointers. The point is that optimization of programs frequently involves better design of the data management process. -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.
gwyn@smoke.brl.mil (Doug Gwyn) (05/11/91)
In article <10550:May918:36:1291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >This points to a productive area of research: how to make languages >sufficiently expressive that all interesting optimizations *can* be >expressed portably at the source-code level. I think that would be heading in exactly the wrong direction. Producing good code is the task of the language translator, not the programmer. The programmer should be able to concentrate on solving the problem in abstract terms, letting the translator deal with low-level details.
torek@elf.ee.lbl.gov (Chris Torek) (05/11/91)
In article <453@smds.UUCP> rh@smds.UUCP (Richard Harter) writes: >The point is that optimization of programs frequently involves better >design of the data management process. Indeed. It has been observed (sorry, no references, but this is not only popularly known but also true :-) ) that many computers spend most of their time copying data, rather than `computing'. This is largely why a fast bcopy/memcpy/memmove is important. A fast block copy is a wonderful thing, but better yet is to eliminate the copies altogether. If the copies are unavoidable, it may pay to compute while copying, rather than having the sequence: memmove(saved_data, data, len); ... work over the data ... ---particularly if `working over the data' involves a linear pass over all the items. (TCP is a good example of this: it must copy outbound data somewhere for retransmission, and it must also compute a checksum. These operations can be done at the same time.) Of course, when optimizing, the place to start is with a profiler. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
rh@smds.UUCP (Richard Harter) (05/12/91)
In article <13089@dog.ee.lbl.gov>, torek@elf.ee.lbl.gov (Chris Torek) writes: > Indeed. It has been observed (sorry, no references, but this is not > only popularly known but also true :-) ) that many computers spend most > of their time copying data, rather than `computing'. This is largely > why a fast bcopy/memcpy/memmove is important. A fast block copy is > a wonderful thing, but better yet is to eliminate the copies altogether. True enough. One of the traps in construction of large software is that when one breaks up into components there is a real tendency copy data from component to component as a matter of convenience. For example suppose we have components A, B, and C with data flowing from A to B to C. The easy way to deal with this is to make each component its own source and sink. I.e. A creates a data set, passes it to B, and then disposes of it after B is completed. B copies its input into its own area, passes its processed data to C, and then disposes of it, and C does the same thing. Stated this baldly, it is clear that A should be the source, C the sink, and nobody should copy. However, if we look at the general case, A, B, C, etc will be defined and implemented independently, i.e. modularly. If each module has responsibility for creating and disposing of its own data then inter-module coupling is minimized and there are no nasty questions about who disposes of what. Which, of course, is why it pays to use object-oriented techniques even in non OO languages. In this case it would pay to package the data set with a reference count and a module of data-set methods. Each user ups the count upon access and decrements upon dispose. When the count goes to 0 the actual physical disposition is done. [Operations hidden behind macro or function calls, of course.] -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.