ram+@cs.cmu.edu (Rob MacLachlan) (02/16/91)
So what about the size of Common Lisp? Is Lisp a VAX? Consider the similarities: -- Big (too big to run on a VAX) -- Old (older than a VAX) -- Slow -- Sold by DEC What lessons (if any) can language designers draw from the victory of RISC? -- Choose primitive primitives and migrate complexity upward. -- At any given level, implement only those features that are cost-effective. -- Optimize operations with a high dynamic frequency in favor of rare operations. Note that the lesson of RISC is *not* "choose simple systems." Considering the additional complexity of the compiler (due to the optimization a RISC processor demands), a RISC-based programming system is more complicated overall. A solution should be as simple as possible, and no simpler. Consider the minimalist-maximalist dimension of the system design space: -- A minimalist design is as simple as possible (in hopes of low production costs.) -- A maximalist design is as complex as is useful (in hopes of maximizing flexibility and ease of use.) RISC was initially publicized using largely minimalist arguments: "Our chip is faster because it has fewer instructions." Although the "RISC philosophy" has won over almost all hardware architects, the minimalist argument has been largely rejected as simplistic. This is why you see RISC machines with hundreds of instructions. The key point is that although a simple CPU design is inherently cheaper to design and manufacture, it is not true that the simplest CPU chip invariably produces the most cost-effective computer *system*. A minimalist programming system is based on a small set of cheap operations. Whenever you need another operation, you implement it using these cheap operations. At least for simple problems, this results in programs that use about as little hardware as possible. There are two major problems with minimalist programming systems: -- The minimalist design goals totally ignore the cost of programming. -- The hardware efficiency advantage of minimalist systems is only clear-cut with minimalist programming problems. Today there is no such thing as a minimalist programming system. All programming systems (assembler, C, whatever) have accepted some some reduction in hardware efficiency in exchange for an increase in ease of programming. This is widely accepted as true, as a "good thing". [But programming systems still do differ in their position on the minimalist-maximalist continuum. More later...] It is less widely acknowledged that minimalist systems aren't that much more efficient for solving complex problems. Could you write X much more efficiently in assembler than its current C implementation? Before you answer "Hell yes!", I point out that I mean X11, xlib, resources, widgets, xtk and motif, with total software pin-for-pin compatibility with the C version. Of course, nobody would attempt to do such a thing. If someone wanted a maximally efficient window system, they would start by throwing out huge amounts of "unnecessary" functionality. Only after they have finished crafting a minimal window system for their application would they consider anything drastic like assembly coding. Everybody agrees that X contains lots of unnecessary stuff. But given the evident truth of this, it is amazingly difficult to find features that almost everyone agrees are unnecessary. This is "design by committee" in a nutshell. If you were writing a window system for just you to use, then you really could throw out all the junk. But if you want your office-mate to help you, then you may find he has his own ideas. Pretty soon you have keymaps and environment variables. And he may want to draw splines. The bottom line is that today's programming systems are used by many people. This means that a programming system must please many people's tastes and cater to many people's needs. This means there will be some mismatch between your needs and the available offerings, but there is *no better way*. If you want to buy a car, but you hate arm-rests, it is cheapest to buy a Ford off the show-room floor and cut off the arm-rests. Any system over a certain small complexity has to be a multi-person effort. At this point, the minimalists are no longer offering an alternative -- they are merely bitching that systems that complex *should not be*. There are some things man was not meant to know... God's will, Dijkstra's will, whatever. But suppose you did decide to reimplement on top of a minimalist system. If so, there is a real risk that reimplementing "just what you need" will result in a less efficient program. To be sure, if you were as wizardly as the X implementors and spent as much time tuning as they did, then you could get the same or better efficiency for your operation subset. But if you say, "2-d trees are too hairy, I'll use a linear search", then you may be in for a surprise. I think that in reality, the biggest efficiency advantage of minimalist programming systems is not the any inherent advantage of reimplementing all the time, but rather that: **** Minimalist operations provide a good Efficiency Model **** An efficiency model is a tool the programmer uses to tell how expensive a particular code fragment will be. In the current state of the art, application of efficiency models is purely manual: you stare at the code. If you see a lot of operations or a lot of repetitions, then you know the code is expensive. In C, the efficiency model is so straightforward that Jon Bentley can publish an article telling you how to predict the performance of your C program with 80% accuracy. In contrast, most of today's Lisp programmers are working with internal efficiency models having perhaps 5% accuracy. And every time somebody writes a Lisp program 20x slower than a C program, people conclude "Lisp is terribly inefficient." o.k., now I've stopped baffling you with bullshit about window systems, and we're back to the Common Lisp v.s. Scheme v.s. C wars. And RISC v.s. CISC too! Way to go... I'll restate my RISC lessons here so that you don't have to page back: -- Choose primitive primitives and migrate complexity upward. -- At any given level, implement only those features that are cost-effective. -- Optimize operations with a high dynamic frequency in favor of rare operations. Remember those 100 instruction multi-million transistor RISCs? Well, the reason that new instructions creep into RISC processors is one of the reasons that new functions creep into Lisp. Integer multiplication was added to the SPARC because some people use it a lot, and it is a pain for each of them to design their own hardware multiplier and hook it into the motherboard. Similarly, Common Lisp has hash-tables because some people use them a lot, and it is a pain for everyone to have to design their own and hook them into the garbage collector. With either integer multiplication or hash-tables it is possible to come up with high-level software solutions that don't require a soldering iron, but you will pay a big efficiency hit. The *other* reason why functions creep into Common Lisp is a strictly maximalist one: to get economies of scale by standardizing the interfaces to useful functionality. In a word, reuse. In fact, this is a lot of the reason that CISCs got so complicated: not because the hairy instructions were much faster, but because they were easier to use in assembly language. RISC has to some degree punted on the interface standardization issue by observing that only compilers and assemblers need to know the interface to the hardware. In other words, they pushed the interface up a level. Programming systems can do this too, but at some point, there has to be a top level that programmers can recognize. This strongly encourages standardization of the intermediate layers as well (to avoid duplication of effort). Common Lisp is definitely a maximalist programming system. Does this mean that it is a total loss for efficiency? I think not, but there are some major efficiency challenges: 1] Make better efficiency models for Common Lisp. Add automatic compiler efficiency notes that flag simple problems, and document advanced compiler capabilities so that programmers can exploit them. 2] Eliminate unnecessary inefficiencies in Common Lisp by adding complexity to the implementation: - Compiler optimizations provide graceful performance degradation for less-tuned programs. - Generational GC makes consing cheap and eliminates offensive GC pauses. 3] Make simple Common Lisp programs comparably efficient to simple C programs. This is the "delivery" problem. The Python compiler for CMU Common Lisp has already addressed 1 and 2. A case study is Scott Fahlman's "export" connectionist AI simulator. This is a medium-size float-intensive application written by someone with a good Lisp efficiency model. It was written well before Python's number-crunching capabilities were designed, and was tuned to perform well in the then-available Lisp systems. When first compiled with Python, there were more than 100 efficiency notes. Even with all these problems, the program ran 2x faster than the commercial Lisp on the same machine (I won't say whether it was Franz or Lucid.) After fixing type declaration problems revealed by these notes, it was 5x faster. Interestingly, a version of this program had been recoded in C for speed. The Python version took only 1.6 times as long as the MIPS C compiler. Floating point? Who cares? That's "unnecessary"... The point is, that with sufficient effort and complexity in the Lisp implementation, users can fairly easily come within a factor of 2 of C, even for problems that seem "un-Lispy". In other words, the efficiency model has been improved from 5% to 50% accuracy. As for 3 above (the delivery problem), there is still a lot of work to be done there. Both Lucid and Franz are working on it, and I have some ideas too. The problem is how to get a relatively small binary for a delivery system. The difficulty is in figuring out how to do this without throwing the baby out with the bath-water. People don't use Lisp instead of C because they like its funky syntax. They use Lisp instead of C because it has lots of useful stuff built in, and is easy to extend. Unfortunately, those are the exact same reasons that "ld" doesn't work well on Lisp programs. What I find more interesting than the prospect of making FORTRAN-in-Lisp as fast as FORTRAN or C-in-Lisp as small as C, is the prospect of making Lisp-in-Lisp as fast as possible. Now we have FORTRAN efficiency in Lisp, but how about making efficient use of all the stuff that's in Common Lisp but not FORTRAN? Although that's lots of stuff, I assure you that there are few useless features. I recently read through the index of functions, macros and special forms in CLtL, and it was rare to find an operation I had never used (and I found none that I didn't know what they did.) The real answer to "Why is Common Lisp so big?" is "Compared to what?" If your system is very small compared to Common Lisp, then maybe you should attack a more complex problem. Big systems easily exceed the size of Common Lisp, making the penalty for having all that stuff around relatively small. As systems become more complex, the original CLtL I Common Lisp will come to seem less and less complex. Remember ALGOL with no I/O? Fortunately, Common Lisp is growing to keep up with progress in computer science. Robert A. MacLachlan (ram@cs.cmu.edu)