[comp.lang.smalltalk] Translating Smalltalk

budd@orstcs.cs.ORST.EDU (06/08/87)

Somebody asked about translating Smalltalk into other code, and how message
passing is done.  Note that the dynamic message passing portion of
smalltalk is not the problem in translating Smalltalk to machine code;
I will use C as my target language, as it is more readable; We can easily
translate

	object send: arg1 message: arg2 with: arg3 arguments: arg4

into some C code that looks like:

*(FindMethod(object,'send:message:with:arguments))(arg1,arg2,arg3,arg4)

That is, we write a little routine that does our message lookup for us,
returns a method (function) and then executes it.  This is how object
oriented programming is done in Scheme, for example.  The need for memory
management complicates this only slightly.
If no method exists FindMethod returns a pointer to an error reporting
routine.

The real problem in translating Smalltalk is the notion of Blocks.  Blocks
must carry with them the context in which they were defined, and there is
no way to easily simulate this dynamic binding is a staticly scoped
language like C.  (This is a classic funarg problem).

Which leads one to wonder if a language similar to Smalltalk, but without
blocks, could be defined.  I think yes.  And if this were done, could you
build interpreters/compilers and generate efficient machine code without
sacrificing the nice features of Smalltalk; i.e., experimental programming,
rapid prototyping, reusability.  I think yes.  Finally, if you did this,
could you use optional declarations of object class to speed things up by 
eliminating the method lookup; (the idea being that you produce and debug 
your code, then insert declarations to speed it up); I think yes.  

Watch this space for future announcements.

--tim budd, oregon state university,  budd@oregon-state.csnet

susser@parcvax.UUCP (06/10/87)

[burp]

In article <245100008@orstcs> budd@orstcs.cs.ORST.EDU (Tim Budd) writes:
>
>The real problem in translating Smalltalk is the notion of Blocks.  Blocks
>must carry with them the context in which they were defined, and there is
>no way to easily simulate this dynamic binding is a staticly scoped
>language like C.  (This is a classic funarg problem).
>
>Which leads one to wonder if a language similar to Smalltalk, but without
>blocks, could be defined.  I think yes.  And if this were done, could you
>build interpreters/compilers and generate efficient machine code without
>sacrificing the nice features of Smalltalk; i.e., experimental programming,
>rapid prototyping, reusability.  I think yes.  Finally, if you did this,
>could you use optional declarations of object class to speed things up by 
>eliminating the method lookup; (the idea being that you produce and debug 
>your code, then insert declarations to speed it up); I think yes.  
>
>Watch this space for future announcements.
>
>--tim budd, oregon state university,  budd@oregon-state.csnet

(Apply flame retardant here...)

All of us who have experience with Smalltalk acknowledge the need for better performance from Smalltalk systems. But I think there is an undesirable trade-off of functionality for performance with the strategy you describe.

Sure, you can build Smalltalk without blocks, but the ability to abstract code as data is an extremely powerful tool that makes all sorts of things possible, and Smalltalk would not be nearly as wonderful without it. Try thinking about Lisp without function variables.

Likewise for static-typed early-bound method lookup. This strategy will indeed give you some performance increase, but it destroys the ideologically pure polymorphic nature of Smalltalk. With a completely late bound system, the level of interface is message protocol. A method doesn't care what kind of object it is sending a message to, as long as it can respond properly. I could build a ByteArray subclass LargeFloat that acts like an infinite-(or nearly)-precision Float, and it would work everywhere that Float would. Once you have early-binding, all bets are off. Somebody is going to early bind some methods that refer to Floats, and my new class won't work. There is a huge difference between a polymorhic system and a mostly polymorphic system. I think we can do better that what you propose. 

Much of Smalltalk's apparently bad performance has to do with processor technology. A 68000 has a great deal of support for running procedural languages, but none for late-bound ones. Even so, a 68000 runs Smalltalk fairly well. Imagine the performance one could get from a processor designed to run a language like Smalltalk. Fortunately, there are a number of people working on building machines to do just that.

It used to be that it was important to have code that ran blindingly fast, even if somebody had to spend two years writing it, because processor power wasn't that great. Now, as processors become more powerful, it is becoming more important to write code quickly even if it runs a little slower. As processor speed approaches the unbelievable, programming speed will become more important. If I can write something in Smalltalk in a month that runs at speed X and you can write something in C in a year that runs at speed 2X then I win because everyone is going to be buying my program for 11 months before yours is even done.

At least that's how it would work in an equitable world.

Flames gladly accepted.

--Josh Susser
  Susser.pasa@Xerox.com
  susser@parcvax.xerox.com

"My people are the people of the dessert,"
said T.E.Lawrence, picking up his fork.

johnson@uiucdcsp.UUCP (06/12/87)

Blocks are not what makes Smalltalk slow.  Evaluating a block is fairly
fast, and creating it is not all that bad, either.  Blocks slow down
Smalltalk in two ways: contexts have to be allocated on the heap instead
of on a stack and it is hard for the compiler to detect and optimize
loops.  (The compiler does a good job of optimizing IF statements.)
The first problem can be reduced by allocating contexts on a stack and
converting them into heap-based objects only when required.  PS does this.

It is certainly feasible to eliminate blocks, since Trellis/Owl shows
how it can be done.  However, blocks are nice, and I like the ability
to invent my own control structures.  

Method lookup is not what makes Smalltalk slow, either.  Most Smalltalk
systems use various caching schemes to eliminate method lookup.  PS uses
in-line caching to achieve a very high hit rate (95%) at a low cost,
i.e. only a few 68000 instructions per message send.

Dave Ungar's analysis of SOAR revealed that 75% of the time was spent in
the primitives.  Thus, one could argue that the reason why Smalltalk is
slow is that primitives are too slow.  My explanation of this is that
each primitive has to check its arguments and to do some little calculations
that the next primitive will probably redo.  What is really needed is
an optimizing compiler that optimizes over a group of primitives.

My students and I are in the process of writing such a compiler here at
Illinois.  The language is not 100% Smalltalk, because we have had to
introduce types.  If the compiler could perform type inference then the
compiler would accept normal Smalltalk programs, but as it is, the programmer
must enter the declarations.  However, we are trying hard to maintain the
interactive programming and rapid-prototyping features of Smalltalk, so
we try to support most normal Smalltalk programming practices.  I am
confident that we will be able to generate efficient code, but I am less
confident that we will not damage the programming environment.

Since primitives are what take the time, it might make sense to build
hardware that executes them fast.  On the other hand, special purpose
hardware is always more expensive than general purpose hardware, and
people who already have computers will prefer not to have buy another
just to run a particular programming language.  Thus, a software solution
has many advantages over a hardware solution.  (And in any case, I'm not
a hardware person!)

PS is described in a paper by Deutsch and Schiffman in the 1984 POPL.
Ungar's thesis is coming out as one of the Distinguished Theses series
from MIT press.  If you write to me, I can send you some tech reports
on my work.  My e-mail address is johnson@p.cs.uiuc.edu (internet) or
{ihnp4, ...}!uiucdcs!johnson.  My U.S. mail address is

	Ralph Johnson
	Department of Computer Science
	240 Digital Computer Lab
	1304 West Springfield Ave.
	Urbana IL 61801

willc@tekchips.UUCP (06/13/87)

>In article <245100008@orstcs> budd@orstcs.cs.ORST.EDU (Tim Budd) writes:
>
>The real problem in translating Smalltalk is the notion of Blocks.  Blocks
>must carry with them the context in which they were defined, and there is
>no way to easily simulate this dynamic binding is a staticly scoped
>language like C.  (This is a classic funarg problem).
>
>Which leads one to wonder if a language similar to Smalltalk, but without
>blocks, could be defined...

I don't see the incongruity between Smalltalk blocks and static scope rules.
Scheme is a statically scoped language with two independent features that,
taken together, dominate Smalltalk blocks.  (Anything a Smalltalk block can
do can be done by some simple combination of Scheme's first class procedures
and first class continuations.)  Scheme can also be made to run significantly
faster than Smalltalk.  So why are blocks an excuse for poor performance, and
what has any of this to do with static scoping?

In fairness to Dr Budd, he was answering a question about translation from
Smalltalk into C, which supports neither first class procedures nor first
class continuations.  My point is that the lack of such support should be
distinguished from a language's scope rules, though one can easily argue
that first class procedures (both "upward" and "downward" "funargs")
_require_ static scope rules.

Dr Budd might also point out that blocks are neither well-defined nor defined
well in Smalltalk-80, and he would be right.

In article <346@parcvax.Xerox.COM> susser@parcvax.xerox.com.UUCP (Josh Susser) writes:
>Likewise for static-typed early-bound method lookup. This strategy will
>indeed give you some performance increase, but it destroys the ideologically
>pure polymorphic nature of Smalltalk.

If I could have a purely polymorphic, statically typed, and early bound
language with good performance, I think I could live without the ideology.
Mr Sasser's argument that followed the sentence quoted above assumed that we
must choose between compile-time ("early") binding and run-time binding.
What's wrong with getting the best of both worlds via link-time binding?
Not that there aren't other ways to win, too, but I don't see why you need
run-time binding unless you like to write programs that create and install
new methods on the fly.

I realize that some people like this sort of thing, but I also understand
why run-time binding is an excuse for poor performance.

>Flames gladly accepted.

And gladly offered.  I'm not anti-Smalltalk by any means.  I'm just trying
to look at things objectively, taking the good and leaving the not-so-good.
Peace,

William Clinger
Tektronix Computer Research Lab