[comp.lang.scheme] Byte code speed

NETWORK@FRSAC11.BITNET (12/18/87)

Date: 18 December 1987, 15:38:49 GMT
From: NETWORK  at FRSAC11
To:   SCHEME at MC.LCS

A while ago there was some mail about benchmarking Scheme, what's
new about that ?
Using the very simplistic (fib 20) test I was surprised to get very
good results on my M24 running TI PC Scheme 2.0, and quite good one
for MacScheme. This compared to straight interpreters. (Lisp interpreters,
vs Byte code interpreters).

My question is: Where can I get description of the byte code used by
TI PC scheme ? How about the compiler and the byte code interpreter ?

(I was quite surprised to see that my 8Mz 8086 run at the same speed than
the IBM 3090-150 cpu, the difference beeing TI PC Scheme on the tiny
box, and CScheme 5.3 on the big blue one) (around 5 sec for (fib 20)
on each).

Happy new year.

Jean-Pierre H. Dumas

network@frsac11 (bitnet)
network%frsac11.bitnet@mitvma.mit.edu (arpanet)
dumas@sumex-aim.stanford.edu (arpanet)

bartley@home.csc.ti.COM (David Bartley) (12/19/87)

> From: NETWORK%FRSAC11.BITNET@mitvma.mit.edu
> A while ago there was some mail about benchmarking Scheme, what's
> new about that ?
> Using the very simplistic (fib 20) test I was surprised to get very
> good results on my M24 running TI PC Scheme 2.0, and quite good one
> for MacScheme. This compared to straight interpreters. (Lisp interpreters,
> vs Byte code interpreters).
> 
> My question is: Where can I get description of the byte code used by
> TI PC scheme ? How about the compiler and the byte code interpreter ?
>    [...]
> Jean-Pierre H. Dumas

The only description of our work in the open literature is the paper
entitled ``The Implementation of PC Scheme'' in the proceedings of the
1986 ACM Conference on Lisp and Functional Programming, pp 86-93.
Although it doesn't disclose nitty-gritty details, the paper explains
the general nature of our byte-threaded-coded virtual machine and
gives some examples of its instruction set.  It also has an overview
of the compiler and a listing of the 8088 code for the 5-instruction
``next'' operation of the emulator.

> As far as I have observed PC Scheme GC somewhat less than CSCheme,
> From old popular wisdom (here, in France) it is said that the speed
> of an interpreter is strongly related to the use of "CONS" for the
> own use of the interpreter itself, and I have the feeling that
> PC Scheme "conses" less than CScheme too. (I am surely wrong, it
> is just a feeling. But the popular wisdom seems solid.)

Our emulator does no consing on its own except to extend the stack
when it overflows.  Until release 3.0, PC Scheme had no interpreter at
all.  It now mixes interpretive and compiled execution in the
following way: An S-expression given to EVAL will be interpreted until
a subproblem is encountered which involves variable binding (e.g., a
LAMBDA, LET, or LETREC).  The subproblem is then compiled and executed.
As a result, simple expressions like (DEFINE X 3) and X are interpret-
ed, but the procedure in (DEFINE F (LAMBDA (X) ...)) is always compiled.

Regards,
David Bartley

jameson@calgary.UUCP (Kevin Jameson) (12/20/87)

Can anyone tell me anything about public domain Scheme implementations 
which will run on a PC?  GNU Scheme, I am told, is too big.
Thanks.

...{ihnp4,ubcvision}!alberta!calgary!vaxb!jameson

willc@tekchips.tek.COM (Will Clinger) (12/22/87)

David Bartley of TI wrote concerning PC Scheme:

    Our emulator does no consing on its own except to extend the stack
    when it overflows.

In the sense intended by David this is true of MacScheme 1.5 also, but
it depends on what is meant by "on its own".  Besides the byte codes that
obviously allocate storage, such as cons and make-vector, there are byte
codes such as + that allocate storage on occasion.  Of more interest are
the byte codes that create closures, that allocate lists for rest arguments,
and that allocate heap storage for variables that are closed over by lambda
expressions (unless this is done by the byte codes that create closures).
Clearly both PC Scheme and MacScheme have such byte codes.

I think the French wisdom is that fast interpreters do not allocate
garbage-collected storage for argument lists, implicit temporaries,
or continuations.  This is true of PC Scheme and MacScheme 1.5, but older
versions of MacScheme were significantly slower than Version 1.5 in part
because they allocated continuations on the heap.  It's interesting that
the original design back in 1983 called for stack-allocated continuations,
but the heap-allocated continuations that were used as a temporary
measure during the bootstrap ran fast enough (because of a fairly fast
garbage collector) that there wasn't much pressure to change it until the
native code compiler was added this spring (by Eric Ost and Anne Hartheimer).

Peace, Will Clinger