[comp.lang.forth] JSR-threading

wmb@sun.uucp (Mitch Bradley) (12/14/86)
There seems to be a lot of interest right now in JSR threading.  I would
like to play devil's advocate for a minute and talk about it's down side.
I do not dislike jsr-threading, but I have been hearing a lot of claims
about it which are perhaps a bit over-optimistic.

1) JSR-threading by itself is not necessarily faster than threaded code.
   It depends strongly on the processor architecture, but on the 68000,
   a JSR/RTS pair is SLOWER than the obvious direct-threaded NEXT routine
   (which is MOVE (IP)+,A0   JMP (A0) ).  The reason it is slower is
   because JSR/RTS has to PUSH/POP a longword on the stack.  Entering
   a colon definition is somewhat faster, but measurements show that
   NEXT overhead at the leaves of the call tree (the CODE words) dominates
   the time.

2) On the other hand, JSR-threading does allow code words to be compiled
   in-line, thus eliminating the NEXT overhead entirely.  This is how
   products like MACH and JFORTH get their speed.

3) In-line compilation has its drawbacks though.  In particular, it makes
   decompilation very hard.  Threaded code can be decompiled to a very
   close approximation of the original source code.  A good decompiler can
   be a very addictive tool.

4) Contrary to popular opinion, JSR-threading doesn't suddenly "open up
   the possibility" of compatibility with other languages.  C subroutines,
   for example, can be called from threaded code implementations with
   approximately the same degree of difficulty as with JSR-threaded Forths.

I did a JSR-threaded, in-line code implementation a while ago.  I used it
a little, then went back to direct-threaded code.  JSR-threading benchmarked
better, but in practice, it didn't feel much faster in most cases, because

a) I rarely write computationally-intensive code
b) Most annoying waits are due to i/o
c) The old adage that the algorithm is more important than the compiler
   is absolutely right.

I found the lack of good decompilation so frustrating that the small
improvement in speed paled in comparison.

Please do not flame me if your tradeoff criteria are different from mine.

		Appendix: Calling C from Forth

In case anyone is interested, here is a list of issues that have to be
resolved when calling C routines from Forth and vice versa.  Note that
none of these have anything to do with JSR-threading

These problems are by no means insurmountable; my Forth system for the
Sun can dynamically load Unix .o files quite nicely.

a) C and Forth use registers differently.  This requires that certain
   registers be saved when making the transition between C and Forth,
   and others registers reloaded.  The most fundamental difference
   is that Forth has 2 stacks whereas C has one.
b) Forth and C have different data types, so arguments often have to
   be converted.  Example: Forth uses either packed (count byte) or
   "address length" strings, whereas C uses null-terminated strings.
   You also have to worry about Forth's 16-bit/32-bit/double-number
   nonsense versus C's rather weak specification of the meanings of
   int/short/long.
c) Most systems do not provide convenient tools for creating and
   incrementally linking object code files.  In some cases, the object
   file format in not even documented.  (Unix System V with its Common
   Object File Format seems to be a step in the right direction, except
   that COFF is so horribly complicated).
d) Forth and C have fundamentally different notions of memory allocation.
   Forth wants to think that it has a huge contiguous chunk of memory
   that it can do anything at all with, whereas C has a (possibly write-
   protected) text segment, a data segment, a bss segment, and a stack
   segment.  Furthermore, C can get more memory will malloc() or calloc()
   or something similar.  (Bill Sebok has solved this problem in all it's
   generality, although his solution is rather involved.  See his paper
   in "The Journal of Forth Application and Research", Vol. 3, No. 2, 1985
   for details)
e) (This one is really nasty) The interface to "modules" of C code is
   specified in terms of C preprocessor "#include" files.  These files
   can contain defined constants (no problem), structure definitions
   (troublesome but not serious), and fragments of C code (aargh!!!).
   In order to teach Forth the interface to the C code, you have to
   convert all these things into something Forth can understand.
   In individual cases, it isn't too bad to do this manually, but it
   soon gets tedious, and it is hard to automate in the general case.