wmb@sun.uucp (Mitch Bradley) (12/14/86)
There seems to be a lot of interest right now in JSR threading. I would like to play devil's advocate for a minute and talk about it's down side. I do not dislike jsr-threading, but I have been hearing a lot of claims about it which are perhaps a bit over-optimistic. 1) JSR-threading by itself is not necessarily faster than threaded code. It depends strongly on the processor architecture, but on the 68000, a JSR/RTS pair is SLOWER than the obvious direct-threaded NEXT routine (which is MOVE (IP)+,A0 JMP (A0) ). The reason it is slower is because JSR/RTS has to PUSH/POP a longword on the stack. Entering a colon definition is somewhat faster, but measurements show that NEXT overhead at the leaves of the call tree (the CODE words) dominates the time. 2) On the other hand, JSR-threading does allow code words to be compiled in-line, thus eliminating the NEXT overhead entirely. This is how products like MACH and JFORTH get their speed. 3) In-line compilation has its drawbacks though. In particular, it makes decompilation very hard. Threaded code can be decompiled to a very close approximation of the original source code. A good decompiler can be a very addictive tool. 4) Contrary to popular opinion, JSR-threading doesn't suddenly "open up the possibility" of compatibility with other languages. C subroutines, for example, can be called from threaded code implementations with approximately the same degree of difficulty as with JSR-threaded Forths. I did a JSR-threaded, in-line code implementation a while ago. I used it a little, then went back to direct-threaded code. JSR-threading benchmarked better, but in practice, it didn't feel much faster in most cases, because a) I rarely write computationally-intensive code b) Most annoying waits are due to i/o c) The old adage that the algorithm is more important than the compiler is absolutely right. I found the lack of good decompilation so frustrating that the small improvement in speed paled in comparison. Please do not flame me if your tradeoff criteria are different from mine. Appendix: Calling C from Forth In case anyone is interested, here is a list of issues that have to be resolved when calling C routines from Forth and vice versa. Note that none of these have anything to do with JSR-threading These problems are by no means insurmountable; my Forth system for the Sun can dynamically load Unix .o files quite nicely. a) C and Forth use registers differently. This requires that certain registers be saved when making the transition between C and Forth, and others registers reloaded. The most fundamental difference is that Forth has 2 stacks whereas C has one. b) Forth and C have different data types, so arguments often have to be converted. Example: Forth uses either packed (count byte) or "address length" strings, whereas C uses null-terminated strings. You also have to worry about Forth's 16-bit/32-bit/double-number nonsense versus C's rather weak specification of the meanings of int/short/long. c) Most systems do not provide convenient tools for creating and incrementally linking object code files. In some cases, the object file format in not even documented. (Unix System V with its Common Object File Format seems to be a step in the right direction, except that COFF is so horribly complicated). d) Forth and C have fundamentally different notions of memory allocation. Forth wants to think that it has a huge contiguous chunk of memory that it can do anything at all with, whereas C has a (possibly write- protected) text segment, a data segment, a bss segment, and a stack segment. Furthermore, C can get more memory will malloc() or calloc() or something similar. (Bill Sebok has solved this problem in all it's generality, although his solution is rather involved. See his paper in "The Journal of Forth Application and Research", Vol. 3, No. 2, 1985 for details) e) (This one is really nasty) The interface to "modules" of C code is specified in terms of C preprocessor "#include" files. These files can contain defined constants (no problem), structure definitions (troublesome but not serious), and fragments of C code (aargh!!!). In order to teach Forth the interface to the C code, you have to convert all these things into something Forth can understand. In individual cases, it isn't too bad to do this manually, but it soon gets tedious, and it is hard to automate in the general case.