Mitch.Bradley@ENG.SUN.COM (04/20/91)
> Forth code is usually compiled as a threaded but you can quite > easily convert it to subroutine threaded and even pure machine code. On most processors, subroutine threaded code without in-line machine code expansion is SLOWER than direct threaded code. This is because typical program thread from code word to code word 8 times more frequently than they nest and unnest colon definitions. The "jsr/rts" pair usually has to push a return address on a stack, whereas typical direct-threaded in-line compiled "NEXT" routines keep "IP" in a register. However, subroutine threading opens the door to in-line machine code expansion. The tradeoffs in a nutshell: * If you don't plan to use in-line expansion of code words, don't use subroutine threading. * If you really must have the ultimate speed, then use subroutine threading with in-line code expansion and peephole optimization. (Be honest about this; most applications bottleneck on I/O, and most compute-bound applications spend nearly all their time in a very few inner loops. It is often cost-effective to use threaded code for most of the application and hand-code a few critical words). * Threaded code is easier to debug. It is possible to decompile in-line expanded code, but not easy, especially if peephole optimization has been performed. Mitch Bradley, wmb@Eng.Sun.COM