[net.lang.forth] Writing FORTH in C

grt@twitch.UUCP ( G.R.Tomasevich) (04/02/86)

I just replied by mail to a comment, but this might be of general interest.
Several years ago I wrote an extensible dictionary compiler (EDC), similar in
philosophy to FORTH, in C.  It does not obey all of the ideas that Chuck
Moore wrote in a draft of something or other that I read while still at NASA.
Rather, it looks sort of like reverse Polish notation C.  Because of having
to use separate I and D on a PDP-11/45, I had to do all primitives as C
functions.  The words are in a linked list in a large array which is
malloc'ed when the program starts.  The header always includes a pointer
to a C function to be executed, a backward list pointer, the length of
the word's name, and the name itself.  Headers for macros are followed
by pointers to words, literal constants, etc, always ending with the
pointer to the 'end' word (which is ';' in FORTH, I think).  Variables and
arrays always come with their storage, so there is less indirection than
in FORTH.  Also, array bound checking is implemented, as well as stack
underflow/overflow on every pop/push.

Because of the pure text restriction, there is the concept of the permanent
vocabulary, which is not deletable, as with 'forget'.  If the program is
invoked as 'a.out', it constructs the linked list of the permanent vocabulary,
writes the whole malloc'ed area to a file, then quits.  If invoked with a real
name, it malloc's, reads the file, does a crude test whether the rollout file
is current, then goes to the interpreter.

The UNIX shell command line can include files of EDC commands, and one
word is 'xeq' to open a file and read it as commands.  Upon hitting EOF,
the word parser longjmp's back to where the file was opened, such as input
from the user terminal or a previously opened file.

I tested the speed by writing a pure C and an EDC program to read all of
the C source files and find definitions of C functions.  The pure C is 6 times
as fast.  One reason is that I simulated the stack rather than trying to
muck around with machine registers, which would have required rewriting csave
and cret and not using register variables or any C library routines at all.

The program ported to VAXen, including spastic and twitch, and in fact to
all operating system changes up to UNIX 5.2.  It may not port easily to
a machine in which pointers are not all the same size.

It originally served as the interpreter of an 8086 instruction simulator, but
now it is part of an interactive graphics program.  I use it regularly as
a desk calculator and for doing throw-away programs, not necessarily involving
plotting.  I made separate floating-point and integer stacks, which fragments
memory but allows interleaving subscripting and floating-point operations
on arrays.  That also means separate operators, e.g. '+' to add integers,
while 'f+' to add doubles.

It would be nice to do real FORTH.  There was mention that someone is/was
doing a C/UNIX implementation of FORTH.  Any replies?

A small example: everybody's favorite example of a recursive program.  This
one returns 0 for 0.

def fact		/* factorial - FORTH would say ':' I think */
	dup 1 > if
		dup 1 - fact *
			/* we do not need an 'else' branch */
	then		/* I forgot how FORTH ends an 'if' */
end			/* FORTH would say ';' I think */
-- 
	George Tomasevich, ihnp4!twitch!grt
	AT&T Bell Laboratories, Holmdel, NJ