[comp.lang.forth] Where does FORTH belong in CS and Academia?

hawley@adobe.COM (Steve Hawley) (03/07/91)

I have been following the FORTH in CS debate with some degree of interest,
although I have found the discussion of "FORTH is more suited to CS than C"
and vice versa to be somewhat reminiscent of a Little Rascals episode that
featured two of the rascals traying desperately to get an authority figure
to decide who was better Flash Gordon or Tarzan.

I'm going to tell you how FORTH has fit into CS for me, but I refuse to be
an authority figure and will not answer the question of "which is better".
The question of "which is better (in CS)" is not answerable.  You have no
clear cut definition of CS (nor will you), nor do you have a clear cut
definition of "better" with relation to CS>.

When I picked up FORTH in my senior year of college, it was my 14th programming
language (I'm now going on my 17th -- I'll let you know when my brain gets
full).  I'm not a really a FORTH programmer.  I don't pretend to be.  I used
FORTH as a tool for learning how to implement a system independent language.

That's not actually what I intended to do at the outset.

I had taken a keen interest in the implementation of various programming
languages and had recently completed a course that dealt with the implementation
and compilation of a functional programming language using graph reduction
and lazy evaluation.  It was cool, but it didn't use a machine that could be
put into silicon, nor did it have much in the way of real world application.

I thought it's be cool (read: a valuable academic endeavor) to write a compiler
for a stack-based language that used a real target machine and gave you nice
things like garbage collection, functions as first class objects (ie, a
function could return, as its value, another function that was dynamically
created within an environment or with bound values).  I thought FORTH would be
my basis for a language.

When I got into the project, I found that FORTH was not appropriate as it is
defined for this task.  The main problem was that FORTH is too close to the
machine to do this and for the language to still be FORTH.  Words like HERE
 , CREATE and DOES> made little or no sense in such an environment, and
faking them would break a fair amount of code that depends on their semantics.

Sadly, the design work alone of *my* language and how I wanted to implement it
could've taken the semester alone, and I really wanted something that worked.
So what I ended up doing instead was building a 68000-based-subroutine-threaded-
binary-portable FORTH.

As I got into the details of implementing the language, I started making some
cool discoveried about the nature of FORTH.  The most enlightening one was
that FORTH has no syntax from a certain point of view except that tokens
start with non white space and are terminated by whitespace, and a token.
Tokens come from an input stream, the first token is looked up.  If it exists
in the dictionary, or we are in compile mode and the word is immediate, it gets
executed, else if we're in compile mode, it gets stuffed into the current
word being compiled, else if it doesn't exist in the dictionary it is parsed
as a number and pushed on the stack.  If it fails the number test, an error is
generated.  E-Z!

It is purely by convention that colon is a word that when executed, pulls the
next token from the input stream, creates a dictionary header and turns on the
compile flag.

It was the discovery of these conventions that made me look at the language
in a whole new light.  I had never really considered what the "real" meaning
of "if" (FORTH "if", C "if", Pascal "if" etc) was and how it related to the
rest of the language and the capability of your target machine.  My advisor
and I met weekly to discuss progress and issues, and while neither of us knew
FORTH to start, we did when we were through and picked up a lot of cool stuff
along the way.

For example, I came up with the following definition for IF and THEN
HEX
: IF ( -- address of forward branch ) 4a5e6700 2, HERE 0 , ; IMMEDIATE
: THEN ( address of forward branch --- ) DUP HERE SWAP SUB SWAP ! ; IMMEDIATE

So what happens is that IF spits out the following code in the dictionary
        tst.w   (a6)+  ; test and throw away top of stack
        beq     xxxx(a3) ; branch to some section of the dictionary

and leaves xxxx on the stack.

THEN goes takes this address, calculates the difference between the branch and
HERE and stores this into xxxx, self modifying the current word and completing
the IF.

My advisor noted ELSE marks the termination of IF and effectively should
complete the pervious branch and start a forward branch to the next THEN.
In effect, ELSE is nothing more than an unconditional IF, so ELSE could be
defined like this:

: ELSE 0 COMPILE IF HERE 0 , SWAP COMPILE THEN ; IMMEDIATE

In fact, this isn't how it was defined.  It instead looked like this:

ELSE	move.w #$6000,(a2)+ ; emit bra xxxx instruction
	jsr    HERE     ; get HERE
	clr.w  (a2)+	; write a place holder for the branch
	jsr    SWAP
	bra    THEN

Even funkier was WHILE and REPEAT:

WHILE	bra IF

REPEAT	move.w #$6000,(a2)+  ; look familiar?
	jsr    SWAP          ; swap back branch location
	jsr    BACK          ; complete backwards branch 
	bra    THEN

BACK was a special word to optimize backwards branches for the 68000.

I thought that solving how to handle CREATE and DOES> was the neatest part of
the project.  So much so that I presented a talk at the end of the semester
entitled "Just what does DOES> do and other FORTH dos and don'ts".  I had found
that CREATE and DOES> provide a mechanism of creating functions that bears an
incredible resemblance to the idea of functions as first class objects in the
language Scheme (which is a LISP dialect).  My talk presented FORTH in terms
of Scheme so that the audience (the CS department) could get a grasp of FORTH,
then I showed how CREATE and DOES> was implemented, which is really mind
bogglingly easy but non-obvious from simply looking at the syntax.

In Scheme land, I could write the function:

(define make-bx+c (lambda (b c)
	(lambda (x) (+ (* b x) c))
))

Such that if I do this:
(define 3x+7 (make-bx+c 3 7))

I can now do this:
(3x+7 4)
=> 19

In FORTH:
: make-bx+c ( b c -- ) CREATE , , DOES> DUP @ SWAP 2 + @ ROT * + ;
3 7 make-bx+c 3x+7 ( x -- 3*x+7 )
4 3x+7 .
=> 19

This was one of the most important topics in the CS class that used Scheme:
that functions are really objects, no different from integers or structs.
And here is FORTH, which encompasses a large part of this idea and implements
it in what amounts to a handful of bytes.  The whole idea of functions as
first class objects was an abstract and nebulous entity for me.  I knew how
it was done (in fact, I wrote a version of Scheme in C a year earlier), but
not how it was done efficiently or in a practical way.  Forth took the
abstractness and put a handle on it.

When I did a wrap up of the talk, I presented a side-by-side comparison of
what Scheme embodies and what FORTH embodies.  At the bottom of my slide,
in the Scheme column, I had "Conclusion: Scheme is Abstract" with a Brodie
style cartoon of a person with both eyes on one side of their head and other
deformities that Picasso would've used.  In the FORTH column, I had "Conclusion:
FORTH is concrete." with a bag of cement.  It went over well.

Given the choice, I would use this project as the basis for course curriculum.
FORTH is small enough that it can be done in a semester from scratch and can
be understood in its entirety by an average student.  In addition, the end
product is something that can be used in a real world application (unlike
most of the products that come out of compiler courses).  The amount of
information that gets passed on about language design, machine architecture,
semantics, etc is invaluable.

But how do I use FORTH to further the efforts of CS?  Good question.  The
trouble is that FORTH lacks a lot of the conveniences that other languages
have built in.  For example, suppose you decide to view a program as graph
instead of a linear sequence of instructions.  The way to "run" the program
is to reduce the graph to a single node.  So you build your graph, and
start reducing it.  Hey!  It can be reduced in parallel too!  Without any
of those grungy semaphore things like mutex-begin and mutex-end.  Spiffy.
Parallel processing without having to explicitly declare the parallelism.
Can you do parallel graph reduction in FORTH?  Sure.  FORTH and Scheme (for
example) are Turing complete, so you know it can be done, but it sure is nice
to have the luxury of built-in garbage collection, and car, cdr, and cons for
building/reducing the graphs, you know?

Realize that a programming language is merely a vehicle for research into
computer science, not the science itself and that FORTH, like Fortran, ratfor
basic, pascal, lisp, scheme, modula, C, Cobol, Apl, PL/1, euclid, sparks,
miranda, etc, has its place within the framework of computer science.

Steve Hawley
hawley@adobe.com
-- 
"Did you know that a cow was *MURDERED* to make that jacket?"
"Yes.    I didn't think there were any witnesses, so I guess I'll have to kill
 you too." -Jake Johansen