[comp.lang.forth] Parsing Forth

wmb@MITCH.ENG.SUN.COM (03/15/91)

> They are *DEAD WRONG* if they think that there's no interest in LALR
> parsing with respect to Forth in general!   The venerable Dr. Ting
> published an article in Dr. Dobbs years back that was a BNF description
> for the fundamental syntax of Forth.   It was *quite* simple and consisted
> of about two (!) typed pages of text;

Unfortunately, Dr. Ting's Forth "railroad diagrams" failed to capture
all of the possibilities.

> You could, concievably, feed this definition to any LALR parser generator
> and get a program that parsed Forth properly

Won't work.  The possibility of user-defined immediate words ruins it.
For example:

        : ignore-next-word  ( -- )  bl word drop  ;  immediate

        : foo
           if
              ignore-next-word else
           then
        ;

In general, immediate words can introduce context dependencies that
cannot be statically analyzed.  For example, ignore-next-word could
have been written to depend upon a run-time value:

        : ignore-next-words  ( n -- )
           ." Ignore how many? " get-number-from-user
           0  do  bl word drop  loop
        ;

Another way of thinking about this is that the set of Forth "keywords"
can be extended, so any fixed parser cannot know the complete set of
tokens it has to deal with.

> (As a matter of fact, I think somebody
> did something like this for one of the C-based Forths in the past...).

You could do a restricted subset of Forth this way, but not a complete
Forth.

John Hayes and I tried to make some railroad diagrams for ANS Forth
a couple of meetings ago.  We succeeded in capturing the "flavor" of
common control structure usage, but not in defining the complete range
of possibilities.  With the addition of SO and STILL to complete the
control structure primitives, it becomes even more difficult to describe
Forth in the traditional CS fashion.

Mitch Bradley, wmb@Eng.Sun.COM