[comp.lang.forth] Forth from scratch

ir230@sdcc6.ucsd.edu (john wavrik) (10/30/89)

Subject: Forth from scratch
Newsgroups: comp.lang.forth


 rs0@beach.cis.ufl.edu (Bob Slaughter) writes:
> I have a friend who wants to write a 6502 based forth, and he needs
> the source listing and such for this.  Where can he go to find it?

  bouma@cs.purdue.EDU (William J. Bouma) writes:
>   Uh, if he wants to start from scratch, why does he need a listing?
>   I am sure you can get a listing from the Forth Interest Group. I 
>   don't have the address available, but can get it for you if you 
>   drop me a note. It will probably cost ~$10.
>
>   I am not sure why anyone would want to implement such a minimal
>   machine, at least in software. It is bound to run reletively slow
>   as there will be more threading to get to the few primitives. It
>   is an interesting thing to think about, but why implement? Especially
>   why write it in assembler? Since it is going to be slow anyway,
>   use a friendly language.


ZMLEB@SCFVM.BITNET (Lee Brotzman) writes:
>    As to why anyone would actually want to implement the "Forth Minimal
>Machine", that should be obvious:  TO MAKE SURE THE DAMN THING WORKS.
>The only way to be sure all the code that has been presented works is
>to try it.  This is the Forth philosophy.  You get an idea; you try the
>idea; if the idea works, great; if the idea doesn't work, you find the
>mistake and learn from it.
>    The weak point of the Forth minimal machine that I have noticed is that
>there is no input or output in any form.  Assume the dumbest of serial
>interfaces (put a character in CHAROUT and get a character fom CHARIN) and
>go for it.

Missing in this discussion is the fact that most of Forth is written in Forth.
The Forth Interest Group still has assembly language listings available for 
$15 each, which may be of some interest -- but not directly applicable to the 
task of bringing Forth up from scratch.

              FORTH INTEREST GROUP
              P.O. BOX 1105
              SAN CARLOS, CA  94070

These listings will disclose that they consist, for the most part, of assembly 
language code for putting colon definitions in place. Unfortunately FIG took 
advantage of the fact that an assembler makes several passes, thus early words 
refer to later ones in the listings -- so they cannot be used to bootstrap 
Forth. There was a book published in 1981 by BYTE Books called "Threaded 
Interpretive Languages" (by R.G. Loeliger) that does describe the process of 
bringing up a Forth-like language from scratch. 

There is a topological sorting problem involved in bootstrapping a Forth 
system: If word A depends on word B then B must be defined first. Figuring out 
which words can be defined in terms of which others is actually a very good 
and interesting way to learn Forth. When Forth was new on the scene, many 
people did bring up Forth by bootstrapping. As I recall there were not more 
than about 200 bytes of Z-80 assembly language code needed to produce enough 
of a system for new definitions to be typed in by hand -- and from that point 
a few definitions more allowed source code to be read from cassette (at that 
time).

The idea that this results in systems which are slow and inefficient is not 
correct. Most Forth systems include the same level of nesting as if they were 
brought up this way. The fact that certain of the most frequently used words 
can be coded directly in assembly language provides as much speed for these 
primitives as one can get.

An indication of how much of a modern Forth system consists of "code" words is 
provided by F-83.  These, out of about 600 words in the Forth vocabulary,
are the ones with code definitions.
 
OLDCINT   GETCINT   NEWCINT   C-FIX   D*S   GETTIME   P!   PC!   P@    
PC@   MULTI   RESTART   (PAUSE)   UNBUG   (FIND)   HASH   TRAVERSE     
SCAN   SKIP   DIGIT   BDOS   CAPS-COMP   COMP   UPPER   UPC   LENGTH   
COUNT   FILL   D2/   D2*   DABS   S>D   DNEGATE   D+   2OVER   2SWAP   
2DUP   2DROP   2!   2@   >   <   U>   U<   =   0<>   0>   0<   0=      
UM/MOD   UM*   2-   1-   2+   1+   8*   U2/   2/   2*   +!   ABS   -   
NEGATE   +   OFF   ON   CTOGGLE   CRESET   CSET   NOT   XOR   OR   AND 
PICK   R@   >R   R>   FLIP   -ROT   ROT   NIP   TUCK   OVER   SWAP     
DUP   DROP   RP!   RP@   SP!   SP@   CMOVE>   CMOVE   C!   C@   !   @  
(?LEAVE)   (LEAVE)   J   I   PAUSE   NOOP   GO   PERFORM   EXECUTE     
(?DO)   (DO)   (+LOOP)   (LOOP)   ?BRANCH   BRANCH   (LIT)   EXIT      
total 113                                                              
 
Of these 113 words, less than half are essential (and a few at the top are 
mine). Moreover the assembly language definitions of the most basic words in 
this set are so simple and short that one hardly needs to be a specialist to 
understand them:

    \ 16 bit Arithmetic Operations                        11OCT83HHL 
    CODE +   (S n1 n2 -- sum )                                       
       BX POP   AX POP   BX AX ADD   1PUSH END-CODE                  
    CODE NEGATE   (S n -- n' )                                       
       AX POP   AX NEG   1PUSH END-CODE                              
    CODE -       (S n1 n2 -- n1-n2 )                                 
       BX POP   AX POP   BX AX SUB   1PUSH END-CODE                  
    CODE ABS   (S n -- n )                                           
      AX POP   AX AX OR   0< IF   AX NEG   THEN   1PUSH END-CODE     
 

>   is an interesting thing to think about, but why implement? Especially
>   why write it in assembler? Since it is going to be slow anyway,
>   use a friendly language.

Bringing up Forth using a wee bit of assembly language and a lot of Forth is 
quite natural and leads to a system which is fast and can be easily modified. 
A common approach to a powerful system is to use a primitive Forth system 
built this way to write a more extensive system.

Forth should be considered (in part) as the assembly language for an abstract 
machine that is realized on a particular real machine -- thus questions like 
how are the Forth stacks implemented?; where is the instruction pointer?; etc. 
should have simple answers in terms of the architecture of the underlying 
processor. Conversely, it should also be easy for Forth to tap what it needs 
of the underlying processor or host operating system (it is interesting that 
almost all of the F-83 words that access the MS-DOS operating system -- 
including KEY, EMIT, and the disk file words -- are high level Forth words 
which use only one "code" primitive, BDOS). 

With regard to implementing Forth using another language: I recently had a 
chance to examine a Forth system which is written in 'C' (which some think is 
a "friendly language"). As it turns out, there are some surprising defects in 
this approach. This particular Forth system happened to be missing some 
commands and had a few others incorrectly implemented. This would be no 
problem in a traditional Forth implementation -- the words that needed to be 
added had very short assembly language definitions. No assembler was provided 
in this system because to provide one the author would have had to know 
details about how the user's 'C' compiler allocates registers and manages its 
stack. Thus rather then producing a friendly and flexible Forth system, 
embedding Forth in another language resulted in an inflexible system -- no 
longer a Forth which can be modified and extended at will. It is quite likely 
that attempts to bring up Forth using a "friendly language" (other than Forth 
itself with a small amount of assembly language) will result in the same 
defect: the implementor relinquishes control over the use of the system to the 
peculiarities of a compiler. 

An interesting thing has happened to Forth. Ten years ago, when people first 
heard about the language, no commercial implementations were available -- so 
lots of people did what Bob Slaughter's friend wants to do. Now it is more 
common to use existing Forth implementations to write new and more powerful 
systems -- the fact that Forth can be implemented from scratch seems to have 
been forgotten. We have come to regard the availability of good Forth 
implementations (like F-83) as a great boon and expect our students and 
successors to be able to begin from where we left off. We need Lee Brotzman to 
remind us that we understand Forth because we have seen how to bring it up 
from scratch -- and that the new generation can benefit from this exercise. 


                                                  John J Wavrik 
             jjwavrik@ucsd.edu                    Dept of Math  C-012 
                                                  Univ of Calif - San Diego 
                                                  La Jolla, CA  92093 
 

bouma@cs.purdue.EDU (William J. Bouma) (10/31/89)

In article <4826@sdcc6.ucsd.edu> ir230@sdcc6.ucsd.edu (john wavrik) writes:
>
>Missing in this discussion is the fact that most of Forth is written in Forth.

     No, that is what the entire discussion has been about! I suggested
     that it is probably not a good idea to bring forth up from the
     minimal set of words from which this is possible.

>The idea that this results in systems which are slow and inefficient is not 
>correct. Most Forth systems include the same level of nesting as if they were 

     The point is that the minimalist implementation will be slower and
     less efficient. Also there may be words that are easier coded in the
     host assembly language than in the minimal forth set. Thus from both
     a user and an implementor viewpoint, I see no practical reason to
     implement such. If you are doing it for fun or educational purpose,
     fine.

>With regard to implementing Forth using another language: I recently had a 
>chance to examine a Forth system which is written in 'C' (which some think is 
>a "friendly language"). As it turns out, there are some surprising defects in 
>this approach. This particular Forth system happened to be missing some 
>commands and had a few others incorrectly implemented. This would be no 

     You are drawing some large unfounded conclusions here. You have one
     example of forth written poorly in C and from that you determine
     that forth cannot be written well in C! I have written forth in 6502
     assembly, and in C. I think my C version is quite nice, and it is
     portable.  But if I were to write forth for a particular architecture,
     I would do it just as you suggest, in assembler. That way one can 
     optimize the crap out of it. 

>successors to be able to begin from where we left off. We need Lee Brotzman to 
>remind us that we understand Forth because we have seen how to bring it up 
>from scratch -- and that the new generation can benefit from this exercise. 

     Yes. I do not wish to discourage any one from learning by doing. But
     I still see no reason to implement from a minimal set of primitives.
     Well, perhaps if there were a RISC-FORTH chip.
-- 
Bill <bouma@cs.purdue.edu>  ||  ...!purdue!bouma