[comp.lang.forth] Some Code

ir230@sdcc6.ucsd.edu (john wavrik) (11/17/89)

John Wavrik writes:

>     I don't understand how this newsgroup has become split into two 
>     subgroups. On the one hand we have the people who want to know how many 
>     pins will be on the RTX-3000 (and Phil Koopman who replies by supplying 
>     the phone number of Harris, Inc). On the other, we have 'C' programmers 
>     who are intent at modifying a language they haven't bothered to 
>     understand.
>     
>     Apparently the people who are doing interesting and portable things in 
>     Forth are too busy to take the time to tell us about them and post some 
>     source code. 
 
While I feel that these words are unnecessarily harsh, I agree that it would 
be nice to see some Forth in this newsgroup.

                                     -----
 
Here is a solution to a problem which arose in the San Diego Chapter of FIG.
Forth allows words to be passed to other words as parameters (and also, 
incidentally to be returned). The usual mechanism for passing WORD1 to WORD2 
is to use  ' WORD1 WORD2. This requires WORD1 to be named. The problem is to 
find a way to pass an unnamed code fragment to WORD2. (This arises, for 
example, in a writing a general sorting program where an array, bounds, and a 
comparison function must be passed).

This solution is for F-83:

VARIABLE OSTATE      ' : @ CONSTANT DOCOL                 
: <[:>   R> DUP @ >R  2+  ;                               
                                                          
: [:   STATE @  DUP OSTATE !                              
          IF    COMPILE <[:>  ?>MARK  DOCOL ,             
          ELSE  HERE DOCOL ,  !CSP ]    THEN ; IMMEDIATE  
                                                          
: ;]   OSTATE @                                           
          IF    COMPILE EXIT ?>RESOLVE                    
          ELSE  ?CSP  COMPILE EXIT                        
                [COMPILE] [             THEN ; IMMEDIATE  

The new words are used like this:
   [:  4 5 +  ;]   puts an address on the stack which, when executed,
                   puts 9 on the stack -- the address is the address
                   of the code fragment in brackets.

  : Ex1  [: 4 5 + ;]  ;
                   when Ex1 is executed, the address of the code fragment
                   is put on the stack. 

To understand these words (and make them more portable) we remove the compiler 
protection and special branching words: 
                                                          
VARIABLE OSTATE      ' <any colon word> @ CONSTANT DOCOL                 
: <[:>   R> DUP @ >R  cell +  ;                               
                                                          
: [:   STATE @  DUP OSTATE !                              
          IF    COMPILE <[:>  here 0 , DOCOL ,             
          ELSE  HERE DOCOL ,   ]    THEN ; IMMEDIATE  
                                                          
: ;]   OSTATE @                                           
          IF    COMPILE EXIT  here swap !                    
          ELSE  COMPILE EXIT                        
                [COMPILE] [             THEN ; IMMEDIATE  
                                                         
The simplest case is that in which STATE is zero (the words are used in the 
interpretive mode). [: puts the current dictionary position on the stack,
compiles the address of the inner interpreter used by all colon words, and 
switches the system to the compiling mode. ;] compiles the EXIT which ends 
colon definitions and switches the system back to the interpretive mode. We 
are left with the stuff between the brackets compiled into the dictionary 
(with the appropriate code field) and the address of the fragment on the 
stack. 

When STATE is non-zero, a "code literal handler" <[:> is compiled followed by 
an address (to be filled in later) followed by the address of the inner 
interpreter, followed by the code fragment and any other code. ;] compiles 
an EXIT (after the code fragment) and fills in the address where the remaining 
code starts. When this word executes, the "handler" gets the "return address" 
from the return stack -- knowing that it points to what would normally be the 
next instruction to be executed. (The "next instruction" is actually the 
address of the code after the end of our code fragment). The handler puts this 
address back in the return stack (thus fooling the system into jumping over 
the code fragment) and it puts the address of the code fragment on the stack. 

These words are interesting because they bring up a point: they do not involve 
assembly language, but they involve a knowledge of how traditional Forth is 
implemented. They involve knowing what is put in the return stack, how a 
dictionary entry is compiled, what a dictionary entry looks like, etc. (The 
amazing thing is that the definitions above will work on any traditionally 
implemented Forth regardless of processor! It's like having a totally portable 
assembly language.)  An aspect of Forth which hasn't been touched on is a 
Forth programmer's ability to use his or her knowledge of the implementation. 
(Which, again, accounts for a small fraction of code but some of the most 
powerful results.) 

I should note that these definitions even work on my friend's XFORTH written 
in 'C' (which is, in many respects, a conventional Forth) but they do not work 
in F-PC (which is a non-traditional Forth even though it is written in Forth 
plus assembler). 

 
                                                  John J Wavrik 
             jjwavrik@ucsd.edu                    Dept of Math  C-012 
                                                  Univ of Calif - San Diego 
                                                  La Jolla, CA  92093 

toma@tekgvs.LABS.TEK.COM (Tom Almy) (11/21/89)

In article <5187@sdcc6.ucsd.edu> ir230@sdcc6.ucsd.edu (john wavrik) writes:
...
>This solution is for F-83:
>VARIABLE OSTATE      ' : @ CONSTANT DOCOL                 
... (lots of code)
>These words are interesting because they bring up a point: they do not involve 
>assembly language, but they involve a knowledge of how traditional Forth is 
>implemented. They involve knowing what is put in the return stack, how a 
>dictionary entry is compiled, what a dictionary entry looks like, etc. (The 
>amazing thing is that the definitions above will work on any traditionally 
>implemented Forth regardless of processor! 

Which brings up a sore point about any of these. This code, and most of
the "interesting" stuff that gets published, is very non-portable. What
is a "traditionally implemented Forth" anyway? And how many implementations
are traditional?

The 83 standard specifically says you cannot use tick to reference any
data that you didn't explicitly compile. This example fetches from the
compilation address of a word. I use a number of different Forth 
implementations, but *none* of the ones I use will this work as intended.

That's not to say that this posting is "bad". Far from it. The problem, if
indeed there is one, is that this sort of program plays around with the
internals of Forth, and the internals are decidedly non-standard. 
Unfortunately, if these details were to become part of the standard then
it would inhibit better implementations. I've been running direct threaded
Forths for about seven years, and woundn't want to be forced back to
indirect threaded to get this code to work!

It's sad but true -- to get the best results with Forth you have to
throw portability out the window. I keep the implementation dependent code
in my programs localized (even at the expense of performance) so I minimize
my porting struggles. But it would be nice if I could assume running on the
same machine/environment forever!

>It's like having a totally portable 
>assembly language.)  An aspect of Forth which hasn't been touched on is a 
>Forth programmer's ability to use his or her knowledge of the implementation. 
>(Which, again, accounts for a small fraction of code but some of the most 
>powerful results.) 

Its more like being a C programmer and having sources to your compiler. You
can add features to make your C better, but no one else can use your code
unless they have the same compiler sources.

Tom Almy
toma@tekgvs.labs.tek.com
Standard Disclaimers Apply

ir230@sdcc6.ucsd.edu (john wavrik) (11/22/89)

Tom Almy writes,
# In article <5187@sdcc6.ucsd.edu> ir230@sdcc6.ucsd.edu (john wavrik) writes:
# ...
# >This solution is for F-83:
# >VARIABLE OSTATE      ' : @ CONSTANT DOCOL                 
# ... (lots of code)
# >These words are interesting because they bring up a point: they do not
# >involve assembly language, but they involve a knowledge of how traditional
# >Forth is implemented.

> Which brings up a sore point about any of these. This code, and most of
> the "interesting" stuff that gets published, is very non-portable. What
> is a "traditionally implemented Forth" anyway? And how many implementations
> are traditional?

I suspect the code is just "non-portable" not "very non-portable". It is so 
short and simple, it might be interesting to see how this must be changed for 
direct threaded, jsr threaded, etc. (with some explanation of what changes 
were made and why they are needed). ANYONE INTERESTED, PLEASE POST. 

> It's sad but true -- to get the best results with Forth you have to
> throw portability out the window.

That's only because we've made some bad decisions about what constitutes
the language -- but fortunately we have a chance to rectify these mistakes.

I think that the use of implementation details in Forth is the subject of 
dishonest controversy. Those who feel that knowledge of the implementation 
should not be part of the Standard are usually the first to pull things off 
the return stack, relink the dictionary, and do all sorts of other "illegal" 
things as soon as you ask them how they would add local variables (or some 
other such thing) to a system. The problem of introducing unnamed code 
fragments was posed at the San Diego Chapter of FIG by Bob La Quey and all the 
solutions presented were of the character of the one I presented in my article.
It would be interesting to see someone come up with a way of doing this that 
is entirely "standard".

I guess we should mention that most of us write very little code which 
directly uses the implementation -- but that (1) it does account for some
fairly powerful results and (2) it is portable in that it does not depend on 
the host processor [The code I submitted will run on all but one of the Forth 
implementations I have on everything from an Apple II to a VAX (if I allow 
for the changed meaning of tic).

> That's not to say that this posting is "bad". Far from it. The problem, if 
> indeed there is one, is that this sort of program plays around with the 
> internals of Forth, and the internals are decidedly non-standard. 
> Unfortunately, if these details were to become part of the standard then it 
> would inhibit better implementations.

Please let's distinguish things like the use of assembly language or features 
of the host processor from "the internals of Forth". A Standard COULD be 
proposed in which implementation details are specified (the FIG model was of 
this type) and then these details WOULD be standard (and portable). A 
decision to declare such programming "non-standard" is made by a committee of 
human beings, not forced on us by our computers. It's quite a reasonable idea 
to take a look at that decision to see if the reduction in power of the 
language is a price worth paying for baptizing a variety of deviant 
implementations. 

If you give people a "compiler" that they can modify, guess what they will do.
There is no problem with someone producing a private version of Forth that 
they feel is "better" for them -- and many people do just that. It is quite 
likely that some of the most important applications of Forth in the past 10 
years have been written using "non-standard" versions of Forth. I don't think 
that Standards inhibit people from experimenting with the language. The issue 
is more what can be done powerfully and portably. You do not add power to a 
language by removing some of its features and declaring them to be "decidedly 
non-standard". 

Forth has been in existence for about 20 years -- readily available for about 
10. During that period of time, we may have learned some things about how to 
improve its implementation. (I think a good case can be made that as much time 
and effort have been spent in experimenting with the language's implementation 
as in actually doing something with the language.) At this point we should 
take stock of what we have learned and embody that in a new Standard. If it is 
decided that direct threaded code actually produces major performance 
advantages over indirect threaded code, then the new Standard should specify 
direct threaded code. If we decide that floored division is preferable, then 
the new Standard should specify that. Instead we have a situation in which a 
vocal group says we should drive on the left side of the road, an equally 
vocal group which says we should drive on the right -- and a Standards team 
which declares the issue to be "implementation dependent". 

                                                  John J Wavrik 
             jjwavrik@ucsd.edu                    Dept of Math  C-012 
                                                  Univ of Calif - San Diego 
                                                  La Jolla, CA  92093