rob@idacom.uucp (Rob Chapman) (02/23/90)
I tried several attempts to email the following info to you but they either
bounced back or got swallowed. So as a last resort, I'll post it through the
news net.
68k-Forth
Overview
The source code is comprised of 3 parts: the metacompiler, the mini-assembler
and of course, the kernel. Each part is described briefly below. This
kernel is still in the testing stages but the kernel, as it is, will
metacompile itself which will allow for easy correction of bugs. The easiest
thing to compile 68k-Forth on is another port of bot-Forth. However since
most people don't have this, there are alternatives. I originally
metacompiled bot-Forth on LMI's PCForth but the metacompiler needs to be
modified to do that (ie. T! and T@ need to do byte swapping). I've tried to
keep the number of words that make up the metacompiler small so that it
shouldn't be to hard to port and the defintions for the words are in the
kernel itself. If somebody wants, I could probably send out a binary image
of some sort (ie. ascii coded binary), but I think the best way is to
metacompile off a Forth somewhere.
Metacompiler
The metacompiler was presented at the 1989 Rochester Forth conference and a
paper describing it was published in the proceedings. That one was more
general to convey its basic concepts. The one in the source code is
specific for the 68k and works in conjunction with the Mini-assembler.
The metacompiler compiles the kernel source code into a buffer named tspace.
This buffer is about 12 kbytes. The kernel source code has been arranged so
that there is no forward referencing. This allows the code to be compiled
in a single pass without having to resolve forward references and simplifies
the work that the metacompiler has to do.
In normal compiling, defining words and immediate words do all the compiling
( : ; IF ENDIF etc), while the interpreter lays down calls to non-immediate
words. In the metacompiler world, ] takes over the interpreter's job. ] is
invoked by : or an occurance in the input stream. ] looks up each word in
the input stream in two dictionaries: meta dictionary (META-FIND) and the
target dictionary (TFIND). The meta-dictionary is all the words in the host
dictionary from latest @ to meta-start @. The target dictionary are all the
words that are compiled in the target kernel in the tspace buffer. The
reason that the meta-dictionary is searched first, is to allow definition of
immediate words. The immediate words in the target kernel aren't executable
and their action must be provided by similar words in the meta-dictionary.
Most of these meta-compilers are SMUDGEed so that they will not interfere
with normal compiling. The ones which aren't smudged, check a flag to see
if they should compile into the host or target dictionary. If a word cannot
be found in either dictionary, then it is converted to a number and
compiled (TLITERAL).
Throughout the kernel, you will find metacompilers. They are preceded by a
switch back to the host space (META) and followed by a switch back to the
target space (TARGET). The reason that they are sprinkled throughout the
kernel, is because they compile words from the target kernel into subsequent
definitions (ie. ." needs (.") to be defined first).
There are some special metacompilers which are actually phrase optimizers.
These metacompilers look back at the previously layed down instruction and
try to lay down more efficient code (ie. 3 + becomes one opcode instead of
5). At present, the optimizers are few in number and only optimize 2 word
sequences.
Mini-assembler
The mini-assembler compiles opcodes into target space or host space as
dictated by the target flag. The syntax is very post-fix:
source destination instruction post-operands
Some of the 68k addressing modes require post-operands to complete the
instruction. These must be layed down after the instruction has been
compiled. For example, to put a value in top, the assembler would be:
lit top move [ n T, ] ( put n into top of stack register )
The source and destination operands are immediate constants and the
instruction is an immediate word which combines the source and destination
operands with a value and lays it down in memory (A,).
Kernel
Word structure:
The structure of each definition is: | length | link | name | code |
length - this field is really there to make life easier for the
compiler. It contains the length of the code in words
excluding the rts layed down by ;. If this field is
6 or less, then the compiler will inline the definition
instead of compiling a call to it. If the most significant
bit is set, then the word cannot be inlined. There are two
reasons for the word not to be inlined: if the word has
any bsr's (as opposed to jsr's) or if ; has converted the
last subroutine call to jump to avoid coming back just for
an rts.
link - this is a pointer back to the link field of the previous
definition. The first word in the dictionary has this
field set to 0. So it follows that no word in the
dictionary should have its link field at the address 0.
The variable latest points to the most recent definition.
name - this field is a count prefixed string which contains the
name of the definition. The count occupies the lower 5
bits of the first byte. The most significant bit of the
first byte is always one to allow for reverse traversal of
the header (C>LINK). The second msb of the first byte is
the immediate flag which is set by IMMEDIATE and checked by
the interpreter. The third msb is the smudge bit which is
set by SMUDGE and cleared by RECURSE. This is checked by
the dictionary search word COMPARE.
code - this variable length field contains 68k assembler opcodes.
There is no difference between a constant, a variable or a
word definition. Since constants and variable's are 4 or 5
words long, they are inlined as literals.
Virtual Forth model on the 68k:
,-------.
d7 |__top__|
,-------.
,-------. d6 |_next__| ,-------.
a6 |__sp___|--> ,------------. a7 |__rp___|--> ,--------------.
| | | |
| data stack | | return stack |
| | | |
,-------. ,-------. ,-------. ,-------.
d0 |__alu__| d5 |_index_| a0 |__ptr__| a5 |_page__|
top - top of data stack
next - second data stack entry
sp - pointer to the rest of data stack in memory (16 bit entries)
rp - pointer to the return stack (32 bit entries)
alu - general purpose data register
index - index for FOR NEXT loops
ptr - general purpose address register
page - 32 bit address of base of kernel. This is set by RAM which
is executed by QUIT upon startup. This register is used for
any access to memory or by subroutine calls. Also anything
which is pushed to the return stack (>R) has this added to
it so that an address may be pushed to the return stack to
modify control flow.
Memory Model:
origin# ,--------------.
| data stack |
| return stack |
| dp |
| latest |
|___kernel_____|
origin# - start of everything. This is set in the metacompiler
to 400H which is the end of the 68k exception vector
table.
data stack - starts at origin# 100 + and grows toward origin#
return stack - starts at origin# 200 + and grows toword data stack
dp - dictionary pointer points to start of free space at
the end of kernel. dp and dp @ enclose the entire
kernel which would need to be saved as a binary.
latest - points to the link field of the latest definition
kernel - definitions that comprise the kernel
It should be pointed out that the kernel is limited to a maximum size of 64
kbytes since it is a 16 bit Forth, but a funny thing will happen at the 32k
limit. The 68k sign extends all 16 bit addresses to 32 bits. So in our
local memory map, what is perceived as linear from 0 to FFFF (using unsigned
addresses, since signed addresses would straighten everthing out but it
would seem wierd), will actually run from 8000 to FFFF and then 0 to 7FFF in
global memory space. The reference point of 0 is what is in the page
register. The 64k limit may of course be expanded by changing page or using
the other 4 address registers as other page registers.
Large differences from other Forths:
COMPILE ( cfa -- )
this word is completely different from what most people expect and
it is this way for clarity. The definition of INTERPRET gets
the cfa (code field address) and if the word is to be executed it
calls EXECUTE. If it is to be compiled, it calls COMPILE. A word
may now be 'ed and passed onto EXECUTE or COMPILE in the same
manner. For example compare the definitions of OF:
(a) : OF COMPILE OVER COMPILE = [COMPILE] IF COMPILE DROP ;
(b) : OF ' OVER COMPILE ' = COMPILE ' IF EXECUTE ' DROP COMPILE ;
I've found that (b) is much more easily taught and there
is no problem remembering what it does at a later date. This
definition for COMPILE was presented at the '89 FORML conference
and generally thought to be much clearer.
FOR ( n -- ) NEXT ( -- )
This looping construct was first proposed by Mr. Forth himself
(Chuck Moore) as a realization of a better way to do loops. A lot
of code does 0 DO ... LOOP and makes no use of the index. This
is slow since two parameters must be kept around. Also in the
cases where the index is used as an address, the operators @+ !+
make better use of the data stack instead of OVER + SWAP DO.
I've used this looping construct exclusively for the past year and
found that I don't need DO LOOP's any more. The one difference
from Chuck's definition of this construct, is the number of times
that it executes the loop. Chuck's FOR...NEXT will loop n 1 +
times whereas bot-Forth's FOR...NEXT will loop n times and not
loop if it is passed a 0.