[comp.lang.forth] attention Steve Sheppard part1

rob@idacom.uucp (Rob Chapman) (02/23/90)

I tried several attempts to email the following info to you but they either
bounced back or got swallowed.  So as a last resort, I'll post it through the
news net.

				68k-Forth

Overview

The source code is comprised of 3 parts: the metacompiler, the mini-assembler
and of course, the kernel.  Each part is described briefly below.  This
kernel is still in the testing stages but the kernel, as it is, will
metacompile itself which will allow for easy correction of bugs.  The easiest
thing to compile 68k-Forth on is another port of bot-Forth.  However since
most people don't have this, there are alternatives.  I originally
metacompiled bot-Forth on LMI's PCForth but the metacompiler needs to be
modified to do that (ie. T! and T@ need to do byte swapping).  I've tried to
keep the number of words that make up the metacompiler small so that it
shouldn't be to hard to port and the defintions for the words are in the
kernel itself.  If somebody wants, I could probably send out a binary image
of some sort (ie. ascii coded binary), but I think the best way is to
metacompile off a Forth somewhere.

Metacompiler

The metacompiler was presented at the 1989 Rochester Forth conference and a
paper describing it was published in the proceedings.  That one was more
general to convey its basic concepts.  The one in the source code is
specific for the 68k and works in conjunction with the Mini-assembler.

The metacompiler compiles the kernel source code into a buffer named tspace.
This buffer is about 12 kbytes.  The kernel source code has been arranged so
that there is no forward referencing.  This allows the code to be compiled
in a single pass without having to resolve forward references and simplifies
the work that the metacompiler has to do.

In normal compiling, defining words and immediate words do all the compiling
( : ; IF ENDIF etc), while the interpreter lays down calls to non-immediate
words.  In the metacompiler world, ] takes over the interpreter's job.  ] is
invoked by : or an occurance in the input stream.  ] looks up each word in
the input stream in two dictionaries: meta dictionary (META-FIND) and the
target dictionary (TFIND).  The meta-dictionary is all the words in the host
dictionary from latest @ to meta-start @.  The target dictionary are all the
words that are compiled in the target kernel in the tspace buffer.  The
reason that the meta-dictionary is searched first, is to allow definition of
immediate words.  The immediate words in the target kernel aren't executable
and their action must be provided by similar words in the meta-dictionary. 
Most of these meta-compilers are SMUDGEed so that they will not interfere
with normal compiling.  The ones which aren't smudged, check a flag to see
if they should compile into the host or target dictionary.  If a word cannot
be found in either dictionary, then it is converted to a number and
compiled (TLITERAL).

Throughout the kernel, you will find metacompilers.  They are preceded by a
switch back to the host space (META) and followed by a switch back to the
target space (TARGET).  The reason that they are sprinkled throughout the
kernel, is because they compile words from the target kernel into subsequent
definitions (ie. ." needs (.") to be defined first).

There are some special metacompilers which are actually phrase optimizers. 
These metacompilers look back at the previously layed down instruction and
try to lay down more efficient code (ie. 3 + becomes one opcode instead of
5).  At present, the optimizers are few in number and only optimize 2 word
sequences.

Mini-assembler

The mini-assembler compiles opcodes into target space or host space as
dictated by the target flag.  The syntax is very post-fix:

	source  destination  instruction  post-operands

Some of the 68k addressing modes require post-operands to complete the
instruction.  These must be layed down after the instruction has been
compiled.  For example, to put a value in top, the assembler would be:

	lit top move [ n T, ]	( put n into top of stack register )

The source and destination operands are immediate constants and the
instruction is an immediate word which combines the source and destination
operands with a value and lays it down in memory (A,).

Kernel

Word structure:
The structure of each definition is:  | length | link | name | code |

	length - this field is really there to make life easier for the
		 compiler.  It contains the length of the code in words
		 excluding the rts layed down by ;.  If this field is
		 6 or less, then the compiler will inline the definition
		 instead of compiling a call to it.  If the most significant
		 bit is set, then the word cannot be inlined.  There are two
		 reasons for the word not to be inlined: if the word has
		 any bsr's (as opposed to jsr's) or if ; has converted the
		 last subroutine call to jump to avoid coming back just for
		 an rts.
	link   - this is a pointer back to the link field of the previous
		 definition.  The first word in the dictionary has this
		 field set to 0.  So it follows that no word in the
		 dictionary should have its link field at the address 0.
		 The variable latest points to the most recent definition.
	name   - this field is a count prefixed string which contains the
		 name of the definition.  The count occupies the lower 5
		 bits of the first byte.  The most significant bit of the
		 first byte is always one to allow for reverse traversal of
		 the header (C>LINK).  The second msb of the first byte is
		 the immediate flag which is set by IMMEDIATE and checked by
		 the interpreter.  The third msb is the smudge bit which is
		 set by SMUDGE and cleared by RECURSE.  This is checked by
		 the dictionary search word COMPARE.
	code   - this variable length field contains 68k assembler opcodes.
		 There is no difference between a constant, a variable or a
		 word definition.  Since constants and variable's are 4 or 5
		 words long, they are inlined as literals.

Virtual Forth model on the 68k:
                        ,-------.
                     d7 |__top__|
                        ,-------.
      ,-------.      d6 |_next__|        ,-------.
   a6 |__sp___|--> ,------------.     a7 |__rp___|--> ,--------------.
                   |            |                     |              |
                   | data stack |                     | return stack |
                   |            |                     |              |

      ,-------.       ,-------.       ,-------.       ,-------.
   d0 |__alu__|    d5 |_index_|    a0 |__ptr__|    a5 |_page__|

	top   - top of data stack
	next  - second data stack entry
	sp    - pointer to the rest of data stack in memory (16 bit entries)
	rp    - pointer to the return stack (32 bit entries)
	alu   - general purpose data register
	index - index for FOR NEXT loops
	ptr   - general purpose address register
	page  - 32 bit address of base of kernel.  This is set by RAM which
		is executed by QUIT upon startup.  This register is used for
		any access to memory or by subroutine calls.  Also anything
		which is pushed to the return stack (>R) has this added to
		it so that an address may be pushed to the return stack to
		modify control flow.

Memory Model:
	origin# ,--------------.
		|  data stack  |
		| return stack |
		|     dp       |
		|   latest     |
		|___kernel_____|

	origin#      - start of everything.  This is set in the metacompiler
		       to 400H which is the end of the 68k exception vector
		       table.
	data stack   - starts at origin# 100 + and grows toward origin#
	return stack - starts at origin# 200 + and grows toword data stack
	dp	     - dictionary pointer points to start of free space at
		       the end of kernel.  dp and dp @ enclose the entire
		       kernel which would need to be saved as a binary.
	latest       - points to the link field of the latest definition
	kernel       - definitions that comprise the kernel

It should be pointed out that the kernel is limited to a maximum size of 64
kbytes since it is a 16 bit Forth, but a funny thing will happen at the 32k
limit.  The 68k sign extends all 16 bit addresses to 32 bits.  So in our
local memory map, what is perceived as linear from 0 to FFFF (using unsigned
addresses, since signed addresses would straighten everthing out but it
would seem wierd), will actually run from 8000 to FFFF and then 0 to 7FFF in
global memory space.  The reference point of 0 is what is in the page
register.  The 64k limit may of course be expanded by changing page or using
the other 4 address registers as other page registers.

Large differences from other Forths:

	COMPILE  ( cfa -- )
	  this word is completely different from what most people expect and
	  it is this way for clarity.  The definition of INTERPRET gets
	  the cfa (code field address) and if the word is to be executed it
	  calls EXECUTE.  If it is to be compiled, it calls COMPILE.  A word
	  may now be 'ed and passed onto EXECUTE or COMPILE in the same
	  manner.  For example compare the definitions of OF:

	  (a) : OF  COMPILE OVER  COMPILE =  [COMPILE] IF  COMPILE DROP ;

	  (b) : OF  ' OVER COMPILE  ' = COMPILE  ' IF EXECUTE  ' DROP COMPILE ;

	  I've found that (b) is much more easily taught and there
	  is no problem remembering what it does at a later date.  This
	  definition for COMPILE was presented at the '89 FORML conference
	  and generally thought to be much clearer.

	FOR  ( n -- )  NEXT  ( -- )
	  This looping construct was first proposed by Mr. Forth himself
	  (Chuck Moore) as a realization of a better way to do loops.  A lot
	  of code does  0 DO ... LOOP and makes no use of the index.  This
	  is slow since two parameters must be kept around.  Also in the
	  cases where the index is used as an address, the operators @+ !+
	  make better use of the data stack instead of  OVER + SWAP DO.
	  I've used this looping construct exclusively for the past year and
	  found that I don't need DO LOOP's any more.  The one difference
	  from Chuck's definition of this construct, is the number of times
	  that it executes the loop.  Chuck's FOR...NEXT will loop n 1 +
	  times whereas bot-Forth's FOR...NEXT will loop n times and not
	  loop if it is passed a 0.