rob@idacom.uucp (Rob Chapman) (02/23/90)
I tried several attempts to email the following info to you but they either bounced back or got swallowed. So as a last resort, I'll post it through the news net. 68k-Forth Overview The source code is comprised of 3 parts: the metacompiler, the mini-assembler and of course, the kernel. Each part is described briefly below. This kernel is still in the testing stages but the kernel, as it is, will metacompile itself which will allow for easy correction of bugs. The easiest thing to compile 68k-Forth on is another port of bot-Forth. However since most people don't have this, there are alternatives. I originally metacompiled bot-Forth on LMI's PCForth but the metacompiler needs to be modified to do that (ie. T! and T@ need to do byte swapping). I've tried to keep the number of words that make up the metacompiler small so that it shouldn't be to hard to port and the defintions for the words are in the kernel itself. If somebody wants, I could probably send out a binary image of some sort (ie. ascii coded binary), but I think the best way is to metacompile off a Forth somewhere. Metacompiler The metacompiler was presented at the 1989 Rochester Forth conference and a paper describing it was published in the proceedings. That one was more general to convey its basic concepts. The one in the source code is specific for the 68k and works in conjunction with the Mini-assembler. The metacompiler compiles the kernel source code into a buffer named tspace. This buffer is about 12 kbytes. The kernel source code has been arranged so that there is no forward referencing. This allows the code to be compiled in a single pass without having to resolve forward references and simplifies the work that the metacompiler has to do. In normal compiling, defining words and immediate words do all the compiling ( : ; IF ENDIF etc), while the interpreter lays down calls to non-immediate words. In the metacompiler world, ] takes over the interpreter's job. ] is invoked by : or an occurance in the input stream. ] looks up each word in the input stream in two dictionaries: meta dictionary (META-FIND) and the target dictionary (TFIND). The meta-dictionary is all the words in the host dictionary from latest @ to meta-start @. The target dictionary are all the words that are compiled in the target kernel in the tspace buffer. The reason that the meta-dictionary is searched first, is to allow definition of immediate words. The immediate words in the target kernel aren't executable and their action must be provided by similar words in the meta-dictionary. Most of these meta-compilers are SMUDGEed so that they will not interfere with normal compiling. The ones which aren't smudged, check a flag to see if they should compile into the host or target dictionary. If a word cannot be found in either dictionary, then it is converted to a number and compiled (TLITERAL). Throughout the kernel, you will find metacompilers. They are preceded by a switch back to the host space (META) and followed by a switch back to the target space (TARGET). The reason that they are sprinkled throughout the kernel, is because they compile words from the target kernel into subsequent definitions (ie. ." needs (.") to be defined first). There are some special metacompilers which are actually phrase optimizers. These metacompilers look back at the previously layed down instruction and try to lay down more efficient code (ie. 3 + becomes one opcode instead of 5). At present, the optimizers are few in number and only optimize 2 word sequences. Mini-assembler The mini-assembler compiles opcodes into target space or host space as dictated by the target flag. The syntax is very post-fix: source destination instruction post-operands Some of the 68k addressing modes require post-operands to complete the instruction. These must be layed down after the instruction has been compiled. For example, to put a value in top, the assembler would be: lit top move [ n T, ] ( put n into top of stack register ) The source and destination operands are immediate constants and the instruction is an immediate word which combines the source and destination operands with a value and lays it down in memory (A,). Kernel Word structure: The structure of each definition is: | length | link | name | code | length - this field is really there to make life easier for the compiler. It contains the length of the code in words excluding the rts layed down by ;. If this field is 6 or less, then the compiler will inline the definition instead of compiling a call to it. If the most significant bit is set, then the word cannot be inlined. There are two reasons for the word not to be inlined: if the word has any bsr's (as opposed to jsr's) or if ; has converted the last subroutine call to jump to avoid coming back just for an rts. link - this is a pointer back to the link field of the previous definition. The first word in the dictionary has this field set to 0. So it follows that no word in the dictionary should have its link field at the address 0. The variable latest points to the most recent definition. name - this field is a count prefixed string which contains the name of the definition. The count occupies the lower 5 bits of the first byte. The most significant bit of the first byte is always one to allow for reverse traversal of the header (C>LINK). The second msb of the first byte is the immediate flag which is set by IMMEDIATE and checked by the interpreter. The third msb is the smudge bit which is set by SMUDGE and cleared by RECURSE. This is checked by the dictionary search word COMPARE. code - this variable length field contains 68k assembler opcodes. There is no difference between a constant, a variable or a word definition. Since constants and variable's are 4 or 5 words long, they are inlined as literals. Virtual Forth model on the 68k: ,-------. d7 |__top__| ,-------. ,-------. d6 |_next__| ,-------. a6 |__sp___|--> ,------------. a7 |__rp___|--> ,--------------. | | | | | data stack | | return stack | | | | | ,-------. ,-------. ,-------. ,-------. d0 |__alu__| d5 |_index_| a0 |__ptr__| a5 |_page__| top - top of data stack next - second data stack entry sp - pointer to the rest of data stack in memory (16 bit entries) rp - pointer to the return stack (32 bit entries) alu - general purpose data register index - index for FOR NEXT loops ptr - general purpose address register page - 32 bit address of base of kernel. This is set by RAM which is executed by QUIT upon startup. This register is used for any access to memory or by subroutine calls. Also anything which is pushed to the return stack (>R) has this added to it so that an address may be pushed to the return stack to modify control flow. Memory Model: origin# ,--------------. | data stack | | return stack | | dp | | latest | |___kernel_____| origin# - start of everything. This is set in the metacompiler to 400H which is the end of the 68k exception vector table. data stack - starts at origin# 100 + and grows toward origin# return stack - starts at origin# 200 + and grows toword data stack dp - dictionary pointer points to start of free space at the end of kernel. dp and dp @ enclose the entire kernel which would need to be saved as a binary. latest - points to the link field of the latest definition kernel - definitions that comprise the kernel It should be pointed out that the kernel is limited to a maximum size of 64 kbytes since it is a 16 bit Forth, but a funny thing will happen at the 32k limit. The 68k sign extends all 16 bit addresses to 32 bits. So in our local memory map, what is perceived as linear from 0 to FFFF (using unsigned addresses, since signed addresses would straighten everthing out but it would seem wierd), will actually run from 8000 to FFFF and then 0 to 7FFF in global memory space. The reference point of 0 is what is in the page register. The 64k limit may of course be expanded by changing page or using the other 4 address registers as other page registers. Large differences from other Forths: COMPILE ( cfa -- ) this word is completely different from what most people expect and it is this way for clarity. The definition of INTERPRET gets the cfa (code field address) and if the word is to be executed it calls EXECUTE. If it is to be compiled, it calls COMPILE. A word may now be 'ed and passed onto EXECUTE or COMPILE in the same manner. For example compare the definitions of OF: (a) : OF COMPILE OVER COMPILE = [COMPILE] IF COMPILE DROP ; (b) : OF ' OVER COMPILE ' = COMPILE ' IF EXECUTE ' DROP COMPILE ; I've found that (b) is much more easily taught and there is no problem remembering what it does at a later date. This definition for COMPILE was presented at the '89 FORML conference and generally thought to be much clearer. FOR ( n -- ) NEXT ( -- ) This looping construct was first proposed by Mr. Forth himself (Chuck Moore) as a realization of a better way to do loops. A lot of code does 0 DO ... LOOP and makes no use of the index. This is slow since two parameters must be kept around. Also in the cases where the index is used as an address, the operators @+ !+ make better use of the data stack instead of OVER + SWAP DO. I've used this looping construct exclusively for the past year and found that I don't need DO LOOP's any more. The one difference from Chuck's definition of this construct, is the number of times that it executes the loop. Chuck's FOR...NEXT will loop n 1 + times whereas bot-Forth's FOR...NEXT will loop n times and not loop if it is passed a 0.