[net.micro] FORTH w/dynamic memory allocation

idhopper (12/02/82)

A couple of people have requested info on my FORTH with dynamic memory
allocation, so here it is.  It is just a preliminary design, not even
close to being implemented, and so I welcome any comments and
suggestions you may have.

			   Memory management
			   -----------------

DynaForth requires a storage allocation scheme that supports the
following three data types: 

	(1)	Small Integers.  The value is folded into the object
		pointer so they would be to +/- 16384.
	(2)	Array.  This is a vector of object pointers, which can
		be up to some specified limit in length (probably up
		to 16384 pointers would match well with SmallInts.
	(3)	ByteArray.  This is just that, a vector of bytes, up to
		some reasonable limit (probably 65K bytes).  It is used
		for strings, long integers, bitmaps, etc.

This memory manager supports the following operation upon these objects:

	Create:	With appropriate parameters, returns an object of the
		right size, or says it can't be done (no memory, etc.)
	Put:	Put the specified value at the specified location of
		a (Byte)Array.
	Get:	Get the Nth field of a (Byte)Array.

There may be other things too, like block moves, that are implemented
for speed, but these are the basics.  Note that objects automatically
disappear when they can't be accessed any more; this can be done by
reference counting or incremental garbage collection, or something like
that.  If this system sounds a lot like Smalltalk's, it's because that's
what I based it on.

			      Kernel Stuff
			      ------------

The threaded code interpreter follows the object pointers in an
Array-type object, either doing an automatic return when it reaches the
end, or else wedging (the former is far preferable and just as easy).
Each of these object pointers points to an Array that contains
information like you would find in a normal FORTH variable block:

	Code Field:	A ByteArray that contains the machine language
			code that is executed.
	Data Field:	An object that is the value of this variable
	Flag Field:	Contains flags like "Execute on compiling?"
	Name Field:	A ByteArray that contains the name of this
			variable.
	Dictionary Fld:	A pointer to the dictionary containing this var.

	{other stuff that it might be nice to have; I haven't decided:}
	Decompile Fld:	A routine to execute when decompiling this word.
	Compile Field:	A routine to execute when compiling this word;
			would make a "Flag field" less necessary.

A dictionary would be an Array of name Arrays, and could be organized
into a fairly complex tree structure.  This also eliminates the need to
organize entries into a linked list, as you can redefine old words quite
easily; furthermore, you could use efficient binary search algorithms to
look up a word in the dictionary.

What the primitives would look like:

	@	object offset --- contents
	!	contents object offset ---
	<var>	--- pointerToVarBlock offsetToDataField
	.	object ---
		(if it's a SmallInt or a ByteArray, then print it as a
		number in the current radix, else complain)
	
It is rather a pain having to have two values on the stack to store or
fetch a variable, but I can't see any other nice ways of doing it.
Usually, you would just reference an object anyways, without worrying
about an offset.  Arithmetic words should overflow and underflow back
and forth from ByteArrays to SmallInts.

			     Fancier Stuff
			     -------------

It would be quite easy to allow multiple processes, each with its own
userArea object, which contains the stack, the return stack, and other
sorts of necessary stuff.  

One very nice feature is the ability to pass procedures on the stack,
with something like the Smalltalk block construct:

	[ ...do all sorts of stuff... ] foo !
	.
	.
	.
	foo @ exec

This is in fact quite simple, as "[" simply creates an Array and
redirects the compiler into it.  "]" directs the compiler back whence it
came.  So, to define a word you would go:

	" foobar" makeVariable
	[ 2 3 * + ]
	  foobar !Exec

Where { " foobar" makeVariable } adds a variable named foobar to the
current dictionary, and { !Exec } stores a value and modifies the
variable's code field to execute its value instead of just leaving an
address on the stack.

The disadvantage of this is that it means that all your control
structures will also be reverse-Polish.  For example,

	[ [ thenPart ]
	  [ elsePart ]
	  isTheMoonFull
	    ifThenElse ]
	[ hellFreezesOver ]
	  repeatUntil

Pretty grody.  However, you can always write a nice, simple infix
compiler for stuff like this without any enormous problem.

etc...

There are many other details that need to be resolved.  This is just a
first cut at the problem, but I hope that you find it interesting.

	--ravi