[comp.lang.forth] Multi-line custom compilers

wmb@MITCH.ENG.SUN.COM (08/30/90)

> The point is that the solutions offered [to the problem that Dr. Wavrik
> posed about building your own compiling words] are only valid if the
> source code is on one line, such as the specific formulation proposed
> as the problem.

One of my A-1, non-negotiable, burning-issue-class goals with the file
input stream stuff was the ability to write a version of WORD (call it
GETWORD for the sake of discussion) that will read past line boundaries.

I believe that this goal was achieved, but I guess I better prove it by
writing up the solution.  Stay tuned.  I'll post it when I finish.

If I can't do it, it's burning issue time.

Mitch Bradley, wmb@Eng.Sun.COM

toma@tekgvs.LABS.TEK.COM (Tom Almy) (08/31/90)

In article <9008310127.AA29178@ucbvax.Berkeley.EDU> wmb%MITCH.ENG.SUN.COM@SCFVM.GSFC.NASA.GOV writes:
>> The point is that the solutions offered [to the problem that Dr. Wavrik
>> posed about building your own compiling words] are only valid if the
>> source code is on one line, such as the specific formulation proposed
>> as the problem.
>
>One of my A-1, non-negotiable, burning-issue-class goals with the file
>input stream stuff was the ability to write a version of WORD (call it
>GETWORD for the sake of discussion) that will read past line boundaries.

I needed this for my Forth compilers, to handle conditional compilation,
among other things. Quite simple, you end up with something on
the order of--

	: GETWORD   ( delim -- string )
		BEGIN
			DUP WORD	( parse next word )
		DUP C@ 0= WHILE		( while at end of line )
			DROP		( toss the null word parsed )
			QUERY		( read the next line )
		REPEAT
		SWAP DROP		( toss the delimiter )
	;

Thats as close as I can get to a generic 83 Standard solution.

Tom Almy
toma@tekgvs.labs.tek.com
Standard Disclaimers Apply

wmb@MITCH.ENG.SUN.COM (09/01/90)

> > a version of WORD (call it GETWORD for the sake of discussion) that
> > will read past line boundaries.

>	: GETWORD   ( delim -- string )
>		BEGIN
>			DUP WORD	( parse next word )
>		DUP C@ 0= WHILE		( while at end of line )
>			DROP		( toss the null word parsed )
>			QUERY		( read the next line )
>		REPEAT
>		SWAP DROP		( toss the delimiter )
>	;

With the Basis12 version of QUERY, which works not only for keyboard input
but also for other input stream sources, this is pretty close.

There is a problem with this solution relating to the problem of "hard end
of input stream".  QUERY has no way to report that it did not succeed in
acquiring another chunk of input.


   Input Source			End Conditions


     Keyboard		No end condition; user is assumed to supply an
			endless series of input lines.

      Block		Questionable.  Should GETWORD read past the end of
			a block?  Should that depend on whether the block was
			loaded with LOAD or with THRU ?  Should there be
			a "splice" word equivalent to FIG's "-->" ?

  Block inside file	Presumably, GETWORD could read the next block, up to
			the last block in the file, but what about shadow
			screens?

     Text file		End-of-file prevents QUERY from acquiring another
			input line.

 String (EVALUATE)	End-of-string is end of input stream.  QUERY cannot
			acquire more from this input source.


Personally, I feel that the traditional Forth "dual-mode" input stream
mechanism (BLK, BLOCK, TIB, #TIB, >IN) is a horrible disgusting mess.
I prefer (and supply in my products) a stream abstraction where all
input sources are uniformly accessed via a GETWORD procedure which
ignores line boundaries.  Refilling of input buffers from whatever source
is handled transparently in a lower layer.

I surveyed a lot of users and vendors, and nearly everybody prefers this
model, but the weight of "common practice" is squarely on the side of the
traditional mechanism, so that is what is in ANS Basis.  The stream model
would break everybody's code.

I have the dubious distinction of being the author of a part of Basis
(the file input mechanism) that I don't really like.  But at least it works,
assuming that we can fix the "end condition" problem.  I have submitted
a proposal to fix it.  The gist of the proposal is to add a new word REFILL
which is like the generalized (works on all input sources) QUERY .  REFILL
returns a flag telling whether or not it succeeded in refilling the input
buffer.

Such are the trials and tribulations of codifying existing practice, rather
than inventing a new language.

This traditional input stream mess is one of the reasons that I prefer a
functional definition of the language, rather than a virtual machine model
where you get to torque on all the visible nuts and bolts.

Mitch Bradley, wmb@Eng.Sun.COM