DAVID@PENNDRLS.BITNET (08/15/90)
How about this for a more postfix FORTH syntax: " SomeWord" Is 20 Constant " *2" Is : 2 * ; 'Is' factors out Create without changing the compiler significantly. The Name field still gets laid down first. Defining words then lay down a Code field at the current location, which is conveniently right where it should go. Another idea I've been fooling with arose from a previous discussion of Vocabularies as string recognizers. The argument was to enlarge the scope of Number by implementing Vocabularies as generalized string recognizers, with most of them doing it through table lookup in the dictionary, but Number and similar routines doing it alogorithmically. For instance, you could allow "x" to be recognized as a character constant just as 5 is recognized as a numerical constant. So, how about a recognizer that is a mixture? Suppose we have a word that recognizes tokens of the form 'Something' as being a reference to the CFA of the word Something? The Interpreter would leave the CFA on the stack, the Compiler would compile it as a litteral. We could then say Defer FooBar 'Fred' 'FooBar' Is : FooFred 'Fred' 'FooBar' Is ; (to use Is in its more common meaning). I think this is much prettier. And more postfix. Once started down this path, though, you start to get into a lot of delimited strings. For instance, "Something" could be Something as a string constant, <Something> could be the Body address of Something, and [Something] could Compile Something even if Something is Immediate. I'm not sure I want to go that far, but . . . "Fred" Is Immediate : Fred's Action ; "CompFred" Is : Some Extra [Fred] Actions ; "FooBar" Is Defered "FooFred" Is : 'Fred' <FooBar> ! ; Now for the Big Question. Would this still be FORTH? I think it would be. But it would *not* be Chuck Moore Forth, or ANS FORTH. -- R. David Murray (DAVID@PENNDRLS.BITNET, DAVID@PENNDRLS.UPENN.EDU)
dwp@willett.pgh.pa.us (Doug Philips) (08/16/90)
In <9008152018.AA23733@ucbvax.Berkeley.EDU>, DAVID@PENNDRLS.BITNET writes: > How about this for a more postfix FORTH syntax: > > " SomeWord" Is 20 Constant > " *2" Is : 2 * ; > > 'Is' factors out Create without changing the compiler significantly. > The Name field still gets laid down first. Defining words then > lay down a Code field at the current location, which is > conveniently right where it should go. I was going to reply that I thought your 'Is' was actually infix and not postfix. However, I think I'm too confused to say that with confidence. The point of postfix is that parameters are passed to functions and that the functions don't/shouldn't care how the parameters are created. I think your 'Is' example shows that the physically adjacency of the dictionary is not necessary are pure factoring. I agree with: 20 Constant ( and ) : 2 * ; being words that lay down code somewhere. I think that for 'Is' to be strictly postfix, the address of the code to be associated with the word should be on the stack along with the name for the word. What I think you have in your 'Is' is a form of deferred word that is immediately resolved. > Another idea I've been fooling with arose from a previous discussion > of Vocabularies as string recognizers. The argument was to enlarge > the scope of Number by implementing Vocabularies as generalized string > recognizers, with most of them doing it through table lookup in the > dictionary, but Number and similar routines doing it alogorithmically. > For instance, you could allow "x" to be recognized as a character > constant just as 5 is recognized as a numerical constant. In March 90, on comp.lang.misc Darren (Last name forgotten) "new@udel.edu" proposed a language he was working on called '2ol' which was just what you are talking about. The parser is a sequence of functions that each get a crack at the current token until one of them claims to have processed it. Of course, you could always pervert NUMBER to do something like that. The interpreter doesn't "know" that NUMBER only parses numbers, it just passes tokens to it. NUMBER could be a jump off point for what you are talking about (but then perhaps a better name would be less confusing.) > Once started down this path, though, you start to get into a lot of > delimited strings. For instance, "Something" could be Something as a > string constant, <Something> could be the Body address of Something, > and [Something] could Compile Something even if Something is Immediate. > I'm not sure I want to go that far, but . . . > > Now for the Big Question. Would this still be FORTH? I think it > would be. But it would *not* be Chuck Moore Forth, or ANS FORTH. I think it would be '2ol'. Whether or not that is Forth, I'm not sure. I think you could do something like that *in* Forth... -Doug --- Preferred: ( dwp@willett.pgh.pa.us OR ...!{sei,pitt}!willett!dwp ) Daily: ...!{uunet,nfsun}!willett!dwp [last resort: dwp@vega.fac.cs.cmu.edu]
DAVID@PENNDRLS.BITNET (08/18/90)
>> How about this for a more postfix FORTH syntax: >> >> " SomeWord" Is 20 Constant >> " *2" Is : 2 * ; >> >> 'Is' factors out Create without changing the compiler significantly. >> The Name field still gets laid down first. Defining words then >> lay down a Code field at the current location, which is >> conveniently right where it should go. >I was going to reply that I thought your 'Is' was actually infix and >not postfix. However, I think I'm too confused to say that with confidenc >The point of postfix is that parameters are passed to functions and that >the functions don't/shouldn't care how the parameters are created. >I think your 'Is' example shows that the physically adjacency of the >dictionary is not necessary are pure factoring. I agree with: > 20 Constant ( and ) > : 2 * ; >being words that lay down code somewhere. I think that for 'Is' to be >strictly postfix, the address of the code to be associated with the >word should be on the stack along with the name for the word. What I >think you have in your 'Is' is a form of deferred word that is >immediately resolved. Actually my logic was as follows: Is is a word that takes a string and builds an index entry that points to Here. Constant and : are 'defining words' that lay down a Code Field and Body. Remember that the index is not required for FORTH execution. The compiler looks up words in the Index and compiles the address to which the index entry points. Like data types in FORTH, it is the programmer's responsability to see to it that Is was called only to label an address where he or she subsequently put a Code Field. Is doesn't care, and neither does the compiler. In fact, if one were to define the system such that the index was located in a different area than the dictionary, one could use Is to label an arbitrary memory location, such as defining a label in an assembler program: " 2Push" Is Code: BX Push, " 1Push" Is AX Push, " Next" Is ..... ;Code (to crib an example from my memory of F83). But I agree with you: 'postfix' vs 'prefix' vs 'infix' seems to be a much subtler problem than I ever imagined! -- R. David Murray (DAVID@PENNDRLS.BITNET, DAVID@PENNDRLS.UPENN.EDU)
dwp@willett.pgh.pa.us (Doug Philips) (08/20/90)
In <9008180333.AA11956@ucbvax.Berkeley.EDU>, DAVID@PENNDRLS.BITNET writes: > Actually my logic was as follows: Is is a word that takes a string and > builds an index entry that points to Here. Constant and : are 'defining > words' that lay down a Code Field and Body. Remember that the index is > not required for FORTH execution. The compiler looks up words in the > Index and compiles the address to which the index entry points. Like > data types in FORTH, it is the programmer's responsability to see to it > that Is was called only to label an address where he or she subsequently > put a Code Field. Is doesn't care, and neither does the compiler. Ok, I think I understand a bit better what you're trying to do. I think that you have removed the prefix nature of CREATE, because the name comes first. Unfortunately the binding for the name still comes after. There is an implicit parameter, HERE, which is what binds the name to the code that defines it. First, the implicit "HERE" parameter means that if the code which is defining the word contains immediate words that invoke Is, you're out of luck. I'm not quite sure if the use of global variables is or isn't a deciding factor for post-fix-ness. I would be inclined to say no, because *all* the parameters should be on the stack. This has the nice side-effect that the words are reentrant. > In > fact, if one were to define the system such that the index was located > in a different area than the dictionary, one could use Is to label an > arbitrary memory location, such as defining a label in an assembler > program: > > " 2Push" Is Code: BX Push, " 1Push" Is AX Push, > " Next" Is ..... ;Code Second, I think that the requirement for separate dictionary and code spaces is overly restrictive and unnecessary. As for the specifics of your labelled code example, a question: Does Is rip out the previous string from the definition, or is " an IMMEDIATE word in the assembler vocabulary (assuming you use vocabularies)? Aside from allowing arbitrary labelling of words in code, I'm don't get the advantage to using Is to label an arbitrary memory location. If Is binds its name to HERE, I don't see quite how the location could be truly arbitrary. If Is could define a truly arbitrary memory location, wouldn't that mean the location's address would have to be on the stack? Seems like that is a more cleanly post-fix solution. How about this example of a purely postfix version of CODE: ( vocabulary manipulations as necessary go here...) : CODE: ( -- cfa-able-address ) ... ; IMMEDIATE CODE: BX PUSH CODE: AX PUSH CODE: ( Machinations for NEXT... ) ;Code " Next" Is " 1Push" Is " 2Push" Is My claim is that if you have more stack-ed addresses than you can keep track of, you are doing something wrong. It maybe that with CODE: you are justified in performing arbitrary complex operations and that if normal Forth factoring were an issue you wouldn't be using CODE: in the first place. Of course, in my example, I assume that CODE: doesn't do anything obnoxious like create a code prolog or do any alignment, etc. Of course those restrictions would apply equally to your example. -Doug --- Preferred: ( dwp@willett.pgh.pa.us OR ...!{sei,pitt}!willett!dwp ) Daily: ...!{uunet,nfsun}!willett!dwp [last resort: dwp@vega.fac.cs.cmu.edu]
DAVID@PENNDRLS.BITNET (08/22/90)
>> in a different area than the dictionary, one could use Is to label an >> arbitrary memory location, such as defining a label in an assembler >> program: >> >> " 2Push" Is Code: BX Push, " 1Push" Is AX Push, >> " Next" Is ..... ;Code > >Second, I think that the requirement for separate dictionary and code >spaces is overly restrictive and unnecessary. As for the specifics of your >labelled code example, a question: > > Does Is rip out the previous string from the definition, or is " an > IMMEDIATE word in the assembler vocabulary (assuming you use > vocabularies)? I thought the assembler was Interpreted code in all systems? Is this not true? So '"' is interactive and leaves on the stack the address of a copy of the string up to the next '"', which 'Is' eats and lays down the index entry, and then 'BX Push,' assembles the next instruction into the dictionary, and so on. >Aside from allowing arbitrary labelling of words in code, I'm don't get >the advantage to using Is to label an arbitrary memory location. >If Is binds its name to HERE, I don't see quite how the location >could be truly arbitrary. If Is could define a truly arbitrary memory >location, wouldn't that mean the location's address would have to be >on the stack? Seems like that is a more cleanly post-fix solution. You are absolutely right about this. I take back my implication that my Is construct is in any way superior to { code . . . } " name" Def. It is in fact inferior (but easier, I think, to implament). >How about this example of a purely postfix version of CODE: > > ( vocabulary manipulations as necessary go here...) > : CODE: ( -- cfa-able-address ) ... ; IMMEDIATE > > CODE: BX PUSH > CODE: AX PUSH > CODE: ( Machinations for NEXT... ) ;Code > " Next" Is > " 1Push" Is > " 2Push" Is > >My claim is that if you have more stack-ed addresses than you can keep >track of, you are doing something wrong. It maybe that with CODE: you are No problem, if we are talking about separate code and index spaces: CODE: " 2Push" Is BX PUSH CODE: " Next" Is AX PUSH CODE: " 1Push" Is ( Machinations for NEXT... ) ;Code This rewrite should make it obvious that we have factored out the HERE dependence only from Is/Def, and that CODE: (etc) still have it. This seems inherent in a state based system like FORTH (as opposed to a purely functional system). >place. Of course, in my example, I assume that CODE: doesn't do anything >obnoxious like create a code prolog or do any alignment, etc. Of course >those restrictions would apply equally to your example. I'm not sure why you say this. I am assuming Is does not compile anything into the dictionary, but my Code: is free to lay down a CFA and whatever code prefix it may find necessary, since it is called only once, at the beginning of the Word. Only "2Push" could be referenced as a FORTH word, the other two could only be used as assembly labels. For that reason I don't really consider my example valid code. Anyway, I concede that { some FORTH words } " Name" Def is more postfix than my Is. -- R. David Murray (DAVID@PENNDRLS.BITNET, DAVID@PENNDRLS.UPENN.EDU)
peter@ficc.ferranti.com (Peter da Silva) (08/22/90)
In article <1560.UUL1.3#5129@willett.pgh.pa.us> dwp@willett.pgh.pa.us (Doug Philips) writes: > ( vocabulary manipulations as necessary go here...) > : CODE: ( -- cfa-able-address ) ... ; IMMEDIATE > CODE: BX PUSH > CODE: AX PUSH > CODE: ( Machinations for NEXT... ) ;Code > " Next" Is > " 1Push" Is > " 2Push" Is That's pretty good. I was about to point out you had invented PostScript, but that language doesn't allow multiple entry points. Still, it's pretty close to: { ... } /Next def Still, it does demonstrate that at least at this level the PostScript full postfix form has its advantages... -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com
peter@ficc.ferranti.com (Peter da Silva) (08/23/90)
The following is in FIG-forth. I still fail to see the point of dumping state. It's also untested and written by a rusty Forth coder. " xxx" is assumed to leave the address of a counted string on top of the stack. : { state @ if compile (branch) then compile (docol) here state @ 34 ; : } ?comp 34 ?match state ! compile (;s) state @ if dup here over - swap ! literal then ; : def tib @ in @ count drop tib ! 0 in ! [compile] : in ! tib ! [ , ] [compile] ; ; : {ifelse} if swap then drop execute ; ( use as { ... } { ... } flag {ifelse} ) : {while} >r begin i execute until r> drop ; ( use as { ... flag } {while} ) : {do} rot >r do i j execute loop r> drop ; ( use as { ... } hi lo {do} ) ... -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com
dwp@willett.pgh.pa.us (Doug Philips) (08/24/90)
In <9008230629.AA29489@ucbvax.Berkeley.EDU>, DAVID@PENNDRLS.BITNET writes: > I thought the assembler was Interpreted code in all systems? Is this > not true? So '"' is interactive and leaves on the stack the address of > a copy of the string up to the next '"', which 'Is' eats and lays down > the index entry, and then 'BX Push,' assembles the next instruction into > the dictionary, and so on. Oops. I forgot that. I guess the only difficulty might be if you were using a subroutine threaded system with an optimizer in it, such as Andrew Scott describes in the article "Extensible Optimizing Compiler" (Forth Dimensions, Volume XII, Number 2). I'm not sure what would happen to a label on an instruction that "disappeared". Maybe the label would just inhibit optimization "across it"? Probably just an academic question. > You are absolutely right about this. I take back my implication that > my Is construct is in any way superior to { code . . . } " name" Def. > It is in fact inferior (but easier, I think, to implament). You could use it in something perverted like: : Long-Word ( stuff ) [ " Short-Word" Is ] ( more stuff ) ; Of course that would require the dictionary to be out of the way of the code, a system dependant thing. On the other hand if you really need to tail merge two definitions you are doing non-portable stuff anyway. (I can't easily justify why you'd do something like the above instead of going to assembly code though.) > CODE: " 2Push" Is BX PUSH > CODE: " Next" Is AX PUSH > CODE: " 1Push" Is ( Machinations for NEXT... ) ;Code > > >place. Of course, in my example, I assume that CODE: doesn't do anything > >obnoxious like create a code prolog or do any alignment, etc. Of course > >those restrictions would apply equally to your example. > > I'm not sure why you say this. I am assuming Is does not compile > anything into the dictionary, but my Code: is free to lay down a CFA > and whatever code prefix it may find necessary, since it is called only > once, at the beginning of the Word. Only "2Push" could be referenced > as a FORTH word, the other two could only be used as assembly labels. > For that reason I don't really consider my example valid code. Ok. I guess I was still stuck on: > From: DAVID@PENNDRLS.BITNET > Message-ID: <9008152018.AA23733@ucbvax.Berkeley.EDU> > Date: 15 Aug 90 15:13:41 GMT > Reply-To: DAVID%PENNDRLS.BITNET@SCFVM.GSFC.NASA.GOV > > How about this for a more postfix FORTH syntax: > > " SomeWord" Is 20 Constant > " *2" Is : 2 * ; > > 'Is' factors out Create without changing the compiler significantly. > The Name field still gets laid down first. Defining words then > lay down a Code field at the current location, which is > conveniently right where it should go. -Doug --- Preferred: ( dwp@willett.pgh.pa.us OR ...!{sei,pitt}!willett!dwp ) Daily: ...!{uunet,nfsun}!willett!dwp [last resort: dwp@vega.fac.cs.cmu.edu] --- Preferred: ( dwp@willett.pgh.pa.us OR ...!{sei,pitt}!willett!dwp ) Daily: ...!{uunet,nfsun}!willett!dwp [last resort: dwp@vega.fac.cs.cmu.edu]
dwp@willett.pgh.pa.us (Doug Philips) (08/24/90)
In <IRD5U21@xds13.ferranti.com>, peter@ficc.ferranti.com (Peter da Silva) writes: > I was about to point out you had invented PostScript, > but that language doesn't allow multiple entry points. Still, it's pretty > close to: > > { ... } /Next def Aside from some twisted cases (Duff's device, tail-merged code) I don't see much need for multiple entry points. Usually that is done by factoring. Perhaps something like: : CREATE WORD ENTRY: ( rest of CREATE ) ; " $CREATE" Is But that still doesn't look as clean to me as: : CREATE WORD $CREATE ; > Still, it does demonstrate that at least at this level the PostScript > full postfix form has its advantages... Now if only ANSI will let me have the cake and eat it too: Where "The cake" is clean post-fix way to do ugly prefix stuff now ( and ) "eat it too" means use those clean words to make the existing words for backward compatibily and for user-at-the-keyboard convenience. I think I go involved with Forth too late for this standards go around. -Doug --- Preferred: ( dwp@willett.pgh.pa.us OR ...!{sei,pitt}!willett!dwp ) Daily: ...!{uunet,nfsun}!willett!dwp [last resort: dwp@vega.fac.cs.cmu.edu] --- Preferred: ( dwp@willett.pgh.pa.us OR ...!{sei,pitt}!willett!dwp ) Daily: ...!{uunet,nfsun}!willett!dwp [last resort: dwp@vega.fac.cs.cmu.edu]
andrew@idacom.uucp (Andrew Scott) (08/25/90)
Doug Philips writes: > Oops. I forgot that. I guess the only difficulty might be if you were > using a subroutine threaded system with an optimizer in it, such as > Andrew Scott describes in the article "Extensible Optimizing Compiler" > (Forth Dimensions, Volume XII, Number 2). I'm not sure what would happen > to a label on an instruction that "disappeared". Maybe the label would > just inhibit optimization "across it"? Probably just an academic question. I hope I understand the issue, as I'm jumping into this discussion thread a bit late, but you *did* refer to my article. :-) The question of preventing optimizing across labels is not so academic. If you think of a branch destination like a label, the optimizer queue must be "flushed" before the branch is resolved. For example, suppose the phrase DUP >R was optimized to a single instruction. The following code illustrates the problem: : FOO ( x \ y -- ) HORIZ? IF OVER ELSE DUP THEN ( x coord for horiz., y for vertical ) >R ... ; THEN doesn't really compile any code - it only resolves a branch. There is no code between DUP and >R, but they must not be combined. My solution was to compile a "do nothing" word that has an optimization rule that "does nothing" (i.e. doesn't actually lay down code): : NOTHING ; SEQ: NOTHING IS: ( nothing ) ; The optimization rule does in fact complete any pending sequences, as NOTHING never appears in any other rule. The definition of THEN becomes: : THEN ' NOTHING COMPILE-TOKEN >RESOLVE ; IMMEDIATE (Our tick is state smart. Here's a vote for the proposal for COMPILE-TOKEN !) I force a "compilation" of NOTHING whenever I have to stop optimization. In your discussion, the same method could be used whenever internal labels need to be generated. I hope this all made some sense... -- Andrew Scott | mail: andrew@idacom.uucp | - or - {att, watmath, ubc-cs}!alberta!idacom!andrew | - or - uunet!myrias!aunro!idacom!andrew
dwp@willett.pgh.pa.us (Doug Philips) (08/29/90)
In <1990Aug24.221345.14475@idacom.uucp>, andrew@idacom.uucp (Andrew Scott) writes: > I hope I understand the issue, as I'm jumping into this discussion thread a > bit late, but you *did* refer to my article. :-) (I was quite surprised when you actually showed up on the net. Another lurker out in the open!) > The question of preventing optimizing across labels is not so academic. If > you think of a branch destination like a label, the optimizer queue must be > "flushed" before the branch is resolved. Just as I thought. Thanks for the example. The code in question was using multiple entry points, but the issues are the same (I think). (It matters not where you COME-FROM, but where you are GOING-TO). I do have a question about your system. For example: IF ( code-series-1 ) ELSE ( code-series-2 ) THEN ( code-series-3 ) would it be reasonable, possible, sane, to be able to push copies of code-series-3 into each branch and see if the optimizer can do better that way? If the ELSE was missing you could pretend it was there but empty. You can decide when to stop the code motion/duplication based on the length of the longest of the two optimizer sequences which start with either code-series-1 or code-series-2. > I hope this all made some sense... It did to me, if that is any consolation. -Doug --- Preferred: ( dwp@willett.pgh.pa.us OR ...!{sei,pitt}!willett!dwp ) Daily: ...!{uunet,nfsun}!willett!dwp [last resort: dwp@vega.fac.cs.cmu.edu]