[comp.lang.forth] Random comments

aj-mberg@dasys1.UUCP (Micha Berger) (08/01/89)

First, my appologies of starting the "real programmer" thread. It was
written tongue-in-cheek, and fit in with the general tone of
talk.religion.computers at the time.

As someone new (less than 1 month, only have one working program - life)
to FORTH, 2 things really bug me. Why is it that we have 2!, 2DUP, 2CONSTANT,
but D. ? It seems inconsistant. You can't use 2 as a prefix across the board
(2+ and 2* are taken) but why not use D? I also mind that DO loops up to,
not including the limit - it's not intuitive.

I appreciate all the sugestions and responses to my last post. I'm writing
words to define a "parameter list" without the rest of the definition. It
should let me implement a vectored execution scheme without wasting alot of
space on headers. Any comments??

-- 
					Micha Berger

"Always should [the child of] Adam have awe of G-d in secret and in public,
admit the truth, and speak truth in his heart."

sdh@wind.bellcore.com (Stephen D Hawley) (08/03/89)

In article <10425@dasys1.UUCP> aj-mberg@dasys1.UUCP (Micha Berger) writes:
> I also mind that DO loops up to,
>not including the limit - it's not intuitive.

Depends on what your definition of intuitive is.  :')

Consider the pascal loop:
	for x := 1 to 20 do ... ;

This runs x from 1 to 20 inclusive, just what you expect.
But what about the c loop:
	for (x=0; x < 20; x++) ... ;

This also iterates 20 times, but goes from 0 to 19.
I find this more intuitive.  Really.  I've made more boundary errors with
pascal than with any other language (iterate once too many or once too few).
I think this is because of the tendency to start at 1 in pascal.  Most of the
programs I've seen in pascal do this (as well as declaring arrays as [1..n]).

Fine.  So what does this have to do with forth?  Well, not much really,
except that 'intuitive' is subjective, not universal.

Steve Hawley
sdh@flash.bellcore.com
"Up is where you hang your hat."
	--Jim Blandy, computer scientist

toma@tekgvs.LABS.TEK.COM (Tom Almy) (08/03/89)

In article <10425@dasys1.UUCP> aj-mberg@dasys1.UUCP (Micha Berger) writes:
>As someone new (less than 1 month, only have one working program - life)
>to FORTH, 2 things really bug me. Why is it that we have 2!, 2DUP, 2CONSTANT,
>but D. ? It seems inconsistant. You can't use 2 as a prefix across the board
>(2+ and 2* are taken) but why not use D? 

The prefix "D" refers to double integer (typically 32 bit in a 16 bit FORTH
implementation), while the prefix "2" refers to a pair of integers.  A double
integer is a subclass of a pair of integers -- as a practical example, a pair
of integers can be used to express a ratio used in scaling with */:
	2VARIABLE Ratio
	: setRatio   ( numerator denominator -- )  Ratio 2! ;
	: scaleValue  ( value -- scaledValue )  Ratio 2@ */ ;

In this application, use of D! and D@ would be confusing.  

There is a practical reason to have both D@ and 2@ (etc).  LMI UR/FORTH 386
implements both operations, and they are not the same!  The 83 standard 
defines 2@ such that the lower address integer becomes top of stack, and the
next integer becomes second on stack.  In the 80386 (and other Intel 
architectures), the most significant portions of numbers are stored at the
highest addresses.  This means that using double integers will store into
memory wrong, causing problems with the OS interface or interfacing with
other language programs.  D@ and D! access memory in Intel standard order.

I also mind that DO loops up to,
>not including the limit - it's not intuitive.

Every new standard seems to have a new way of handling DO loops (I don't
know if the ANSI standard will be different).  It may not be very intuitive,
but the reason is that indexing is 0 based, so that loops typically have
an initial value of 0; to loop n times you execute "n 0 DO".

Tom Almy
toma@tekgvs.labs.tek.com
I have a commercial connection with Laboratory Microsystems.

ZMLEB@SCFVM.BITNET (Lee Brotzman) (08/03/89)

>
>First, my appologies of starting the "real programmer" thread. It was
>written tongue-in-cheek, and fit in with the general tone of
>talk.religion.computers at the time.

   Apology accepted.  At least it generated a good side-effect -- more
traffic in a very slow newsgroup!

>
>As someone new (less than 1 month, only have one working program - life)
>to FORTH, 2 things really bug me. Why is it that we have 2!, 2DUP, 2CONSTANT,
>but D. ? It seems inconsistant. You can't use 2 as a prefix across the board
>(2+ and 2* are taken) but why not use D? I also mind that DO loops up to,
>not including the limit - it's not intuitive.

   I believe the '2' prefix for double numbers came about because a double
number does not always imply a double-length integer.  The data item might
be a pair of single-length integers, e.g. X-Y coordinates on a grid.  While
'2' is somewhat appropriate for both situations, 'D' clearly is not.  I
often use 2DUP to duplicate two single-length values on the stack.  DDUP
wouldn't "feel" right.
   But remember, this is Forth.  If you don't like a name, then you can
always do the following:

   : D! 2!;
   : DCONSTANT 2CONSTANT ;
   : DO SWAP 1+ SWAP [COMPILE] DO ; IMMEDIATE

   Unless you are distributing source code to the public (in which case
the renaming and redefinition of standard words renders the code nearly
useless) then I would suggest you put all the redefinitions in a file
(or set of blocks depending on your situation) and INCLUDE them whenever
you compile.  Better yet, you can compile the redefintions and do a
SAVE-SYSTEM and they will be there always.

>I appreciate all the sugestions and responses to my last post. I'm writing
>words to define a "parameter list" without the rest of the definition. It
>should let me implement a vectored execution scheme without wasting alot of
>space on headers. Any comments??

    An off the cuff, thirty-second design phase vectored execution scheme:

    : EXEC-VECTOR  ( compile time: n -- ...number of vectors to allocate )
                   ( execution:    n -- ...number of vector to execute )
        CREATE 2* ALLOT
        DOES>  SWAP 2* + @ EXECUTE ;

    : STORE-VECTOR  ( n -- ...cell number to store execution vector into)
                    ( this word is used like:  n STORE-VECTOR <to> <from> )
                    ( where: n is the cell number, <to> is the vector array )
                    ( and <from> is the word to put in the array )
      ' >BODY   ( n pfa-to -- ...get pfa of <to> )
      SWAP 2* + ( cell-adr -- ...calculate vector cell address )
      '         ( cell-adr cfa-from -- ...get cfa of <from> )
      SWAP ! ;  IMMEDIATE   ( store the execution vector )

    10 EXEC-VECTOR ADDS
    0 STORE-VECTOR ADDS +
    1 STORE-VECTOR ADDS D+
    2 STORE-VECTOR ADDS F+

    So, '0 ADDS' will result in '+' being executed, and '2 ADDS' will execute
'F+'.  Actually the example isn't very practical.  I have used a very similar
scheme, however, in a text editor.  I have an execution vector array with 128
cells in it.  Each cell points to a different editing function.  The character
entered on the keyboard is the index into the array that specifies which
function is being called.  So my word EDIT is just:

    : EDIT   GET-FILE    ( Open, create, or re-establish like to the file )
             INITIALIZE  ( set up some stuff to get started )
      BEGIN  KEY DO-EDIT  AGAIN;   ( DO-EDIT is the name of the vector array )

    Cells which correspond to regular characters, like 'A', execute the word
CHARACTER , which stores the character into the file.  The cell which
corresponds to control-A (decimal 1) executes the word ADDLINE, i.e. when
I press control-A, a new line is added to the file being edited.  This makes
it real easy to reconfigure the editor.  Plus I can use forth as an editor
macro language, just define a new editing function and tie the CFA to the
vector array.

>                                        Micha Berger
>

-- Lee Brotzman (FIGI-L Moderator)
-- BITNET:   ZMLEB@SCFVM
-- Internet: zmleb@scfvm.gsfc.nasa.gov
-- The government and my company don't know what I'm saying.
-- Let's keep it that way.

eaker@sunbelt.crd.ge.com (Charles E Eaker) (08/04/89)

In article <10425@dasys1.UUCP> aj-mberg@dasys1.UUCP (Micha Berger) writes:
>As someone new (less than 1 month, only have one working program - life)
>to FORTH, 2 things really bug me. Why is it that we have 2!, 2DUP, 2CONSTANT,
>but D. ? It seems inconsistant. You can't use 2 as a prefix across the board
>(2+ and 2* are taken) but why not use D?

Here's why. The prefix 2 generally means operate on 2 things of the
usual size.  The "usual size" in Forth has traditionally been a 16-bit
"cell".  Note that the operations 2!, 2DUP, and 2CONSTANT could care
less what the 2 cells are interpreted to be. They could be two single
numbers or 1 double number or 4 characters or one or more bit strings
representing a set or representing a packed record structure or
representing a date.  But the dot operation which prints its operand
must interpret the operand's contents.  D. interprets two cells as
being one double number. Another operation, say .TIME might interpret
the two cells as representing seconds since some specified moment. I
suppose a reasonable action for 2. would be to print the two cells as
two consecutive single numbers.

   [What bugs ME is the traditional asymetry between D. and ." and the
    inclination to use the dot as a prefix in other print words such as
    .TIME . Instead, I think that the dot should uniformly be a suffix
    so that to print something right adjusted in a field whose width is
    on the stack we would use D.R and ".R and TIME.R but I'm afraid
    that using dot as a prefix to mean print is now too deeply
    ingrained.]

>                                         I also mind that DO loops up to,
>not including the limit - it's not intuitive.
>

On the contrary, I find DO loop limits in Forth to be quite intuitive.
I always get it right in Forth, I never get it right in Pascal, I
often get it right in C. A limit, after all, is easily thought of
as something which is never quite reached.
If you want to initialize SIZE cells of an array, for example, 
starting at address ARRAY with the pattern PAT, you would use:
     ARRAY SIZE + ARRAY DO I PAT ! LOOP
otherwise you would constantly have to subtract one:
     ARRAY SIZE + 1- ARRAY DO I PAT ! LOOP
which strikes me as being an unnecessary and counter-intuitive nuisance
in a piece of code that becomes a cliche in any array processing
application. (Or would you rather the value of SIZE be one less than
the actual size of the array? That's even more counter-intuitive.)

For being new to Forth, you're asking good questions.
Chuck Eaker                                |  eaker@sunbelt.crd.ge.com
Software Engineering Program               |  eaker@crdgw1.crd.ge.com
GE Corporate Research & Development Center |  eaker@crdgw1.UUCP
P.O. Box 8, K-1 3C12 Schenectady, NY 12301 |  518 387-5964

marc@noe.UUCP (Marc de Groot) (08/05/89)

In article <10425@dasys1.UUCP> aj-mberg@dasys1.UUCP (Micha Berger) writes:
>Why is it that we have 2!, 2DUP, 2CONSTANT,
>but D. ? It seems inconsistant. You can't use 2 as a prefix across the board
>(2+ and 2* are taken) but why not use D?

2@ and 2! were intended to fetch and store pairs of single-precision numbers.
D. prints one double-precision number.


-- 
Marc de Groot (KG6KF)                   These ARE my employer's opinions!
Noe Systems, San Francisco
UUCP: uunet!hoptoad!noe!marc
Internet: marc@kg6kf.AMPR.ORG

ir230@sdcc6.ucsd.EDU (john wavrik) (08/06/89)

    Tom Almy (toma@tekgvs.labs.tek.com) writes:
> The prefix "D" refers to double integer (typically 32 bit in a 16 bit FORTH 
> implementation), while the prefix "2" refers to a pair of integers.  A 
> double integer is a subclass of a pair of integers ...
 
> There is a practical reason to have both D@ and 2@ (etc).  LMI UR/FORTH 386 
> implements both operations, and they are not the same!  The 83 standard 
> defines 2@ such that the lower address integer becomes top of stack, and the 
> next integer becomes second on stack.  In the 80386 (and other Intel 
> architectures), the most significant portions of numbers are stored at the 
> highest addresses.  This means that using double integers will store into 
> memory wrong, causing problems with the OS interface or interfacing with 
> other language programs.
-----------------------------------------------------------------------------
                        IMPLEMENTATION DEPENDENCE
 
    There is at least a conceptual difference between a double-precision 
number and an ordered pair of single-precision numbers. The number pair 
operations (with prefix 2) should, as Tom Almy points out, always refer to 
ordered pairs of integers of whatever size. On the other hand, 32-bits seems 
to be the current standard (among other languages) for "long" integers. If 
this were to be the accepted definition of the Forth double-numbers then "D" 
and "2" operations would be different on 32-bit machines.  The ANSI standards 
team, however, seems to lean toward making D-numbers twice the cell size. 
Their BASIS document says:
      "The order in memory of ordered pairs, being structured data, 
      must be predictable. On the other hand experience has 
      indicated that there may be significant advantages to choosing 
      a hardware compatible representation of double numbers in 
      memory". 
An argument could be made for the reverse: As long as 2@ and 2! are 
compatible, the order in memory doesn't matter. On the other hand, the 
component words of a double precision number have arithmetic significance. 
[When used as the basis of multi-precision arithmetic, for example, the high 
word is the "carry" to the next position -- and it must be in a predictable 
place.]
 
   In any case, when dealing with compound data, one can either provide a 
programmer with information about how the data is represented, or one can 
provide selector and manipulator functions to deal with an unknown 
representation. (It is interesting that the ANSI team is proposing that the 
representation of double numbers be implementation-dependent but has not 
suggested selectors for the high and low words!) 
 
    Almy's also gives justification for making data representations 
implementation-dependent. The fact that the Intel chips (as well as others) 
store an integer with the low BYTE lower in memory has no bearing on 2@ vs D@. 
On the Intel 80286, 8086, 8088 there is no architecture-preferred order for 
the 16-bit words of a 32-bit number. 32-bit arithmetic operations on the 80286 
use pairs of registers and these can be pushed or popped in any order. Either 
order is equally fast and neither is "wrong". When a 16-bit Forth is 
implemented on a machine with 32-bit registers there is a machine preferred 
order -- but, in this case, it seems best to use a 32-bit Forth stack. No 
problems have been found with OS interfacing on an 80286-based system (and 
double-precision numbers are not needed for such interfacing!). 
 
   Forth lacks the ability of compiled languages to trap illegal operations on 
mixed data. Some of its amazing robustness and portability are due to an 
historical consistency in implementation: sequences like 2! followed by an 
eventual @ lead to predictable results (counter-intuitive, but predictable!). 
If I were on the Standards team, I'd think long and hard before giving up 
consistency in implementation. Not only will it mean an increase in 
complexity, but also an increase in subtle bugs when applications are 
transported. 
 
 
                                                  John J Wavrik
             jjwavrik@ucsd.edu                    Dept of Math  C-012
                                                  Univ of Calif - San Diego
                                                  La Jolla, CA  92093

toma@tekgvs.LABS.TEK.COM (Tom Almy) (08/07/89)

In article <4605@sdcc6.ucsd.EDU> ir230@sdcc6.ucsd.EDU (john wavrik) writes:
[concerning my previous comment about D@ and D! being machine dependent and
 2@ and 2! memory order being a language standard]
>An argument could be made for the reverse: As long as 2@ and 2! are 
>compatible, the order in memory doesn't matter. 

Not necessarily so, and the reason LMI added D@ and D!, was that they changed
2@ and 2! from an earlier version so as to match the Intel preferred order
(oposite from the Forth standard).  This broke my code.  I have a *large*
application which used 2VARIABLEs, and 2@ and @!, to hold x,y coordinate
points.  I rely on "@" getting the Y value and "WSIZE + @" getting the X
value.  So the order is important!

>   In any case, when dealing with compound data, one can either provide a 
>programmer with information about how the data is represented, or one can 
>provide selector and manipulator functions to deal with an unknown 
>representation. (It is interesting that the ANSI team is proposing that the 
>representation of double numbers be implementation-dependent but has not 
>suggested selectors for the high and low words!) 

Good point.  My problem would have been solved had there been something
like 2A@ and 2B@ for the "A" and "B" elements of the pair!

> 
>    Almy's also gives justification for making data representations 
>implementation-dependent. The fact that the Intel chips (as well as others) 
>store an integer with the low BYTE lower in memory has no bearing on 2@ vs D@. 
>On the Intel 80286, 8086, 8088 there is no architecture-preferred order for 
>the 16-bit words of a 32-bit number. 32-bit arithmetic operations on the 80286 
>use pairs of registers and these can be pushed or popped in any order. Either 
>order is equally fast and neither is "wrong". 

Of course you are right from an architectural point of view.  The preferred
order must be inferred from the "little-endian" architecture.  But when 
interfacing with other languages, 32 bit numbers had better be stored with
least significance in the lower addresses.  This is also important if you
intend to access the data in the 80386 -- I have been "burned" by several
Forth programs that 32 bit data via 2!, and then my 80386 native program
has to swizzle the data to read it.  Ugh!

>No
>problems have been found with OS interfacing on an 80286-based system (and 
>double-precision numbers are not needed for such interfacing!). 

You are absolutely correct.  But there is confusion over ordering of 
segment/offset pairs, if you consider the segment to be of greater numeric
significance.

Tom Almy
toma@tekgvs.labs.tek.com
Standard Disclaimers Apply