[comp.lang.forth] Standard follies

koopman@a.gp.cs.cmu.edu (Philip Koopman) (07/07/89)

OK, here's a stack processor architect's $.02 about the
proposed Forth Standard.  Normally I don't like getting
into the fray, but the recent discussion, and the fact
that someone might actually listen to me have convinced
me to participate.

1) 1's complement. -- not required.
   1's complement machines still exist.  I believe that
the AN/UYK-7 uses 1's complement, and it is still installed
on most of the US Navy fleet (1960's technology -- scary,
isn't it?)  But, the point is, who cares?  Does the
ANSII C standard worry about integer representation?
Does the FORTRAN standard?  Why must it be specified?
People who want to write portable code should be careful
not to exploit specific machine behavior, in any language.
   If people are worried about writing code that depends on the
2-ness of the complement, then they have four choices:
   - Don't exploit the integer representation explicitly
   - Let the code break on 1's complement machines
   - Use run-time tests to execute correct code
   - Use conditional compilation to compile correct code

2) Floored division -- make it optional
  I agree that there are some instances where floored
division is nice to have.  However, it is inherently slower
and more difficult to perform.  Hardware has usually been
built to support truncated division, because the hardware
is simpler (and therefore faster, and less prone to obscure
bugs).  I do not have any plans to support floored division
in hardware.  Why should innocent users pay performance
penalties for division that doesn't work like the stuff
they learned in grade school?
  Anyone who needs floored division knows it.  They can
easily define an alias to make it the default.  Floored
division should be available as an optional, software-
supported extension, as Mitch suggested.

3) Separate floating point stack -- too expensive
  I have tried to talk to members of the committee about
this, notably Martin Tracy, but they have turned a
deaf ear.  A separate floating point hardware stack
is too expensive to implement in hardware.  I have
no plans on implementing such a stack.
  It seems to me that the primary appeal of such a stack
is that floating point numbers used to be a different
size than integers.  However, this problem goes away
with 32-bit processors.  Single integers are the same
size as single floats, and the same with doubles.
Should folks who are serious about floating point
performance pay penalties to remain compatible with
16-bit machines?  Let's face it, processors with
16-bit ALUs will never be very good at IEEE-format
floating point math!  Again, conditional compilation
can solve the transportability issue between 16-bit
and 32-bit systems for folks who care about such things.
I ported a floating point math package written for
a 16-bit processor (with absolutely *no* thought given
to portability) to a 32-bit processor in about 2 hours.
And, I used a common stack for integers and floats. 
  I think the separate floating point stack was spawned
by the fact that the 8087 had one that clever programmers
exploited.  But, there is more to the world than the 80x86/7!
  If standard Forth requires a separate floating point
stack, I do not plan on building standard Forth hardware.

Summary:
  Don't make hardware and compiler designers life more
difficult, and in particular, don't make hardware more
expensive/slower and software more expensive/slower
by specifying behavior that usually doesn't matter.
Things such as 1's complement and floored division
should be options, not requirements.
  I think you will find other hardware designers in
agreement.  I know Chuck Moore agrees with the floating
point stack issue.  How about a comment from Marty or
John at Johns Hopkins/APL?

  Phil Koopman                koopman@greyhound.ece.cmu.edu   Arpanet
  5551 Beacon St.
  Pittsburgh, PA  15217    
This is completely my own opinion, and in no way represents
anything having to do with Harris Semiconductor
(don't tell them I posted this -- they might shoot me).

wmb@SUN.COM (07/18/89)

Funny, I usually agree with Phil, but not this time...


> 1) 1's complement. -- not required.
>    1's complement machines still exist.  I believe that
> the AN/UYK-7 uses 1's complement, and it is still installed
> on most of the US Navy fleet (1960's technology -- scary,
> isn't it?)  But, the point is, who cares?  Does the
> ANSII C standard worry about integer representation?
> Does the FORTRAN standard?  Why must it be specified?
> People who want to write portable code should be careful
> not to exploit specific machine behavior, in any language.

The problem is that Forth tends to "overload" bitwise logical operators
and boolean operators, especially AND, OR, and NOT, whereas e.g. C and
Fortran don't.  (Okay, Forth does indeed distinguish between 0= and NOT,
but it is common practice, athough wrong, to use NOT to complement a flag)

Here is the hard problem (assuming a 16-bit machine for the sake of
discussion):

   Suppose you have a bit mask whose value is (hex) 8000.  You want to
   test a value to see if that bit is clear.  So we try this:

       ( value )  8000 and  0=  IF  <whatever>  THEN

   This code works correctly on a 2's complement machine.  However, it
   fails on a 1's complement machine, because 8000 is negative zero, so
   "8000 0=" is true!

What is needed is a an unsigned 0= operator!  On a 2's complement
machine, U0=  and  0=  would be equivalent.  On a 1's complement machine,
0=  is true if its operand is either +0 or -0, and U0= is true if its
operand is +0, i.e. all bits are 0.

In C, you can get this effect either by declaring the data type of the
value to be tested to be unsigned, or by using bit field operations.


It irks me that it is *not possible* to write clean portable code in
standard Forth (conditional compilation is not clean, in my book), due to
too few operators, thus forcing the existing operators to do double
duty and thus to have "surprise" implementation dependencies.


The problem is even worse with address arithmetic, since there are
many popular machines with strange addressing (where "strange" is
defined as "not linearly byte-addressed").  The Forth standard
implicitly assumes linear byte addressing, because it provides
only the word "+" for both number arithmetic and for address arithmetic.
(The proposed "CELL+" is a step in the right direction, but it goes
only about 20% of the way).


> in hardware.  Why should innocent users pay performance
> penalties for division that doesn't work like the stuff
> they learned in grade school?

Playing devil's advocate, I know of no computer hardware that
works like the stuff they teach in grade school.  In grade school,
they teach infinite-precision arithmetic.


> Floored division should be available as an optional, software-
> supported extension, as Mitch suggested.

I would actually hope that Forth vendors would not make it optional.
The standard really ought to define required words for both options.
An implementation could choose which one of them is implemented
most efficiently, and then implement the other in software.

That way, if I want to write a portable program, I don't have to
worry about whether or not any particular machine has one or the
other of the words.


> 3) Separate floating point stack -- too expensive
> ...
>  It seems to me that the primary appeal of such a stack
>  is that floating point numbers used to be a different
>  size than integers.  However, this problem goes away
>  with 32-bit processors.  Single integers are the same
>  size as single floats, and the same with doubles.

I think it's shortsighted to assume that the word size is going to
stick at 32 bits.

> Should folks who are serious about floating point
> performance pay penalties to remain compatible with
> 16-bit machines?

Most people who are really serious about floating point performance
don't want to deal with stacks at all, regardless of whether it
is a separate floating point stack or the regular parameter stack.

They want to convolve vectors, or to perform pipelined transformations,
or to evaluate n-dimensional neighborhoods.


> I ported a floating point math package written for
> a 16-bit processor (with absolutely *no* thought given
> to portability) to a 32-bit processor in about 2 hours.

That is not the point.  A very good programmer can easily do such things,
but real significant software progress will only be be made when
people don't have tweak and port everything that comes their way.


>   I think the separate floating point stack was spawned
> by the fact that the 8087 had one that clever programmers exploited.

I think it's because the combinatorics of 16/32 integer size X single/
double/extended floating point size makes people's brains hurt
when they thing about the stack manipulations.

I have implemented floating point packages for several different
fp chips and boards, none of which has a hardware fp stack.  I used
a separate floating point stack because it simplifies a lot of
things, especially when users would like to select the precision
without having to code the problem differently.


>   I think you will find other hardware designers in
> agreement.  I know Chuck Moore agrees with the floating
> point stack issue.

Forth isn't getting creamed in the marketplace because of hardware
problems.  It's getting creamed because (a) nobody is training
Forth programmers to any significant extent, and  (b) the lack
of standardized libraries and usage across different Forth
implementations causes continuous reinvention of wheels; forward
progress is slow in the community sense.


The purpose of a standard ought to be to allow the creation of
portable programs.  Specific hardware doesn't have to directly implement
the standard.  It can implement whatever the designer can sell.
However, the software that runs on that hardware should implement
the standard.  Providing additional features or faster modes is fine,
so long as the standard semantics are also available.


Cheers,
Mitch

wmb@SUN.COM (Mitch Bradley) (07/18/89)

Many thanks to Rainer Woitok for reminding me that:

> on a 1's complement machine, negative zero is FFFF
> (16 bit  words assumed). 8000  is the smallest ("most  negative") number

Sorry for the error; what I said was correct for sign/magnitude representation
rather than one's complement.

In any case, the principle still holds; "0=" returns true for two distinct
bit patterns, one of which contains nonzero bits.  Worse yet, the standard
defines "true" (i.e. that number which is returned by comparison operators)
as "-1" (it used to be "1" in Forth 79) so that all the bits will be set
(which is not correct for one's complement, where "-1" is hex fffe).

The reason for having all the bits set was supposed to be to allow flags
to participate in AND and OR expressions with less chance for error.

For example, suppose I have the following code:

 ( value ) 40 and    num1 num2 <   and  if

This code would work in Forth-83 but not in Forth-79.  The intention is
obviously to test whether the 40 bit in value is set and num1 is less than
num2.  In Forth-79, < returns 1 , but AND'ing 40 with 1 gives 0.
In Forth-83, < returns -1 (= ffff on 2's complement machines), so you can
AND a flag with any non-false value and still get non-false.  This code
happens to work on one's complement machines too, but it wouldn't work
of you were masking with 1 instead of 40.  ( "1 -1" and  is 0 in
1's-complement)

The point is, if you want to write portable code, make sure that you
follow bit-masking operations with a test for nonzero, to turn the
bit mask into a real, honest-to-goodness, true flag.

But wait!
There is no operator to test for nonzero!  The closest is  0= 0= ,
and as we've seen, that doesn't work on one's-complement machines.

The bottom line here is that Forth has a problem.  Other languages
have bitten the bullet and have distinguished between boolean operators
and bitwise logical operators (C: && vs &   Fortran: .AND. vs AND() ).
Forth uses the same operator for different purposes, just because it
happen to work on most machines in most cases.

Mitch (give me precise operators!) Bradley

koopman@a.gp.cs.cmu.edu (Philip Koopman) (07/19/89)

In article <8907181614.AA06989@jade.berkeley.edu>, wmb@SUN.COM writes:
>...
> What is needed is a an unsigned 0= operator!  On a 2's complement
> ...
> It irks me that it is *not possible* to write clean portable code in
> standard Forth (conditional compilation is not clean, in my book), due to
> too few operators, thus forcing the existing operators to do double
> duty and thus to have "surprise" implementation dependencies.

You still haven't explained which 1's complement machines are
about to flood the marketplace....

But, it appears that your argument is that Forth itself is broken.
Do you propose changing the language, or making implementations in
hardware and software bite the compatibility/speed loss bullet?

> > 3) Separate floating point stack -- too expensive
> > ...
> I think it's shortsighted to assume that the word size is going to
> stick at 32 bits.
> ...
> I think it's because the combinatorics of 16/32 integer size X single/
> double/extended floating point size makes people's brains hurt
> when they thing about the stack manipulations.

But, when basic integer sizes reach 64 bits, I think it likely
that "single precision" floating point will also go to 64 bits.
The number of bits itself isn't interesting.  What I'm speculating
is that we will reach a balance, where single-precision integers
and floats will be the same size, and that they will both increase
in size together.

> The purpose of a standard ought to be to allow the creation of
> portable programs.  Specific hardware doesn't have to directly implement
> the standard.  It can implement whatever the designer can sell.
> However, the software that runs on that hardware should implement
> the standard.  Providing additional features or faster modes is fine,
> so long as the standard semantics are also available.

So, it's OK to sell a very fast Forth with stack hardware, as
long as a compatibility mode with the standard (that does
floored division, and emulates a floating point stack in
program memory) is available?  If so, I agree with you.
What I'm arguing against is mandating significant inefficiencies
in the base package, so that the user is *stuck* with them.

  Phil Koopman                koopman@greyhound.ece.cmu.edu   Arpanet
  2525A Wexford Run Rd.
  Wexford, PA  15090
Senior Scientist at Harris Semiconductor.
I don't speak for them, and they don't speak for me.