[comp.lang.prolog] Floats and the SubStandard

ok@cs.mu.oz.au (Richard O'Keefe) (10/22/89)
Having become interested in the topic of floating-point arithmetic in
Prolog, I decided that it would be a good idea to convert the Fortran
subroutine MACHAR (which I got from NETLIB a couple of years ago, but
you can find it in CALGO -- see Algorithm 665 in ACM TOMS) to Prolog.
The point of this is that MACHAR determines some of the parameters of
the floating-point operations on your system:
	ibeta	= base of the floating point system (2, 8, 16, 10, whatever)
	it	= number of base-ibeta digits
	xmin	= smallest normalised positive number
	xmax	= largest finite representable number
and several others.  It doesn't tell you everything that you might want
for writing portable floating-point code, but it's a rather good start.
Since the BSI Prolog substandard includes such things as cos(-) {one of
the things I find amusing about the substandard, when I'm not sickened,
is that the text plainly and unambiguously says that [10.1.12.1] "cos(X)
evaluates the operand X with value VX and produces as result the cosine
of VX (measured in radians)."  That is, no rounding is permitted at all
so a conforming Prolog system must be capable of representing
transcendental numbers} it would be a Good Thing if it were possible to
write test code in Prolog to tell you how well these operations were
implemented.  There is already a suite of programs to check the Fortran
elementary transcendental functions, called ELEFUNT and accessible from
NETLIB or look in Cody & Waite, and MACHAR provides the parameters that
those test programs need.  As I said, it's a good start.

It turned out that the easiest way to get MACHAR running was to convert
the Fortran code to a data structure and write an interpreter.  I tried
the code under the following Prologs:
	Quintus Prolog 2.5Beta (Sun)		\  No change needed
	SICStus Prolog 0.6 (Sun)		 | for these three
	NU Prolog 1.4 (ELXSI) and 1.5 (Sun)	/  "Common Prologs"
	Stony Brook Prolog 3.0 (Sun & ELXSI)    |  minor change needed
	XP Prolog (IBM PC/AT)			\  These three are
	XM Prolog (1M Macintosh Plus)		 | pseudonyms
	YY Prolog (1M Macintosh Plus)		/

YY Prolog dropped out because the version I had turned out not to support
floats at all (it was a beta-version, as the FCS that I had been given
destroyed itself).  XP Prolog and YP Prolog come from the same vendor,
but are not compatible with each other.  Fortunately, the only relevant
difference for this test was the name of the truncate-to-integer function.

The most obvious difference between the Prologs was the name of the
truncate-a-float-towards-zero-yielding-an-integer function.  This is
needed to emulate the Fortran INT(X) or IFIX(X) function, which is used
by MACHAR.

	Quintus Prolog, SICStus Prolog, NU Prolog:	I is integer(X).
	SB Prolog (this is not a mistake):      	floor(X, I)
	XP Prolog:					I is ip(X)
	XM Prolog:					I is int(X)
	BSI substandard Prolog:				not available

The entry for SB Prolog is not a mistake; the name of the operation is
floor/2, but that's not what it _does_.  What it _does_ is truncate
towards zero.  The entry for BSI substandard Prolog is not a mistake
either.  There _is_ an operation called fix/1 in the substandard;
[10.1.8.1] "fix(X) evaluates the operand X with value VX and produces
as result the value of the integer obtained by **ROUNDING** VX".  For
this particular application, we want an operation which does what the
Fortran function IFIX does, but that's not what the BSI substandard's
fix(-) does.  Instead, fix(-) is defined as *rounding*, which is what
the Fortran function NINT (nearest integer) does.

I have recommended elsewhere that all four conversions (floor, ceiling,
round (to nearest) and truncate (towards zero)) should be provided under
those names.  This exercise gives me no reason to change that opinion.

Another porting problem was that this program incorporates the Fortran
code as a data structure, and SB Prolog could neither compile nor consult
a clause containing a data structure that big.  It was straightforward to
convert the data structure to a predicate which put the thing together at
run time from pieces.

Getting SB Prolog running at all on an ELXSI was not easy, as SB Prolog
makes use of 'syscall'.  Unfortunately, even UNIX systems claiming to
be 4.3 BSD (as the ELXSI does) might not have that system call (and the
ELXSI does not).  The 4.3 BSD manual explicitly labels syscall() as a
"(VAX-11)" function.  I was able to hack around this for the one case
that SB Prolog really needs.  (No, I can't give you the patches.  SB
Prolog is covered by a GNU-style copyleft, so I can't give you any
modifications without giving you the whole lot of the sources.  Yes,
this does mean that the copyleft strongly discourages software sharing.)
There appears to be a mistake in the ELXSI implementation of frexp(),
which required another patch.  (No, I can't give you the patches.)

Another problem is that both XM Prolog and SB Prolog write floating-point
numbers out in C "%f" format.  This means that some of the tiny numbers
that MACHAR computes (such as xmin) print as 0.000000, which is Not Good.
It was easy to fix this in SB Prolog (No, I can't give you the patches.
See above) but not possible in XM Prolog.  I worked around the problem
in XM Prolog by guessing the order of magnitude of the results and having
it print 1.0E<appropriate>*<the interesting value> instead.  Ugh.

Amusingly/sickeningly enough, the substandard does not prohibit this.
The only indication I can find in the substandard of how floating-point
numbers are to be printed is in the definition of writeq/1:
[9.12.7.5 b)]
"	Floating-point numbers are printed in either the fixed point
	or exponential notation, whichever is shorter; non-significant
	zeros are not printed.  The default number of significant
	digits printed is implementation-dependent.	"

As far as I can see, this means that it is ok to print 1.0e-9 as "0"
or possibly "0.", but that "0.0" .. "0.0000" are not allowed (because
they contain "non-significant zeros") and "0.00000" and up are not
allowed either (because they are longer than "1.0e-9").  It is noteworthy
that writeq is required to quote strings and atoms "where necessary to
make the result acceptable as input to read/1" but that there is no
requirement that _numbers_ be written in a way that is acceptable to read/1.
Nor can I find any definition of what "the fixed point or exponential
notations" are.  This is important:  Fortran and C disagree!  What is
really sickening is that there is no requirement whatsoever that the
output be an approximation of the number's value.  That is, if your
default number of significant digits is 3 and you choose to write the
number 1.2345e6 as "1.21e6" there is nothing in the substandard to stop you.

    This deficiency in the substandard would be excusable in a first draft;
but it is not the first draft, and the committee have "working" on it for
five years.  (And the unacceptable vagueness of the writeq/1 definition
was pointed out to them two or three years ago.)

    I haven't finished!  As Cody explains in the ACM TOMS article,
"the program deliberately provokes underflow two or three times (depending
on the system) and will malfunction on systems that treat underflow as a
fatal error."  The program is set up to handle IEEE gradual underflow,
flush-to-zero, and a couple of others.  At first sight, there is no
problem:
	Quintus Prolog			truncated "native" single format
	SICStus Prolog, NU Prolog	full "native" double format
	Stony Brook Prolog		own 28-bit format but ok
	XM Prolog (Mac)			uses Apple's "SANE" package
	XP Prolog (PC)			said to use "IEEE formats"
However, the vendor of XP Prolog claims _only_ to use IEEE *formats*;
the X company make no claim whatsoever to conform to the IEEE standard
in any other way.  (That's one of the reasons why it is interesting to
run MACHAR in XP Prolog; the difference between their arithmetic and
IEEE arithmetic shows up quite clearly.)  XP Prolog, in fact, reports
underflow as an exception.  It turns out to be possible to hack around
this.  You have to ensure that arithmetic operations are performed only
in is/2, not in comparisons, and that each is/2 contains at most one
operation, and then you install a GLOBAL error handler (now that's
obscene) which says
	"if the <UNDERFLOW> error occurs in a goal matching (X is _)
	 then unify X with 0.0 and continue".

    Ok, so what does the BSI substandard say about underflow?  NOTHING.
In the absence of an index, I may have missed it, but I checked all the
obvious places.  The standard nowhere speaks of floating-point numbers;
on the contrary it explicitly demands "real" numbers, and there is no
permission given anywhere to deliver inexact results.  For example, if
you track down "Z is X*Y" you find that "X, Y, and Z represent REALS
such that Z = X*Y".  There's no way around it; reporting underflow,
flushing to zero, and gradual underflow are alike forbidden.  (Section
10 mentions a category of exceptions called "Calculation Error", but
section 6, which defines what these categories mean, does not include
this category.  The category "range error" does not refer to the result
of a computation being outside the representable range, but to the INPUT
of a computation not being in a permitted *domain*.)

    Ideally, a Prolog standard including floating-point should provide
you with a simple way of saying that you want underflow flushed to 0.
In fact, you shouldn't have to say anything at all.
The IEEE 854 standard, for example, says
	The default response to an exception SHALL be to proceed
	without a trap ... [for underflow] the delivered result might
	be zero, subnormal, or +/- [radix]**[most negative exponent].
The IEEE 754 standard is similar:  any system whose default behaviour
for underflow is to signal an error does not conform to either of the
IEEE standards.  (I repeat that the "X" company make no claim to conform
to either of the IEEE standards.)  However, there are lots of machines
around which don't conform to the IEEE standards.  It might not be easy
for a Prolog implementor to implement underflow->tiny-or-zero.  At the
very least, a Prolog standard including floating-point should give you
a way of finding out that underflow doesn't behave the IEEE way.

    A final problem was that even with cuts all over the place (as MACHAR
is interpreting a Fortran program, it is completely determinate) both XP
Prolog and XM Prolog put all their stacks in a 64kbyte region, which was
not enough space, even with frequent garbage collection.  I redesigned a
data structure and produced output earlier, and managed to squeak through.
Part of the problem here is that the "X" Prologs use a rather bulky
representation for compound terms.


To summarise, the difficulties I had were
    -	treatment of underflow (not addressed by BSI)
    -	truncation to zero (operation missing or inaccurately described in BSI)
    -	incorrect output (left hopelessly vague by BSI)
    -	implementation limits (a standard _can't_ do anything about this)

What _does_ the BSI substandard buy me?  Well, they spell their new syntax
out in considerable detail.  The ultimately comical thing about this is
that I had no *syntax* problems whatsoever.  For some of this I can take
credit:  in 1984 I told the BSI committee that the most effective way of
standardising Prolog was to do what Wirth did with Pascal, and I proceeded
to broadcast to the net Prolog definitions of read/1, write/1 (and family),
setof/3 (and family), and several other things.  Quintus Prolog uses
debugged and enhanced versions of these public domain files.  So does SICStus
Prolog.  Stony Brook Prolog 3.0 uses the public domain read/1 with a tokeniser
written in C, but I gave them that too.  However, I can't take credit for all
of the compatibility.  The "X" company did their own "Edinburgh-compatible"
syntax; it isn't perfectly compatible but the few minor differences were not
a problem for this program.  NU Prolog has its own tokeniser and parser, and
extends Edinburgh syntax considerably:  it has fxx, fxy, fyx, and fyy
operators so that you can write e.g. "all X p(X)".  You have to put spaces
around "+" and "-" signs, but that is trivially easy to do.

    Oh yes, aside from the sheer convenience of implementing MACHAR via
an interpreter, it is advisable to interpret at least the arithmetic
expressions.  The reason is that the expressions MUST NOT be rearranged
by a compiler.  This is entirely typical of floating-point work; floats
satisfy almost none of the axioms satisfied by the REALs, and compilers
which rearrange float expressions as if they were real expressions give
numeric programmers real pain.  The BSI substandard does not address
this issue.  (You knew that was coming, didn't you?)

    Just how big a program am I talking about?  318 lines, 93 clauses,
11 predicates.  The part which actually does arithmetic is 30 lines,
8 clauses, 1 predicate.  This is not a large program.  The Fortran
version is portable to most Fortrans; the ACM TOMS article lists 15,
and I know that it works on at least three more.  What's the good of a
standard if it doesn't help me write a *small* portable program?

    After FIVE YEARS the BSI/ISO substandard committees have not
addressed the actual portability problems relevant to this program.
And this is the easy case, as all of the problems have been known for
years and are addressed by the Fortran (77), Ada, and C standards, so
that no innovation is required.

    Remember that three of the Prolog systems (Quintus, SICStus, and
NU Prolog) could use the same source file, and the SB Prolog's main
problem was that a data structure was too big?  The BSI substandard
doesn't fix the porting problems but it does _introduce_ one:  it is
not compatible with the "Common Prologs".  Even integer arithmetic!

    Remember also that you don't have to be an expert on floating-point
to find out what the experts reckon ought to be done about it.  One has
only to read ACM TOMS, or SP&E (Farnum's paper in July '88 is a good one).
The BSI/ISO work is being done as an ISO standard; it should be easy for
one standards committee to find out what the others are doing (Fortran 8X,
and particularly the Ada Numerics group).  WHY IN THE NAME OF ALL THAT IS
HOLY HAVE THE BSI/ISO COMMITTEES GOT THIS FAR IN STANDARDISING A LANGUAGE
WITH FLOATING-POINT WITHOUT TAKING THE TROUBLE TO LEARN ANYTHING ABOUT IT?