[comp.std.misc] Questions about NCEG

kingdon@pogo.ai.mit.edu (Jim Kingdon) (05/30/90)

In comp.std.c, O'Keefe writes:

    (1) If you stick to the letter of the IEEE 754 and IEEE 854
    standards, conversion of numeric literals from decimal to binary
    (or possibly, in the case of 854, to internal decimal) is a *run*
    *time* operation,

Looking at 754-1985 I don't see how that interpretation makes sense.
Section 4.2 says "An implementation shall also [in addition to round
to nearest] provide three user-selectable directed rounding modes".
"User", as defined in this standard, can be the compiler, rather than
the program you are writing.  This is made explicit in section 4.3,
"the user, which may be a high-level language compiler".  So when
section 5.6 says "the procedures used for binary <-> decimal
conversion should give the same results regardless of whether the
conversion is performed during language translation (interpretation,
compilation, or assembly) or during program execution (run-time and
itneractive input output)" it doesn't say what rounding mode the
compiler has to use.  754 just says that the compiler needs to be able
to choose a rounding mode.  Presumably the compiler will either just
pick one (and if you're lucky document which one it is), or it will
provide some sort of directive to control it ("#pragma rounding_mode",
"__rounding(round_toward_infinity)", etc).

This is based on 754; I have no idea whether 854 is similar.

eggert@twinsun.com (Paul Eggert) (05/31/90)

In response to O'Keefe's assertion that the letter of IEEE 754 requires that
decimal-to-binary conversion must always be done at run-time because the
rounding mode may be different at run-time than compile-time, Jim Kingdon
writes a tricky defense of compile-time conversion, but overlooked a key word
in IEEE 754 that permits a much simpler defense:

	... the procedures used for binary <-> decimal conversion should give
	the same results regardless of whether the conversion is performed
	during language translation ... or during program execution ...
	(section 5.6 of IEEE 754)

The key word is "should".  That means it's optional, or as IEEE 754 defines it,

	strongly recommended as being in keeping with the intent of the
	standard, although architectural or other constraints beyond the scope
	of this standard may on occasion render the recommendations
	impractical.  (section 2)

Surely this is such an occasion.

eggert@twinsun.com (Paul Eggert) (05/31/90)

I wrote:

	There are ways to say the IEEE numbers in ANSI C + IEEE 754:
		#define minus_0 (-0.0)
		#define infinity (1e300*1e300)
		...

Doug Gwyn replied:

	These don't work.  -0.0 is NOT a "minus zero"; it's identical to 0.0.

Surely ANSI C doesn't require this; it is inconsistent with IEEE 754.
For example, given the following strictly conforming ANSI C program

	#include <stdio.h>
	int main() { printf("%g\n", -0.0); return 0; }

an implementation that conforms to both ANSI C and IEEE 754 must print "-0",
not "0".  IEEE 754 doesn't specify the output format, but it does require that
+0 and -0 be output differently, and "-0" is the only real choice here.

This is not purely academic.  Of the two C compilers on my IEEE 754-based
machine, one prints "-0" and the other "0" when given the above program.  I've
sent a bug report to the latter's implementer, and fully expect it to get fixed.

Finally, Gwyn wrote:

	Assuming that 1.0e+300*1.0e+300 is not representable as a
	(floating-point) number, the behavior is explicitly undefined...

(Steve Clamage made a similar point.)

True, if you assume only conformance to ANSI C.
But I explicitly assumed conformance to both ANSI C and IEEE 754.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/31/90)

In article <KINGDON.90May30012449@pogo.ai.mit.edu>,
 kingdon@pogo.ai.mit.edu (Jim Kingdon) writes:
> In comp.std.c, O'Keefe writes:
>     (1) If you stick to the letter of the IEEE 754 and IEEE 854
>     standards, conversion of numeric literals from decimal to binary
>     (or possibly, in the case of 854, to internal decimal) is a *run*
>     *time* operation,

Jim Kingdon then gives a clear explanation of why he thinks this doesn't
follow.  I am obliged to confess that I was taking this "on authority"
from an article in Software Practice & Experience a couple of years ago.

There are some subtleties in 754-1985.  For example, a C compiler which
implements "long double" as IEEE extended double runs into "1.3 This
standard does not specify (3) binary<->decimal conversions to and from
extended formats", so long double literals are governed only by the C
standard.

One important question is whether a compiler counts as a "user",
within the meaning of the standard.  The definition is
	"user.  Any person, hardware, or program not itself specified
	by this standard, having access to and controlling those
	operations of the programming environment specified in this
	standard."
Now the most natural interpretation of this _to me_ is that a compiler
counts as a "user" with respect to its _own_ use of floating point,
but that with respect to the floating point operations expressed in my
program, it counts as part of the implementation.  This does get rather
blurred when you have an interpreter or incremental compiler.

If we regard the compiler as a "user" then it might have selected
_any_ of the four rounding modes, and we have no right to expect
say, round to nearest.  So we have two cases:
(a) A compiler is part of the implementation,
    floating-point literal -> binary conversion is notionally run-time;
    it ought to be affected by rounding mode and exception mode;
    therefore to be certain of what you're getting hex floats are useful
(b) A compiler is a user, not an implementation,
    floating-point literal -> binary conversion is compile-time;
    it is affected by whatever rounding mode the compiler writer chose;
    therefore to be certain of what you're getting hex floats are useful.

For example, the method suggested for getting an IEEE infinity, namely
	static double infinity = 1.0e300*1.0e300;
is _not_ guaranteed to produce an infinity value; the compiler, if
thought of as a "user", may have said "never mind this Infinity business,
I want *exceptions*" and might quite reasonably reject the program
(perhaps by dumping core or some such traditional method).
Or failing that, the compiler-writer might have selected "round toward 0"
which (7.3(2)) "carries all overflows to the format's largest finite
number with the sign of the intermediate result".  Is there any reason
why an ANSI C compiler doing this would not be conforming to IEEE 754,
_if_ we regard the compiler as "user" rather than "implementation"?

The key point in the standard is that binary<->decimal conversion is
listed as an OPERATION in section 5.  It's one of the things that can
signal exceptions (if you have EXPLICITLY asked for exceptions, by
means outside the scope of the standard).  So if I write
	double boojum = 1.0e999;
I have specified in my program an >operation< which might return an
Infinity or might signal Overflow, depending on the run-time state.
Consider
	/* fragment 1; clearly run-time */
	static double x = 1.1e300;
	static double y = 1.2e300;
	void foo() { double z = x*y; ... }

	/* fragment 2; is this compile-time or what? */
	#define x (1.1e300)
	#define y (1.2e300)
	void foo() { double z = x*y; ... }

	/* fragment 3; how about this one? */
	static const double x = 1.1e300;
	static const double y = 1.2e300;
	void foo() { double z = x*y; ... }

In each case I have specified in my program that a certain operation
be performed involving certain operands; the same operation and operands
in each case.  If I get an Infinity value in z when I had asked for
exceptions, I may be in trouble...

What's the relevance of this to the conversion of literals?  Just that
conversion of 1.2e300 from decimal to binary is in every respect just
as much an "operation" as multiplication is.  It is influenced by the
rounding mode and exception mode in just the same ways that multiplication
is.

Now, I don't myself know of _any_ compiler that does decimal->binary
conversion at run time, and that interpretation would obviously cause
difficulties for static initialisers.  But the other interpretation
means that we cannot express IEEE special values portably, so either
way hex floats are useful.

IEEE 854 is similar to 754.  They mostly stuck to the same wording.
The main point of 854 was to generalise to other word lengths than
32, 64, &c, and to allow decimal as well (so you can have an 854-
conformant pocket calculator).


There is a proposed standard for the floating-point aspects of
programming language standards.  I just borrowed a copy of the
SigPlan Notices issue containing it, and don't know what it says
yet.  No doubt that will make life even more interesting, so many
standards to choose from!
-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/31/90)

In article <1990May30.205436.11534@twinsun.com> eggert@twinsun.com (Paul Eggert) writes:
>Doug Gwyn replied:
>	These don't work.  -0.0 is NOT a "minus zero"; it's identical to 0.0.
>Surely ANSI C doesn't require this; it is inconsistent with IEEE 754.

I sometimes wonder why I waste my time responding.
IEEE 754 does not apply to C source code!
The C standard says simply that the value of "-x" is the "negative of x";
in the case of integer operations on a ones' complement architecture,
which does have a representation for "negative zero" integers, one can
deduce from a combination of specifications in the standard that "-0"
must have the same representation as "0", not all-one-bits.  While I
doubt that there are tight enough constraints to deduce that the
analogous situation is mandated for floating-point, I'm sure that it is
allowed.  Since no special requirements are made for IEEE FP environments
in the C standard, it would then be allowed by the C standard for such
environments also.  Note that "-0.0" consists of the TWO tokens "-" and
"0.0"; it is NOT a single floating constant, and thus is covered by the
ordinary rules for negation in expressions, not treated as some special
code that must be represented by the IEEE bit pattern for "minus zero".

>IEEE 754 doesn't specify the output format, but it does require that
>+0 and -0 be output differently, ...

However, it does not require that writing "-0.0" in C source code be
a way of producing a "minus zero".  IF you are somehow able to cobble
up a minus zero result, perhaps through bitwise operations or by
dividing 1 by -infinity (produced as an overflow or something), THEN
printing that result would have to print something distinguishable from
(plus) zero in order to satisfy IEEE 754.

>This is not purely academic.  Of the two C compilers on my IEEE 754-based
>machine, one prints "-0" and the other "0" when given the above program.  I've
>sent a bug report to the latter's implementer, and fully expect it to get fixed.

I would have sent a bug report to the former, since although the former
may technically be allowed it is not what I would expect for negation of
an expression that happened to have the value 0.  Mathematically it is
nonsense to say -0 is not identical to 0.  (Yes, I've heard all the
arguments for the IEEE 754 extended number system, but I don't buy it.
There are much more useful extensions that would also have been
mathematically valid, but the real number represented by -0 is
identically the real number represented by 0.)

>But I explicitly assumed conformance to both ANSI C and IEEE 754.

So did I.