[comp.lang.c] MSC C arithmetic

jdickson@zook.Jpl.Nasa.GOV (Jeff Dickson) (08/14/87)

	I recently wrote a program for the PC that performed some 
arithmetic. I compiled the code using MSC 'C' 4.0 . The problem I
had, was that simple integer and floating point arithmetic was not
being carried out properly unless I explicitly cast the operands
to the type of the result.

	One of the expressions involved multiplying two integers
and placing the result in a long integer. The code looked something
like this:

	unsigned int repeat_cnt;
	unsigned int units;
	unsigned long n;

	for (n = repeat_cnt * units; n > 0; n--)

	If repeat_cnt equaled 10,000 and units equaled 10, n did
not get set to 100,000! Rather n got set to 34,464 - which is what
the result would be if n were an int not a long (100,000 - 65,536).
However, if I explicitly cast the values as longs then n got assigned
correctly.

	for (n = (long)repeat_cnt * (long)units; n > 0; n--)

	I had a similiar problem with a floating point expression. Not
explicitly casting the operands to float caused a completely bogus
result. In this case, the expression looked something like this:

	double total_bits;
	unsigned int repeat_cnt;
	unsigned int units;
	unsigned int blksiz;

	total_bits = repeat_cnt * units * blksiz * 16.0

	If I interpret K & R correctly, explicit casting of the above
operands and the operands in the first example are not necessary. What
gives? The Sun will run the above code without modification and yield
the expected results. 

	Is this a bug? Or a feature? Or within the spec of the language
that leaves it up to the compiler writer to generate code that works or
breaks? 

Jeff S. Dickson
jdickson@zook.jpl.nasa.gov

chris@mimsy.UUCP (Chris Torek) (08/14/87)

In article <8792@brl-adm.ARPA> jdickson@zook.Jpl.Nasa.GOV (Jeff
Dickson) writes:
>	unsigned int repeat_cnt;
>	unsigned int units;
>	unsigned long n;

>	for (n = repeat_cnt * units; n > 0; n--)

[and]

>	double total_bits;
>	unsigned int repeat_cnt;
>	unsigned int units;
>	unsigned int blksiz;

>	total_bits = repeat_cnt * units * blksiz * 16.0

[produce unexpected results].

Microsoft C is compiling the first example correctly.  The extension
rules require only that `repeat_cnt * units' be done in unsigned
arithmetic; at least one of the two multiplicands must be of type
long or unsigned long for the operation to be done in long arithmetic.

The second result is not so defensible; the 16.0 should modify the
entire operation so that it is done in double.  This follows from
the type matching rules: typeof(repeat_cnt * (expr0)) is
typemax(typeof(repeat_cnt), typeof(expr0)); typeof(expr0) is
typemax(typeof(units), typeof(expr1)); typeof(expr1) is
typemax(typeof(blksize), typeof(16.0)) which is typemax(type_u_int,
type_double) which is just type_double; this propagates all the
way back to the top.

The first example works on a Sun because there sizeof(u_int) ==
sizeof(u_long), so unsigned int arithmetic is the same as
unsigned long arithmetic.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

ark@alice.UUCP (08/14/87)

In article <7976@mimsy.UUCP>, chris@mimsy.UUCP writes:
> 
> >	total_bits = repeat_cnt * units * blksiz * 16.0
> 
> [produce unexpected results].
> 
> The second result is not so defensible; the 16.0 should modify the
> entire operation so that it is done in double.

Nope.  This example should be parsed as

	total_bits = (((repeat_cnt * units) * blksiz) * 16.0);

Thus the first two multiplications should be done in integer,
and the resulted converted to double for the third multiplication.

devine@vianet.UUCP (Bob Devine) (08/15/87)

In article <8792@brl-adm.ARPA>, jdickson@zook.Jpl.Nasa.GOV (Jeff Dickson) writes:
> 	I recently wrote a program for the PC that performed some 
> arithmetic. I compiled the code using MSC 'C' 4.0 . The problem I
> had, was that simple integer and floating point arithmetic was not
> being carried out properly unless I explicitly cast the operands
> to the type of the result.

  But the operation was done properly; it was just unexpected to you.
According to the "usual arithmetic conversion rules" 2 ints will produce
an int.  If it so happens that overflow occurs because of the operation,
the resultant int value will simply not be the expected value.

  And as you discovered, according to the "usual arithmetic conversion
rules" where one operand is a long and the other an int, the operand
that is an int is converted to a long and the result is a long.  This
gave you what you wanted.

  Look up "arithmetic conversions" in K&R or "the usual binary conversions"
in Harbinson/Steele.  Sam Harbison posted a modified conversion list
to net.lang.c in '85.

  On PCs, an 'int' is 16 bits.  You have to use a long to hold the value
for the example code you posted.  Yeah, there are times I want a VAX too.

Bob Devine

chris@mimsy.UUCP (Chris Torek) (08/17/87)

>In article <7976@mimsy.UUCP> I wrote:
>> The second result is not so defensible; the 16.0 should modify the
>> entire operation so that it is done in double.

In article <7178@alice.UUCP> ark@alice.UUCP writes:
>Nope.  This example should be parsed as
>
>	total_bits = (((repeat_cnt * units) * blksiz) * 16.0);

ark is right (as usual); I was at home, working without a K&R.  I
had originally hedged on this, then took out the hedging.  Sorry
about that.

(And this posting was delayed two days by hardware problems ... sigh.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

am@cam-cl.UUCP (08/17/87)

In article <7976@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
[discussing:]
>>	double total_bits;
>>	unsigned int repeat_cnt, units, blksiz;
>>	total_bits = repeat_cnt * units * blksiz * 16.0
>[produce unexpected results].
>
>The second result is not so defensible; the 16.0 should modify the
>entire operation so that it is done in double.
> [...] type_double; this propagates all the way back to the top.
>
I believe this to be a mis-reading of the rules.
The expresion 'repeat_cnt * units * blksiz * 16.0' parses as
'((repeat_cnt * units) * blksiz) * 16.0' as * is defined
to be left associative.  The fact that '((repeat_cnt * units) * blksiz)'
is evaluated in 'double' context is not relevant.
The fix is to write
  '(double)repeat_cnt * units * blksiz * 16.0' or
  'repeat_cnt * (units * (blksiz * 16.0))'.
to force more multiplications to be done in double mode.
  '(unsigned long)repeat_cnt * units * blksiz * 16.0' probably gives
exact correspondence with 32-bit int machines.

dg@wrs.UUCP (David Goodenough) (08/17/87)

In article <8792@brl-adm.ARPA> jdickson@zook.Jpl.Nasa.GOV (Jeff Dickson) writes:
> ..... stuff deleted .....
>	One of the expressions involved multiplying two integers
>and placing the result in a long integer. The code looked something
>like this:
>
>	unsigned int repeat_cnt;
>	unsigned int units;
>	unsigned long n;
>
>	for (n = repeat_cnt * units; n > 0; n--)
>
>	If repeat_cnt equaled 10,000 and units equaled 10, n did
>not get set to 100,000! Rather n got set to 34,464 - which is what
>the result would be if n were an int not a long (100,000 - 65,536).

Your MSC C compiler is behaving correctly. If you read page 41 of the
gospel according to Kernighan and Ritchie :-) it says:

	char & short are converted to int ......
	.
	.
	.
	Otherwise if either operand is long, the other is converted to long,
	and the result is long.
	.
	.
	.
	Otherwise the operands must be int, and the result is int.

The practical upshot of this is that the type of an expression is determined
from the operands, NOT the result, i.e. without a cast the multiplication is
done as int, not long. It then goes on to say that conversions take place
across assignments: the value of the RHS is converted to the type of the
LHS, which is the type of the result. BUT SINCE '*' has a higher priority
than '=', the mult is still done as int:

		     long assignment
			=
		       / \
		long  /   \ int converted to long
		     /	   \
		    n	    * int multiplication
			   / \
		      int /   \ int
			 /     \
		 repeat_cnt   units

The reason this works on the Sun is that on most 68K machines ints and longs
are the same size: 32 bits.

The second example fails because '*' groups left to right (see page 49 of the
gospel according to Kernighan and Ritchie :-) and it does the two integer
multiplications first, and then the float (i.e. double). This should work
correctly if you wrap the last multiplication in parentheses:

>	total_bits = repeat_cnt * (units * (blksiz * 16.0));

as this then forces all multiplications to have one float operand, hence
getting the desired result.
--
		dg@wrs.UUCP - David Goodenough

					+---+
					| +-+-+
					+-+-+ |
					  +---+

msb@sq.uucp (Mark Brader) (08/18/87)

To the original question about:

> >	unsigned int repeat_cnt;
> >	unsigned int units;
> >	unsigned long n;
> >	for (n = repeat_cnt * units; n > 0; n--) ...

> >	double total_bits;
> >	unsigned int repeat_cnt;
> >	unsigned int units;
> >	unsigned int blksiz;
> >	total_bits = repeat_cnt * units * blksiz * 16.0

Chris Torek writes:

> Microsoft C is compiling the first example correctly.  The extension
> rules require only that `repeat_cnt * units' be done in unsigned
> arithmetic; at least one of the two multiplicands must be of type
> long or unsigned long for the operation to be done in long arithmetic.

which is true, but he continues:

> The second result is not so defensible; the 16.0 should modify the
> entire operation so that it is done in double.  This follows from
> the type matching rules ...

which I think is wrong.  I quote K&R appendix A section 7.3: "The
multiplicative operators *, /, and % group left-to-right."  In the
ANSI draft, the same statement is concealed behind the grammar
in 3.3.5 that defines a multiplicative expression as being,
among other things, "multiplicative-expression * cast-expression".

It seems clear to me that both sources are saying that the assignment
must be considered as:

	total_bits = ((repeat_cnt * units) * blksiz) * 16.0

whereupon the two left-hand multiplies are done, as the MS C compiler
compiled, in unsigned ints.

I must admit that it is NOT clear to me whether the regrouping rules
allow the compiler to alternatively compile this as:

	total_bits = repeat_cnt * (units * (blksiz * 16.0))

whereupon the arithmetic would indeed be done in double.  I think they do.
Perhaps Doug or someone who has studied this closely could comment on it.
That is, if the removal (to which I probably object) of the regrouping
rules has not already happened.

Mark Brader			"The language should match the users,
utzoo!sq!msb			 not vice versa"  -- Brian W. Kernighan

randy@umn-cs.UUCP (Randy Orrison) (08/20/87)

In article <305@wrs.UUCP> dg@wrs.UUCP (David Goodenough) writes:
>The second example fails because '*' groups left to right (see page 49 of the
>gospel according to Kernighan and Ritchie :-) and it does the two integer
>multiplications first, and then the float (i.e. double). This should work
>correctly if you wrap the last multiplication in parentheses:
>
>>	total_bits = repeat_cnt * (units * (blksiz * 16.0));
>
>as this then forces all multiplications to have one float operand, hence
>getting the desired result.

Interesting.  Page 49 of my K&R states that "expressions involving one of the
associative and commutative operators (*,...) can be rearranged even when
parenthesized."  Which means that your parentheses should do nothing, and that
the original expression is equivalent to:

	16.0 * repeat_cnt * units * blksiz

Which would mean that the whole expression should be evaluated in double.

Since the compiler is free to evaluate the expression in whatever order is
wants, does that mean that it is not determined what types it will use?
It seems to me that both of the following are legal:

	repeat_cnt * units * blksiz * 16.0
		--int--	       --double--
		     --double--
and
	16.0 * blksiz * units * repeat_cnt
	 --double--
		 --double--
			--double--

Which we know can give different results.  Is this right? (under K&R and ANSI)

(Side note to ANSI people:  I like to use parentheses to make my meaning clean
 to me.  If the compiler can think of a better order to do things, LET IT!!!
 I LOVE unary plus, though temporary variables are just fine.)

-- 
Randy Orrison, University of Minnesota School of Mathematics
UUCP:	{ihnp4, seismo!rutgers!umnd-cs, sun}!umn-cs!randy
ARPA:	randy@ux.acss.umn.edu		 (Yes, these are three
BITNET:	randy@umnacvx			 different machines)

chris@mimsy.UUCP (Chris Torek) (08/20/87)

In article <2042@umn-cs.UUCP> randy@umn-cs.UUCP (Randy Orrison) writes:
>... Page 49 of my K&R states that "expressions involving one of the
>associative and commutative operators (*,...) can be rearranged even when
>parenthesized."

This quote refers to *evaluation order*, not *result type*.  (Today I
have my K&R handy!)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

tim@amdcad.AMD.COM (Tim Olson) (08/20/87)

In article <2042@umn-cs.UUCP> randy@umn-cs.UUCP (Randy Orrison) writes:
+-----
|Since the compiler is free to evaluate the expression in whatever order is
|wants, does that mean that it is not determined what types it will use?
|It seems to me that both of the following are legal:
|
|	repeat_cnt * units * blksiz * 16.0
|		--int--	       --double--
|		     --double--
|and
|	16.0 * blksiz * units * repeat_cnt
|	 --double--
|		 --double--
|			--double--
|
|Which we know can give different results.  Is this right? (under K&R and ANSI)
+-----
Yup.  To guarantee that the calculation is done correctly, you should
write it:

	16.0 * (double)blksiz * (double)units * (double)repeat_cnt;


	-- Tim Olson
	Advanced Micro Devices
	(tim@amdcad.amd.com)

peter@sugar.UUCP (Peter da Silva) (08/21/87)

>	long = int * int. Should it become long = long * long.

Depends on how you break up the expression. I would expect the compiler
to parse like this:

		expr
		/|\
	       / | \
	    expr = expr		promote to long
		   /|\
		  / | \
	       expr * expr	leave integer

This is more efficient, for one thing. It also means the compiler doesn't
have to worry about propogating promotion rules into subexpressions. The
expression ends up looking like:

	long = (long)(int)
	     = (long)(int * int).
-- 
-- Peter da Silva `-_-' ...!seismo!soma!uhnix1!sugar!peter (I said, NO PHOTOS!)