[net.lang.c] weird C behavior

nather@utastro.UUCP (Ed Nather) (03/21/86)

Here is a short C program which gives correct output according to K & R.
While I can't argue it's wrong, it is not transparently right, either.
The following was extracted directly from the display screen of my
IBM PC.  Try it on *your* 16-bit computer.  Then explain it to a friend.
-----------------------------------------------------------------------
C% print weird.c

/* weird.c - demonstrate weird C behavior */
/* Reference: Kernighan and Ritchie pp. 40-41 */

#include <stdio.h>

#define BIG 36864

main()
{
int i;

i = BIG;
if(i == BIG)
    printf("Equality found: i = %d, BIG = %d\n", i, BIG);
else
    printf("Equality NOT found: i = %d, BIG = %d\n", i, BIG);
}

C% weird

Equality NOT found: i = -28672, BIG = -28672

-----------------------------------------------------------------------
-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather@astro.UTEXAS.EDU

roy@gitpyr.UUCP (Roy Mongiovi) (03/22/86)

In article <557@utastro.UUCP>, nather@utastro.UUCP (Ed Nather) writes:
> Equality NOT found: i = -28672, BIG = -28672

Ok.  The #defined constant is greater than 32767, so it's a long on a
16 bit machine.  Top 16 bits are zero, bottom are 0x9000.  It gets
assigned into an SIGNED integer, so that gets the bottom 16 bits.
Then to perform the comparison, C promotes the int to a long, and
since it's signed that extends the sign bit.  One value now has
the top 16 bits zero, the other has 0xFFFF.  Unfortunately, the
program ignores the fact that the constant is a long, and printf's
it through a %d.  Since words and bytes are stored in least significant,
most significant order in memory, the address passed to printf for both
variables is that of the bottom 16 bits, which are equal.  Therefore
the printed output seems to contradict the comparison.

The real question is:  who made the mistake?  The compiler by assuming
the constant is a long, or the programmer by printing with a %d?  Maybe
you ought to have to say L to get a long constant....
-- 
Roy J. Mongiovi.	Office of Computing Services.		User Services.
Georgia Institute of Technology.	Atlanta GA  30332.	(404) 894-6163
 ...!{akgua, allegra, amd, hplabs, ihnp4, masscomp, ut-ngp}!gatech!gitpyr!roy

jsdy@hadron.UUCP (Joseph S. D. Yao) (03/22/86)

In article <557@utastro.UUCP> nather@utastro.UUCP (Ed Nather) writes:
>/* weird.c - demonstrate weird C behavior */
>/* Reference: Kernighan and Ritchie pp. 40-41 */
>#include <stdio.h>
>#define BIG 36864
>main()
>{
>int i;
>
>i = BIG;
>if(i == BIG)
>    printf("Equality found: i = %d, BIG = %d\n", i, BIG);
>else
>    printf("Equality NOT found: i = %d, BIG = %d\n", i, BIG);
>}
>Equality NOT found: i = -28672, BIG = -28672

Check the code produced by your compiler.  Could you be mixing
16- and 32-bit operations?  I.e., could the compiler be moving
the BIG word into 16-bit i (making it a negative number), then
(finding that BIG is > 32767) upgrading both to (long) before
the compare?  Check and see.  You don't say what machine or C
you used, so we have no way of telling.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

chris@umcp-cs.UUCP (Chris Torek) (03/22/86)

In article <557@utastro.UUCP> nather@utastro.UUCP (Ed Nather) writes:

>Here is a short C program which gives correct output according to K & R.
>[...]  Try it on *your* 16-bit computer.

The code contains the following [paraphrased]:

	printf("%d\n", 36864);

On a 16 bit machine, this should read

	printf("%ld\n", 36864);

One alternative is to change Ed's original program to read

	#define BIG ((int) 36864)

Too bad lint does not catch the printf() type bug.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

dube@csd2.UUCP (Tom Dube) (03/22/86)

>Here is a short C program which gives correct output according to K & R.
>While I can't argue it's wrong, it is not transparently right, either.
>#include <stdio.h>
>#define BIG 36864
>int i;
>i = BIG;
>if (i != BIG)
>   printf("Equality NOT found: i = %d, BIG = %d\n", i, BIG);

 Sure it's correct. i is not 36864 because it is forced into a 16
bit integer.  The numbers look the same when you print them because
you are also forcing BIG into an integer with the format %d.  Try
printing both numbers with %ld.
                                    Tom Dube

jsdy@hadron.UUCP (Joseph S. D. Yao) (03/23/86)

In article <436@umcp-cs.UUCP> chris@umcp-cs.UUCP (Chris Torek) writes:
>In article <557@utastro.UUCP> nather@utastro.UUCP (Ed Nather) writes:
>>Here is a short C program which gives correct output according to K & R.
>>[...]  Try it on *your* 16-bit computer.
>The code contains the following [paraphrased]:
>	printf("%d\n", 36864);
>On a 16 bit machine, this should read
>	printf("%ld\n", 36864);
>One alternative is to change Ed's original program to read
>	#define BIG ((int) 36864)

Passed arguments should always be passed as an "int", I do believe.
Changing the printf specification will  n e v e r  change what the
C compiler does with the rest of the arguments!!  Nather's original
posting led me to believe he was using some kind of a 16/32 bit
machine, with a C compiler that had not quite been consistent.  I.e.,
on the comparison, all I could think was that a 16-bit int had sign-
extended to compare with an IMPLICIT long constant (look at it: it's
not an explicit long constant!).  This is inconsistent.  However, an
arg has to be explicitly declared; so the int default almost has to
be honoured.

Sorry to have to publicly disagree, Chris.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

gwyn@BRL.ARPA (VLD/VMB) (03/24/86)

That's easy; BIG is a (long), i is an (int),
and the code blatantly ignores the difference.

kwh@bentley.UUCP (KW Heuer) (03/24/86)

In article <3090013@csd2.UUCP> csd2!dube (Tom Dube) writes:
>The numbers look the same when you print them because
>you are also forcing BIG into an integer with the format %d.  Try
>printing both numbers with %ld.

But if you do, remember to cast the corresponding argument into (long).

greg@utcsri.UUCP (Gregory Smith) (03/25/86)

In article <3090013@csd2.UUCP> dube@csd2.UUCP (Tom Dube) writes:
>>Here is a short C program which gives correct output according to K & R.
>>While I can't argue it's wrong, it is not transparently right, either.
>>#include <stdio.h>
>>#define BIG 36864
>>int i;
>>i = BIG;
>>if (i != BIG)
>>   printf("Equality NOT found: i = %d, BIG = %d\n", i, BIG);
>
> Sure it's correct. i is not 36864 because it is forced into a 16
>bit integer.  The numbers look the same when you print them because
>you are also forcing BIG into an integer with the format %d.  Try
>printing both numbers with %ld.
>                                    Tom Dube

Lots of people noticed that 36864 is a (long) while i is an (int).
( we must be dealing with 16-bit ints, of course ). A much more
serious problem exists in the printf call, which reduces to

	printf("..%d..%d..", i, 36864 );

Since 36864 is a long (32 bits ) and assuming 16-bit addresses (e.g. PDP-11)
the parameter list in the stack for printf looks like this:

	< address of string >
	< value of i	    >
	< low word of 36864 >
	< hi word of 36864 >

[ Note: I am assuming that the low word of a long is stored first. ]
So printf sees that two (2) integers are required
(_by_looking_at_the_string_) and prints i and the lo word of the constant.
You see the problem?

	printf("...%d...%d...%s", i, 36864, "holy lint, batman!");

The stack is now as above, with the addition of one more address at the
end of the list. printf sees that two integers and one string are required,
so it prints the same 2 numbers as before, and _then_tries_to_use_the_high_
_word_of_36864_as_a_string_address_! Gotcha!

The arguments to printf MUST AGREE with the types specified in the format
string. There is no way for printf to tell the types _or_the_data_sizes_
by looking at the stack. Thus simply changing the "%d"'s in the string to
"%ld"'s, as suggested by Mr. Dube, will most probably result in strangeness.
We need to convert 'i' to a long before that will work.
What is needed is:

 printf("Equality NOT found: i = %ld, BIG = %ld\n", (long) i, (long) BIG);

Now printf expects two longs, and is handed two longs. Note that the
second (long) [ before BIG ] is redundant in this case, but it is what we
call a Very Good Idea :-).

Moral of the story: Be *sure* that the types of parameters to printf are
compatible with the format specifiers given in the format string.
Or: 'Caveat Printor'.( groan....). Us vaxers can be more careless because
our ints and longs are both 32 bits!

Note: I am 97.23% certain that LINT will not catch any error of this type.

-- 
"No eternal reward will forgive us now for wasting the dawn" -J. Morrison
----------------------------------------------------------------------
Greg Smith     University of Toronto       ..!decvax!utzoo!utcsri!greg

broehl@watdcsu.UUCP (Bernie Roehl) (03/25/86)

In article <436@umcp-cs.UUCP> chris@umcp-cs.UUCP (Chris Torek) writes:
>
>The code contains the following [paraphrased]:
>
>	printf("%d\n", 36864);
>
>On a 16 bit machine, this should read
>
>	printf("%ld\n", 36864);
>

Both are wrong on a 16 bit machine, and both will produce garbage (though
probably different garbage).  What you want is

        printf("%ld\n", 36864L);

Note the 'L' suffix; this is how you tell C that you mean a Long constant.

chris@umcp-cs.UUCP (Chris Torek) (03/27/86)

In article <330@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:

>>On a 16 bit machine, this should read
>>	printf("%ld\n", 36864);		[me]

>Passed arguments should always be passed as an "int", I do believe.
>...  Sorry to have to publicly disagree, Chris.

Unfortunately for you, I checked my sources first.  The clue was
there in Ed's original posting.  According to K&R, a constant that
is too big to be an `int' is automatically considered a `long'; on
a 16 bit machine,

	printf("%ld\n", 36864);

and

	printf("%ld\n", 36864L);

build the exact same stack.

For you ANSI C buffs, I quote from a year-old draft (the latest I
could find), section C.1.3.2, `Integer constants':

	If the value of an unsuffixed decimal constant (base 10) is no
	greater than that of the largest signed int, the constant has
	type signed int; otherwise it has type signed long int.

Now, if you were using ANSI C and had written, say,

	void myfunc(int);

	myfunc(36864);

you would get something closer to what you expected. . . .
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

ark@alice.UucP (Andrew Koenig) (03/27/86)

>>The code contains the following [paraphrased]:
>>	printf("%d\n", 36864);
>>On a 16 bit machine, this should read
>>	printf("%ld\n", 36864);
>>One alternative is to change Ed's original program to read
>>	#define BIG ((int) 36864)

> Passed arguments should always be passed as an "int", I do believe.

Absolutely not!  Are you really suggesting there's no way to pass
a long integer to a function?  The following are correct:

	printf("%d\n", 1);
	printf("%ld\n", 1L);		/* long constant */
	printf("%ld\n", (long) 1);	/* long constant expression */

The following are incorrect:

	printf("%d\n", 1L);
	printf("%ld\n", 1);
	printf("%d\n", (long) 1);

This one is correct on some machines but not on others:

	printf("%d\n", 36864);

because 36864 is an int on some machines and a long on others.

jsdy@hadron.UUCP (Joseph S. D. Yao) (03/28/86)

In article <559@umcp-cs.UUCP> chris@umcp-cs.UUCP (Chris Torek) writes:
>In article <330@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>>>On a 16 bit machine, this should read
>>>	printf("%ld\n", 36864);		[me]
>>Passed arguments should always be passed as an "int", I do believe.
>                            ...  According to K&R, a constant that
>is too big to be an `int' is automatically considered a `long'; ...
>For you ANSI C buffs, I quote from a year-old draft (the latest I
>could find), section C.1.3.2, `Integer constants':

Moral:  check your sources (no pun intended?).  We analysed this
"weird behaviour" the same way; but I was wrong in calling it a
compiler error.

End of discussion.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

chris@umcp-cs.UUCP (Chris Torek) (03/30/86)

In article <2194@watdcsu.UUCP> broehl@watdcsu.UUCP (Bernie Roehl) writes:
>In article <436@umcp-cs.UUCP> chris@umcp-cs.UUCP (Chris Torek) writes:
>>On a 16 bit machine, this should read
>>
>>	printf("%ld\n", 36864);		[me]
>
>Both are wrong on a 16 bit machine, and both will produce garbage (though
>probably different garbage).

False.  I have already explained why, with references to K&R and an
old X3J11 draft.

>What you want is
>
>        printf("%ld\n", 36864L);	[Bernie]

I agree that this is considerably better in practice.  In fact, I
would not at all mind a compiler that gave warnings whenever an
unsuffixed constant did not fit in an integer and was promoted to
long (or unsigned long for `U' suffixed and hex and octal constants,
per X3J11).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

jsdy@hadron.UUCP (03/31/86)

In article <5190@alice.uUCp> ark@alice.UucP (Andrew Koenig) writes:
>> Passed arguments should always be passed as an "int", I do believe.
>Absolutely not!

Folks, I'm not going to spend the rest of my life apologising for
those two stupid articles I wrote last week.  I hope.  Let's accept
that my brain was tied up with something else and get on with it.

What I  m e a n t  to say above was "Passed integer constants ... ".
As we all know, we can force a constant to be anything at all.  What
caught me was the surprise promotion of a constant to a long, with
no request from the user, if the constant is > the larget integer.
This, though a valid part of the language for years, I think is a
problem.  Then again, writing code that uses that kind of a constant
may be seen to be a problem also.  I have spent some effort in diverse
programs avoiding this.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

bc@cyb-eng.UUCP (Bill Crews) (04/03/86)

> /* weird.c - demonstrate weird C behavior */
> /* Reference: Kernighan and Ritchie pp. 40-41 */
. . .
> #define BIG 36864
> 
> main()
> {
> int i;
> 
> i = BIG;
> if(i == BIG)
>     printf("Equality found: i = %d, BIG = %d\n", i, BIG);
> else
>     printf("Equality NOT found: i = %d, BIG = %d\n", i, BIG);
> }
> 
> Equality NOT found: i = -28672, BIG = -28672
> 
> -----------------------------------------------------------------------
> Ed Nather
> Astronomy Dept, U of Texas @ Austin

It seems to me that your compiler has done the right thing.  Since 36864
cannot be represented as an int on your machine, it generated a positive
long rather than a negative int.  When i is extended to a long, the sign
bit is extended, so the comparison fails, as it should.  The only reason
you got -28672 for BIG instead of nulls is because your machine has backwards
byte order.
-- 
	- bc -

..!{seismo,topaz,gatech,nbires,ihnp4}!ut-sally!cyb-eng!bc  (512) 835-2266

rbj%icst-cmr@smoke.UUCP (04/05/86)

Bill Crews writes:

	The only reason you got -28672 for BIG instead of nulls is
	because your machine has backwards byte order.

Sorry Bill, *you're* the one that's got backwards byte order. Little
Endian is `correct', even tho bucking historical convention.

My reasoning is this: The original byte ordering was done the obvious
way, Big Endian. If this was so perfect, why would a sane man design
anything Little Endian? For compelling mathematical reasons!
You wouldn't number your bits backwards (within a register) would you?
Admittedly, some people do, but they must not know any better.

I admit this causes some headaches because of our historical biases.
Unfortunately, this means I side with Intel and against Motorola on
this, but it just goes to show a company can't be all right or all wrong.
Like National goofed & called `longs' `doubles' & vice versa!

	(Root Boy) Jim Cottrell		<rbj@cmr>