[comp.lang.c] Is this a bug in some C compilers?

greim@sbsvax.UUCP (Michael Greim) (07/19/89)

Hello,
I tried the following program on several machines. IMHO the program
should not compile yet there are some compilers which compile it:

cc 43bsd
cc 43bsd-tahoe
cc ultrix 2.0
cc SunOS 3.5

And there are some that give an error message:

gcc 1.34
cc sinix 2.1 ("McClure")

Here is the program:

---------- cut here ----------------------------
/*
 * What does the compiler do with this?
 */
# include <stdio.h>

struct link {
	struct link * next;
	int count;
};

int fix0;
struct link * t;
int fix1;

main ()
{
	printf ("t lies at %1d, brk is at %1d\n", (int)(&t), (int)sbrk(0));
	printf ("Addresses of %1d byte objects around t : %1d and %1d\n",
		sizeof(int), (int)(&fix0), (int)(&fix1));
	t = NULL;
	t.count = 5;
	if (t.count == 5)
		printf ("Assignment worked.\n");
	else
		printf ("Assignment did not work\n");
	printf ("Value 5 has been put at address %1d\n", &(t.count));
}
---------- cut here ----------------------------

Output from VAX running 43bsd:

Script started on Wed Jul 19 11:44:22 1989
% cc test78.c
"test78.c", line 21: warning: struct/union or struct/union pointer required
"test78.c", line 22: warning: struct/union or struct/union pointer required
"test78.c", line 26: warning: struct/union or struct/union pointer required
% a.out
t lies at 6008, brk is at 6176
Addresses of 4 byte objects around t : 6004 and 6012
Assignment worked.
Value 5 has been put at address 6012
% exit
% 
script done on Wed Jul 19 11:44:47 1989

Am I correct to say that the program is not correct C, and that all compilers
which compile it are wrong?

	-mg

-- 
Michael Greim    Email : greim@sbsvax.informatik.uni-saarland.dbp.de
                 or    : ...!uunet!unido!sbsvax!greim
[.signature removed by the board of censors for electronic mail's main
executive computer because it contained a four letter word ("word")]

maddog@cbnews.ATT.COM (john.j.tupper) (07/20/89)

In article <800@sbsvax.UUCP> greim@sbsvax.UUCP (Michael Greim) writes:
>	printf ("Value 5 has been put at address %1d\n", &(t.count));
> ....
>Am I correct to say that the program is not correct C, and that all compilers
>which compile it are wrong?

Michael is complaining about the fact he got away with writing t.count when
t was a pointer (the compiler should have insisted on t->count).

K&R C treats member names as offsets. I.e. fred.foo turns into
*(&fred + offset_to_foo). What's more (this is the braindamaged part) member
names are global across all structures. This means the following is correct:

struct a {
	int a_mem;
};

struct b {
	int b_mem;
} one_b;

one_b.a_mem = 0;

As illistrated by Michael's example, the variable you take a member of doesn't
even have to be a structure at all!

K&R don't actually  and say this, but they do allude to it. On page 197,
paragraph 5, it says that two structures may have the same member name
it the member is the same type and at the same offset in both structs.

Some C compilers retain this bogisity, some fixed it.
--------------------------------------------------------------
sdfl sdf sldkf sdlfoi			My real signature is illegible too.

diamond@csl.sony.JUNET (Norman Diamond) (07/20/89)

In article <800@sbsvax.UUCP> greim@sbsvax.UUCP (Michael Greim)
asks why the following program compiled and executed:

-> # include <stdio.h>
-> struct link {
-> 	struct link * next;
-> 	int count;
-> };
-> int fix0;
-> struct link * t;
-> int fix1;
-> main ()
-> {
-> 	printf ("t lies at %1d, brk is at %1d\n", (int)(&t), (int)sbrk(0));
-> 	printf ("Addresses of %1d byte objects around t : %1d and %1d\n",
-> 		sizeof(int), (int)(&fix0), (int)(&fix1));
-> 	t = NULL;
-> 	t.count = 5;
-> 	if (t.count == 5)
-> 		printf ("Assignment worked.\n");
-> 	else
-> 		printf ("Assignment did not work\n");
-> 	printf ("Value 5 has been put at address %1d\n", &(t.count));
-> }
-> 
-> Output from VAX running 43bsd:
-> % cc test78.c
-> "test78.c", line 21: warning: struct/union or struct/union pointer required
-> "test78.c", line 22: warning: struct/union or struct/union pointer required
-> "test78.c", line 26: warning: struct/union or struct/union pointer required
-> % a.out
-> t lies at 6008, brk is at 6176
-> Addresses of 4 byte objects around t : 6004 and 6012
-> Assignment worked.
-> Value 5 has been put at address 6012

In fact, the answer was discussed very recently, and was described as
"weard" [sic].  Your instantiation of this obscure feature is, perhaps,
even "wearder."

t is a variable, so it may be used as an lvalue.  Once upon a time it
was legal, almost reasonable and expected (since "union" wasn't invented
yet), and only a little bit "weard," to put any lvalue on the left of a
dot.  The offset on the right of the dot is 4 (the distance from the
beginning of a structure containing "count" to the "count" field itself)
so t.count refers to an integer 4 bytes after t.

Coincidentally a (struct link *) requires 4 bytes so t.count refers to
fix1.

After the invention of "union," the old practice of arbitrary lvalues
on the left of "." was discouraged.  The left side now should be a
struct or union, should actually contain a field with the name given
on the right side, and it might be an rvalue in an expression context.

Some compilers still accept the old syntax.  Yours did, after giving
warnings.

--
-- 
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.jp@relay.cs.net)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

ark@alice.UUCP (Andrew Koenig) (07/20/89)

In article <10561@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:

> A standard-conforming compiler is required to diagnose such misusage.

I don't think so.  I'm under the impression that a standard-conforming
compiler is one that accepts standard-conforming programs.
-- 
				--Andrew Koenig
				  ark@europa.att.com

chris@mimsy.UUCP (Chris Torek) (07/20/89)

In article <10561@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>Certainly [misusing structure members is] not correct C. ...
>A standard-conforming compiler is required to diagnose such misusage.

I thought the wording was something like `the compiler is required
to produce at least one warning for any program that violates one
or more syntax or constraints requirements'---which seems to say that
the compiler is required to print *some* warning, but not necessarily
one about this in particular (it could say `warning, this code smells
musty' :-) ).

Of course, one would hope that compilers that print deliberately
misleading error messages are rare.  (`Warning: illegal combination
of pointer and integer, op =' anyone? :-/ )
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

suhas@nlgvax.UUCP (Suhas Joshi) (07/20/89)

>Am I correct to say that the program is not correct C, and that all compilers
>which compile it are wrong?

The usage a.b when a is not a structure is allowed. But it is not portable. 
This is allowed so that strict typing is not enforced. So, as long as
compilers issue a warning, it is acceptable. It is definitely not a bug.
As long as a in a.b is lvalue, it is assumed to be a structure, if it is not.
See K&R C book's C reference manual sections 7.1 and 14.1 for details.

But my decision to post this was for other reason than pointing out this.
The reason is how implicit assumption that fix0 and fix1 are allocated around
t is wrong, and makes programs which rely on such assumption non-portable.
The assumption is implicit due to the use of the following printf about
addresses of fix0, fix1 and t.

[ Stuff deleted ]
>int fix0;
>struct link * t;
>int fix1;
>
>main ()
>{
>	printf ("t lies at %1d, brk is at %1d\n", (int)(&t), (int)sbrk(0));
>	printf ("Addresses of %1d byte objects around t : %1d and %1d\n",
>		sizeof(int), (int)(&fix0), (int)(&fix1));

[stuff deleted]

See the following output generated by SUN 3 running SunOS 4.0, and note that
out of 12 bytes allocated for fix0, fix1 and t. t has last 4, fix0 has first 
4, and fix1 has middle 4. So, fix0 and fix1 are not around t in addresses but
before t.

t lies at 131400, brk is at 131408
Addresses of 4 byte objects around t : 131392 and 131396
Assignment worked.
Value 5 has been put at address 131404

This happens because compile generates following assembly directives in the
same order as the definitions of the variables in the source
LL0:
	.data
	.comm	_fix0,4
	.comm	_t,4
	.comm	_fix1,4
	.text
	.proc
[Assembly code deleted]

But .comm tells assembler not to allocate storage but to leave it to linker.
So, linker reserves 12 bytes, and later allocates 4 each to t, fix1 and fix0
in that order, from the last address in the 12 byte area. So, the allocation
depends on how linker looks at the 12 bytes (low to high or high to low), and
on the order in which t fix1 and fix0 are referenced., because that is the
order in which linker has allocated. First printf refers to t, next refers
to fix0 and fix1, but the order of references is fix1 and fix0 because of
the order of parameter pushing on the stack.

So, beware of dependencies on order of definition of data and assumptions
about their allocation in the same order.

Suhas M. Joshi.					E-Mail: suhas@pcg.philips.nl
Philips Research Labs.,				Tel: +31 40 892336
Project Centre Geldrop, Building XP
Willem Alexanderlaan 7B, 5664 AN Geldrop	The Netherlands
-- 
Suhas M. Joshi.					E-Mail: suhas@pcg.philips.nl
Philips Research Labs.,				Tel: +31 40 892336
Project Centre Geldrop, Building XP
Willem Alexanderlaan 7B, 5664 AN Geldrop       The Netherlands

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/21/89)

In article <9645@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes:
>In article <10561@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>> A standard-conforming compiler is required to diagnose such misusage.
>I don't think so.  I'm under the impression that a standard-conforming
>compiler is one that accepts standard-conforming programs.

No, you're thinking about it the wrong way around.  A (non-strictly)
conforming program is simply one that is acceptable to SOME conforming
implementation (one wonders, why bother to define such a critter; it
was essentially a political decision).  However, conforming
implementations must meet numerous requirements other than simply
accepting all strictly standard-conforming programs.  Section 2.1.1.3
states:  "A conforming implementation shall produce at least one
diagnostic message ... for every translation unit that contains a
violation of any syntax rule or constraint."

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/21/89)

In article <18648@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>In article <10561@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>>A standard-conforming compiler is required to diagnose such misusage.
>I thought the wording was ... which seems to say that the compiler is
>required to print *some* warning, but not necessarily one about this
>in particular (it could say `warning, this code smells musty' :-) ).

Yes, the Standard does not attempt to be too specific about what
diagnostic messages must look like and other such environmental matters.
It does require that such misusage produce a diagnostic that is clearly
recognizable as a diagnostic (well, the "clearly recognizable" is more
the intent than the specification).  A conforming compiler COULD simply
report only: "foo.c: at least one error detected".  (It must NOT report
the same form of diagnostic for a strictly-conforming program, although
it is free to generate additional, distinguishable, diagnostics.)  We
generally label such things a matter of "quality of implementation",
leaving it up to market pressure to get the vendors to do a good job.

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/22/89)

In article <268@nlgvax.UUCP> suhas@nlgvax.UUCP () writes:
>The usage a.b when a is not a structure [or union] is allowed.

Not any more it isn't.

diamond@csl.sony.JUNET (Norman Diamond) (07/24/89)

In article <268@nlgvax.UUCP> suhas@nlgvax.UUCP () writes:

>>The usage a.b when a is not a structure [or union] is allowed.

In article <10583@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:

>Not any more it isn't.

As Doug Gwyn has pointed out many times, compilers are permitted to
accept additional kinds of programs besides those that conform to
ANSI C.  The "weard" usage a.b (where a is not a structure or union)
is non-conforming, but might still be allowed by the vast majority
of compilers.

--
-- 
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.jp@relay.cs.net)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/15/89)

In article <800@sbsvax.UUCP> greim@sbsvax.UUCP (Michael Greim) writes:
>Am I correct to say that the program is not correct C, and that all compilers
>which compile it are wrong?

Certainly it's not correct C.

For reasons that we've recently discussed, older compilers were rather
cavalier about structure members.  If the programmer wanted to insist
on treating non-structs (or the wrong kind of structs) as having certain
members, the compiler would attempt to comply.

A standard-conforming compiler is required to diagnose such misusage.