[net.lang.c] forward declared structures

wws@siemens.UUCP (William W Smith) (07/23/84)

I have a question on forward declarations of structures in C.  Kernighan and
Ritchie does not answer the question and to me the behavior of the Berkeley
Unix C compiler is without question wrong.  To wit:

struct s *ps;

main() {
    struct s { 
	int i1, i2;
	};
}
foo() {
    ps->i1 = 33;
}

My question is whether field i1 should be known in foo.  In the Berkeley
Compiler, it is known if and only if there is a different struct with a
field i1 declared before main is declared !?  If there is such a field,
the type and offset of i1 are correct, otherwise you get an undefined field
reference error.

I discovered this while implementing a C compiler for a class and wanted to 
follow real C instead of the simplification of structures that our professor
recommended.  My solution was to say that the declaration of the structure was
exported to the block level of the first use of the forward declaration and
no further.   Any better ideas?

Bill Smith
princeton!siemens!wws

gwyn@BRL-VLD.ARPA (07/26/84)

From:      Doug Gwyn (VLD/VMB) <gwyn@BRL-VLD.ARPA>

Technically it is an error to declare

struct s *ps;

without having defined what a "struct s" is first.
Because of the forward-reference problem, most C
compilers allow declaring a pointer to a struct
before the struct is defined, so that

struct a {
	struct b *bp;
	};
struct b {
	struct a *ap;
	};

is supported.  They get away with this only because
all pointers-to-struct happen to have the same run-time form.

The declaration of `struct s' inside a function is local
to that function, which is why it is unknown later when
you attempt to use the pointer `ps'.  This is perfectly
correct behavior, which supports information hiding.
To make the meaning of `struct s' globally known, you need
to declare it in a global context.

keesan@bbncca.ARPA (Morris Keesan) (08/02/84)

>I have a question on forward declarations of structures in C.  Kernighan and
>Ritchie does not answer the question and to me the behavior of the Berkeley
>Unix C compiler is without question wrong.  To wit:
>
>struct s *ps;
>
>main() {
>    struct s { 
>	 int i1, i2;
>	 };
>}
>foo() {
>    ps->i1 = 33;
>}
>
>My question is whether field i1 should be known in foo.  In the Berkeley
>Compiler, it is known if and only if there is a different struct with a
>field i1 declared before main is declared !?  If there is such a field,
>the type and offset of i1 are correct, otherwise you get an undefined field
>reference error.
>
>I discovered this while implementing a C compiler for a class and wanted to 
>follow real C instead of the simplification of structures that our professor
>recommended.  My solution was to say that the declaration of the structure was
>exported to the block level of the first use of the forward declaration and
>no further.   Any better ideas?
>
>Bill Smith
>princeton!siemens!wws

    Indeed, the Berkeley compiler is behaving incorrectly here, but it's
easy to see how the bug crept in.  The problem is not with knowing about
field i1.  That is getting treated correctly.  In the example, element i1 is
not known in the scope of foo, and if another structure is declared with
structure member i1, then i1 is known inside foo.  See section 14.1 of the
C Reference Manual, which says, "the expression before a -> is required only
to be a pointer or an integer.  If a pointer, it is assumed to point to a
structure of which the name on the right is a member."  So if i1 is defined
in a scope which foo inherits, then it doesn't matter what the type of ps is.
    The error here is in allowing the declaration of ps as a pointer to an
undeclared structure.  Playing with my compiler, descended from the Ritchie
PDP-11 C compiler, I discover that I can declare a pointer to a structure
whether or not that structure is ever declared, either in the scope of the
pointer declaration, in an inner scope, forward, or backward.  This is
clearly illegal:  C Ref. Man. section 8.5:

	A structure or union specifier of the second form, that is, one of
	    struct identifier { struct-decl-list }
	    union identifier { struct-decl-list }
	declares the identifier to be the structure tag (or union tag) of
        the structure specified by the list.  A SUBSEQUENT declaration may
	then use the third form of specifier, one of
	    struct identifier
	    union identifier
	Structure tags allow definition of self-referential structures; . . .
	a structure or union may contain a pointer to an instance of itself.

Capitalization of SUBSEQUENT is mine.  Implicitly within this "subsequent"
means subsequent to the "struct identifier {" rather than subsequent to the
entire declaration, otherwise structures containing pointers to themselves
would not be allowed.  It looks like it was too much work to keep track of
whether the compiler was in the middle of declaring "struct s", and so in
order to allow structures to point to themselves, the compiler writer(s)
decided to allow any declaration of "pointer to structure", regardless of
whether the structure was yet declared.
-- 
			    Morris M. Keesan
			    {decvax,linus,ihnp4,wivax,wjh12,ima}!bbncca!keesan
			    keesan @ BBN-UNIX.ARPA

mmr@utmbvax.UUCP (Mike Rubenstein) (08/03/84)

>	A structure or union specifier of the second form, that is, one of
>	    struct identifier { struct-decl-list }
>	    union identifier { struct-decl-list }
>	declares the identifier to be the structure tag (or union tag) of
>       the structure specified by the list.  A SUBSEQUENT declaration may
>	then use the third form of specifier, one of
>	    struct identifier
>	    union identifier
>	Structure tags allow definition of self-referential structures; . . .
>	a structure or union may contain a pointer to an instance of itself.

>Capitalization of SUBSEQUENT is mine.  Implicitly within this "subsequent"
>means subsequent to the "struct identifier {" rather than subsequent to the
>entire declaration, otherwise structures containing pointers to themselves
>would not be allowed.

On the other hand, a few months ago I was working with a compiler that
handles this "correctly."  Unfortunately, I wanted to declare something
like

	struct foo     { struct bar *b };
	struct bar     { struct foo *f };

Frustrating.
-- 

	Mike Rubenstein, OACB, UT Medical Branch, Galveston TX 77550

miller@saturn.UUCP (Terrence C. Miller) (08/03/84)

K&R may say that the short form of the declaration may be only used
for subsequent occurrances of the tag but those of us who write code
which looks like:

     struct a { struct b *pb;
		....
	      };

     struct b { struct a *pa;
		....
	      };

would be very upset if the compiler enforced that restriction.

mrm@datagen.UUCP (08/04/84)

The current ANSI draft does allow forward declaration of the pointer of
structures, since that is the only way you can build tree and list
structures with different typed structures.  Even in pascal, which is
much stricter about things, had to allow the forward declaration of pointers.

However, I believe that when the structure is declared, it must be in the same
block as the forward declaration.  Also, this forward declaration requires
that all structure/union pointers have the same format (or at least occupy
the same amount of storage) -- this means on a word-addressing machine (like
the DG stuff, DEC-20, UNIVAC, etc.) an implementator cannot make a pointer
to a structure which only contains chars be a byte pointer.

With regard to ANSI committees requiring large IBM-ish staffs to produce an
ANSI compiler, I beleive that most of us are on rather small teams.  I
ceratainly wouldn't agree to anythi
certainly wouldn't agree to anything that required a massive staff (since I
only have myself and an employee supporting two compilers).

	Michael Meissner
	Data General Corporation
	...{ ihnp4, allegra, rocky2 }!datagen!mrm

jwp@sdchema.UUCP (08/04/84)

In article <2263@saturn.UUCP> miller@saturn.UUCP (Terrence C. Miller) writes:
>K&R may say that the short form of the declaration may be only used
>for subsequent occurrances of the tag but those of us who write code
>which looks like:
>
>     struct a { struct b *pb;
>		....
>	      };
>
>     struct b { struct a *pa;
>		....
>	      };
>
>would be very upset if the compiler enforced that restriction.

*Lots* of things would be upset if that wasn't legal code.  I guess the
question now is:  What is the new standard going to say about this?

howard@byucsb.UUCP (Johnson Howard Reed) (08/06/84)

The problem with:
	struct foo { struct bar *b; };
	struct bar { struct foo *f; };
is that it allows a procedure to appear between them (at the global level).
If this is rewritten as:
	struct foo { struct bar { struct foo *f; } *b; };
then any "forward reference" refers to a partially-declared struct/union
and makes it easier for the compiler to detect such typos as:
	struct foo { struct bar { struct foo f; } *b; };

Howard Johnson
harpo!utah-cs!beesvax!byucsa!byucsb!howard

mmr@utmbvax.UUCP (Mike Rubenstein) (08/08/84)

> The problem with:
>	struct foo { struct bar *b; };
>	struct bar { struct foo *f; };
> is that it allows a procedure to appear between them (at the global level).
> If this is rewritten as:
>	struct foo { struct bar { struct foo *f; } *b; };
> then any "forward reference" refers to a partially-declared struct/union
> and makes it easier for the compiler to detect such typos as:
>	struct foo { struct bar { struct foo f; } *b; };

Unfortunately, at the cost of obscuring the code, at least in many cases.
In the case I ran into the structures (which, of course, actually had other
members) were for two tables which needed cross references into each other.
It would, I think, have been misleading to make the code look like one was
subordinate to the other.  I don't think it's worth it to make the compiler's
job a bit easier.
-- 

	Mike Rubenstein, OACB, UT Medical Branch, Galveston TX 77550

gnu@sun.uucp (John Gilmore) (08/09/84)

Another way I've used forward declared (actually undeclared) structures
is when I have a large structure full of "global" variables, many of
which are pointers to other structures.  If the compiler required a
definition of every structure which I declare a pointer to, I'd have
to include every include file in every program -- and edit every program
each time I added another struct * to the global structure.  The way
it is now, (4.2BSD cc, Unisoft V7, MIT cc68, etc) I only need to declare
a structure if I'm going to dereference the pointer (use it to point
to a member) in this module.  If I don't touch it, or if I just assign
it or pass it as a parameter, the compiler doesn't need to know what's
inside the struct it points to.

I'd hate to lose this feature.  Does ANSI C break it?