george@giza.cis.ohio-state.edu (George M. Jones) (07/20/89)
I have noticed what I consider anomolous behavior when using forward references in structs & typedefs and would would like to know whether things behave as they do because (a) that's just the way the compilers choose to implement a fuzzy point or (b) there is some definition (ansi, K&R) that say this is [in]correct behavior. The following code compiles (Sun, GCC, Green Hills) just fine as expected: struct one { struct two_t *foo; } one_t; struct two { struct one_t *bar; } two_t; However all the compilers I tried appear to be choaking on the forward reference to two_t in the following chunk of code typedef struct one { two_t *foo; } one_t; typedef struct two { one_t *bar; } two_t; 4:35pm> cc -c bad.c "bad.c", line 3: syntax error at or near variable name "two_t" "bad.c", line 4: zero sized structure Any definitive answers from the knowledgable clientel of this newsgroup ? Thanks, ---George Jones -=- OSU Computer & Inf. Science 2036 Neil Ave.,Columbus,Ohio 43210. 614-292-7325 george@cis.ohio-state.edu or ...!osu-cis!george What would Christmas be without friends like you !
henry@utzoo.uucp (Henry Spencer) (07/20/89)
In article <55480@tut.cis.ohio-state.edu> George M. Jones <george@cis.ohio-state.edu> writes: >I have noticed what I consider anomolous behavior when using forward references >in structs & typedefs and would would like to know whether things behave as >they do because (a) that's just the way the compilers choose to implement >a fuzzy point or (b) there is some definition (ansi, K&R) that say this is >[in]correct behavior. The latter. The language permits forward references with structure tags (specifically, "struct foo *x;" when foo has not yet been seen) but not the equivalent with typedefs. That's just the way it is. (Oh, you want to know *why*? Basically, because typedefs are a royal pain to parse at the best of times, and forward references just complicate the issue hopelessly. That keyword "struct" on the front simplifies things enormously. It also constrains the forward reference to be to a struct, which makes things easier because although different types of pointers sometimes have different representations [notably, the representation of "char *" can differ from other pointers], it's very unusual for different struct pointers to have different representations, so the compiler knows how big a pointer it's got even if it's not quite sure what it points to.) -- $10 million equals 18 PM | Henry Spencer at U of Toronto Zoology (Pentagon-Minutes). -Tom Neff | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
sritacco@hpdml93.HP.COM (Steve Ritacco) (07/20/89)
I've fought with this many times myself. It appears that this is a fact of life when dealing with K&R C. Every compiler I've ever used complains about that code. You simply need to define the structs and then typedef them, at least that is the way I have always gotten around this.
walter@hpclwjm.HP.COM (Walter Murray) (07/21/89)
George M. Jones writes: > typedef struct one > { > two_t *foo; > } one_t; > typedef struct two > { > one_t *bar; > } two_t; This code is incorrect by the syntax rules of ANSI C. The identifier 'two_t' by itself is not a valid type-specifier in the declaration of 'foo', unless it has previously been defined as a typedef name. This is probably what you want: typedef struct one one_t; typedef struct two two_t; struct one { two_t *foo; } ; struct two { one_t *bar; } ; Walter Murray --- Not speaking for Hewlett-Packard or X3J11 -------------
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/21/89)
In article <55480@tut.cis.ohio-state.edu> George M. Jones <george@cis.ohio-state.edu> writes: >The following code compiles (Sun, GCC, Green Hills) just fine as expected: > struct one > { > struct two_t *foo; > } one_t; > struct two > { > struct one_t *bar; > } two_t; If you were to change the member types to point to "struct two" and "struct one" types, it would be correct. The lack of a diagnostic at the end of the translation unit ("struct two_t and struct one_t missing declarations") can be considered a compiler deficiency. C deliberately allows forward struct references in such contexts. It may not have been completely clear in K&R 1st Edition, but it's officially specified in the proposed Standard. >However all the compilers I tried appear to be choaking on the forward >reference to two_t in the following chunk of code > typedef struct one > { > two_t *foo; > } one_t; Sure; "two_t" is neither a C keyword nor a defined type at the point that it is first used. The compiler cannot know that it is eventually going to be a structure type, unless it were to scan ahead (which is contrary to the intent of C). You might consider writing something like: typedef struct two two_t; /* incomplete type */ typedef struct one { two_t *foo; ) one_t; struct two { /* completes the type */ one_t *bar; };
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/21/89)
In article <1989Jul20.152935.14872@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >... it's very unusual for different >struct pointers to have different representations, ... Some of us think that the language rules effectively force all struct pointers to be conformable.
dfp@cbnewsl.ATT.COM (david.f.prosser) (07/21/89)
In article <55480@tut.cis.ohio-state.edu> George M. Jones <george@cis.ohio-state.edu> writes: >I have noticed what I consider anomolous behavior when using forward references >in structs & typedefs and would would like to know whether things behave as >they do because (a) that's just the way the compilers choose to implement >a fuzzy point or (b) there is some definition (ansi, K&R) that say this is >[in]correct behavior. > > >The following code compiles (Sun, GCC, Green Hills) just fine as expected: > > struct one > { > struct two_t *foo; > } one_t; > > struct two > { > struct one_t *bar; > } two_t; > There is either a pair of typos or a misunderstanding in the above. Structure tags are in a different namespace from regular identifiers. The structures "struct one_t" and "struct two_t" have never been defined. Moreover, since a "_t" ending is usually used for typedef names, my guess is that typedefs were intended for these, instead of regular objects. typedef struct one { struct two *foo; } one_t; typedef struct two { struct one *bar; } two_t; > >However all the compilers I tried appear to be choaking on the forward >reference to two_t in the following chunk of code > > typedef struct one > { > two_t *foo; > } one_t; > > typedef struct two > { > one_t *bar; > } two_t; > 4:35pm> cc -c bad.c > "bad.c", line 3: syntax error at or near variable name "two_t" > "bad.c", line 4: zero sized structure > >Any definitive answers from the knowledgable clientel of this newsgroup ? Since "two_t" in the above is completely unknown, the compiler has no idea what is intended. At least the "shape" of the type being described by the typedef name must be declared prior to use of a typedef name. If I were writing this sort of code, I'd probably use typedef names for what ANSI C calls "incomplete types": typedef struct one one_t; typedef struct two two_t; At this point, sizeof(one_t) and sizeof(two_t) are not known, but sizeof(one_t *) and sizeof(two_t *) are known. Note that this implies that all structure (and union) pointers must be the same size. As this pointer, the following declarations are valid: struct one { two_t *foo; }; struct two { one_t *bar; }; The main advantage gained by separating the typedef declaration from the structure is that a single common header file can contain the typedefs while headers for particular modules can contain the complete structure. Only those parts of the program that need to know the members of the structure must include the detailed header. Dave Prosser dfp@attunix.att.com
wen-king@cit-vax.Caltech.Edu (King Su) (07/23/89)
In article <1196@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: >The main advantage gained by separating the typedef declaration from the <structure is that a single common header file can contain the typedefs >while headers for particular modules can contain the complete structure. <Only those parts of the program that need to know the members of the >structure must include the detailed header. I have often wondered about this. Is it legal? Since pointers for different data objects are not required to have the same binary format or even storage size, how would a compiler allocate space and initialize a pointer of an unknown type? Consider: master_header.h: typedef struct s1 S1; typedef struct s2 S2; typedef struct s3 S3; one_program_file.c: #include "master_header.h" struct s3 { S1 *pointer_s1; S2 *pointer_s2; int a; .... } s; When compiling "one_program_file.c", how does the compiler figure out the size of 's' and the offset of 'a', given that 'pointer_s1' and 'pointer_s1' are of unknown size. How does the compiler initialize 's' (assuming static variable) if the format for the pointers are unknown. Or is it true that pointers to structures are always of the same size and format? -- /*------------------------------------------------------------------------*\ | Wen-King Su wen-king@vlsi.caltech.edu Caltech Corp of Cosmic Engineers | \*------------------------------------------------------------------------*/
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/23/89)
In article <11341@cit-vax.Caltech.Edu> wen-king@cit-vax.UUCP (Wen-King Su) writes:
-I have often wondered about this. Is it legal? Since pointers for
-different data objects are not required to have the same binary format
-or even storage size, how would a compiler allocate space and initialize
-a pointer of an unknown type?
All pointers-to-structures have the same size, that's how.
cowan@marob.masa.com (John Cowan) (07/25/89)
In article <1989Jul20.152935.14872@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >...it's very unusual for different >struct pointers to have different representations, so the compiler knows >how big a pointer it's got even if it's not quite sure what it points to.) For "very unusual" read "illegal". As someone (Chris Torek?) said earlier in this group, "all 'struct' pointers must 'smell' the same." If this behavior is not guaranteed, forward references to structs would be unimplementable in one pass. Ufcawss, if you talk to the people who wrote VMS C, they'll tell you that all one-pass implementations of C are unacceptable! (This has something to do with generating good code for the 'switch' statement.) -- Internet/Smail: cowan@marob.masa.com Dumb: uunet!hombre!marob!cowan Fidonet: JOHN COWAN of 1:107/711 Magpie: JOHN COWAN, (212) 420-0527 Charles li reis, nostre emperesdre magnes Set anz toz pleins at estet in Espagne.
chris@mimsy.UUCP (Chris Torek) (07/25/89)
In article <24CB9E07.9547@marob.masa.com> cowan@marob.masa.com (John Cowan) writes: >... As someone (Chris Torek?) said earlier in this group, "all 'struct' >pointers must 'smell' the same." If this behavior is not guaranteed, >forward references to structs would be unimplementable in one pass. It was Doug Gwyn, but this is otherwise correct. >Ufcawss, if you talk to the people who wrote VMS C, they'll tell you that >all one-pass implementations of C are unacceptable! (This has something to >do with generating good code for the 'switch' statement.) They are both right and wrong. One-pass code generation means there are few opportunities for optimsiation; but generating good switches is easy. Simply emit a branch at the top of the switch, code at each of the labels, a branch at the bottom, and then generate the code that actually implements the switch. E.g., given switch (foo) { case 1: case1(); break; case 2: case2(); break; case 3: case3(); break; case 999: case999(); break; } onward(); one generates code of the form: jump L1 L3: call case1_ jump L2 L4: call case2_ jump L2 L5: call case3_ jump L2 L6: call case999_ jump L2 L1: move foo,r0 compare r0,#3 jeq L5 jgt L7 compare r0,#1 jeq L3 jgt L4 # here is the clever part jump L2 L7: compare r0,#999 jeq L6 L2: call onward_ This sort of code is exactly what one gets from a one-pass compiler, except that they tend not to have a clever part. :-) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
wjr@ftp.COM (Bill Rust) (07/25/89)
In article <18729@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >In article <24CB9E07.9547@marob.masa.com> cowan@marob.masa.com (John Cowan) >writes: ||... As someone (Chris Torek?) said earlier in this group, "all 'struct' ||pointers must 'smell' the same." If this behavior is not guaranteed, ||forward references to structs would be unimplementable in one pass. | |It was Doug Gwyn, but this is otherwise correct. | ||Ufcawss, if you talk to the people who wrote VMS C, they'll tell you that ||all one-pass implementations of C are unacceptable! (This has something to ||do with generating good code for the 'switch' statement.) | |They are both right and wrong. One-pass code generation means there |are few opportunities for optimsiation; but generating good switches |is easy. Simply emit a branch at the top of the switch, code at each |of the labels, a branch at the bottom, and then generate the code |that actually implements the switch. E.g., given | | [Clever example deleted] Excuse me. I am not a compiler writer but even I know that there are two ways to implement switches: jump tables and compare and execute when equal. Jump tables (ie jmp switch_list[ix] where ix is some easy transformation of the switch variable) are much more efficient when the range of values that the switch variable can assume is limited (ie switch variable in range of 1 to 20 and it assumes 75% of values). The only way to tell if a jump table is better than compare and jump is to see what the range of the switch variable is and how many values it actually assumes. This is very difficult to do in a one pass compiler. The previous correspondent completely ignored this in his response. Bill Rust (wjr@ftp.com)
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/26/89)
In article <686@ftp.COM> wjr@ftp.UUCP (Bill Rust) writes: >The only way to tell if a jump table is better than compare and jump is to >see what the range of the switch variable is and how many values it >actually assumes. This is very difficult to do in a one pass compiler. I don't see that it's particularly hard to do. The compiler generates a preamble that branches to a postamble to be generated later (after the entire switch has been seen), and for each case as it is encountered the case value is stashed in a compiler internal table (only 257 labels need be accommodated) and an entry point label output followed by the case code then a jump to a second deferred label which will follow the postamble. The postamble can be a jump table, or a sequence of comparisons (hopefully ordered as a binary tree instead of linear), or any other algorithm the compiler decided was best for the particular set of case labels.
karzes@mfci.UUCP (Tom Karzes) (07/27/89)
In article <686@ftp.COM> wjr@ftp.UUCP (Bill Rust) writes: >Excuse me. I am not a compiler writer but even I know that there are >two ways to implement switches: jump tables and compare and execute >when equal. There are more ways to implement switch statements than this. For example, if you have a sparse but large set of case values, it may make sense to use an optimized hash table to look up the branch targets. Even for the "compare and execute when equal" case there are several possibilities. If there are only a couple of values, it makes sense to simply check them individually and execute the default case when there's no match. For a larger set of values, it might make sense to do a binary search (which of course is log(n)), assuming you didn't want to use an optimized hash table. I'm sure there are other possibilites as well, although these are the ones you're most likely to actually see implemented.
dg@lakart.UUCP (David Goodenough) (07/27/89)
henry@utzoo.uucp (Henry Spencer) sez: > The latter. The language permits forward references with structure tags > (specifically, "struct foo *x;" when foo has not yet been seen) but not > the equivalent with typedefs. That's just the way it is. > > (Oh, you want to know *why*? Basically, because typedefs are a royal pain > to parse at the best of times, and forward references just complicate the > issue hopelessly. Also, have you considered the following: typedef a *b; typedef b *a; There, now your compiler is _REALLY_ confused :-) -- dg@lakart.UUCP - David Goodenough +---+ IHS | +-+-+ ....... !harvard!xait!lakart!dg +-+-+ | AKA: dg%lakart.uucp@xait.xerox.com +---+
dg@lakart.UUCP (David Goodenough) (07/28/89)
wjr@ftp.COM (Bill Rust) sez: > Excuse me. I am not a compiler writer but even I know that there are > two ways to implement switches: jump tables and compare and execute > when equal. That makes three - see below, plus the hashing technique used in the V6 unix cc, on the PDP11 I used at university makes 4. Plus the concept (when switching on a character value) of generating a string of recognised characters, calling index, turning the pointer returned back into and index and using _THAT_ to index into a table (which is in fact a variation on my method #3 below) Of course, compiler purists will say that I have to do the comparisons as integer values, but don't knock it, because it does work. > Jump tables (ie jmp switch_list[ix] where ix is some easy > transformation of the switch variable) are much more efficient when > the range of values that the switch variable can assume is limited (ie > switch variable in range of 1 to 20 and it assumes 75% of values). The > only way to tell if a jump table is better than compare and jump is to > see what the range of the switch variable is and how many values it > actually assumes. This is very difficult to do in a one pass compiler. > The previous correspondent completely ignored this in his response. Hold on a minute. Doing the necessary work is very _EASY_ in a one pass compiler. Just like Chris Torek said, you jump at the top, build your switch code with labels, creating the table of values as you go. Now, at the bottom you have a table full of values. You can do anything you like with it - simple index lookup (jump table[ix]), tabled compare (method 3): dw label1, value1 dw label2, value2 with a search comparing value1 to the expression, or anything else that takes your fancy. All in one pass. -- dg@lakart.UUCP - David Goodenough +---+ IHS | +-+-+ ....... !harvard!xait!lakart!dg +-+-+ | AKA: dg%lakart.uucp@xait.xerox.com +---+
mccaugh@s.cs.uiuc.edu (08/10/89)
I couldn't help but side with the suggestion of "quadratic" even if it was misguided, since: * a 'quadrille' is a "square" dance, but consists of 5 or 6 rhythms * a 'quadrangle' and a 'quadrilateral' are both 4-sided * a 'quadrennium' is a period of 4 years Need I go on? THe point being that 'quadratic' means 'square', and it is an inference that the latter means "to the power 2".