gupta@asgb.UUCP (Yogesh K Gupta) (11/18/85)
The following seems to be a bug in the SYSV.2 C-compiler (loader really): You can DEFINE the same global variable in two files and the loader does not even issue a warning about it. In fact, you can DEFINE the same global variable in two files as two DIFFERENT data types and the loader does not issue a warning as long as the size of the two globals is the same. Also, it really treats them as one global variable. Thus, the effect is the same as declaring the global as the union of the two declarations (as an example, try the program at the end of this message). This behavior of the compiler is contrary to what is mentioned in K&R. To quote, (pp 77, italicized words from the text are shown in upper case): " There must be only one DEFINITION of an external variable ^^^^^^^^^^^^^^^^^^^^^^^^^^^ among all the files that make up the source program; others may contain 'extern' declarations to access it." Also, on page 206 (Appendix A): "Thus in a multi-file program, an external data definition without the 'extern' specifier must appear in exactly one of the files." ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ System: Vax-780. Unix: USG V.2 /* Begin tst1.c */ struct { short x; short y; } a; f1() { printf("f1: &a: %d, a.x: %d, a.y: %d\n",&a,a.x,a.y); } main() { f2(); f1(); printf("main: &a: %d, a.x: %d, a.y: %d\n",&a,a.x,a.y); } /* End tst1.c */ /* Begin tst2.c */ int a; f2() { a = 1 + (1 << 16); printf("f2: a: %d &a: %d\n",a, &a); } /* End tst2.c */ ---- Begin output of the compilation of the above program ---- f2: a: 65537 &a: 73136 f1: &a: 73136, a.x: 1, a.y: 1 main: &a: 73136, a.x: 1, a.y: 1 ---- End output ---- -- Yogesh Gupta Advanced Systems Group, {sdcrdcf, sdcsvax}!bmcg!asgb!gupta Burroughs Corp., Boulder, CO. -------------------------------------------------------------------- All opinions contained in this message are my own and do not reflect those of my employer or the plant on my desk.
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/19/85)
> The following seems to be a bug in the SYSV.2 C-compiler (loader really): > You can DEFINE the same global variable in two files and the loader does > not even issue a warning about it. In fact, you can DEFINE the same global > variable in two files as two DIFFERENT data types and the loader does not > issue a warning as long as the size of the two globals is the same. Also, > it really treats them as one global variable. Thus, the effect is the same > as declaring the global as the union of the two declarations (as an example, > try the program at the end of this message). > > This behavior of the compiler is contrary to what is mentioned in K&R. > To quote, (pp 77, italicized words from the text are shown in upper case): > " There must be only one DEFINITION of an external variable > ^^^^^^^^^^^^^^^^^^^^^^^^^^^ > among all the files that make up the source program; others > may contain 'extern' declarations to access it." > Also, on page 206 (Appendix A): > "Thus in a multi-file program, an external data definition without > the 'extern' specifier must appear in exactly one of the files." > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is not a bug. Although Ritchie preferred the COMMON model for extern data, which is what most UNIX PCCs implement, there are some systems supporting C that have unacceptable restrictions on COMMON data (such as, minimum space 4Kb each, limited quantity available, etc.). Therefore the official language description is in terms of the DEF/REF model of external data linkage, which has no such limitations on those systems. If you write your code according to the (DEF/REF) rules, it will also work correctly on systems using COMMON for extern data, but not vice-versa. There is a story that some early release of USG UNIX (5.0?) had been changed to enforce DEF/REF semantics, and so much code broke that they had to back out the change (by providing "mcc", or "cc -m", or some such temporary scheme) to still the outcries. I heard that some AT&T internal sites refused to install the changed SGS until extern data linkage was put back the way it used to be. If there is an interesting story here, perhaps someone involved will tell us. Note that if you had tried initializing the multiple definitions of extern data with different values, the loader would have issued a warning. But there is a special kludge in "ld" that allows multiple definitions so long as they are all (uninitialized) common except for at most one (initialized) .data definition. (The sizes also are maxed together, as I recall.)
marc@petrus.UUCP (Marc Pucci) (11/21/85)
> > The following seems to be a bug in the SYSV.2 C-compiler (loader really): > > You can DEFINE the same global variable in two files and the loader does > > not even issue a warning about it. ... > This is not a bug. Although Ritchie preferred the COMMON model > for extern data, which is what most UNIX PCCs implement, there > are some systems supporting C that have unacceptable restrictions ... > There is a story that some early release of USG UNIX (5.0?) had > been changed to enforce DEF/REF semantics, and so much code broke > that they had to back out the change (by providing "mcc", or "cc -m", > or some such temporary scheme) to still the outcries. I heard that > some AT&T internal sites refused to install the changed SGS until > extern data linkage was put back the way it used to be. If there > is an interesting story here, perhaps someone involved will tell us. - - - - Doug Gwyn is right about the internal differences in some flavors of USG C compilers (actually SGS's). Forgive me if I start to ramble on. Most of this took place 4 or 5 years ago, so some details are sketchy. Back when I did the original UNIX port for the 3B20S, the SGS systems had just appeared. I believe they were based on the 3B20 D (for duplex) development systems from BTL Indian Hill. These were cross generation systems that ran on pdp-11's running either RT or UNIX and produced code to be later down-loaded into the 3B20-D through various nifty hardware. These adhered to the spec that made muliple externs illegal. The timing now (~1980-82?) is also about that of the UNIX shop's real growth period - a transition when it grew from a few groups to a few departments. Support would be provided for several processor types: 3B20, 3B5, VAX, PDP-11 and so adopting a common SGS structure for the language maintainers was a reasonable thing to do. We were the first internal customers for the new SGS and had to port UNIX for the 3B20S. However, even though the C spec claimed no multiple externs, the UNIX system code made extensive use of this. Just look at sys/systm.h. It contained things like "char runrun;" which had to be changed to "extern char runrun;" with a single instance of "char runrun;" appearing in some other file. Hence, header files were changed and variables added to related source files. At the time, we were too excited about doing the UNIX port to complain about the externs. (Another problem that did get fixed was that the loader would complain and exit when it hit its first conflict. Couple this with the fact that it was a slow cross-compiler with many temporary files to support software demand paging of data structures and you can imagine what a joy it was hitting conflict after conflict - ah but we got over-time in those days). Anyway, the SGS's were released in internal UNIX release 4.something or other. People were upset about the change concerning multiple externs, but it was decided to go with the new SGS. This decision was subsequently reversed and multiple externs were added to the SGS. During this time you could find cc, mcc (or was it Mcc), and occ on various internal BTL machines. Some had installed the new and kept the old as occ; some had reinstalled the old and moved the new to mcc, and some had an Mcc that was one flavor or the other (my memory is very sketchy here - but confusion certainly had set in. I think that on at least one system there was a cc (pre-SGS), an mcc (SGS - no multiple externs), and an Mcc (SGS with multiple externs)). I think by internal release 4.3, the multiple externs were put in to stay. Marc Pucci Bell Communications Research, Morristown, NJ