[comp.lang.c] local extern : MS vs. the world

carroll@s.cs.uiuc.edu (10/26/88)

Small note - MS C actually found a bug that a number of other compilers
didn't. In porting code from UNIX to my AT, the compiler choked on code
that compiled fine on UNIX. The problem was that it had declared
	extern long int mem_size
inside several functions, but used it in a function in which it had not
been declared. We had previously compiled this on Sun3.4, 4.3BSD on RT's,
SysV on 3b2's and 3b20's, and a Sequent. Not one of these caught the error.
Apparently they take 'extern' declarations to be global to the file, even
if the declaration is local to a function. I agree with MS on this one -
local 'extern' declarations *should* be local. Any comments?

Alan M. Carroll          "How many danger signs did you ignore?
carroll@s.cs.uiuc.edu     How many times had you heard it all before?" - AP&EW
CS Grad / U of Ill @ Urbana    ...{ucbvax,pur-ee,convex}!s.cs.uiuc.edu!carroll

tim@crackle.amd.com (Tim Olson) (10/27/88)

In article <207600006@s.cs.uiuc.edu> carroll@s.cs.uiuc.edu writes:
| 
| Small note - MS C actually found a bug that a number of other compilers
| didn't. In porting code from UNIX to my AT, the compiler choked on code
| that compiled fine on UNIX. The problem was that it had declared
| 	extern long int mem_size
| inside several functions, but used it in a function in which it had not
| been declared. We had previously compiled this on Sun3.4, 4.3BSD on RT's,
| SysV on 3b2's and 3b20's, and a Sequent. Not one of these caught the error.
| Apparently they take 'extern' declarations to be global to the file, even
| if the declaration is local to a function. I agree with MS on this one -
| local 'extern' declarations *should* be local. Any comments?

K&R is very vague on this subject.  Section 11.1 says

	"Because all references to the same external identifier refer to
	the same object... the compiler checks all declarations of the
	same external identifier for compatibility; in effect their
	scope is increased to the whole file in which they appear."

But the X3J11 Rationale document says:

	"One source of dispute was whether identifiers with external
	linkage should have file scope even when introduced within a
	block.  The Base Document is vague on this point, and has been
	interpreted differently by different implementations... While it
	was generally agreed that it is poor practice to take advantage
	of an external declaration once it had gone out of scope, some
	argued that a translator had to remember the declaration for
	checking, anyway, so why not acknowledge this?  The compromise
	adopted was to decree essentially that block scope rules apply,
	but that a conforming implememntation need not diagnose a
	failure to redeclare an external identifier that had gone out of
	scope (undefined behavior)."

	-- Tim Olson
	Advanced Micro Devices
	(tim@crackle.amd.com)

ka@june.cs.washington.edu (Kenneth Almquist) (10/28/88)

carroll@s.cs.uiuc.edu writes:
>> Small note - MS C actually found a bug that a number of other compilers
>> didn't. In porting code from UNIX to my AT, the compiler choked on code
>> that compiled fine on UNIX. The problem was that it had declared
>> 	extern long int mem_size
>> inside several functions, but used it in a function in which it had not
>> been declared. We had previously compiled this on Sun3.4, 4.3BSD on RT's,
>> SysV on 3b2's and 3b20's, and a Sequent. Not one of these caught the error.
>> Apparently they take 'extern' declarations to be global to the file, even
>> if the declaration is local to a function. I agree with MS on this one -
>> local 'extern' declarations *should* be local. Any comments?

Agreed.  The behavior of the SysV and BSD compilers is a throwback to the
days before C had block structure.  Before the introduction of block
structure, the scope of a global variable extended from its declaration
to the end of the file and the scope of a local variable extended from
its declaration to the end of the current procedure.  Declaring a local
variable with the same name as a previously defined global variable was
illegal since there was no provision for one definition to hide another.
Even if a global variable was declared within the body of a function the,
declaration was still visible throughout the rest of the file.

This scheme was very easy to implement--all the compiler had to do was
to scan the symbol table at the end of each procedure and discard all
entries for local variables.  However, it was rather counterintuitive,
which is probably why Ritchie decided to switch to block structure.

When Ritchie converted to block structure, he made the compiler remember
extern definitions even after the end of the scope of the definition.
This violation of block structure allowed the compiler to process all
programs written before the conversion to block structure.

tim@crackle.amd.com (Tim Olson) comments:
> K&R is very vague on this subject.  Section 11.1 says
> 
> 	"Because all references to the same external identifier refer to
> 	the same object... the compiler checks all declarations of the
> 	same external identifier for compatibility; in effect their
> 	scope is increased to the whole file in which they appear."

This is talking about the compiler, not necessarily the language.  The
paragraph preceding the one quoted here states that, "The lexical
scope of identifiers declared at the head of blocks persists until the
end of the block."  This implies that the following code is legal, though
machine dependent:

	init() {
		extern int x;
		x = 1;
	}

	little_endian() {
		extern char x;
		return x;
	}

However, the Ritchie compiler rejects this.  In order to be backward
compatible, the compiler makes the first declaration of "x" visible
within the function "little_endian."  When the compiler encounters the
second definition of "x", it notes the conflict in the definition and
generates an error message.  My reading of K&R is that it acknowledges
this behavior of the compiler without making it a feature of C.

In any case, X3J11 decided to leave this up to the implementers, which
does seem reflect the intent of K&R pretty well, although most of the
code written in the pre-block-structure days is long gone.
				Kenneth Almquist