[comp.lang.c] Does extern "const" allocate storage?

lenoil@Apple.COM (Robert Lenoil) (03/17/88)

I'm having some trouble understanding the const type qualifier.  If I declare

const int foo = 3;

then I hope that the compiler would actually use the constant 3 whenever foo
is used, instead of allocating storage and generating a reference to foo.
One exception, however, would be if I declare

const int *bar = foo;

which would have to allocate storage for foo so that a pointer to foo could be
placed in bar.  My question involves what happens when foo is defined in
another module?  Does

extern const int foo;

expect to be linking to an external constant, or is it linking to an external
integer, and semantically prohibiting assignment to that integer?  If the
latter is true, then every file with an extern reference to foo would have to
reference foo wherever it is used, instead of using a constant.  In this case,
using const would be less efficient than #define.  What about

extern const char foo[];

I suppose that's illegal, because sizeof(foo), which should be a constant,
won't work.  The ANSI C draft of 1/11/88 mentions the const type qualifier
in section 3.5.3 but doesn't actually define its meaning.  And while we're
discussing that section, can someone translate their definition of noalias
into English for me?  (If you thought reference manuals are hard to follow,
just try the ANSI draft on for size.)


Robert Lenoil

henry@utzoo.uucp (Henry Spencer) (03/18/88)

> const int foo = 3;
> 
> then I hope that the compiler would actually use the constant 3 whenever foo
> is used, instead of allocating storage and generating a reference to foo.

Conceptually that declaration allocates space for an int that cannot be
assigned into, and initializes it to 3.  A halfway-intelligent compiler
will notice that the value isn't going to change, and use the constant
instead of a variable access when foo appears *in that file*.  References
from other files will have to access the variable, since the compiler has
no way of knowing the value when compiling them.

> One exception, however, would be if I declare
> 
> const int *bar = foo;
> 
> which would have to allocate storage for foo...

No, *you* have to allocate storage for whatever bar points at.  The compiler
will not do it for you.  All the above declaration asks for is that the
value foo be placed into the constant pointer bar.

> ... The ANSI C draft of 1/11/88 mentions the const type qualifier
> in section 3.5.3 but doesn't actually define its meaning...

I think you will find that the meaning is well-defined but you're going
to have to read quite a bit of the standard to catch all the nuances;
it's not all in one place.

> And while we're
> discussing that section, can someone translate their definition of noalias
> into English for me? ...

An idiomatic translation is easy:  "We want to do something about this, but
we don't understand the problem or the solutions well enough to do it right.
We'll try anyway."  :-(

More literally, ignoring fine points, the basic idea is that in the presence
of noalias, the compiler is allowed to cache variables without worrying about
whether the variables pointed at by two separate pointers really are the
same variable.  After it's all over, any altered cache entries get written
back to real memory, and if two cache entries get written back to the same
place (i.e. they *were* the same variable), the result is anyone's guess.
The idea is fine, but nobody really understands the implications well enough
to be sure it's being done right.  X3J11 has violated its own rules about
not adopting untried and poorly-understood inventions.
-- 
Those who do not understand Unix are |  Henry Spencer @ U of Toronto Zoology
condemned to reinvent it, poorly.    | {allegra,ihnp4,decvax,utai}!utzoo!henry

karl@haddock.ISC.COM (Karl Heuer) (03/18/88)

In article <7712@apple.Apple.Com> lenoil@apple.UUCP (Robert Lenoil) writes:
>I'm having some trouble understanding the const type qualifier.  If I declare
>  const int foo = 3;
>then I hope that the compiler would actually use the constant 3 whenever foo
>is used, instead of allocating storage and generating a reference to foo.

The compiler is certainly free to do so.  (But if this declaration has file
scope, you'd better declare it "static" as a hint to the compiler that no
other module uses it.)

>One exception, however, would be if I declare
>  const int *bar = &foo;   ["&" added  --kwzh]
>which would have to allocate storage for foo so that a pointer to foo could be
>placed in bar.  My question involves what happens when foo is defined in
>another module?  Does
>  extern const int foo;
>expect to be linking to an external constant, or is it linking to an external
>integer, and semantically prohibiting assignment to that integer?

What do you mean by "linking to an external constant"?

>What about
>  extern const char foo[];
>I suppose that's illegal, because sizeof(foo), which should be a constant,
>won't work.

No, it's quite legal, just as for a non-const array.  You can't apply sizeof
to an incomplete type.

>The ANSI C draft of 1/11/88 mentions the const type qualifier in section
>3.5.3 but doesn't actually define its meaning.

It doesn't describe the implementation details, if that's what you mean.  The
dpANS defines the valid operations and their semantics; whether storage is
actually allocated is a "quality" issue, beyond the scope of the standard.

You seem to be asking whether const is at least as efficient as #define.
Let's assume we're talking about a header file that is shared among multiple
modules.  If you use "static int const foo=3;" and never take its address, a
good compiler ought to be able to inline it wherever it's used.  (If you do
take its address, you couldn't have used a #define anyway, so the comparison
is invalid.)  If you use "extern int const foo;" and somewhere else you have
the definition "int const foo=3;", then the compiler will probably allocate
storage, unless it has some way to intuit the value.  (It could still be more
efficient than a non-const reference, though.)

(Btw, I intentionally wrote "int const" rather than "const int".  "const" is a
qualifier, not a storage class; it modifies "foo", not "int".)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

jejones@mcrware.UUCP (James Jones) (03/19/88)

In article <3034@haddock.ISC.COM>, karl@haddock.ISC.COM (Karl Heuer) writes:
> Let's assume we're talking about a header file that is shared among multiple
> modules.  If you use "static int const foo=3;" and never take its address, a
> good compiler ought to be able to inline it wherever it's used.

Is this really true?  As nearly as I can tell, "const" doesn't *really* mean
"constant," it means "readonly" (which makes me wonder why the Committee
chose such a highly misleading name for it).  An example the Draft gives is
of a variable that might represent a memory-mapped input port that cannot
be assigned to (though, to be sure, that is declared "const volatile.")

		James Jones

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/20/88)

In article <7712@apple.Apple.Com> lenoil@apple.UUCP (Robert Lenoil) writes:
>const int foo = 3;

In general, "const" in ANSI C does not mean the same as in Pascal etc.
(C's equivalent to that is "manifest constant" preprocessor macros.)
Unless the compiler can ascertain that it is impossible for any code
to need the variable `foo' to really exist, it is obliged to generate
storage for `foo'.  Think of "const" as "readonly" and you'll have a
good idea of its intended properties.

>... can someone translate their definition of noalias into English for me?

There are definitely some problems with the current specification of
type qualifiers, which will probably be fixed one way or another at
the April X3J11 meeting.  (The possibilities range from minor tweaks
to the current specs to fix technical errors, through removing
"noalias" altogether and possibly adding some other support for
certain types of optimization, such as Peter Darnell's [] proposal.)

Basically, "noalias" is a hint that, although as usual in C there
can be multiple "handles", i.e. access paths to an object's content,
the first noalias "handle" used is the master version, which is
guaranteed to always reference the official object content, and access
via other handles need not always go fetch the current object content
for safety (as is required if "noalias" is not used), but instead can
use cached copies of the object.  The only purpose of all this is to
allow just that sort of caching, which is a form of optimization that
some people consider important.  For example, it allows vectorizing
machines to employ vector operations where normally C would say that
they could not be safely used.

I think I got that substantially right, although I may have messed
up a few details.  Hey, I don't plan to use this stuff!

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/20/88)

In article <1988Mar17.175448.521@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>The idea is fine, but nobody really understands the implications well enough
>to be sure it's being done right.  X3J11 has violated its own rules about
>not adopting untried and poorly-understood inventions.

The first sentence is erroneous; several people fully understand the
issue, its implications for the implementation, and its implications
for the programmer.  (Their evaluations of the desirability of the
feature vary widely, however.)  I don't claim to be one of them, but
I think I understand the issue in general.

There are known technical errors in the current wording that have to
be fixed before the final standard if "noalias" is to remain; these
have nothing to do with the desirability of the feature, though.

As to your second sentence, there is some merit to it.  Many committee
members were uneasy at introducing such a major invention at this late
date, but it was done to resolve an issue that had remained unsolved
for several meetings.  For a while it looked like there would never be
a solution that enough people would accept; "noalias" was by far the
most generally acceptable proposal for dealing with the aliasing vs.
optimization issue.  You and I probably would agree what the solution
"should" have been (namely, to simply disallow unsafe optimization),
but there were many people who didn't want to accept that solution.

The above is anecdotal and should not be considered official X3J11
history.

karl@haddock.ISC.COM (Karl Heuer) (03/23/88)

In article <613@mcrware.UUCP> jejones@mcrware.UUCP (James Jones) writes:
>In article <3034@haddock.ISC.COM>, karl@haddock.ISC.COM (Karl Heuer) writes:
>>If you use "static int const foo=3;" and never take its address, a good
>>compiler ought to be able to inline it wherever it's used.
>
>Is this really true [since "const" really means "readonly", not "constant"]?

An object which is declared const but not volatile can never be modified by a
correct program.  A conforming implementation is allowed to take advantage of
this knowledge, by putting it in read-only memory and/or by inlining it.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

dag@chinet.UUCP (Daniel A. Glasser) (03/23/88)

In article <3117@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article <613@mcrware.UUCP> jejones@mcrware.UUCP (James Jones) writes:
[original stuff deleted]
>>Is this really true [since "const" really means "readonly", not "constant"]?
>
>An object which is declared const but not volatile can never be modified by a
>correct program.  A conforming implementation is allowed to take advantage of
>this knowledge, by putting it in read-only memory and/or by inlining it.
>
>Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

I dissagree with Karl.  I'm looking at the draft standard right now, and I
believe that const does not give the compiler the license to 'inline' it.
The January 11, 1988 draft standard says in section 3.5.3, footnote 50,
page 65:

	"The implementation may place a const object that is not volatile
	in a read-only region of storage."

The rationale of the same date is out of sync with the standard in section
numbers, and discusses this in section 3.5.2.4.  It states that the compiler
can cache this value, only reading it from storage once, though this does
not guarentee that the value will not ever change in the running of the
program.  The only time that a const value is known never to change (by
the compiler) is when it is declared as 'const noalias'.  (Note that if
dmr, and those of us who agree with him, has is way, noalias will not make
it into the final standard.)  The rationale goes on to say that they could
just as easily created a 'nonconst' qualifier instead.  The purpose of the
const qualifier is to allow diagnostics when the programmer inadvertantly
does an assignment to one, and to allow the compiler to make some assumptions
about the integrity of the data between sequence points.  The actual hardware
protection of the storage allocated to const objects is an extension, and
not directly addressed in the body of the draft standard.  (just a footnote)

What Mr. Heuer is describing is more of a 'Literal contant'.  These exist
in some other languages, but the closest thing to one in C is an enumeration
constant.  Maybe the next round of standards for C should consider the
BLISS style global literal as a new feature for the language.

Daniel A. Glasser		I've read the draft standard from cover to
Mark Williams Co.		cover, and also the rationale.  It leaves me
1430 W. Wrightwood Ave.		with a philisophical, not technical, question:
Chicago, IL 60660		
(312) 472-6659			If you dereference a NULL pointer to void,
...!ihnp4!mwc!dag				does anything happen?  (-;
				... *((void *)0)
				? Error -- Pointer is NULL and void

	(void *) wars, coming to a newsgroup near you!

PS to KH:  Please don't flame me so violently, huh?  I'm not posting out
	of spite.  And yes, our compiler converts i++ to ++i where it makes
	no difference.  I've encountered those that do not, and have learned
	not to depend on its being done.
-- 
		Daniel A. Glasser	dag@chinet.UUCP
    One of those things that goes "BUMP!!! (ouch!)" in the night.
 ...!att-ih!chinet!dag | ...!ihnp4!mwc!dag | ...!ihnp4!mwc!gorgon!dag

henry@utzoo.uucp (Henry Spencer) (03/24/88)

> >... nobody really understands the implications well enough
> >to be sure it's being done right...
> 
> The first sentence is erroneous; several people fully understand the
> issue, its implications for the implementation, and its implications
> for the programmer.  (Their evaluations of the desirability of the
> feature vary widely, however.)

The parenthesized addendum there is, um, interesting.  Might one suspect
that the construct is, in fact, *not* well-understood in a more global
sense?  Dissent about whether it's a good idea is hardly a sign of "being
sure it's done right"!

The correct "solution" to this problem at this time is DO NOT ATTEMPT
TO SOLVE IT.
-- 
"Noalias must go.  This is           |  Henry Spencer @ U of Toronto Zoology
non-negotiable."  --DMR              | {allegra,ihnp4,decvax,utai}!utzoo!henry

tanner@ki4pv.uucp (Dr. T. Andrews) (03/25/88)

In article <7485@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
) ...  Think of "const" as "readonly" and you'll have a
) good idea of its intended properties. ...

So, allow me to propose that it be called "readonly" instead of
"const".  As that would accurately describe it ("const" \fBis\fP
misleading), it would seem a better choice.
-- 
{allegra clyde!codas decvax!ucf-cs ihnp4!codas killer}!ki4pv!tanner

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/26/88)

In article <6985@ki4pv.uucp> tanner@ki4pv.uucp (Dr. T. Andrews) writes:
>So, allow me to propose that it be called "readonly" instead of "const".

"const" was the name that C++ used for this type qualifier, which is
the main reason we decided to also call it "const".  Note, however,
that at the December meeting this qualifier was very nearly changed
to take on more of the meaning currently assigned to "noalias".  If
the AT&T representative had not vociferously objected to the conflict
with C++ usage of the keyword "const", it is possible that its name
would have been even LESS indicative of its function.

This subject is by no means closed; type qualifiers are likely to be
the most-discussed issue at the April meeting.  Who knows what will
happen..

pablo@polygen.uucp (Pablo Halpern) (03/30/88)

From article <3117@haddock.ISC.COM>, by karl@haddock.ISC.COM (Karl Heuer):
> An object which is declared const but not volatile can never be modified by a
> correct program.  A conforming implementation is allowed to take advantage of
> this knowledge, by putting it in read-only memory and/or by inlining it.

I disagree.  An object that is declared const but not volatile can not
be modified within the scope in which the declaration holds.  Nowhere
does the standard say that such an object is "really constant."  The
lack of the volatile modifier simply indicates that the object will
not be modified "behind the function's back" and therefore can take part
in certain optimizations.  This does bring up a problem, though.
In the standard, a clock register is used as an example of a volatile const.
The declaration looked something like:

	extern const volatile int clock;

The meaning of this is explaned that the clock register may not be
written to by the program and that if it is read once, the value may
be different if it is read again.  The question is, doesn't this imply
that somewhere there is a function that might look as follows:

	volatile int clock;

	void update_clock()	/* called on clock interupt */
	{
		clock++;
	}

Here, clock is defined with two different types (const volatile int verses
volatile int) in two different scopes.  Wouldn't a compiler (or lint) that
"sees" both scopes be allowed to complain about this?  If so, how could
you ever update a const volatile type.  (Don't tell me you should do it
in assembly language.  Avoiding the compiler's type checking is not the
issue.  The meaning of type modifiers in C is the issue.)  In general,
are you allowed to declare the same object with different type qualifiers
provided that the scopes of the declarations do not overlap?  If so,
please direct me to the section in the standard that says so.

Pablo Halpern		|	mit-eddie \
Polygen Corp.		|	princeton  \ !polygen!pablo  (UUCP)
200 Fifth Ave.		|	bu-cs      /
Waltham, MA 02254	|	stellar   /

karl@haddock.ISC.COM (Karl Heuer) (03/31/88)

In article <3938@chinet.UUCP> dag@chinet.UUCP (Daniel A. Glasser) writes:
>In article <3117@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>In article <613@mcrware.UUCP> jejones@mcrware.UUCP (James Jones) writes:
>[original stuff deleted]
>>>Is this really true [since "const" really means "readonly", not "constant"]?
>>
>>An object which is declared const but not volatile can never be modified by a
>>correct program.  A conforming implementation is allowed to take advantage of
>>this knowledge, by putting it in read-only memory and/or by inlining it.
>
>I dissagree with Karl.  I'm looking at the draft standard right now, and I
>believe that const does not give the compiler the license to 'inline' it.
>[The dpANS does say in a footnote:] "The implementation may place a const
>object that is not volatile in a read-only region of storage."

I think it's covered by the as-if rule.  If I say "int const x=5;", then since
we seem to agree that the value can never change (otherwise the implementation
would not have the liberty to enROM it), a compiler ought to be able to
convert "return x;" into "return 5;".  To disprove this, you'd have to come up
with a strictly conforming program in which the optimized version produces a
different result than the non-optimized.

>[Section 3.5.2.4 in the Rationale] states that the compiler can cache this
>value, only reading it from storage once, though this does not guarentee that
>the value will not ever change in the running of the program.

I don't see that.

>The only time that a const value is known never to change (by the compiler)
>is when it is declared as 'const noalias'.

That paragraph is talking about a pointer.  It's true that the referent of a
(non-noaliased) "int const *" is not cacheable, because there could be a
non-const path to the same object (i.e. we may actually be dealing with an
"int *" that has been cast to "int const *").  However, I maintain that the
object declared by "int const x=5;" is cacheable.

>PS to KH: Please don't flame me so violently, huh?  I'm not posting out of
>spite.  And yes, our compiler converts i++ to ++i where it makes no
>difference.  I've encountered those that do not, and have learned not to
>depend on its being done.

Neither this nor my previous posting was a flame; I'm sorry if you took it
that way.  In fact, I always write "++i" rather than "i++" for essentially the
same reason (and for the related reason that "++i" is the more fundamental
operation, so I'm writing what I mean), even though I've never used such a
compiler.  But I still believe that any compiler that doesn't make such a
simple optimization is probably so lousy that it's not worth hand-optimizing;
you end up doing a lot of work for a small benefit.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

dag@chinet.UUCP (Daniel A. Glasser) (03/31/88)

In article <3241@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
[discussion and excerpts about const/storage and the ANSI docs. removed]
>I think it's covered by the as-if rule.  If I say "int const x=5;", then since
>we seem to agree that the value can never change (otherwise the implementation
>would not have the liberty to enROM it), a compiler ought to be able to
>convert "return x;" into "return 5;".  To disprove this, you'd have to come up
>with a strictly conforming program in which the optimized version produces a
>different result than the non-optimized.

---- begin file 1 -----

const int x = 5;

int foo()
{
	return x;
}

---- begin file 2 -----

extern const int x;
extern int foo();

int fie()
{
	return x == foo();
}
---- end example -----

Maybe the compiler is at liberty to substitute the value 5 for the
use of x in the function foo, though I consider this rather unsavory
behavior, but it is not at liberty to omit storage for it entirely.

I cannot remember if ANSI C has const functions.  (A const function
is one which has exactly one (possibly non-unique) value for each possible
argument value, thus the expression
	extern int foo(const int);
	int i, a;

	i = foo(a)*foo(a);

can be converted by the compiler to (for a hypothetical machine)
(function return in reg0)
	push a_
	call foo_
	pop
	mult reg0,reg0
	store reg0, i_

In some cases, this optimization is a REAL win!
-- 
		Daniel A. Glasser	dag@chinet.UUCP
    One of those things that goes "BUMP!!! (ouch!)" in the night.
 ...!att-ih!chinet!dag | ...!ihnp4!mwc!dag | ...!ihnp4!mwc!gorgon!dag

karl@haddock.ISC.COM (Karl Heuer) (04/02/88)

In article <4381@chinet.UUCP> dag@chinet.UUCP (Daniel A. Glasser) writes:
>---- begin file 1 -----
>  const int x = 5;
>  int foo() { return x; }
>---- begin file 2 -----
>  extern const int x;
>  extern int foo();
>  int fie() { return x == foo(); }
>---- end example -----
>Maybe the compiler is at liberty to substitute the value 5 for the use of x
>in the function foo [but it isn't allowed to omit the storage entirely].

I think we're in agreement now.  What I claim is that (a) the substitution is
valid if the compiler knows the value, and (b) if *all* possible references
are so optimized, then the compiler need not allocate the space.  This is
normally possible only if the object has non-external linkage.

Actually, it's theoretically possible even in your example, if the compiler is
willing to do cross-file optimization.  (It would also have to "know" that no
library routines depend on x.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

karl@haddock.ISC.COM (Karl Heuer) (04/02/88)

In article <136@polygen.UUCP> pablo@polygen.uucp (Pablo Halpern) writes:
>From article <3117@haddock.ISC.COM>, by karl@haddock.ISC.COM (Karl Heuer):
>>An object which is declared const but not volatile can never be modified by
>>a correct program.  A conforming implementation is allowed to take advantage
>>of this knowledge, by putting it in read-only memory and/or by inlining it.
>
>I disagree.  An object that is declared const but not volatile can not
>be modified within the scope in which the declaration holds.  Nowhere
>does the standard say that such an object is "really constant."

Footnote 50 certainly implies it.  If the body of the standard doesn't say
that modifying a const object through a non-const handle is undefined, I'm
sure the Committee would like to correct the oversight.

(Note: I'm talking about actual objects declared const.  Your statement is
true for a dereferenced pointer-to-const.)

>In the standard, a clock register is used as an example of a volatile const.
>The declaration looked something like:
>	extern const volatile int clock;
>The meaning of this is explaned that the clock register may not be written to
>by the program and that if it is read once, the value may be different if it
>is read again.  The question is, doesn't this imply that somewhere there is a
>function that might [modify it through a non-const handle]

No, the dpANS said that this may be "modifiable by hardware".  A memory-mapped
address could be so declared.  (Presumably the linker will bind the unresolved
extern to the appropriate magic location.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

djones@megatest.UUCP (Dave Jones) (04/02/88)

in article <3288@haddock.ISC.COM>, karl@haddock.ISC.COM (Karl Heuer) says:
> 
> In article <136@polygen.UUCP> pablo@polygen.uucp (Pablo Halpern) writes:

...

>>In the standard, a clock register is used as an example of a volatile const.
>>The declaration looked something like:
>>	extern const volatile int clock;
>>The meaning of this is explaned that the clock register may not be written to
>>by the program and that if it is read once, the value may be different if it
>>is read again.  The question is, doesn't this imply that somewhere there is a
>>function that might [modify it through a non-const handle]
> 
> No, the dpANS said that this may be "modifiable by hardware".  
> A memory-mapped address could be so declared.  (Presumably the linker 
> will bind the unresolved extern to the appropriate magic location.)
> 
> Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), 
>   The Walking Lint
    

Why should my program know whether the "clock" variable is being
"modified by hardware", or modified by an interrupt-handler?  Why
shouldn't one implementation use a hardware register, and another
use an interrupt handler?  Without breaking any application code?

We used to make up jokes about "write-only" memory. But "read-only
volatile memory" is a pretty screwy idea too.   It's oxymoronic.

It's really a matter of information-hiding.  The consumer of the clock
variable needs to know that consumers are not to write to it. 
Consumers do not need to know the specifics of how the variable gets
written to.

I say, "const" should apply only to the file in which the declaration
occurs.


 -- Dave (If I could walk like lint, I wouldn't need the talcum) Jones