[comp.lang.c++] identifiers enter scope: language lawyers

jar@io.UUCP (Jim Roskind x5570) (01/15/90)

In article by ark@alice.UUCP (Andrew Koenig)

>In article <1794@thumper.bellcore.com>, clayton@thumper.bellcore.com (R. Clayton) writes:
>>   ...

>One might think the following might work too:

>	extern char *home_dir, *path, *file;
>	Pathname home = home_dir;
>	{
>		Pathname file = home + path + file;
>		FILE *fp = file->open();
>		// ...
>	}

>with `file' in `home + path + file' referring to the `file' in
>the enclosing scope.  Unfortunately, the ANSI C standard decrees
>that a variable is defined from the instant its name is uttered
>in its declaration and C++ goes along with that.  

The language of the dpANSI C Standard is very subtle, and (as I recall)
the precise statement is that the identifier is defined from the instant
the "declarator is complete".  In C, this point is nicely marked by
either a ';' (end of declaration), '=' (start of initializer), or ','
(start of next declarator).  In C++, this location is unclear to a
typical LR1 parser, when a parenthesized initializer is used.
Although this detail has not (to my knowledge) been addressed in the
C++ Reference Manual, I believe there should be a subtle distinction
for C++ in this area.  Specifically, I believe that in C++ the
identifier should be placed into the scope at the end of parenthesized
initializer (if any), and otherwise at the end of the declarator (as
in ANSI C).  Assuming this resolution is adopted, the following slightly
obscure code should work:


	extern char *home_dir, *path, *file;
	Pathname home = home_dir;
	{
		Pathname file ( home + path + file ) ;
		FILE *fp = file->open();
		// ...
	}

I am only guessing, but I assume some C++ parsers would accept this
already with the semantics that are desired.  To be specific::

const int t=5;
void main()
	{
	int t(t+1);  // should initialize this local t to 6
	}

By the way, I am NOT advocating use of such cryptic code, only
standardization of its meaning.

The following very cryptic code can also be nicely disambiguated using
the above rule:

typedef int * T1;
int *pi1;
void main()
	{
	int (*T1)(T1); //redeclares T1 "pointer to function taking int *"
	void * pv = &pv; // valid ANSI C code
	int *pi1(pi1);  //redeclare local pi1 with initial value ::pi1
	}

>-- 
>				--Andrew Koenig
>				  ark@europa.att.com

Jim Roskind
Independent consultant
(407)729-4348
jar@ileaf.com

comeau@utoday.UUCP (Greg Comeau) (01/16/90)

In article <1372@io.UUCP> jar@io.UUCP (Jim Roskind x5570) writes:
>In article by ark@alice.UUCP (Andrew Koenig)
>>In article <1794@thumper.bellcore.com>, clayton@thumper.bellcore.com (R. Clayton) writes:
>>One might think the following might work too:
>>	extern char *home_dir, *path, *file;
>>	Pathname home = home_dir;
>>	{ Pathname file = home + path + file; }
>>with `file' in `home + path + file' referring to the `file' in
>>the enclosing scope.  Unfortunately, the ANSI C standard decrees
>>that a variable is defined from the instant its name is uttered
>>in its declaration and C++ goes along with that.  
>The language of the dpANSI C Standard is very subtle, and (as I recall)
>the precise statement is that the identifier is defined from the instant
>the "declarator is complete".

It just may be subtle, but in this situation both ANSI C and C++ 2.0 DRM
explicitely state the id is in scope as soon as the declarator is complete.
I don't see any ambiguity or unnecessary terseness here and that seems to be
very much in tune with Andrew's somewhat slang'ish "a variable is defined from
the instant its name is uttered in its declaration and C++ goes along with
that".  Of course, '{ Pathname file = home + path + ::file; }' should work
with no problems.

>In C, this point is nicely marked by
>either a ';' (end of declaration), '=' (start of initializer), or ','
>(start of next declarator).  In C++, this location is unclear to a
>typical LR1 parser, when a parenthesized initializer is used.

Hmm, I'm not thinking too hard about this but why should a typical
LR1 C parser do it better that a typical LR1 C++ parser?  Granted
there are differences, but I don't think they enter here.

>I am only guessing, but I assume some C++ parsers would accept this
>already with the semantics that are desired.  To be specific::
>const int t=5;
>void main()
>	{
>	 int t(t+1);  // should initialize this local t to 6
>	}
>By the way, I am NOT advocating use of such cryptic code, only
>standardization of its meaning.

Well, that won't work because neither t isn't based on a class, so
the () affair is no good.  But even if there were a class
it's meaning is standardized at least accto DRM and it still
follows the declarator constraint above as Andrew put it.

>The following very cryptic code can also be nicely disambiguated using
>the above rule:
>typedef int * T1;
>int *pi1;
>void main()
>	{
>	int (*T1)(T1); //redeclares T1 "pointer to function taking int *"
>	void * pv = &pv; // valid ANSI C code
>	int *pi1(pi1);  //redeclare local pi1 with initial value ::pi1
>
>Jim Roskind
>jar@ileaf.com

Yeah, but again, like above, () is NOT an initializer and therefore the local
pi1 is missing a type name in it's function prototype, rather than acting
like a constructor reference.  Unless there's an extension somewhere I've
been ignorant of, such a thing is not allowed on a base type (which has
nothing to do with a base class). (Perhaps fundamental type is a better
usage of the term??)  'pv' is obviously ok.

The inner T1 should be ok with C compilers and C++ compilers as well
as a situation like:

typedef int x;

main()
{
    x x;
}

However is this (not the x x case, the T1 case) one of the cases where C++
differs from C?  I can't see it anywhere but it must be.  Anything to do
with typedef vs classes?

At first glance I thought this would work, but running it through cfront
just gave a ghastly "error: syntax error" message!  And to make matters
worse (in my eyes) adding a T2 typedef and changing to 'int (*T1)(T2);'
still didn't make it happy, nor did it like 'int (*T1)(int);' so
the 'declarator is still in progress' was not an issue, more to the tune
of a name space pollution nuanse I never bother to look at.  I somehow
feel that is not the case but instead it's trying to apply the type info
from within the T1 typedef but that doesn't explain why a local decl of
'int *T1;' *is* alright and why it'll also bark as something like
'int (*T1)[30];' but not 'int *T1[30];'.  Or even though it accepts
'int *T1;' it won't take 'int (*T1);'! which is supposed to be the
same unadorned or not.
-- 
Greg, Comeau Computing, 91-34 120th Street, Richmond Hill, NY, 11418
Producers of CC C++, SysAdm columnist for UNIX Today!, Microsoft Systems Journal
(C programming), + others. Also, BIX c.language & c.plus.plus conf. moderator.
Here:attmail!csanta!greg / BIX:comeau / CIS:72331, 3421 / voice:718-849-2355