[comp.lang.c++] C++ syntactic ambiguity

david@lpi.UUCP (David Michaels) (11/10/89)

-----------------------------------------------
 The following is probably only of interest
 to C++ language-lawyer/compiler-writer types.
-----------------------------------------------

Besides the well known and well documented declaration-statement
vs. expression-statement ambiguity (which requires arbitrary compiler
look-ahead), there is another related function-declarator vs.
object-initializer ambiguity.  Assuming that "T" is a class type,
consider the following declaration:

        T a (long (x));

Here, "a" could conceivably be either of the following:

1. A function returning type "T" and taking one argument of
   type "long"; in this case the "x" is a dummy parameter name
   enclosed in (redundant) parentheses.

2. An object of type "T" being initialized (via a class constructor)
   with the value of the (function-style cast) expression "long (x)".

AT&T C++ (2.0) seems to choose #2.  This choice seems to be based on a
a new and incompatible (with old C and ANSI C) prohibition of redundant
parentheses around declarators; the simple declaration "int (x);" (inside
a function definition) has been rendered illegal.  In "The C++ Answer Book"
by Tony Hansen (on page 522 in appendix A), there is a statement that
superfluous parentheses in declarations, while legal in ANSI C, are *not*
permitted in C++.  But in "C++: From Research to Practice" by S.B. Lippman
and B.E. Moo (in the 1988 USENIX C++ Conference Proceedings) there is an
indication that C++ *does* allow extraneous parentheses in declarations.
In addition, in "The Evolution of C++: 1985 to 1987" by Bjarne Stroustrup,
there is an assertion that redundant parentheses are *illegal* in declarations,
but in the follow up "The Evolution of C++: 1985 to 1989", that assertion
seems to have been removed.  I spoke very briefly with Andrew Koenig about
this at the recent "C++ at Work-'89" Conference; he said that he wasn't at
that moment entirely sure what was currently implemented, but he thought that
#1 in the example above should be chosen, and seemed to be fairly certain that
it was *not* intended that redundant parentheses in declarators be disallowed.

If redundant parentheses in declarators should indeed be permitted in C++
(I think they should), and if the disambiguation rule is indeed to choose
a function-declarator over class-initializer in an ambiguous construct
(i.e.  similar to the way in which a declaration-statement is chosen over
an expression-statement), then I have the following questions/comments.

1. This rule seems fine except that it doesn't seem to yield the most
   expected (least surprising) behavior, because by just looking at the
   example above, you would probably pick interpretation #2 since normally
   people don't put redundant parentheses around declarators, and if you
   really wanted interpretation #2 you would have to do something special
   like surround "int (x)" in parentheses or use the old-style cast
   construct.  Perhaps redundant parentheses in *parameter* declarators
   should be disallowed after all ?

2. I assume that (as with the declaration-statement vs. expression-statement
   ambiguity) the disambiguation is purely syntactic; that is, the meaning
   of identifiers (beyond whether they are type-names or not) is not to be
   considered during disambiguation.  In particular, the disambiguation will
   not consider whether or not the declaration occurs within function scope,
   whether or not the type has a constructor, or whether or not the type is
   even a class type.

3. How smart/thorough should the disambiguation be ?  Consider this case:

        T a (long (x), long (x+1), long (x))

   If we just look-ahead at the *first* parameter-declaration/constructor-
   argument then since it *could* be a parameter-declaration we would assume
   we are looking at a function declarator and we would (begin to) parse "a"
   as a function returning type "T" etc. and get a syntax error when looking
   at the second parameter-declaration.  If however we look at *all* of the
   parameter-declarations/constructor-arguments, we would interpret "a" as
   an object of type "T" being initialized by three arguments (all of which
   are function-like cast expressions); this is more desirable I think.
   Just as a quality-of-implementation issue, the look-ahead process should
   probably terminate as soon as an unambiguous parameter-declaration is
   found (in which case we'd disambiguate to a function declarator) or as
   soon as an unambiguous expression (or illegal declaration) is found (in
   which case we'd disambiguate to a class initializer).

Phew, maybe I got a little carried away but we'd like to get this right.
Can anyone clear this up further ?  Thanks.

                                - David Michaels (david@lpi.uucp)
                                  Language Processors, Inc. (LPI)
                                  Framingham, MA 01701-4613
                                  (508) 626-0006

hansen@pegasus.ATT.COM (Tony L. Hansen) (11/13/89)

< AT&T C++ (2.0) seems to choose #2.  This choice seems to be based on a a
< new and incompatible (with old C and ANSI C) prohibition of redundant
< parentheses around declarators; the simple declaration "int (x);" (inside
< a function definition) has been rendered illegal.  In "The C++ Answer
< Book" by Tony Hansen (on page 522 in appendix A), there is a statement
< that superfluous parentheses in declarations, while legal in ANSI C, are
< *not* permitted in C++.  But in "C++: From Research to Practice" by S.B.
< Lippman and B.E. Moo (in the 1988 USENIX C++ Conference Proceedings) there
< is an indication that C++ *does* allow extraneous parentheses in
< declarations. In addition, in "The Evolution of C++: 1985 to 1987" by
< Bjarne Stroustrup, there is an assertion that redundant parentheses are
< *illegal* in declarations, but in the follow up "The Evolution of C++:
< 1985 to 1989", that assertion seems to have been removed.

When I wrote that paragraph, the restriction was very definitely present.
Unfortunately, I didn't catch the fact that the restriction had later been
removed while I was reviewing the 2.0 reference manual. That's right, the
restriction is no longer present.

Isn't language evolution wonderful? :-)

The other statements within that section in Appendix A are correct.

					Tony Hansen
				att!pegasus!hansen, attmail!tony
				    hansen@pegasus.att.com