[gnu.g++.help] Incompatible changes in C++

rms@AI.MIT.EDU (Richard Stallman) (11/18/90)

    Ah, don't you just love language design by committee?  Especially
    when the committee throws out the "prior art" rule?  X3J11 showed
    commendable restraint in what they did to C, but there are storm
    warnings out for C++...

This spirit of careless incompatibility can be traced back to
Stroustrup.  Didn't he change the behavior of struct and union,
breaking almost all C programs?  Make an ambiguous grammar?

bs@alice.att.com (Bjarne Stroustrup) (11/18/90)

 rms@AI.MIT.EDU (Richard Stallman) writes
(in reply to a message from Henry Spenser):

 >    Ah, don't you just love language design by committee?  Especially
 >    when the committee throws out the "prior art" rule?  X3J11 showed
 >    commendable restraint in what they did to C, but there are storm
 >    warnings out for C++...
 > 
 > This spirit of careless incompatibility can be traced back to
 > Stroustrup.  Didn't he change the behavior of struct and union,
 > breaking almost all C programs?  Make an ambiguous grammar?

Before this degenerates further it might be useful to mention a few things.

X3J11 did indeed show restraint in what they did to C and I hope X3J16
will show similar restraint. It should be remembered that X3J11 did do
a few interesting things to C. Notably it added `const,' prototypes
(both adapted from C++), locale in libraries, w_char, the rules for
unsigned arithmetic, etc. (in such cases C++ has followed ANSI C).
Like C, C++ was at the point of standardization not in a state where
simple rubber stamping would make an acceptable standard.

In particular, the proposal to standardize C++ (from HP) explicitly
mentions the need to add templates and exception handling; both of
which have now been voted in. In both cases SOME prior art existed
both in C++ and in similar languages. In the case of templates X3J16
found 3 independent implementations and well over half a million
lines of code in use.

I don't see any signs of a ``spirit of careless incompatibility'' in
X3J16 nor could I see how such a thing could originate from me.

I am not aware of a  having ``change(d) the behavior of struct and union,
breaking almost all C programs.'' In fact, nothing C++ does to structs
or unions break any C programs. I suspect rms is misinformed.

The C++ grammar is indeed not LALR(1). The language extensions that
caused that are from 1982/1983. That is, before X3J11 was convened and
before an LALR(1) grammar for C was known. That the C++ grammar is not
LALR(1) is an inconvenience and X3J16 is looking into what can be done
to make the definition of C++ more amenable to formal parser techniques.
It should be remembered though that retrofitting a simple LALR(1)
grammar would break C++ code and X3J16 would be sensitive to accusations
of having broken code. Note that this grammar problem does not imply
a C/C++ incompatibility and that it stems directly from the desire
from making user-defined types notationally equal to built-in types in
C++ (rather than making do with the simple `struct s' and `union u'
notation from C). I think that my design choices in this area traded
implementor convenience for user convenience. It would have been nice
if I had found a way to inconvenience neither group, but I didn't.

I would consider it naive to expect a work of the magnitude of C++ to
be 100% compatible to C. C++ is a separate language from C. 100%
compatibility was not a C++ design aim (remember C didn't even have
prototypes when C++ was first put into use), ``no gratuitous
incompatibilities'' was the aim. I consider that aim met with more
success than most users expect or are aware of. For details see
chapter 18 (``Compatibility'') of the ARM or the paper I wrote for
``The C++ Report'' with Andrew Koenig: ``C++ - As close to C but
no closer.'' If you want 100% C compatibility use a C compiler.
If you are willing to remove anachronisms from your code (typically
just adding function prototypes and removing C++ keywords used as
identifiers will do the job) you can try C++. If you have a C++
compiler that carefully implements the language as defined by
the ARM I suspect you'll be pleasantly surprised with the degree
of compatibility.

Finally, we have yet to see an example of ``design by committee''
in X3J16. Templates and exception handling were designed by me
(helped by Andrew Koenig in the case of exception handling).
Naturally, this does not guarantee good design, but it is not
design by committee.

rms@AI.MIT.EDU (Richard Stallman) (11/19/90)

    I am not aware of a  having ``change(d) the behavior of struct and union,
    breaking almost all C programs.'' In fact, nothing C++ does to structs
    or unions break any C programs. I suspect rms is misinformed.

My understanding is that the following declaration

   struct foo { int a, b; };

is handled by C++ in a way incompatible with C.  C++ defines `foo' as
a typedef, but C does not.

Is this not so?  I would be glad if it is not.  But if it is, the
incompatibility breaks many C programs because they use the same
symbol both as a structure tag and as a variable (often a variable
whose type is or points to that structure).

The grammar ambiguity shows up in

   int (x);

which could either declare `x' as an integer or convert it to one.

    I think that my design choices in this area traded
    implementor convenience for user convenience.

I don't see that they help the implementors enough to compensate for
the inconvenience to the users.

Users switching from C to C++ wish that C++ were a superset of C.
There is no fundamental or important reason it is not.  The real
features of C++ are upward compatible with C.  The problems are all
superficial.

These problems are insoluble today, but would have been so easy to
avoid at the start.  For example: make `class' do a typedef
automatically but not `struct' or `union'.  Don't use the syntax TYPE
(EXP)--use something like (TYPE) [ EXP ] instead.

An incompatibility that could have been avoided at such small cost
must be considered gratuitous.  It seems that Stroustrup decided
arbitrarily to reject upward compatibility as a goal, and thus
accepted incompatibilities in the absence of any need.

bs@alice.att.com (Bjarne Stroustrup) (11/20/90)

rms@AI.MIT.EDU (Richard Stallman @ Gatewayed from the GNU Project mailing
list help-g++@prep.ai.mit.edu) writes in response to a note by me:

 >     I am not aware of a  having ``change(d) the behavior of struct and union,
 >     breaking almost all C programs.'' In fact, nothing C++ does to structs
 >     or unions break any C programs. I suspect rms is misinformed.
 > 
 > My understanding is that the following declaration
 > 
 >    struct foo { int a, b; };
 > 
 > is handled by C++ in a way incompatible with C.  C++ defines `foo' as
 > a typedef, but C does not.
 > 
 > Is this not so?  I would be glad if it is not.  But if it is, the
 > incompatibility breaks many C programs because they use the same
 > symbol both as a structure tag and as a variable (often a variable
 > whose type is or points to that structure).

You can be happy. What you described is not the whole story. C++ does
indeed allow you to use the name of a user-defined type (structure tag)
without a prefix. However, for C compatibility names can be declared to
refer to both a type and an object and the compiler can resolve this
overloading.

For example:

	struct s { int m; } s;

in both C and C++ declares a struct and an object both named `s' and
in both languages you can use the prefix `struct' to disambiguate:

	s a;		/* error `s' is an object of struct `s' */
	struct s b;	/* ok */
	s.m = 7;	/* ok */

Similarly, the `struct stat' `stat()' ambiguity is handled compatible.

I consider this very ugly, it does complicate compilers, and most
experienced C++ users would have prefered this ugly compatibility hack
not to be required of their compilers. My original C++ compiler,
cfront, always supported this.

There is one rare case where C++ is not C compatible. If the name of a
function or an object is hidden by a local struct name the global name
must be explicitly qualified to be accessible. Default, the local name
is used:

	int s;

	void f() {
		struct s { int m; };
		struct s a;	/* ok */
		s++;		/* ok in C, error in C++ */
		::s++;
	}

I consider this a small price to pay for being able to use type names
without prefixes.

 > The grammar ambiguity shows up in
 > 
 >    int (x);
 > 
 > which could either declare `x' as an integer or convert it to one.

Not in C++. Again, C++ is defined to follow C in such cases. `int (x);'
declares `x' to be an `int.'

A quick look in Ellis & Stroustrup: ``The Annotated C++ Reference Manual''
(Addison Wesley 1990) would have told you that. The reference manual
proper from that book is the base document for the ANSI standardization
of C++.

 >     I think that my design choices in this area traded
 >     implementor convenience for user convenience.
 > 
 > I don't see that they help the implementors enough to compensate for
 > the inconvenience to the users.

I think you misread. The design giver users convenience and the
implementors more work; not the other way around. Ideally things
should be designed for the convenience of both users and implementors,
but where a choice must be made I favor users.

 > Users switching from C to C++ wish that C++ were a superset of C.
 > There is no fundamental or important reason it is not.  The real
 > features of C++ are upward compatible with C.  The problems are all
 > superficial.

SOME users switching from C to C++ wish that C++ were a superset of C.

MOST users with even a minor amount of C++ experience realize that the
inconveniences caused by the few remaining incompatibilities are minor
and that full ANSI C compatibility would not only invalidate the large
existing body of C++ code but would seriously damage C++'s type system.
For example, C++ requires function prototypes rather than having them
optional like ANSI C.

Please realize that for many C++ users even the current degree of C
compatibility is a burden on everyday use and learning (not all C++
programmers used to be C programmers).

I consider the number of C++/C incompatibilities quite amazing small
giving the difference in aims of the two languages and the number of
concepts supported by C++ that have no counterpart in C. I think that
C++'s stated goal of `no gratuitous incompatibilities' or `as close
to C as possible - but no closer' has been met to a larger degree
than anyone would have had the right to expect.

It is a fact though, that some compiler writers have taken liberties
with the language definition and in particular have been uninterested
in the details of C compatibility. These problems are not very common
any more and with the ANSI C++ committee working there is less excuse
for divergent implementations. I'd like to encourage users to encourage
their compiler suppliers to implement the language as currently defined
(including these C compatibility hacks). These days users do have a
choice of C++ suppliers and need not be satisfied with half-measures.

 > These problems are insoluble today, but would have been so easy to
 > avoid at the start.  For example: make `class' do a typedef
 > automatically but not `struct' or `union'.  Don't use the syntax TYPE
 > (EXP)--use something like (TYPE) [ EXP ] instead.

Yes, it would indeed have been nice if I had possesed 20/20 foresight,
but I didn't and the problem is not completely trivial. Consider:

	complex z2 = z1+complex(PI,2.8);	// C++

	complex z2 = z1+(complex)[PI,2.8];	// suggestion

I suspect I would have had a hard time convincing users to adopt the
latter syntax. C's syntax is already so contorted that additions
are harder to make than one would like.

There have been many suggestions of how to make C++ an exact superset
of C by segregating the ++ features from the C features by syntactic
means. I consider such notions short sighted. There are too many languages
that do not posses a single syntax or a single type system and are
thus simply a collection of semi-random features without even the
semblance of unity. C++ does not belong to that class of languages.
Its differences from C are not arbitrary but follows directly from
that view. 

 > An incompatibility that could have been avoided at such small cost
 > must be considered gratuitous.  It seems that Stroustrup decided
 > arbitrarily to reject upward compatibility as a goal, and thus
 > accepted incompatibilities in the absence of any need.

I beg to differ.

I also think you ought to read the ARM or at least consult with a C++
expert before you start making statements with phrases such as `must
be considered gratuitous,' `in the absence of any need,' and `spirit
of careless incompatibility' in the discussion of my work. As someone
who has made himself into a public figure you have the obligation
to refrain from random flaming.

I don't see you could possible have the prerogative of telling me
what my aims ought to be, and especially not what they ought to have
been 10 years ago. If you nevertheless do, please at least try to get
your facts straight.

Note also that many C++ programmers resent being told by non-C++
programmers what their language ought to be. I understand that
sentiment well.

henry@zoo.toronto.edu (Henry Spencer) (11/20/90)

In article <11635@alice.att.com> bs@alice.att.com (Bjarne Stroustrup) writes:
>X3J11 did indeed show restraint in what they did to C and I hope X3J16
>will show similar restraint. It should be remembered that X3J11 did do
>a few interesting things to C. Notably it added `const,' prototypes
>(both adapted from C++), locale in libraries, w_char, the rules for
>unsigned arithmetic, etc. (in such cases C++ has followed ANSI C).

My point was not so much that X3J11 didn't do anything to C, but that they
didn't do very much that *some* existing implementation hadn't already done.
The unsigned-arithmetic rules, for example, existed in production compilers
well before X3J11 adopted them.  (There is a common syndrome in which X3J11
gets blamed for "inventing" anything that wasn't in the Unix C compilers,
but there have been other C compilers in the world for over a decade now.)
Prototypes and `const' were a borderline case, but early versions of C++
were pretty close to C and I think they were right in deciding that the
experience could be read across.  I believe the locale stuff existed at
HP.  w_char I am not sure about.  But overall, almost everything in ANSI C
is based on prior experience with C or C derivatives *somewhere*, and the
few things that were invented out of whole cloth -- notably `noalias' and
trigraphs -- were conspicuously poor ideas (in the case of `noalias',
sufficiently so to get it removed).

Although I'm not as happy with C++ as I used to be, I have little quarrel
with the idea of standardizing things that have been implemented and are
used successfully **in C++ or a derivative**.  I might think that feature
X is a dumb idea, but that's another question.  The numbers Bjarne cites
make it clear, in particular, that templates are well understood and are
appropriate for consideration by X3J16.

What concerns me is exception handling.  The article I was originally
responding to had a distinct air of "we're going to standardize something
like Bjarne's proposal; if we don't like it, we'll tinker with it until
we do".  This is why I hoisted storm warnings. :-)  I'm not convinced
that this feature has had enough implementation and use in C++ to be a
legitimate candidate for standardization at all -- confidence in the people
who designed it is not a substitute for real experience -- and committee
modifications to it are an invitation to disaster.  Storm warnings don't
mean that your house *will* blow down; they just denote cause for concern.
-- 
"I don't *want* to be normal!"         | Henry Spencer at U of Toronto Zoology
"Not to worry."                        |  henry@zoo.toronto.edu   utzoo!henry

rwk@CS.UTAH.EDU (Richard W. Kreutzer) (11/20/90)

I am not sure, but I think the handling of enum tags may be related to the
problem you (rms) describe concerning struct tags.  The following code will
not compile using g++-1.37.1.

class foo {
private:
  enum Number {ONE = 1, TWO};
  static const NUM = ONE;
public:
  int test();
  };

typedef enum {THREE = 3, FOUR} Number;

int foo::test() {
  Number num = ONE;
  printf("test: num=%d\n", num);
  printf("test: NUM=%d\n", NUM);
}

const NUM = 2;

main() {
  Number num = THREE;
  foo a;
  a.test();
}  


G++ says the typedef for Number is a redefinition.

--

        Richard (Dick) W. Kreutzer
UUCP:   cs.utah.edu!olyis!rwk
Mail:   Olympus Software, Inc.; 1333 E 9400 S; Sandy, UT 84093 (USA)
Phone:  +1 801 572 1610

grunwald@foobar.colorado.edu (Dirk Grunwald) (11/20/90)

>>>>> On 19 Nov 90 19:51:28 GMT, henry@zoo.toronto.edu (Henry Spencer) said:
	...
HS> What concerns me is exception handling.  The article I was originally
HS> responding to had a distinct air of "we're going to standardize something
HS> like Bjarne's proposal; if we don't like it, we'll tinker with it until
HS> we do".  This is why I hoisted storm warnings. :-)  I'm not convinced
HS> that this feature has had enough implementation and use in C++ to be a
HS> legitimate candidate for standardization at all -- confidence in the people
HS> who designed it is not a substitute for real experience -- and committee
HS> modifications to it are an invitation to disaster.  Storm warnings don't
HS> mean that your house *will* blow down; they just denote cause for concern.
HS> -- 
--

However, one would assume that other languages (Clu, Ada, Lisp, Mesa,
Module-2+ and Modula-3) have had exception handling, some of them for
years. Certain the design rational of those languages could be applied
to C++?

ark@alice.att.com (Andrew Koenig) (11/20/90)

In article <9011190326.AA13113@mole.ai.mit.edu> rms@AI.MIT.EDU (Richard Stallman) writes:

>     I am not aware of a  having ``change(d) the behavior of struct and union,
>     breaking almost all C programs.'' In fact, nothing C++ does to structs
>     or unions break any C programs. I suspect rms is misinformed.

> My understanding is that the following declaration

>    struct foo { int a, b; };

> is handled by C++ in a way incompatible with C.  C++ defines `foo' as
> a typedef, but C does not.

> Is this not so?  I would be glad if it is not.  But if it is, the
> incompatibility breaks many C programs because they use the same
> symbol both as a structure tag and as a variable (often a variable
> whose type is or points to that structure).

But C++ allows the same identifer to be used as a type name and a
variable name in such contexts, precisely for the purpose of preserving
C compatibility.  Of course, if you do recycle type names that way,
you must use `struct' or `class' before each occasion in which you want
to use the name as a type, just as you do in C.

> The grammar ambiguity shows up in

>    int (x);

> which could either declare `x' as an integer or convert it to one.

and which is therefore supposed to be interpreted as a declaration
in C++.  The rule is `if it looks like a declaration, it is.'

>     I think that my design choices in this area traded
>     implementor convenience for user convenience.

> I don't see that they help the implementors enough to compensate for
> the inconvenience to the users.

It's the other way around: C++ gave up implementer convenience
to gain user convenience.

> Users switching from C to C++ wish that C++ were a superset of C.

Every C++ user I know *likes* the ability to use structure names
as types without further formality.

> There is no fundamental or important reason it is not.  The real
> features of C++ are upward compatible with C.  The problems are all
> superficial.

Actually, the most significant incompatilibity between C++ and
ANSI C is that C++ treats

	extern int f();

as being equivalent to

	extern int f(void);

whereas ANSI C treats it as

	extern int f(...);

However, I believe that that treatment on the part of ANSI C is
stated to be an anachronism.

It is clear that the old treatment opens a hole in the type system
through which one could drive a moving van.

In practice, it is closing this hole that takes the most time when
converting large C programs to C++.  Doing so also usually discovers
serious errors in the programs being be converted, even when they
have appeared to work in the past.
-- 
				--Andrew Koenig
				  ark@europa.att.com

bs@alice.att.com (Bjarne Stroustrup) (11/20/90)

Henry Spencer at U of Toronto Zoology writes:

 > What concerns me is exception handling.

That is indeed a legitimate cause for concern. I share it up to a point.
It is the least tried aspect of the language as currently defined and
the one we ought to worry the most about.

We built partly on experience from the Clu, Modula2+, Modula3 sequence
of languages, partly on experience with faking exceptions in C++, and
partly on an experimental C++ with exception handling implementation
at Sun. So even though we don't have as much experience as we would
like we are not without practical experience with exceptions handling
in the context of C++.

There was another concern: There was a strong push from some users and
from some implementors: Agree on something NOW or else we will go our
own way. Caught between a rock and a hard place it seemed most sensible
for us all to agree to do the same thing. Anything else would guarantee
divergence.

I would have preferred a couple of years to experiement with exception
handling on a larger scale, but you can't always get exactly what you
want. People were NOT willing to wait with standardization and people
were NOT willing to define C++ without exception handling.

henry@zoo.toronto.edu (Henry Spencer) (11/20/90)

In article <30021@boulder.Colorado.EDU> grunwald@foobar.colorado.edu writes:
>HS> What concerns me is exception handling...  I'm not convinced
>HS> that this feature has had enough implementation and use in C++ to be a
>HS> legitimate candidate for standardization at all ...
>
>However, one would assume that other languages (Clu, Ada, Lisp, Mesa,
>Module-2+ and Modula-3) have had exception handling, some of them for
>years. Certain the design rational of those languages could be applied
>to C++?

The design philosophy, yes.  But *details are critical* in language design.
This is why it is so important to have *actual experience* with a feature,
in the language in question or a very close relative.  There is a big
difference between being able to say "something somewhat along these lines
ought to be workable, if we can get the details right" and being able to
say "we've proved that this exact design works".
-- 
"I don't *want* to be normal!"         | Henry Spencer at U of Toronto Zoology
"Not to worry."                        |  henry@zoo.toronto.edu   utzoo!henry

philip@pescadero.Stanford.EDU (Philip Machanick) (11/21/90)

In article <11645@alice.att.com>, ark@alice.att.com (Andrew Koenig) writes:
|> Actually, the most significant incompatilibity between C++ and
|> ANSI C is that C++ treats
|> 
|> 	extern int f();
|> 
|> as being equivalent to
|> 
|> 	extern int f(void);
|> 
|> whereas ANSI C treats it as
|> 
|> 	extern int f(...);
|> 
|> However, I believe that that treatment on the part of ANSI C is
|> stated to be an anachronism.
|> 
|> It is clear that the old treatment opens a hole in the type system
|> through which one could drive a moving van.
|> 
|> In practice, it is closing this hole that takes the most time when
|> converting large C programs to C++.  Doing so also usually discovers
|> serious errors in the programs being be converted, even when they
|> have appeared to work in the past.
I second this. Even with programs only a few hundred lines long (one
of which is in widespread use), I've discovered errors when converting
from C to C++. Sometimes, it's worth breaking a few programs to improve a
language - the effort required to do the repairs pays off in cutting the
number of bugs to be searched for. (None of this of course is an argument
for gratuitous changes.)
-- 
Philip Machanick
philip@pescadero.stanford.edu

jimad@microsoft.UUCP (Jim ADCOCK) (11/27/90)

Well, just to represent the other end of the spectrum, *my* concerns are
that C++ is being sacrificed too much on an alter of ANSI-C compatibility.
I believe a couple of years from now no one is going to give a fig about X3J11,
but rather are going to be cursing all these wierd little "features"
of X3J16 asking: "where the hell did *that* come from."  -- The answer
being that "that" came from striving for the last iota of backwards
compatibility.

*I'd* rather see C++ cleaned up a little bit, make some of the rules a
little simpler rather stretching the rules are far as possible in order
to encompass as many old C and C++ programs as possible.  Old programs
can be handled via a compiler switch -- even if it does make the compiler
writers job a little harder.

I think we should be looking less towards the past, and more towards the
future.  If one is not willing to give up on some backwards compatibility,
the language can only become messier and messier as new features are added.
Let's make C++ a good language in its own right, not the poor step-child of
ANSI-C.

[the opinions of a C++ user, not a compiler writer :-]

jimad@microsoft.UUCP (Jim ADCOCK) (11/29/90)

In article <1990Nov20.185345.21001@Neon.Stanford.EDU> philip@pescadero.stanford.edu writes:
|In article <11645@alice.att.com>, ark@alice.att.com (Andrew Koenig) writes:
||> whereas ANSI C treats it as
||> 
||> 	extern int f(...);
||> 
||> However, I believe that that treatment on the part of ANSI C is
||> stated to be an anachronism.
||> 
||> It is clear that the old treatment opens a hole in the type system
||> through which one could drive a moving van.
||> 
||> In practice, it is closing this hole that takes the most time when
||> converting large C programs to C++.  Doing so also usually discovers
||> serious errors in the programs being be converted, even when they
||> have appeared to work in the past.

|I second this. Even with programs only a few hundred lines long (one
|of which is in widespread use), I've discovered errors when converting
|from C to C++. Sometimes, it's worth breaking a few programs to improve a
|language - the effort required to do the repairs pays off in cutting the
|number of bugs to be searched for. (None of this of course is an argument
|for gratuitous changes.)

I believe these examples and other similar show that the complaint that
C++ is not _exactly_ ANSI-C is a red herring.  The real problem in
moving a program from one compiler to another is overcoming program
dependencies on implementation defined issues.  The practical
differences between ANSI-C and C++ are small compared to the number of
issues both leave implementation defined.  As pointed out above, at
least C++ provides a number of features that help detect these bugs
at compile time.  So my claim is that in general making some features
of C++ more type safe than ANSI-C actually helps in porting software from
one compiler to another -- even if there is an initial hurdle of small
changes one has to make in order to get old code to compile.  Once you
do get it to compile under C++, you have a much better chance of it
actually working.  It is more important to make C++ _strictly_ typed
and _object oriented_, than to make it _ansi-c oriented_.