[adsp.sw] Major problem with Comeau C++

bojsen@moria.UUCP (Per Bojsen) (06/27/91)

MAJOR C++ GRIEF ALERT!  READ THIS NOW!

I have found a serious problem with the AT&T cfront Comeau C++ is based on!
This problem is associated with static members of classes, and thus affects
usage of the iostream library (when including iostream.h).

Consider a class definition:

        class Object
        {
        public:
          Object() { ObjectCount++; }
          ~Object() { ObjectCount--; }

        private:
          static int ObjectCount;
        };

All objects of class Object share the ObjectCount member.  How can this
behavior be represented in a C program (after all, cfront translates the
C++ source to C)?  By defining an external variable of type int to hold
the value of Object::ObjectCount.  This is precisely the source of the
problem.  Every module that uses (read: defines objects of) class Object
will need to reference this external variable.  But it needs to be
*defined* somewhere.

The problem is that cfront processes each source file individually and
thus is unable to determine which module the definition of the
Object::ObjectCount variable should be placed in;  it's either all or
none.  What does cfront do, then?  It actually *defines* the variable
in *all* modules using the class!  Unfortunately, this does not work
well with blink:  you get ``variable multiply defined'' errors.

A short digression on external variables as defined in the C reference
manual of K&R 2 is necessary at this point.

An external declaration of a variable in C is a declaration outside
functions.  If the declaration does not include the `static' specifier
the variable has external linkage (static external variables have
*internal* linkage).  An external declaration is a definition if it
has an initializer.  An external declaration that does not have an
initializer, and does not contain the `extern' specifier is a *tentative
definition*.

If a definition of a variable appears in the module any tentative
definitions are treated merely as redundant declarations.  If no
definitions appears in the module all tentative definitions become a
single definition with initializer 0.

Each object (e.g., variables) must have *exactly* one definition.  For
objects with external linkage this rule applies to the entire program.

Now, in one of the previous paragraphs I said that cfront will generate
definitions for the external variables representing static members of
classes in each module using these classes.  Well, actually cfront
generates *tentative definitions*.  But that violates the one-definition
rule of the preceding program!

This seeming paradox is resolved under UNIX by an extension of the
treatment of tentative definitions.  In UNIX C *all* tentative
definitions of an externally-linked object, throughout all modules of
the program, are considered together.  As an example, consider a program
consisting of the following two modules:

        /**** Main.c ****/
        int ExternalVariable;

        main()
        {
          ExternalVariable++;
          module();
          printf("%ld\n", ExternalVariable);
        }

        /**** Module.c ****/
        int ExternalVariable;

        module()
        {
          printf("%ld\n", ExternalVariable);
          ExternalVariable++;
        }

Under UNIX the external variable ExternalVariable is the same, while
under SAS/C we get a multiply-defined error during linking.  The UNIX
behavior is recognized as a common extension by the ANSI C standard
according to K&R 2 p. 227.

So, we now understand why cfront works under UNIX and why it doesn't
work with SAS/C and blink: cfront relies on a UNIX extension to the
C language.  Is there anything we can do about it under AmigaDOS?
I see several kludges and some solutions to the problem.  I'll list
the kludges first:

1) Use only one module in programs.

2) Only use classes with static members in *one* module each.

3) Compile the generated C source files with option -x, which will
   treat *all* external definitions as external declarations.  Make
   a special source file which includes all class definitions that
   uses static members, as well as normal external (global) variables.
   Compile this file without the -x option.  The object module thus
   obtained should contain all *definitions* of C-objects with external
   linkage.

(1) is generally not acceptable for obvious reasons and (2) is not much
better.  (3) might be an acceptable interim solution but is not
acceptable in the long run.  The acceptability of this kludge depends
on how easy it is to find all the declarations one needs.  By refraining
from using global variables things will be somewhat easier, I think.
Incidentally, this is the solution Lattice suggested in the manual to
the old Lattice C++.

The solutions:

1) Create a program that will analyze all object modules of the program,
   extract all multiply defined objects and place them in a new object
   module, while removing them from the other object modules.

2) Create a program that will analyze all C source files generated by
   cfront for the program, extract all definitions of objects with
   external linkage and place them in a special C source file.  Compile
   the cfront generated C files with the -x option.  Compile the special
   C file without the -x option.

3) Change cfront to output the information gathered by the program in
   (2) to an auxiliary file.  Use this file to generate the external
   definitions C source file in (2) (this may be a trivial step if the
   information generated by cfront is C code).

4) Convince SAS/C to change their C compiler and blink to behave like
   UNIX C.

Solution (1) is possible but somewhat ugly since object modules may
have to be changed/patched.

The hardest part of (2) is the implementation of (parts of) a C
parser.  The program should output all external declarations and all
type definitions for every module in the program.  Since the output
of cfront does not need to be cpp'ed things are somewhat simpler.  The
parser can skip function definitions.

I don't think (3) is likely, and I don't know whether (4) is possible.
Solution (1) combined with the blink part of (4), i.e., integrating the
linker preprocessor in (1) into the linker, might be the best possible
solution, since it'll be completely transparent to the user.

Is there any linker for the Amiga that coalesces multiply defined
objects?

I guess this is all for now.  Any comments are welcome!

--
.------------------------------------------------------------------------------.
|  Greetings from Per Bojsen.                                                  |
+------------------------------+-----------------------------------------------+
|  EMail: cbmehq!lenler!bojsen | "Names do have power, after all, that of      |
|     Or: bojsen@dc.dth.dk     |  conjuring images of places we have not seen" |
`------------------------------+-----------------------------------------------'

bojsen@moria.UUCP (Per Bojsen) (06/27/91)

In part 1 I wrote:

> The solutions:
>
> 1) Create a program that will analyze all object modules of the program,
>    extract all multiply defined objects and place them in a new object
>    module, while removing them from the other object modules.
>
> [...]
>
> Solution (1) is possible but somewhat ugly since object modules may
> have to be changed/patched.
>
I have pondered this solution a bit.  I think it's the easiest of the
four I mentioned in my previous post.  The AmigaDOS object file format
is fairly simple to parse---a lot simpler than parsing C code.

I think there're at least six possible ways to implement (1):

a) Scan the object files for external definitions and put all these in
   a new definition module; remove the definitions from the main object
   files.

b) Like (a) but remove the space allocated for the defined objects
   that are moved to the definitions module.

c) Like (a) but only for external definitions that occur in more than
   one object module.

d) Like (c) but remove the space allocated for the defined objects
   that are moved to the definitions module.

e) Scan the object files for multiply defined external objects; pick
   some occurance (e.g., the first) as the definition and remove the
   others.

f) Like (e) but remove the space allocated for the defined objects
   that are eliminated.

The last two are appealing because no extra object file is needed.  All
methods require that some or all object modules are patched and/or
otherwise changed.  (a) and (b) is the symmetric case.  (b), (d), and
(f) are better than the others because they avoid duplicate allocation
of space.

Should I decide to do the program I will probably implement (f) unless
there's good arguments for implementing some of the others.  Before I
start programmig: does anyone know of a tool like this one?

Note, that this program is a frontend to the linker, and that it could
also be extended to do the magic that the stid program currently does
(i.e., generating the calls to the constructors and destructors of
external class objects).

Ideally, the functionality of this program should be built into blink,
I think.

What does other users of Comeau C++ (and the old Lattice C++) do to
cope with this problem?  Does anyone know if Comeau Computing is working
on the problem?

If I get the time to do the program I'll make it freely distributable,
allowing the C++ vendors to distribute the program with their C++ packages
if they're interested.

--
.------------------------------------------------------------------------------.
|  Greetings from Per Bojsen.                                                  |
+------------------------------+-----------------------------------------------+
|  EMail: cbmehq!lenler!bojsen | "Names do have power, after all, that of      |
|     Or: bojsen@dc.dth.dk     |  conjuring images of places we have not seen" |
`------------------------------+-----------------------------------------------'


--
.------------------------------------------------------------------------------.
|  Greetings from Per Bojsen.                                                  |
+------------------------------+-----------------------------------------------+
|  EMail: cbmehq!lenler!bojsen | "Names do have power, after all, that of      |
|     Or: bojsen@dc.dth.dk     |  conjuring images of places we have not seen" |
`------------------------------+-----------------------------------------------'

ricwe@ida.liu.se (Rickard Westman) (06/29/91)

bojsen@moria.UUCP (Per Bojsen) writes:

>I have found a serious problem with the AT&T cfront Comeau C++ is based on!
>This problem is associated with static members of classes, and thus affects
>usage of the iostream library (when including iostream.h).

>Consider a class definition:

>        class Object
>        {
>        public:
>          Object() { ObjectCount++; }
>          ~Object() { ObjectCount--; }

>        private:
>          static int ObjectCount;
>        };

>All objects of class Object share the ObjectCount member.  How can this
>behavior be represented in a C program (after all, cfront translates the
>C++ source to C)?  By defining an external variable of type int to hold
>the value of Object::ObjectCount.  This is precisely the source of the
>problem.  Every module that uses (read: defines objects of) class Object
>will need to reference this external variable.  But it needs to be
>*defined* somewhere.

Yes, and *you* need to do that, yourself.  Declare ObjectCount
globally in all files that uses the Object class.  Use the 'extern'
keyword in all but one of these declarations.  

In 'The Annotated C++ Reference Manual' (Ellis & Stroustrup), the
practice of allowing the definition of a static class member to be
omitted is described as an anachronism.  It does not have to be
provided by a correct C++ implementation:

"The declaration of a static data member in its class declaration is
 *not* a definition.  A definition is required elsewhere; see also
 paragraph 18.3"  (paragraph 18.3 describes anachronisms)

Clearly, Cfront implements this anachronism, but in a way that doesn't
work well with SAS/Lattice C.  You might just as well consider this
'feature' unsupported and adhere to the current language definition.

--
Rickard Westman, University of Linkoping, Sweden        ricwe@ida.liu.se