[comp.lang.c++] Multiple inclusion of virtual tables...

kipp@warp.SGI.COM (Kipp Hickman) (03/18/89)

I implemented a simple hack in our cfront 1.2 compiler as follows to ``solve''
this problem at SGI...

In addition to +e0 and +e1, the compiler understands +e<name> where <name>
is the name of a class to generate a virtual table for.  During the print
phase of the compiler, it checks the name against the class name it is
printing the virtual tables for.  If the name matches, the virtual table
is written out global and defined.  Otherwise, the table is extern'd.

Given that, our Makefile's have something that looks like this:

    .c++.o:
	    $(C++) +e$* -c $<

And then all you have to do is name the source file that implements the
class the same as the class (i.e.  class Foo in file Foo.c++).

A minor side effect is that developers know where to find the implemntation
for a particular class without hunting...

happy hacking...

					kipp hickman
					kipp@sgi.com

grunwald@flute.cs.uiuc.edu (03/18/89)

I'd thought about +e<name> -- the problem being that
	(a) this requires changing the entire makefile
	(b) requires a very strict naming convention for files.

E.g., If I have class ``SimulationMultiplexor'' (I do),
then I need to have a file with the same name. That's not so bad,
except on SysV (which, thank god, I'm not using).

Using #pragma means you need to change one line in the Makefile
(the one defining C++FLAGS) and lets you use whatever file
naming convention you want.
--
Dirk Grunwald
Univ. of Illinois
grunwald@flute.cs.uiuc.edu

dld@F.GP.CS.CMU.EDU (David Detlefs) (03/21/89)

In article 2970, Dirk Grunwald suggests an alternative to the +e0/+e1
method of cutting down on the number of vtables in programs.  His
suggestion is that programmers insert

#pragma class instance Foo

in the .c file corresponding to the .o file in which the programmer
wishes the vtable (and bodies of inline functions) be defined.

I would like to suggest a slight variation on this theme that I've
implemented here at CMU.  It's much like the above, except that at
class declaration time I choose a distinguished non-inline member
function, and use the implementation of that distinguished member
function as a marker for where the vtable should be defined.  (I don't
at present insert bodies for virtual functions, but adding that should
be easy; similarly, if the semantics of the language were modified,
static member initialization could be done here as well.)  In my
implementation, I choose the first non-inline member-function name in
alphabetic order (being carful to use fully expanded names of
overloaded functions.)

Pro: Just as with Mr. Grunwald's suggestion, each class gets exactly
one vtable.  Vtables for classes in libraries are in the .o's in the
.a, where common sense says that they belong.  The added value of my
scheme (which I'm sure has been thought of many times before by
others) is that in most cases, it does the right thing with absolutely no
programmer effort.

Cons: bad things about this are:
  1) it doesn't work for classes that have no non-inline functions,
  2) if a member function is not declared inline in the class
declaration, it may be chosen as the distinguished member function.
If an inline definition for that function is given later on in the .h
file, things break.
  3) you have to link your program with all .o files that contain
implementations of member functions for classes used in the program.

3) Is not a big restriction, I think.  Most people generally follow a
"one .h, one .c per class" protocol.  This is especially not a problem
for libraries, where all the .o's for a class will be in the .a.
It would be good if linkers could be modified to produce a suggestive
error message whenever a vtable if found to be undefined.

2) is easy to "solve."  Just change the language :-)  Seriously, the
only way around this I can think of would be to make it illegal to
include an inline implementation of a member function outside a class
declaration unless the member function had been declared inline in the
declaration.  That is

  class foo {
    int i;
   public:
    void bar();
  };

  inline void foo::bar() { i++; };

would be illegal, and would have to be fixed to say

  class foo {
    int i;
   public:
    inline void bar();   // <--- Note inline.
  };

  inline void foo::bar() { i++; };

This would require some modification of existing C++ code, but note
that the result would be equivalent to the code you started with.  You
could further reduce the impact by only making it an error for classes
with virtual functions.  I plan to further hack our local version of
cfront to produce this error message.

1) Can be solved by using "main" as the distinguished function for all
classes with nothing but virtual functions.  This would require that
all .h files defining such classes be included by the .c file that
contains "main," in much the way that the .c file that is compiled
using +e1 must include all the .h files used by the program in the
+e0/+e1 paradigm.

To sum up, I think: +e0/+e1 is a hack (albeit a necessary one at the
time), so something like Mr. Grunwald's suggestion or the one I've
outlined above should probably be used instead.  In many ways Mr.
Grunwald's suggestion is cleaner, but this one seems to work in almost
all cases without anyone other than the compiler-writer having to
do anything.

Comments are of course welcome, and anyone who wants to use this
idea is of course free to do so (unless it's not original with me and
someone has copyrighted it or something :-).
--
Dave Detlefs			Any correlation between my employer's opinion
Carnegie-Mellon CS		and my own is statistical rather than causal,
dld@cs.cmu.edu			except in those cases where I have helped to
				form my employer's opinion.  (Null disclaimer.)
-- 

malcolmp@spinifex.eecs.unsw.oz (Malcolm Purvis) (03/23/89)

From article <GRUNWALD.89Mar17220936@flute.cs.uiuc.edu>, by grunwald@flute.cs.uiuc.edu:
>
>I'd thought about +e<name> -- the problem being that
>	(a) this requires changing the entire makefile
>	(b) requires a very strict naming convention for files.
>
>E.g., If I have class ``SimulationMultiplexor'' (I do),
>then I need to have a file with the same name. That's not so bad,
>except on SysV (which, thank god, I'm not using).
>
>Using #pragma means you need to change one line in the Makefile
>(the one defining C++FLAGS) and lets you use whatever file
>naming convention you want.

[My apologies if this topic has discussed in depth before.]

	Couldn't you change the compiler to do the equivalent of +e<name>
automatically so you wouldn't have these problems?  You could put in rules
that say if when compiling a file it finds a constructor/destructor for a
class, then the vtbl for that class is defined in this file, otherwise it is
declared external.  This results in the vtbl being declared in one place
just as with +e<name>, but you could then name your files however you like,
and also you wouldn't have to change the Makefile.

	Of course the contructor/destructor rule is probably insufficient by
itself, especially if the class has all inline member functions (inline
virtual functions? How strange), but you should get the idea.

	Malcolm Purvis (malcolmp@{spectrum,spinifex}.eecs.unsw.oz)
		University of NSW, Sydney, Australia.

Honours: The students treat you like geniuses, the staff like postgrads, and
the administration like children........

grunwald@flute.cs.uiuc.edu (03/25/89)

In article <340@spinifex.eecs.unsw.oz> malcolmp@spinifex.eecs.unsw.oz (Malcolm Purvis) writes:
....	
	   Couldn't you change the compiler to do the equivalent of +e<name>
   automatically so you wouldn't have these problems?  You could put in rules
   that say if when compiling a file it finds a constructor/destructor for a

Yes, as others have noted, this would work. The more general solution is to
pick some arbitrary function as the ``anchor'' for the vtbl & virtual inlines.
I don't know if this would be more suitable or not. I think this might be a
better way to do it.

	(inline virtual functions? How strange)

Not that strange. Consider the Priority Queue classes in libg++. The
root class (call it PQ) defines the interface for subclasses using
virtual functions. Now consider subclass ``PairPQ'' and ``SplayPQ''.
If I do:
		SplayPQ pq;
		pq.enq(.....)

then I *know* that i'm calling ``SplayPQ::enq''. By that same light, If I say:
		SplayPQ *pq = new SplayPQ
		pq -> enq(.....)

I still know which enq I'm talking about, but if I say:
		SplayPQ *pq = someOtherFunction();
		pq -> enq(....)

I no longer know the enq routing to use, so I must use the virtual function,
unless I can determine something about ``someOtherFunction()''.

In the first two cases, you get the generality of virtual functions and
class hierarchies without always having to pay the cost if you're using
a specific instance of a class lattice. This can be very important in
very performance constrained implementations.


--
Dirk Grunwald
Univ. of Illinois
grunwald@flute.cs.uiuc.edu