schuh@whey.cs.wisc.edu (Dan Schuh) (03/08/91)
I was orignally going to call this Multiple Inheritance Considered Harmful, but it's really just a bug report. What I'm writting about is a series of tests I did of multiple inheritance and mixing virtual and non virtual base classes, under both AT&T C++ and g++. This came up because my work requires pretty detailed knowledge of the interals of multiple inheritance implementation in cfront; once I started testing my modifications, I discovered the base I was working on was a little wobbly, to put it mildly. To try to get a handle on what did & didn't work, I came up with the following simple multiple inheritance hierarchy. Schematicly, the hierarchy looks like the following lattice: d1 a \ / - va d2 b \ / - vb d3 c \ / - vc d4 d \ / - vd e where each of the links va,vb,vc,vd indicate either virtual or non-virtual inheritance. This hierarchy uses 4 levels of multiple inheritance, classes d1-d4 are simple base classes that are used to force each level of class derivation to use multiple inheritance but are otherwise not used in the test. The 4 optional virtual attributes on the a-b-c-d-e inheritance chain leads to 16 combinations of virtual and non-virtual inheritance on the secondary derivation chain. In C++, the declarations of the classes a,b,c,d, and e look approximatley like the following. Note all base classes are public, as indicated by the struct keyword. struct a : { int i; virtual void f(int *); }; struct b : d1, [virtual] a { void f(int *); }; struct c : d2, [virtual] b { void f(int *); }; struct d : d3, [virtual] c { void f(int *); }; struct e : d4, [virtual] d { void f(int *); }; Each of the classes a,b,c,d,and e defines one member function which tests the address of a member variable of the base class a against an address passed in as a parameter, and then passes its idea of the address of i to the next member function up the chain. I then tested the resulting code by checking the address of the class a's public member field i in member functions of each class (a,b,c,d,e), and by assigning between 2 objects of class e and again testing the address of a's member in each class on the right side inheritance chain. The results of these tests were, well, enlightening. I ran these tests with 3 compilers: AT&T C++(cfront) 2.0, running on a decstation 3100, Object Design's OSCC, part of their Objectstore product, running on a sun 4/110 with sunos 4.1, and g++ 1.37.2, also on a decstation 3100. AT&T C++ 2.0 and g++ each handled 5 of the 16 cases correctly, to the point of reporting consistent addresses for the member field i in the member function of each of the 5 classes a,b,c,d, and e. The only case that both compilers handled correctly was when the entire a-b-c-d chain was declared virtual. For C++ 2.0, of the 11 cases not handled, 4 produced "Sorry, not implemented" messages from cfront, 5 cause cfront to generate illegal C code, and 2 produce compileable C code that produced incorrect results. We don't have cfront 2.1, but we have a test copy of Object Design's Objectstore C++ compiler, which I believe is based on cfront 2.1; the only difference between that compiler and AT&T C++ 2.0 is that the 5 cases that produce illegal C for 2.0 produced compilable C that gave incorrect results. Of the 11 cases not handled by g++, 4 resulted in runnable executables that produced incorrect results and 7 resulted in executables that terminated abnormally with core dumps. The results I got are summarized below. The source code used to run these tests is included at the end of this msg. test # virtual bases AT&T 2.0 Obj.Design 2.1 g++ 1.37.2 0 none ok ok incorrect 1 D ok ok incorrect 2 C incorrect incorrect incorrect 3 CD ok ok incorrect 4 B not impl. not impl. core dump 5 B D bad C code incorrect core dump 6 BC bad C code incorrect core dump 7 BCD ok ok core dump 8 A not impl. not impl. core dump 9 A D not impl. not impl. core dump 10 A C bad C code incorrect ok 11 A CD incorrect incorrect ok 12 AB not impl. not impl. core dump 13 AB D bad C code incorrect ok 14 ABC bad C code incorrect ok 15 ABCD ok ok ok ok -> the test was compiled and executed; no errors detected. incorrect -> compiler produced a runable executable that produced incorrect results. The member variable i had an inconsistent address in some context. bad C code -> cfront produced C that was not accepted by ultrix (mips) cc. not impl. -> not implemented; cfront produced a "Sorry, Not Implemented" message. core dump -> g++ produced executable had some kind of run time fault. Just to throw rocks at somebody else, all of the cases that worked for Object Design C++ on the sun4 had to have the C output of cfront compiled by gcc; the sun-supplied cc failed with a compiler internal error for all but one of the 12 cases that didn't get the "Sorry not implemented" msg. I realize these examples are pretty contrived and may not correspond to any real world examples, but they aren't really very complicated, and four levels of inheritance doesn't seem unrealisticly deep. There is probably only a couple bugs that produce most of the failures, and for all I know they may be fixed in later releases. I actually have a fairly precise idea of at where some of the problems in cfront come from, and why the "Not Implemented" cases can't be handled; I'm not at all familiar with the internals of g++. It's also sort of depressing that g++ won't even handle the totally non-virtual case. I'd be interested to know if any of the other C++ implementations out there do better. But given the somewhat messy semantics of MI and virtual/non virtual bases and the possibility of bad code generation, it's probably best to stick to single inheritance if at all possible. MI may work ok in a shallow hierarchy, but it sure looks like trying to use base classes with a mix of virtual/non virtual multiple inheritance is not horribly reliable. If there is something wrong with my understanding of MI that would make these test cases illegal C++ let me know; in that case the compilers would just seem to have problems with warnings. Anyway, here is the code for these tests. The source is parameterized by the externally defined macro TESTN, which needs to be provided when the source is compiled. To recreate case 11 on a unix system running AT&T C++, for example, CC -DTESTN=11 mitest.c =====================mitest.c cut here=============================== // mitest.c - test a variety of mixtures of multiple and virtual inheritance // many combinations fail // Class hierarchy: // d1 a // \ / - va // d2 b // \ / - vb // d3 c // \ / - vc // d4 d // \ / - vd // e // va, vb, vc, and vd indicate if the respective base classes a,b,c, and d // are declared virtual or not virtual. They are controlled by the constant // TESTN, which should be defined as a flag via -DTESTN=<n> when the compiler // is called . // The low order bit of TESTN indicates if vd is virtual or // not, the second bit controls bdefines vc, bit 3 controls vb, and bit 4 va, // so, for example, if TESTN == 0, no base classes are virtual, if // if TESTN == 15 == binary 1111, all of a,b,c,and d are virtual base classes, // and if TESTN = 6 = binary 1010, c and a are virtual bases #if ((TESTN) & 0x1) #define vd virtual #else #define vd #endif #if ((TESTN) & 0x2) #define vc virtual #else #define vc #endif #if ((TESTN) & 0x4) #define vb virtual #else #define vb #endif #if ((TESTN) & 0x8) #define va virtual #else #define va #endif extern "C" printf(...); int rc = 0; struct d1 { int d1i; }; struct d2 {}; struct d3 {}; struct d4 {}; struct a { int i; virtual void f(int *); }; struct b : d1, va a { int i2; void f(int *); }; struct c : d2, vb b { int i3; void f(int *); }; struct d : d3 ,vc c { int i4; void f(int *); }; struct e : d4, vd d { int i5; void f(int *); } ; extern "C" printf(...); // each of the member functions a::f, b::f, c::f, d::f, e::f // is passed one parameter p, which indicates the caller's idea // of what the address of this->a::i should be. // Each function compares this // value to what it sees as &this->a::i and prints a msg if the // values are not equal. Each function passes its idea of &this->a::i // to its base class version of f, i.e, e::f calls d::f, d::f calls c::f, // etc. void a::f(int * p) { if (p != &i) { printf("a::f: parameter %x != &this->i %x\n",p,&i); rc = -1; } } void b::f(int * p) { if (p != &i) { printf("b::f: parameter %x != &this->i %x\n",p,&i); rc = -1; } a::f(&i); } void c::f(int * p) { if (p != &i) { printf("c::f: parameter %x != &this->i %x\n",p,&i); rc = -1; } b::f(&i); } void d::f(int * p) { if (p != &i) { printf("d::f: parameter %x != &this->i %x\n",p,&i); rc = -1; } c::f(&i); } void e::f(int * p) { if (p != &i) { printf("e::f: parameter %x != &this->i %x\n",p,&i); rc = -1; } d::f(&i); } e evar; main() { // we use 2 objects of type e: the directly declare evar and the // dynamicly allocated object pointed to by ept. e * ept = new e; // Save the initial values we get for &evar.a::i and &ept->a::i int * ept_iaddr = &ept->i; int * evar_iaddr = &evar.i; /* intial test of evar, *ept */ evar.f(evar_iaddr); ept->f(ept_iaddr); evar = *ept; /* test after assignment */ evar.f(evar_iaddr); ept->f(ept_iaddr); *ept = evar; /* test after 2nd assignment */ evar.f(evar_iaddr); ept->f(ept_iaddr); if (rc!= 0) printf("test %d failed\n",TESTN); else printf("test %d ok\n",TESTN); return(rc); }; ====================== end of mitest.c================================= Feel free to mail me any questions about this; I will summarize any additional results from other compilers. I also humbly accept any flames about whether any of this means anything; it just seems that the compilers should do better. Cheer, Dan Schuh UW-Madison (608) 262-7892 schuh@cs.wisc.edu, {...}!uwvax!schuh