strick@osc.COM (henry strickland) (07/15/89)
[] CLASS LAYOUT FOR G++ (1.35.1- / sun3-os4-nfp)
[]
[] Henry Strickland <strick@osc.com> (also <strick@gatech.edu>)
[] Object Sciences Corporation Menlo Park CA
I have attempted to reverse-engineer how G++ (v1.35.1-) allocates
storage for classes, including multiple inheritance and virtual base
cases. This was done by trying things and looking at the output: so
it could be wrong. This is for sun3 with os level 4 and no floating
point processor, but I don't see why that would matter any.
IF ANYONE KNOWS WHERE THE FOLLOWING IS WRONG, PLEASE LET ME KNOW!
Also, if anyone knows if cfront 2.0 does things differently,
please let me know. I do not have access to cfront 2.0 yet.
I am most hesitant about the algorithm at the end for determining
virtual base placement.
1. a simple structure
A struct has fields for the variables in order. There may be
some gaps (dead fields) to meet alignment constraints.
Funny cases exist, e.g. structs with no fields, zero-length
arrays, bit fields, and they're probably compiler-dependent.
Nonvirtual methods (including inlines) require no storage in
an instance.
EXAMPLE
struct A {
char a; /* variables */
int x,y;
short b;
double z;
void print(Stream*); /* methods */
};
memory layout A* --> | 1: a
| 1: gap
| 4: x
| 4: y
| 2: b
| 0: gap? (not on sun3, but on vax, i think)
| 8: z
2. a simple structure with a virtual function table
There will be ONE anonymous field ("v_ptr") of type (void*) at
the end of the structure IF the struct has one or more virtual
functions. The format of this virtual function table dependent
on whether multiple inheritance is used or not; anyway, it is
not described in this document.
EXAMPLE
struct B {
char a; /* variables */
int x,y;
short b;
float z;
void print(Stream*); /* methods */
virtual void one(); /* virtual methods */
virtual int two(char*);
virtual int one(int);
virtual char* three(char*, int);
};
memory layout B* --> | 1: a
| 1: gap
| 4: x
| 4: y
| 2: b
| 0: gap?
| 8: z
| 4: vt$B /* virtual func table ptr */
|
| |
| |
V increasing memory addresses V
3. single or multiple inheritance, non-virtual bases
The non-virtual superclasses are allocated, in order of
appearance on the class definition line, and then the member
variables and possibly the virtual function pointer of the
current struct. If the superclasses themselves have
superclasses or virtual functions, then the layout method
is recursive, with each superclass having a full layout
within itself.
EXAMPLE
struct X { /* ... sizeof X is 20 ... */ };
struct Y { /* ... sizeof X is 12 ... */ };
struct Z { /* ... sizeof X is 36 ... */ };
struct M : X, Y, Z {
char a;
int x,y;
short b;
float z;
void print(Stream*);
virtual void one();
virtual int two(char*);
virtual int one(int);
virtual char* three(char*, int);
};
memory layout M* --> | 20: X /* superclasses first */
| 12: Y
| 36: Z
| 1: a /* M's variables begins */
| 1: gap
| 4: x
| 4: y
| 2: b
| 0: gap?
| 8: z
| 4: vt$M /* virtual func table ptr */
The case of single inheritance is just the multiple-inheritance
case with only one superclass. The supers have their virtual
function table pointers if they ordinarily need it. The class
M only has its own virtual table pointer if M defines new
virtual functions that did NOT exist in any of the supers. If
M only reimplements virtual functions declared in supers, then
M has no v_ptr of its own.
4. multiple inheritance, with virtual bases
A class in which itself or ANY of its supers (or any supers of
supers, transitively ) has one or more virtual bases, the
struct will have two sizes: A "lean" size, which does not
count the storage for the virtual bases, and a "full" size,
which does count storage for them.
Virtual bases are NOT included inside the "lean" structure the
way that normal bases are. Instead, only a pointer to the
virtual base is stored inside the "lean" structure. These
virtual base pointers come at the end of the normal
superclasses and before any variables of the current class.
There will be one pointer for each virtual base declared as a
virtual super of the CURRENT class. If there are virtual bases
that are found in the closure of supers but NOT listed in the
current struct as a virtual super, there will be no virtual
base pointer for this base at this level. These pointers
appear in REVERSE order of their declaration: see the diagram.
EXAMPLE
struct F { /* ... sizeof X is 40 ... */ };
struct G { /* ... sizeof X is 24 ... */ };
struct X { /* ... sizeof X is 20 ... */ };
struct Y { /* ... sizeof X is 12 ... */ };
struct Z { /* ... sizeof X is 36 ... */ };
struct M : virtual F, virtual G, X, Y, Z {
char a;
int x,y;
short b;
float z;
void print(Stream*);
virtual void one();
virtual int two(char*);
virtual int one(int);
virtual char* three(char*, int);
};
*LEAN* layout M* --> | 20: X /* non-virtual
| 12: Y superclasses first
| 36: Z if any */
| 1: a /* M's variables begins
| 1: gap if any */
| 4: x
| 4: y
| 2: b
| 0: gap?
| 8: z
| 4: POINTER TO G /* virtual base pointers */
| 4: POINTER TO F /* first last */
| 4: v_ptr /* if any */
When an object is instantiated (with operator new) a "full"
structure of the specified class is allocated. The full
structure includes the lean structure, above, followed by lean
structures of all the virtual bases in the superclass closure.
The order of the virtual bases is not trivial. The following
algorithm FindVirtualBaseOrder() seems to find it. This may
not be right -- it is hard to check all the possibilities, but
this explains what I've seen so far, and is a fairly likely
implementation. It does have its nutty aspects, though, and
makes me nervous that I may have missed something. I would
have to delve into the compiler code to be absolutely sure.
//////////// BEGIN ALGORITHM
static List b;
static Class orig;
FindVirtualBaseOrder( Class c )
{
b = NIL; // collect virtual bases on this list
orig = c; // remember the original class
TraverseClass( c );
return b;
// ... the order of the bases in the List b is the order of the
// bases after the lean structure for class c
// i.e. the most recent addition to b appears first, and the
// very first thing that was added comes at the very
// end of the full structure.
}
TraverseClass(Class x)
{
// this implements a special kind of post-order traversal,
// in which we visit traverse the supers of our current
// class, and then record all of our virtual supers.
// (it is different from a normal post-order in that we do not
// record each super immediately after traversing it, but
// rather come back and get them after doing all supers.)
// It is also strange that the recording of virtual supers
// is one backwards ONLY for the original class, and done
// frontwards for virtual supers of any transitively-super class.
FOR EACH superclass s OF x
in the order of declaration in the class definition of x
{
TraverseClass( s )
}
FOR EACH superclass s OF x
IF s == orig THEN // the original class
in the REVERSE order of declaration in the class def of x
ELSE // all supers
in the order of declaration in the class definition of x
{
IF s is a virtual superclass of x AND
s is not already in list b
{
add s to the FRONT of b
}
}
}
//////////// END ALGORITHM
EXAMPLE
class F {...};
class G {...};
class H {...};
class J {...};
class K {...};
class L {...};
class U {...};
class V {...};
class A : virtual U {...};
class B : virtual V {...};
class X : virtual U, A, virtual K, B, virtual L {...};
class Y : virtual V, virtual F, A, B, virtual J {...};
class Z : virtual G, virtual U, virtual V, X, Y,
virtual F, virtual H {...};
Flow of algorithm:
FindVirtualBaseOrder Z
TraverseClass Z
TraverseClass G
TraverseClass U
TraverseClass V
TraverseClass X
TraverseClass U
TraverseClass A
TraverseClass U
add U <== U
TraverseClass B
TraverseClass V
add V <== V
add K <== K // frontwards
add L <== L
TraverseClass Y
TraverseClass V
TraverseClass F
TraverseClass A
TraverseClass U
TraverseClass B
TraverseClass V
add F <== F // frontwards
add J <== J
TraverseClass F
TraverseClass H
add H <== H // backwards!!
add G <== G
So the order of the virtual bases is G, H, J, F, L, K, V, U.
layout of a full class Z:
Z* zp = new Z;
zp --> | Z::X::A ->U // begin A within X within Z
| Z::X::A vars // variables for A
| Z::X::A vt$A // virt func table for A
| Z::X::B ->V
| Z::X::B vars
| Z::X::B vt$B
| Z::X ->U // i.e. pointer to U vars, below
| Z::X vars
| Z::X vt$X // end X within Z
| Z::Y::A ->U
| Z::Y::A vars
| Z::Y::A vt$A
| Z::Y::B ->V
| Z::Y::B vars
| Z::Y::B vt$B
| Z::Y ->V
| Z::Y ->F
| Z::Y vars
| Z::Y vt$Y // end Y within Z
| Z ->H // Z names 5 virtual bases
| Z ->F // (notice inverse order)
| Z ->V
| Z ->U
| Z ->G
| Z vars
| Z vt$Z ///// end of lean Z ////
| G vars // begin of virtual bases
| G vt$G // run algorithm to
| H vars // determine ordering
| H vt$H
| J vars
| J vt$J
| F vars
| F vt$F
| L vars
| L vt$L
| K vars
| K vt$K
| V vars
| V vt$V
| U vars
| U vt$U // end of full structure
5. other comments
I have not been able to get a virtual base that has a virtual
base to compile with "g++ version 1.35.1-". I always get this
from the compiler:
> Failed assertion prev != NULL_TREE at line 2069 of `cplus-search.c'.
> g++: Program cc1plus got fatal signal 6.
From my understanding of C++, I see no reason why a virtual
base cannot have a virtual base, and I see no reason why the
above algorithm wouldn't still hold, but I haven't been able to
try it.
////////////////////////////////////////////////////////////////////////////
strick@osc.osc.com 415-325-2300
uunet!lll-winken!pacbell!osc!strick
( strick@gatech.edu )
They know their limits 'cause they
cross 'em every night. -- DEVO
--
strick@osc.osc.com 415-325-2300
uunet!lll-winken!pacbell!osc!strick
( strick@gatech.edu )