[comp.lang.c++] Byte padding question

keving@r4.uk.ac.man.cs (Kevin Glynn) (09/05/90)

	When building a class hierarchy in C++ we found that some
extraneous char's were being inserted into the C structure produced by
the AT&T front end. After writing a fairly large test program it was
found that these bytes were only being generated (at least in our
tests) when an empty class (methods only, no member variables) was the
base class. I also noticed that in it's child class the extra byte
generated for the empty base class was added as well as a new char
array of size 1.

   In the work our department is doing, this will cause problems in
the future and we would like to clear up the reason why this is
happening (and even better how to stop it happening!). 


/********************** Part of the original code ***************************/


class GraphCell
	{

public :

	void	*operator new(size_t size);

	void	*operator new(size_t, int sz);

	void	operator delete(void *cell);

	};


class Instance : public GraphCell
	{

public	:


	void	suspend(InstFn function);

	void	activate();

protected :

	instance header_state;

	};

/**************** This was generated by the cfront (v2.0) *******************/

struct GraphCell {	/* sizeof GraphCell == 2 */

char __W3__9GraphCell ;
};

struct Instance {	/* sizeof Instance == 22 */

char __W3__9GraphCell ;
char __W4[1];

struct instance header_state__8Instance ;
};

/****************************************************************************/

   Does anybody know why these char's are added,

Kevin Glynn
ICL

--

keving@r1.cs.man.ac.uk

sig? No thanx, I'm trying to give up ...

-----------------------------------------------------------------

petergo@microsoft.UUCP (Peter GOLDE) (09/08/90)

In article <KEVING.90Sep5134659@r4.uk.ac.man.cs> keving@r4.uk.ac.man.cs (Kevin Glynn) writes:
>	When building a class hierarchy in C++ we found that some
>extraneous char's were being inserted into the C structure produced by
>the AT&T front end. After writing a fairly large test program it was
>found that these bytes were only being generated (at least in our
>tests) when an empty class (methods only, no member variables) was the

Off the top of my head, I'd guess that cfront was trying to conform with
section 5.3.2 of the language standard: "The size of any class or class
object is larger than zero."  Obviously there would be ways to conform
with this that do not waste space in derived classes, but I don't
believe cfront does them.

--Peter Golde   
I speak for myself only -- not Microsoft.

rfg@NCD.COM (Ron Guilmette) (09/08/90)

In article <KEVING.90Sep5134659@r4.uk.ac.man.cs> keving@r4.uk.ac.man.cs (Kevin Glynn) writes:
>
>
>	When building a class hierarchy in C++ we found that some
>extraneous char's were being inserted into the C structure produced by
>the AT&T front end. After writing a fairly large test program it was
>found that these bytes were only being generated (at least in our
>tests) when an empty class (methods only, no member variables) was the
>base class. I also noticed that in it's child class the extra byte
>generated for the empty base class was added as well as a new char
>array of size 1.
>
>   In the work our department is doing, this will cause problems in
>the future and we would like to clear up the reason why this is
>happening (and even better how to stop it happening!). 

Well, you can't stop it.  As I recollect, the reason that it is happening
is that Bjarne though that it would be a good idea to insure that each
element of each array in any given program should have a unique address.

Obviously, given:

	struct s {} array[20];

if the elements are given zero length then:

	&array[1] == &array[2]

I can't tell you precisely why Bjarne though it would be important to
insure that array elements all had unique addresses (because I don't
know for sure and because I don't have my copy of E&S handy), but I'm
willing to bet that he found some very good reason for it.

(Note that in ANSI C, it is illegal to even declare a struct or union
type which has no members, so in ANSI C this issue never even comes up.)

Regarding class types derived (singly) from `empty' class/struct types,
the compiler can't simply forget about the (artificial 1 byte) size of
the base type.  That would cause problems for the compiler when assignments
like:

	*base_pointer = *derived_pointer;

were executed.
-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

mckenney@sparkyfs.istc.sri.com (Paul Mckenney) (09/09/90)

In article <1477@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
>In article <KEVING.90Sep5134659@r4.uk.ac.man.cs> keving@r4.uk.ac.man.cs (Kevin Glynn) writes:
>Well, you can't stop it.  As I recollect, the reason that it is happening
>is that Bjarne though that it would be a good idea to insure that each
>element of each array in any given program should have a unique address.

One reason for wanting this is to avoid a divide-by-zero fault in response
to (&a[20]-&a[0]):  recall that the subtraction operation for pointers
subtracts the values and then divides by the length of an individual
array element.
					Thanx, Paul

mat@mole-end.UUCP (Mark A Terribile) (09/12/90)

> 	When building a class hierarchy in C++ we found that some extraneous
> char's were being inserted into the C structure produced by the AT&T front
> end. After writing a fairly large test program it was found that these bytes
> were only being generated ... when an empty class (methods only, no member
> variables) was the base class. I also noticed that in it's child class the
> extra byte generated for the empty base class was added as well as a new char
> array of size 1.

>    In the work our department is doing, this will cause problems in
> the future and we would like to clear up the reason why this is
> happening (and even better how to stop it happening!). 

As to why the empty char is being inserted into the empty base class--
the underlying C implementation can't handle an empty struct; there must
be a member, so the smallest possible datum is used.

The array in the derived class is *probably* there to force subsequent
data members back to the ``correct'' or machine optimal alignment.  After
a non-array char member, the next member can be begun at an odd boundary
if the machine will support it.  On some machines, this will be costly; on
others it will leave a hole that C++ may have a hard time dealing with when
computing member addresses (although THAT is speculation on my part).

You are probably stuck with it.  You might try (I haven't got a recent
cfront handy so this is pure speculation) making the base class abstract
(put a pure virtual function in it); cfront *might* be smart enough to
know that it won't have to create the base class object as a complete
object.

But I wouldn't bet on it.

You might take your added data for the derived class and stuff it all into
a struct from which which you can multiply and publicly derive the derived
class; that way you will at least have a well-defined data layout for those
data and you can convert the derived class to it at will.

	class	Base_with_funcs
	{
		. . . your member functions for this hierarchy . . .
	};

	struct	Data_for_Derived_1
	{
		. . . all the member data needed for Derived_1 . . .
	};

	class	Derived_1
	  : public Base_with_funcs, Data_for_Derived_1
	{
		. . . the member functions for Derived_1,     . . .
		. . . which will have the Data_for_Derived_1  . . .
		. . . member data in their class name space   . . .
	};

	. . .

	// For the sake of argument ...
	ostream& operator<< ( ostream&, const Base_with_funcs& );
	ostream& operator<< ( ostream&, const Data_for_Derived_1& );

	Derived_1 d1( . . . );

	cout << (Data_for_Derived_1&) d1;

I don't know when cast to a reference type slipped into C++, (or whether it
was there all along) but it's real nice to have.
-- 

 (This man's opinions are his own.)
 From mole-end				Mark Terribile

gwu@nujoizey.tcs.com (George Wu) (09/15/90)

In article <440@mole-end.UUCP>, mat@mole-end.UUCP (Mark A Terribile) writes:
|> > 	When building a class hierarchy in C++ we found that some extraneous
|> > char's were being inserted into the C structure produced by the AT&T front
|> > end.
|> >
|> > [ stuff about empty structs deleted ]
|> >
|> > I also noticed that in it's child class the
|> > extra byte generated for the empty base class was added as well as
a new char
|> > array of size 1.
|> 
|> >    In the work our department is doing, this will cause problems in
|> > the future and we would like to clear up the reason why this is
|> > happening (and even better how to stop it happening!). 
|> 
|> [ more stuff deleted ]
|> 
|> The array in the derived class is *probably* there to force subsequent
|> data members back to the ``correct'' or machine optimal alignment.  After
|> a non-array char member, the next member can be begun at an odd boundary
|> if the machine will support it.

     Nah, that can't be it.  The C compiler already does that for you.  The
C++ compiler/translator should never have to do alignment explicitly.

----
George J Wu                           | gwu@tcs.com or ucbcad!tcs!gwu
Software Engineer                     | 2121 Allston Way, Berkeley, CA, 94704
Teknekron Communications Systems, Inc.| (415) 649-3752

mat@mole-end.UUCP (Mark A Terribile) (09/21/90)

> |> > I also noticed that in it's child class the
> |> > extra byte generated for the empty base class was added as well as
> a new char
> |> > array of size 1.

> |> [ more stuff deleted ]

> |> The array in the derived class is *probably* there to force subsequent
> |> data members back to the ``correct'' or machine optimal alignment.  After
> |> a non-array char member, the next member can be begun at an odd boundary
> |> if the machine will support it.

>      Nah, that can't be it.  The C compiler already does that for you.  The
> C++ compiler/translator should never have to do alignment explicitly.


> George J Wu                           | gwu@tcs.com or ucbcad!tcs!gwu


It may if it has to type pun one struct into another, which is exactly how
it handles certain cases of inheritance. 

The C compiler must maintain consistant alignment within one struct; it
is not guaranteed to set up two structs such that one can be punned into
the other, at least not in the way that C++ does it.  Cfront seems to be
capable of generating a short struct for the base class, a longer struct
that includes the short struct as its first member to represent a derived
class, AND a longer struct whose initial members are synonyms for the short
struct.  I suspect that the easiest way to ensure that alignment is
handled consistently on many different underlying C compilers is to put
that array in, relying on the requirement that what follows the array have
the machine's natural alignment (or however it's stated).  (For a single
char, that requirement does not hold.)

cfront does some incredible stuff.  Some of it is very clever, some of it
is very dumb, and some of it is cleverness in one place to allow dumbness
in another.  I suspect that this is one of the latter.
-- 

 (This man's opinions are his own.)
 From mole-end				Mark Terribile