[comp.lang.c] C structs alignment

greg@utcsri.UUCP (Gregory Smith) (11/06/86)

In article <2904@rsch.WISC.EDU> mcvoy@rsch.WISC.EDU (Lawrence W. McVoy) writes:
>
>I have a question about alignment and padding.  I have noticed (context:
>Vax 780, 4.3BSD) that the c compiler pads out struct sizes to be
>long word aligned.  And it does the pointer arithmetic based on the
>padded sizes. (no sh*t, sherlock, one would hope that they are the same)
>For instance,
>
>    typedef struct {
>	    char	byte;
>	    short	word;
>    } three_bytes;
>
>    sizeof(three_bytes) == 4, not 3.
>
>    three_bytes* p = 100;
>
>    p == 100, p+1 == 104, not 103.
>
>For all of you that knew this, you're all saying big deal, so what?  Well,
>I do (did) stuff like this all the time:
>
>    head = (three_bytes*)calloc(N, sizeof(three_bytes));
>
>This wastes N bytes.  Sometimes N is around 10 to the 7th or 8th.  Bad news.
				My, what a big computer you have...
>The fact that pointer arith is "wrong" makes this very icky to work around
>even if you are aware of the problem.  Anyone have any comments or
>suggestions?  Does everyone except me know about this?

The arithmetic is not 'wrong'. On a vax, 16-bit quantities must be at
an even address to guarantee that they can be written and read easily
( I think the VAX can read and write any 16-bit field, but it takes more
code, and is much slower. This is not what you want).
Similarly, 32-bit quantities ( ints and longs ) must be at an address
which is a multiple of four.

The struct three-bytes is layed out like this:

0:	byte
1:	( not used )
2:	word ( lo byte
3:		hi byte )

The 'not used' byte makes sure that the word will be on an even address,
and furthermore the compiler will make sure that all structs of this type
start on an even address.
Thus it is of size four. structs are not always padded out to a multiple
of four; a struct consisting of 3 char's will have sizeof = 3.

Suppose you put byte *after* word. You will get this:

0:	word ( lo byte
1:		hi byte )
2:	byte
3:	not used

And sizeof is still 4. Why? suppose you make an array of these things.
In order to ensure that the 'word's are on even addresses, each of the
structs must start on an even address. Thus a blank byte must be inserted
after each to pad them out to four. If you have ' three_bytes *p',
then ++p must add '4' to the pointer to reach the next one. To keep
things consistent, sizeof() gives 4, and the extra byte is added to *any*
instance of the struct. Make sense yet?

If there is a 32-bit quantity within the struct, the whole struct must be
padded out to a multiple of four, for the same reasons. Check out
Kernighan & Ritchie, page 196.

If you can't afford this wasted memory, you can keep two separate arrays,
one containing the bytes and the other containing the words.
	
>
>Also, what's this about alignment that I hear all the time?  If compilers are
>already aligning things for you, why bother to do it explicitly?  You might
>say "so it works on stupid compilers" but who are you to say what alignment
>should be?  I mean, if you port code to a machine with 24 bit alignment and
>you've carefully aligned all your stuff to 32 bit boundries, you've screwed
>yourself.  No fun.  Also no gain.
>
Mostly the compiler takes care of it. If you are using the same pointer type
to point to different object types, or other things like that, you may have to
go to some trouble to make the compiler take care of it. Most of the
discussion occurs because people want more ways to make their programs
portable. Again, you only get into trouble here if you are doing 'unusual'
things.

A classic example is a program which must create a sequence of
mixed structs of different sizes. The type ( and thus size ) of each
struct is determined by looking at its first member ( which would be of
the same type for all of them ). You can use a union containing all the
struct types, but then space will be wasted because they will all be the
same ( worst-case ) size.

The difficulty here is to write the program in such a way that all these
structs will be aligned properly, and to make it work on any machine.

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...