[net.lang.c] C structure alignment

tjr@ihnet.UUCP (Tom Roberts) (04/11/84)

Structure alignment in C varies, depending upon which machine you are on!

Many CPU's cannot reference an int at an odd-byte address, so their
C-compiler ALWAYS locates a structure on an even-byte address. On some
machines (with 4-byte int's), the restriction is to 4-byte addresses.
This applies to structures within structures, even when the structures
only contain char-arrays.

Note that the compilers are permitted to do this, because the only LEGAL
things you can do to a structure are to take its address, reference a
single member, or assign the entire structure to another structure of
the same type (newer C-compilers only). Internal padding can never be
significant for these operations (see the recent discussion of == for
structures).

It is NOT PORTABLE to assume the ordering of members within a structure,
nor to assume they are contiguous in memory. Some machines that are
not byte-addressable may have some big surprises here (the PDP-10
comes to mind). For all byte-addressable machines with which I am familiar,
if you make EVERY member of a structure a multiple of 4 bytes long, data
can be passed between machines in the format of the structure, ASSUMING THE
BYTE-ORDERING IS THE SAME, OR HAS BEEN CORRECTED.

	Tom Roberts
	ihnp4!ihnet!tjr

tll@druxu.UUCP (04/12/84)

>From: tjr@ihnet.UUCP (Tom Roberts)
>
>It is NOT PORTABLE to assume the ordering of members within a structure,
>nor to assume they are contiguous in memory.

It's true that members may not be contiguous in memory, but (time for
the bible-thumping quote from The C Reference Manual, p196)

	"Within a structure, the objects declared have addresses which
	increase as their declarations are read left-to-right."

So the ordering is clearly defined.  This feature is useful when you
need to store a large number of structures that have differing amounts
of data in them (for instance, in a circuit simulator, you might want a
structure for each component -- early in the structure, you define what
kind of component this is, and the rest of the structure is optimized
for the particular component).  A union could be used, but then you use
the maximum amount of storage for each structure, which may be more than
you have.

		Tom Laidig
		AT&T Information Systems Laboratories, Denver
		...!ihnp4!druxu!tll

tim@unc.UUCP (Tim Maroney) (04/12/84)

Excuse me, but you CAN assume that members of a struct appear in the listed
order; furthermore, if two structs have a common beginning (that is, their
first n members are identical), you can assume that they will be at the same
offset.  My favorite use of this is writing generic linked list manipulation
functions, in which each cell starts with a pointer to the next, and the
rest of the cell is a value of arbitrary type.  I don't have my K&R handy,
but it's in there.

There was a discussion about this a few months ago.  For a lot of
applications, you don't care about the ordering of struct members, and you'd
like the compiler to rearrange them for maximum storage efficiency within
its alignment requirements.  Unfortunately, you can't do that in C.  (It
would be easy enough to add -- just put a new keyword before the open brace
to signal that you don't care about the order -- but then again, what
wouldn't?)
--
Tim Maroney, The Censored Hacker
mcnc!unc!tim (USENET), tim.unc@csnet-relay (ARPA)

All opinions expressed herein are completely my own, so don't go assuming
that anyone else at UNC feels the same way.

johnl@haddock.UUCP (04/22/84)

#R:ihnet:-12200:haddock:12400007:000:1335
haddock!johnl    Apr 21 13:11:00 1984

***** haddock:net.lang.c / ihnet!tjr /  1:11 pm  Apr 20, 1984
Structure alignment in C varies, depending upon which machine you are on!

Many CPU's cannot reference an int at an odd-byte address, so their
C-compiler ALWAYS locates a structure on an even-byte address. On some
machines (with 4-byte int's), the restriction is to 4-byte addresses.
This applies to structures within structures, even when the structures
only contain char-arrays.

Note that the compilers are permitted to do this, because the only LEGAL
things you can do to a structure are to take its address, reference a
single member, or assign the entire structure to another structure of
the same type (newer C-compilers only). Internal padding can never be
significant for these operations (see the recent discussion of == for
structures).

It is NOT PORTABLE to assume the ordering of members within a structure,
nor to assume they are contiguous in memory. Some machines that are
not byte-addressable may have some big surprises here (the PDP-10
comes to mind). For all byte-addressable machines with which I am familiar,
if you make EVERY member of a structure a multiple of 4 bytes long, data
can be passed between machines in the format of the structure, ASSUMING THE
BYTE-ORDERING IS THE SAME, OR HAS BEEN CORRECTED.

	Tom Roberts
	ihnp4!ihnet!tjr
----------

crane@fortune.UUCP (John Crane) (04/23/84)

	Many CPU's cannot reference an int at an odd-byte
	address, so their C-compiler ALWAYS locates a
	structure on an even-byte address.  On some
	machines (with 4-byte int's), the restriction is
	to 4-byte addresses.  This applies to structures
	within structures, even when the structures only
	contain char-arrays.

If you recall my question, I am not interested in getting ints; I
want chars.


	Note that the compilers are permitted to do this,
	because the only LEGAL things you can do to a
	structure are to take its address, reference a
	single member, or assign the entire structure to
	another structure of the same type (newer C-
	compilers only).  Internal padding can never be
	significant for these operations (see the recent
	discussion of == for structures).

All I legally wanted to do was refer to a single member which
happened to be a char.

	It is NOT PORTABLE to assume the ordering of
	members within a structure, nor to assume they are
	contiguous in memory.  Some machines that are not
	byte-addressable may have some big surprises here
	(the PDP-10 comes to mind).  For all byte-
	addressable machines with which I am familiar, if
	you make EVERY member of a structure a multiple of
	4 bytes long, data can be passed between machines
	in the format of the structure, ASSUMING THE BYTE-
	ORDERING IS THE SAME, OR HAS BEEN CORRECTED.

Portability schmortability! Does portability ALWAYS have to be an issue?
My question related to converting data from ONE SPECIFIC MACHINE to
ANOTHER SPECIFIC MACHINE. I did not make this clear in my original
question. Now listen. If I programmed for the DPD-10 and all its foibles,
or any other machine for that matter, that's not portable either. Maybe
there's a difference between portability and machine-independence.

I still put aligning nested structures that begin with chars in the
area of dirty tricks. Its one of the things I don't like about BASIC.
It assumes you don't know what you are doing and does what it "thinks"
is going to make life easier for you. One nice thing about assembly
language is that you can control your own alignment.

John Crane