[net.unix-wizards] C sizeof

gwyn@Brl@sri-unix (09/01/82)

From:     Doug Gwyn <gwyn@Brl>
Date:     26 Aug 82 3:58:13-EDT (Thu)
Welcome to the hole-in-the-struct club.
On a PDP-11,
struct	{
	char	a;
	long	b;	/* not "int" whose size varies a lot */
	}
has to have a "padding" char so that the next MOS starts on a word (2-byte)
boundary, due to hardware restrictions (infamous "odd address trap").
Also, total struct size will always be even so that arrays of structs obey
the hardware restrictions too.

On the VAX this garbage is TOTALLY UNNECESSARY but some PCC implementor
apparently decided to squeeze a few nanoseconds out of the hardware by
longword (4-byte) aligning longs, floats, and doubles.  What burns me
about this is that it isn't even PDP-11 compatible so your binary files
will very likely not be accessible to the SAME C code on the VAX, and
there is no easy fix for this (contrary to initial appearances).

sizeof thing	always includes any alignment padding, so an array of n
things will have sizeof array == n * sizeof thing.

I/O of a padded struct does read/write the padding as well as the useful
data if you do something like
	write( 1, &thing, sizeof thing );
whereas to avoid the padding you would have to access MOSs separately:
	write( 1, &thing.a, sizeof thing.a );
	write( 1, &thing.b, sizeof thing.b );

Nothing can be done in general about this problem, since there are CPU
architectures that necessitate alignment (IBM 360 is maybe the worst).
It is a pity, though, that the VAX implementation is unnecessarily ugly.

obrien@Rand-Unix@sri-unix (09/01/82)

Date: Thursday, 26 Aug 1982 09:37-PDT
Well, it turns out that the size of the structure you mentioned:

struct x {
	char x1;
	int x2;
} x;

really, truly is 8 bytes.  There are three wasted bytes in the middle.
Even if you reverse the order of the structure elements the result is still
eight bytes long.  If you don't read and write eight bytes when trying to
fill this structure you will lose data.

	As you surmised, this is due to alignment.  The C compiler chooses
to waste space in order to save time.  "sizeof" is not lying and you shouldn't
lose faith in it just because only 5 of the bytes contain "real data".  The
object occupies 8 bytes of memory.  Alignment is NOT "transparent" when it
comes time to externalize the representation of the data, for example by
writing it to disk in its internal form.  It is for this reason that
transferring data from a VAX to a PDP-11 gives great pain, sometimes, unless
you convert everything to character streams via carefully written things
like "tar".

lazear@Mitre@sri-unix (09/01/82)

From:   <lazear@Mitre>
Date: 26 Aug 1982 16:42:17 EDT (Thursday)
As recently seen on this list, the Berkeley compiler *knows* it is more
efficient to align int's, long's, and other biggies on certain boundaries.
Sure, the VAX book says they can start anywhere, but C knows *better*!

This problem is only annoying to those interested in inefficiencies
in writing extra bytes, *and* by those who are communicating between
dissimilar machines.  The latter was my case and gave me something to
do when I didn't really need the hassle.  Re-specifying structures so
they looked the same to all machines that were reading from each other
is a boring task, but is at the heart of real portability.

Walt Lazear at Mitre

mo@Lbl-Unix@sri-unix (09/04/82)

From: mo at Lbl-Unix (Mike O'Dell [system])
Date: 1 Sep 1982 10:01:24-PDT
Walt,
Not only is specifying structures so they look the same a boring task,
IT WON'T WORK!  You will never get around the BigEndian-LittleEndian
problem; Vax and 11's even argue over arrangement of Longs! The
real issue is that binary files are NEVER portable between machines.
You can write portable programs which use binary files internally
as temporaries, but you will never suceed in making them portable
without extreme ineffeciency in one place or another.  If you are
moving data between systems, it has gotta be ASCII strings,
and then you have to fight with IBM systems.  It is ironic that
the Unix world is just now discovering what the Software Tools
people have been doing for several years now: building software
which runs on many, many machines without modification.  If there
is interest (and I think there is), I am planning a talk on
"Software Tools Experiences Applied to the Unix World" for the
San Diego Usenix meeting.

	-Mike