[comp.lang.c] Overzealous alignment and padding

scjones@sdrc.UUCP (Larry Jones) (10/15/88)

I am looking for information about C compilers that are overzealous
about aligning structure members and padding structures.  By this
I mean that the compiler aligns more strictly than is required by
the underlying hardware - for example, aligning an array of char
on an int boundary rather than a char boundary.  Similarly, a
structure may be padded out to an even number of ints even though
all the members are only char.

Although this behavior is acceptable under the draft ANSI
standard, it makes it impossible to declare a struct that
conforms to some external format (e.g. records in a file).  If a
compiler does this by default but has some way to prevent it,
please let me know what compiler and how to disable it (e.g.
compiler switch, pragma, etc.).  If a compiler doesn't have a way
to disable it, I DEFINITELY what to know!

If there are lots of compilers like this (which I doubt since
I've just run into my first and I've dealt with LOTS before),
then we're going to have to do a major rethink of a product we
were just about to announce.  If there are only a few (fingers
crossed), we can just write them off.

If you know of a compiler like this, please mail me the info and
I will summarize.

----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                      scjones@sdrc.uucp
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150                  AT&T: (513) 576-2070
"Save the Quayles" - Mark Russell

henry@utzoo.uucp (Henry Spencer) (10/16/88)

In article <410@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
>I am looking for information about C compilers that are overzealous
>about aligning structure members and padding structures...
>If there are lots of compilers like this ...
>then we're going to have to do a major rethink...

You're going to have to do a major rethink.  Such compilers are indeed
common.  Many compilers pad all structs out to a word boundary just on
general principles.  Some of them actually have important reasons for
doing so:  on any machine where char pointers are a hassle -- Data General
and Cray, to name two -- there is a strong efficiency incentive to pad
all structs out so they can use word pointers.  Life is also much simpler
if all struct pointers share the same representation, which again bites
you if char pointers are different.

>... it makes it impossible to declare a struct that
>conforms to some external format (e.g. records in a file)...

Right.  This can't be done portably in C.  You have to read the data in
as a lump of bytes and then do a component-by-component copy.  (This is
necessary anyway for ints and such, due to byte-ordering problems.  To
say nothing of questions like the size of ints and the representation
of floating point...)

Yes, this is a hassle.
-- 
The meek can have the Earth;    |    Henry Spencer at U of Toronto Zoology
the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu

rkl1@hound.UUCP (K.LAUX) (10/17/88)

	The best way is to align the structure yourself!  Sounds as if
you have structures where the first field is a 'char', so you don't
know how the compiler will align it.  So don't make the first field a
char...maybe you might have to insert a dummy 'int' field if all you
have are 'char' fields.  Alternatively, you could use a *union* of an
'int' with your structure of 'char' fields.

	What you're doing is allowing the *compiler* to decide how to
align instead of *telling* the compiler how to align...

--rkl

scjones@sdrc.UUCP (Larry Jones) (10/18/88)

In article <410@sdrc.UUCP>, I wrote:
> I am looking for information about C compilers that are overzealous
> about aligning structure members and padding structures.  By this
> I mean that the compiler aligns more strictly than is required by
> the underlying hardware - for example, aligning an array of char
> on an int boundary rather than a char boundary.  Similarly, a
> structure may be padded out to an even number of ints even though
> all the members are only char.

Well, I've been inundated by the first wave of replies saying
"LOTS of compilers do this!" which means I didn't make myself
clear and, sure enough, as I reread what I said, not only did I
not make myself clear, I didn't say what I meant at all!

What I'm interested in knowing about is NOT compilers that align
more strictly than the hardware requires, but rather compilers
that align more strictly in some cases than in others.  For
example, if a compiler chooses to align char on a multiple of 1,
short on a multiple of 2, and long on a multiple of 4, that's OK
by me.  If, however, it further aligns char arrays or char
structure members on a multiple of 2 or 4 (even though char is
only aligned on a multiple of 1), THAT'S what I want to know
about (particularly if there's no way to disable it).

So far, the only compilers I know of that behave this way is the
Microsoft V4 (and related Xenix) compiler (which had a switch to
disable it), and the Apollo compiler (which does not).
----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                      scjones@sdrc.uucp
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150                  AT&T: (513) 576-2070
"Save the Quayles" - Mark Russell

scs@athena.mit.edu (Steve Summit) (10/23/88)

In article <410@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
>I am looking for information about C compilers that are overzealous
>about aligning structure members and padding structures...
>it makes it impossible to declare a struct that
>conforms to some external format (e.g. records in a file)

As was discussed recently in this newsgroup, you can avoid all
structure arrangement problems by simply not attempting to
conform to external binary formats, but by using an external text
file format instead.  (Efficiency hackers find this solution
unpalatable, but the parse time and file size issues are often
not real problems in practice.)

                                            Steve Summit
                                            scs@adam.pika.mit.edu

cd@hrc63.co.uk (Colin Denman "GECCL") (11/01/88)

In article <7613@bloom-beacon.MIT.EDU>, scs@athena.mit.edu (Steve Summit) writes:
> In article <410@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
> >[...]
> 
> As was discussed recently in this newsgroup, you can avoid all
                                                             ^^^
> structure arrangement problems by simply not attempting to
> conform to external binary formats, but by using an external text
> file format instead.  (Efficiency hackers find this solution
> unpalatable, but the parse time and file size issues are often
> not real problems in practice.)

My most frequent problems with  enforced  alignments  occur  with
*hardware*  formats  and  ones over which I have no control. What
external text format do I use for them :->.

If my machine doesn't  have  floating  point  hardware,  I  still
expect  a  worthy  C  implementation  to  simulate  with  loss of
efficiency. Why can't the same attitude be adopted to matters  of
alignments  and  bit-manipulations. It would be neat and portable
if I could express a format (bitfields and  all)  secure  in  the
knowledge  that  the  only  thing  I  loose  is efficiency. If my
solution permits, I can optimise data layouts, but at least I get
something that *works*.

It is not just a problem  for  "efficiency  hackers",  quite  the
opposite.

		Colin J Denman