[net.unix] Porting mallocs and structures

vlv@drux1.UUCP (Vaughn Vernon) (07/19/85)

Thanks for all the responses!  So fast also!

So many of the folks that responded said: "So what's the problem?".
That's the same thing that I asked myself.  In the interest of finding
out more on alignment and possible problems I made up some of the members
in the structure example.  I figured: if your going to ask a question,
be as stupid as possible and get all the information you *may* ever need!

You see, I made up almost all of the members in the structure.
My real structure only contained character arrays and single
character variables.  I did it not knowing how the 3B would treat
int's (and alignment), and for some other reasons too.  My code was
blamed for problems but nothing appiers to be wrong with my code.
At run-time we were getting something like:
	"Invalid Instruction - core dumped" (?huh?-you got me?!)
I was told that I had a one in four chance of getting a word boundary
from malloc on a four byte word non-aligned system.  Sounds like Los
Vegas to me!

But I did learn from *your* help that I don't have to give up all those
slick coding methods.  I'm not saying slezzy.  I don't do that.
One of my big concerns was whether sizeof(struct line) took into consider-
ation (knew) the gaps that are needed to align int's, doubles and the like.
Why wouldn't it?!

According to Rich Hammond (rest below):
>3) Sizeof is built into the compiler and knows how big the structure is
>   counting the gaps between elements.
I tested it.  Even on the VAX cc && sizeof() used the gaps.

Also realizing that in aligning all elements for the compiler on any
machine is the best thing to do since the code will run faster and
uses less memory. The 8086's bus will make two trips to give you an int
that straddles two two-byte words.  This has some overhead.  Along this
line, the winner of the Fastest Response Award, Alan Bland writes:
>by some external interface).  I would define your structure as follows.
>The first members are the ones that typically need the largest
>alignment boundary.  Character arrays are always last.

By coding: struct x { int a; char b,c,d,e,f,n[81]; };
instead of: struct x { char n[81]; int a; char b,c,d,e,f; };
or even:   struct x { int a; char n[81]; char b,c,d,e,f; };
you use 4 & 3 less bytes respectively per structure on the VAX.

This is the kind of rule that I was looking for.  It makes simple sense
and you don't have to use any crazy unions.  According to the other
responses about sizeof() though, it's not nessasary on 3B's and 68000's.
It certainly is safe and efficient.

Yes I did guff in my example.  Matt Crawford caught this (first):
	if((lines[i]->xyz = malloc(sizeof(lines[i]->xyz)))==(char *)NULL)
	                           ---------------------
It was supposed to be:
	if((lines[i]->xyz = malloc(strlen(of_something)+1))==(char *)NULL)
	                           ---------------------
	strcpy(lines[i]->xyz, something);

For the most part all stressed the rule : char	*malloc(num);
	"allocates num bytes.  The pointer returned is sufficiently well
	 aligned to be usable for any purpose.  ..."

I totally agree with Jim:
> ....  What is the
>use of a high level language if you have to worry about stuff like this.

Enjoy reading and thanks again.
Vaughn Vernon
ihnp4!drutx!drux1!vlv
Disclaimer!Disclaimer!Disclaimer!
(response cut off Jul 18 - 1:00pm - this is getting huge! edited.)
-----------------
From mab Wed Jul 17 14:39 MDT 1985 remote from druca

First, according to the manual page, malloc is guaranteed to return
a pointer that is suitably aligned to hold any data object.  So, at
least for the first structure member, it will always be aligned
properly.
As far as alignment within the structure, it's often possible to
rearrange the structure members so that there are no alignment problems
(of course you can't do this if the structure ordering is imposed
by some external interface).  I would define your structure as follows.
The first members are the ones that typically need the largest
alignment boundary.  Character arrays are always last.
struct	line	{
	double	abc;	/* no alignment by me! */
	char	*xyz;
	int	x,	/* here either */
		y,
		z;
	char	n[81];
} *lines[MAXLINES];
This definition shouldn't have any extra filler bytes anywhere on most
machine architectures.  If there are any machines where ints are larger
than pointers, then there might be filler characters stuffed between
*xyz and x, but the above techique works for me on vax, 3b, and 8086.
	Alan Bland, druca!mab
-------------------
From: ulysses!ggs (Griff Smith)

It's really not a problem.  The specification of malloc requires
that the pointer it returns must point to a "safe" address.
...
To get to a 32-bit word boundary for a character pointer, you can use
	p += 3;
	p &= ~3;
I don't recommend this for portable code, however; it makes too many
assumptions about the structure of a pointer.  Malloc is safe.
-------------------
From: ihnp4!oddjob!matt (Matt Crawford)

The compiler decides at compile time what padding is needed and where,
assuming that the beginning of the struct is on a major boundary.
...
You are doing one fishy thing, however.  Why do you write
	if((lines[i]->xyz = malloc(sizeof(lines[i]->xyz)))==(char *)NULL)
	                           ---------------------
?  lines[i]->xyz already has space to hold a pointer-to-char.
You should perhaps be allocating space for a string of some
other size than the size of the pointer itself.
-----------------
From: ihnp4!bellcore!hammond (Rich A. Hammond)

What was your problem?  The malloc(sizeof (struct line) ) should work on
any machine because sizeof is built into the compiler and knows how big
the structure is and the compiler should allocate offsets in the structure
so that all elements fall on legal boundaries.  Note that malloc is
defined to return at least the requested number of bytes on a boudary
suitable for storage of any object.
Are you sure your problem was with malloc?
Please post the responses to the net, I don't see a problem with what
you did.
-----------------
From: hounx!bear
It's supposed to be automatic.  malloc(3C) says: "Malloc returns a pointer
to a block of at least size bytes SUITABLY ALIGNED for any use."  In my
experience (VAX, IBM 3081, 6300) what you originally used works.  How much
space the structure takes up is a function of the machine.  What is the
use of a high level language if you have to worry about stuff like this.
Jim
------------------------
From: ihnp4!bellcore!hammond (Rich A. Hammond)

You don't need the unions, the original example,
struct line, ... malloc(sizeof (struct line)) will work because
1) malloc returns a pointer to an area that is at least as large
   as requested, "suitably aligned (after possible pointer coercion) for
   storage of ANY [emphasis mine] type of object."
...
2) The compiler automagically leaves gaps in structures when needed to
   make sure that the elements fall on an appropriate addressing boundary.
3) Sizeof is built into the compiler and knows how big the structure is
   counting the gaps between elements.
...
What was the problem, all you said was that you had one, but the code
(not the union stuff) looked fine?
---------------------
From: ihnp4!ihnet!tjr

The subject of the internal alignment of objects in structures
is one of the (several) areas where C is not very portable.
The lack of portability is in the data formats, not the code (i.e.
any code should work on any machine, as long as the data is not
passed to another machine).
Unfortunately, different compilers work differently when aligning
objects within structures (and outside, too).
VAX:	no alignment (SVR2 C compiler)
3B2:	every int or short or long aligned to 4-byte addr;
	every struct aligned to 4-byte addr.
	chars and char arrays have no alignment.
BELLMAC8: no alignment.
In general, we have found it necessary to add filler arrays of chars
to manually align everything other than chars to a 4-byte addr
within the enclosing struct; we also fill all structs to be a
multiple of 4-bytes long.
EXAMPLE: struct bletch { int a; char b[3]; char filler1; struct foo c;
	char d; char filler2[3]; int e; };
Again, this problem normally only gets you when you have multiple
CPU-types passing data around.
----------------------
>From lfd Thu Jul 18 11:01:25 EDT 1985 remote from whuxlm

Then, quoting from MALLOC(3C) or MALLOC(3X) in the System V
Release 2.0 Programmer Reference Manual,
    Malloc returns a pointer to a block of at least
    size bytes suitably aligned for any use.
	Lee Derbenwick
	AT&T BL, Whippany NJ
------------------
>From mp  Thu Jul 18 10:15:42 1985 remote from allegra

malloc returns addresses that are aligned on suitably strict boundaries
so that you shouldn't have any problem accessing structure elements.
By the way, what does this do?  It looks like it's allocating 4 bytes
(the size of a character pointer) and assigning the address
of those 4 bytes to a character pointer.
	if((lines[i]->xyz = malloc(sizeof(lines[i]->xyz)))==(char *)NULL)
	Mark Plotnick
	allegra!mp
-------------------
From: vax135!petsd!pedsgd!jje

according to the malloc(3C) manual page (at least the System III
and System V versions) "...each of the allocation routines returns
a pointer to space suitable aligned (after possible pointer coersion)
for storage of *any* type of object..." (emphasis added).
--Jeremy Epstein Perkin-Elmer {decvax,ucbvax}!vax135!petsd!pedsgd!jje
----------------
From: ihnp4!uiucdcs!uiucdcsb.Uiuc.ARPA!johnston (Gary Johnston)

	1. Malloc(III) *always* returns a word-aligned address (i.e.,
	   even).  This is true for every malloc() that I've ever
	   heard of.  Malloc has no idea of the type of object that
	   you are intending to allocate memory for, but everything
	   should be OK if it's word-aligned.
	2. The compiler should take care of aligning fields within
	   structs.  The offsets that my 68000 C compiler would use
	   are
		struct	line	{
			char	n[81];	/* 0 */
			double	abc;	/* 82 (NOTE: PADDING WAS DONE) */
			int	x,	/* 90 */
				y,	/* 94 */
				z;	/* 98 */
			char	*xyz;	/* 102 */
		} *lines[MAX_LINES];
		So, sizeof(struct line) is 106 bytes.
--------------
From: ihnp4!plus5!hokey

I am surprised that you had a problem given the example you posted.  The
compiler is supposed to align structure members.  This means that the
total size of the structure may be larger than it "could" be, but all the
elements are properly aligned for you.  Note that writing the structure
to disk will keep the "alignment holes" in place, which means one can
read the *structure* back in without problems.  If you write the structure
out to disk element-by-element, these "alignment holes" are not written,
but you get better space allocation on disk.  This also means that the data
must be read back in element-by-element.
Are you doing something tricky like trying to reference elements based on
an offset from the beginning of the structure?
----------------
>From merlyn Thu Jul 18 11:21 EDT 1985 remote from avalon

The malloc routine "returns a pointer ... suitably aligned
for any use" according to the manual. Is this what you're
looking for?
			Steve Humphrey