[net.unix] Portablity using structures and malloc - Help

vlv@drux1.UUCP (Vaughn Vernon) (07/18/85)

No first time guessers please!  I need help.

I've really been wondering about the way in which the UNIX 'C' compiler
handles structures and memory allocation across processors.

I recently ported a large program from a VAX to a 3B2 and had some
real problems with it.  I would like to propose an example and some
questions about the simple concept that was used.  I will also give
my (probably wrong) proposed answer.

struct	line	{
	char	n[81];
	double	abc;	/* no alignment by me! */
	int	x,	/* here either */
		y,
		z;
	char	*xyz;
} *lines[MAXLINES];
...
	if((lines[i] = (struct line *)malloc(sizeof(struct line)))\
		==(struct line *)NULL)
		...
	if((lines[i]->xyz = malloc(sizeof(lines[i]->xyz)))==(char *)NULL)
		...

Look closely.  I'm allocating memory for the structure and getting a
pointer back.  The pointer returned on a 3B or 68K may or may not be
on a word boundary.  Right?  In either case, what will this do
to the xyz character pointer and the int's?  Malloc() does not know it's
dealing with a structure pointer so will xyz be aligned?  What about the
address being returned to the xyz pointer?  I would think that as long as
xyz is on a word boundary then the pointer returned to the character array
would not have to be aligned since they are only characters.  Right?

The 'C' programmers (K&R) manual says (speaking of bit fields):

" ... an unnamed field with a width of 0 specifies alignment of the
next field at word boundary.  The "next field" presumably is a field,
not an ordinary structure member, (***) because in the latter case
the alignment would have been automatic (***)."

I would not think that this would apply to malloc's since a character
pointer can point to a an odd address and not matter since characters
are contiguous.  Does malloc() know the object that it's working on?

Do I need to know what processor I'm on and how much extra memory
I need to allocate so I can sloppily adjust the struct pointer forward to
the even address I need?  How do I get xyz into an even word address?
Can someone please tell me a set of rules to follow for each processor
(ie. 3B && 68K etc.) ?!

I would hate to think that I would have to take the (*) out from
in front of the lines array and let the compiler handle it!  There's
a whole lot of memory that I would not be used at any one time.

Proposed partial answer: use unions to get alignment.

union aln_int { char	c; int		x; };
union aln_ptr { char	c; char		*xyz; };
union aln_dbl { char	c; double	d; };

struct	line	{
	char	n[81];
union	aln_dbl abc;
union	aln_int	x,
		y,
		z;
union	aln_ptr xyz;
} *lines[MAXLINES];

Does this help at all?  Or is malloc still going to give me a problem?

				Thanks in advance,

				Vaughn Vernon
				AT&T ISL
				Denver, CO
				ihnp4!drutx!drux1!vlv

I will post answers to the net.
Unix is AT&T's Trademark
VAX is Digital's Trademark

All that disclaimer stuff.  Besides, my intelligence is artificial!

mjs@eagle.UUCP (M.J.Shannon) (07/18/85)

> No first time guessers please!  I need help.
> 
> I've really been wondering about the way in which the UNIX 'C' compiler
> handles structures and memory allocation across processors.
> 

The compilers produced by AT&T pad structures as necessary to deal properly
with alignment of members.  The only guarantee made about structure members is
that the (lexically) N+1th member will have an address larger than the Nth
member.  Further, malloc() returns a pointer suitable to be cast to the most
strictly aligned aggregate, so you shouldn't be experiencing any problems (from
the compiler at least).  What problem are you experiencing?
-- 
	Marty Shannon
UUCP:	ihnp4!eagle!mjs
Phone:	+1 201 522 6063

chris@umcp-cs.UUCP (Chris Torek) (07/19/85)

The answer is simple: malloc has been written by someone who knows
the hardware alignment constraints of the machine, and it returns
a pointer that is aligned for *any* use.

(If malloc has been written by someone who does NOT know the hardware
alignment constraints, complain.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

jpl@allegra.UUCP (John P. Linderman) (07/20/85)

> The answer is simple: malloc has been written by someone who knows
> the hardware alignment constraints of the machine, and it returns
> a pointer that is aligned for *any* use.
> 
> In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251)

malloc knows, but it's a pity that you can't make malloc tell.
If there were a nice

    int malign() {return ALIGNMENT_MULTIPLE;}

entry in the malloc package, I could do my own storage allocation out
of an area acquired from malloc or elsewhere.  [I might want to do this
so I could allocate variable-length structures from both ends of an area
until the area was filled, something malloc itself cannot be made to do.]
A totally trivial one-liner that would make it much easier to write
portable software.  How about it, system implementors?

John P. Linderman     The much-maligned allegra!jpl

jack@boring.UUCP (07/21/85)

In article <81@drux1.UUCP> vlv@drux1.UUCP (Vaughn Vernon) writes:
>
>struct	line	{
>	char	n[81];
>	double	abc;	/* no alignment by me! */
>	int	x,	/* here either */
>		y,
>		z;
>	char	*xyz;
>} *lines[MAXLINES];
>...
>	if((lines[i] = (struct line *)malloc(sizeof(struct line)))\
>		==(struct line *)NULL)
>		...
>	if((lines[i]->xyz = malloc(sizeof(lines[i]->xyz)))==(char *)NULL)
>		...
What is done here is probably not what was intended. You've allocated
a character array with the size of a pointer (e.i. probably 2 or
4 bytes).
You should either change the second malloc() to 
	... lines[i]->xyz = malloc(MAXLINESIZE) ...
if you *really* want xyz to be a pointer to the data, or change the
declaration to
	...
	char xyz[1];
and replace both malloc() by one like this:
	lines[i] = (struct line *)malloc(sizeof(struct line)+
			MAXLINESIZE-1);
This way you'll have the string inside the structure.
This has the disadvantage of being slightly tricky code, but the
advantage that the whole thing is contiguous, and you can dispose
of it with a single free() call.

About alignment of malloc():
- It always assumes worst case, so the pointer returned will be
able to point to anything, if correctly casted.
- *NEVER* assume that two malloc() calls will give you contiguous
storage. On the contrary, the won't on any machine that I know of,
since malloc() allocates a few bytes for it's own administration.
-- 
	Jack Jansen, jack@mcvax.UUCP
	The shell is my oyster.

chris@umcp-cs.UUCP (Chris Torek) (07/21/85)

> malloc knows [the alignment constraints of the machine], but it's
> a pity that you can't make malloc tell.  If there were a nice
>
>   int malign() {return ALIGNMENT_MULTIPLE;}

True.  However, there is a general rule you can use that I've not
yet seen fail on any machine: align your object on an ``n'' bit
boundary, where ``n'' is the smallest power of 2 that is not less
than the size of the object you're allocating.  (Of course this
can be quite wasteful for large areas....)

In case what I wrote doesn't say what I really meant, here's an
example:

	int malign(size)
	register int size;
	{
		register int n = 0;

		while ((1 << n) < size)
			n++;
		return (1 << n);
	}
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

jpl@allegra.UUCP (John P. Linderman) (07/21/85)

>>> The answer is simple: malloc has been written by someone who knows
>>> the hardware alignment constraints of the machine, and it returns
>>> a pointer that is aligned for *any* use.
>>>
>>> In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251)
>
> malloc knows, but it's a pity that you can't make malloc tell.
> If there were a nice
>
>     int malign() {return ALIGNMENT_MULTIPLE;}
>
> entry in the malloc package, I could do my own storage allocation.
> A totally trivial one-liner that would make it much easier to write
> portable software.  How about it, system implementors?
>
> John P. Linderman     The much-maligned allegra!jpl

Close, but no cigar, fish-breath.  Don't put the problem on malloc's
doorstep.  Both you and malloc should be able to determine this value, and
a host of others, like minimum alignment required to avoid core dumps,
bits per byte, which release of whose UN*X, paging/non-paging, big/little
endian, host name, number of file descriptors, maximum lengths for
directory entries, path names, arguments passed to exec*, space available
per process, and so on, without building them into your programs.

Some values, like bits per byte, are unlikely to change without forcing a
recompilation, so they could live in a header like
   /usr/include/portable.h
Others, like host name and number of file descriptors, might reasonably be
expected to differ between binary-compatible machines or over time, so
they should be determinable at execution time.  Most of these values are
already available if you know where to look, but knowing where to look is
not portable.  Responsible people on the net [that leaves you out, John]
should collect a list of these values that make porting difficult, and
decide how they can be made known in a standard way.

I suggest that those interested in kicking these issues around move the
discussion to mod.unix, since there is likely to be a lot of overlap of
proposed additions [whence the mod] and since the problems are more those
of UN*X than C [whence the unix].  And why don't you look for a job
commensurate with your skills, Linderman, like waxing VAXes?

John P. Linderman   Departments of Schizophrenia   allegra!jpl

wendt@bocklin.UUCP (07/26/85)

An easy way to determine alignment requirements is to take the size
of a structure containing one character.  Odd-length structures are
invariably rounded up to convenient units by the compiler.

ado@elsie.UUCP (Arthur David Olson) (07/27/85)

In article <257@bocklin.UUCP>, wendt@bocklin.UUCP writes:
> An easy way to determine alignment requirements is to take the size
> of a structure containing one character.  Odd-length structures are
> invariably rounded up to convenient units by the compiler.

Or, preferably,
	#define ALIGNMENT (sizeof (struct { char :1;}))
which avoids having to make up a name for a structure element.
--
UNIX is an AT&T Bell Laboratories trademark.
--
	UUCP: ..decvax!seismo!elsie!ado    ARPA: elsie!ado@seismo.ARPA
	DEC, VAX and Elsie are Digital Equipment and Borden trademarks

franka@mmintl.UUCP (Frank Adams) (07/29/85)

In article <5181@elsie.UUCP> ado@elsie.UUCP (Arthur David Olson) writes:
>In article <257@bocklin.UUCP>, wendt@bocklin.UUCP writes:
>> An easy way to determine alignment requirements is to take the size
>> of a structure containing one character.  Odd-length structures are
>> invariably rounded up to convenient units by the compiler.
>
>Or, preferably,
>	#define ALIGNMENT (sizeof (struct { char :1;}))
>which avoids having to make up a name for a structure element.

This also has the advantage of working properly on a hypothetical machine
where alignment on a character boundary is not required.