[comp.lang.c] RISC Machine Data Structure Word Alignment Problems?

toppin@melpar.UUCP (Doug Toppin X2075) (01/20/90)

We are using the SUN 4/260 which is a RISC architecture machine.
We are having trouble with data alignment in our data structures.
We have to communicate with external devices that require data structures
such as the following:
	struct
	{
		long  a;
		short b;
		long  c;
	};
When we compile and link something referencing this structure the
data produced appears to have had each element word boundary aligned
so that what results appears to be as follows:
	struct
	{
		long  a;
		short b;
		short pad; <==== this was inserted by cc to align next thing
		long  c;
	};
This means that we lose the benefit of data abstraction and have
to create our own output without using structures.
We have not been able to find any Sun-4 cc option that eliminates
this problem. We cannot use the 'compile as Sun-3' option.
Please let us know if you know of a built-in way around this.
thanks
Doug Toppin
uunet!melpar!toppin

johnl@esegue.segue.boston.ma.us (John R. Levine) (01/22/90)

In article <111@melpar.UUCP> toppin@melpar.UUCP (Doug Toppin   X2075) writes:
>We are using the SUN 4/260 which is a RISC architecture machine.
>We are having trouble with data alignment in our data structures.
>We have to communicate with external devices that require data structures
>such as the following:
>	struct
>	{
>		long  a;
>		short b;
>		long  c;
>	};

I guess all the world's not a Vax any more, now it's a 68020.  It would be
more correct to say that your external device requires a four-byte integer, a
two-byte integer, and a four-byte integer, all sent highest byte first.  C
makes no promise that the layout of structures will be the same from machine
to machine.  For instance, if you ran this code on a 386, there doesn't need
to be any padding (though many compilers add it to make the code run faster)
but the words are all in the opposite byte order.

The SPARC and every other RISC chip requires that items be aligned on their
natural boundaries, because there is considerable performance to be gained by
doing so, and because it is not very hard to write programs that are totally
insensitive to padding and byte order.  Many people have observed this.  In
an article on the IBM 370 series in the CACM about 10 years ago one of the
370's architects noted that the 370 permits misaligned data while its
predecessor the 360 didn't, and it was a mistake to have done so because it's
rarely used and adds considerable complicated to every 370 machine.

In the particular case of the SPARC, there is a C compiler option (documented
in the FM) to allow misaligned data at the enormous cost of several
instructions and sometimes a subroutine call for every load and store.  I
presume you are passing byte streams back and forth to your device, a memory
mapped interface that requires misaligned operands is too awful to
contemplate.  You need to write something like this:

read_foo_structure(struct foo *p)
{
	p->a = read_long();
	p->b = read_short();
	p->c = read_long();
}

long read_long(void)
{
	long v;

	/* read in big endian order */
	v = getc(f) << 24;	/* should do some error checking */
	v |= getc(f) << 16;
	v |= getc(f) << 8;
	v |= getc(f);
	return v;
}

This may seem like more work, but in my experience you write a few of these
things and use them all over the place.  Then your code is really portable.
-- 
John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650
johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl
"Now, we are all jelly doughnuts."

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (01/22/90)

johnl@esegue.segue.boston.ma.us (John R. Levine) writes:

| long read_long(void)
| {
| 	long v;
| 
| 	/* read in big endian order */
| 	v = getc(f) << 24;	/* should do some error checking */
| 	v |= getc(f) << 16;
| 	v |= getc(f) << 8;
| 	v |= getc(f);
| 	return v;
| }
| 
| This may seem like more work, but in my experience you write a few of these
| things and use them all over the place.  Then your code is really portable.

  I agree with your thought, although for portable transfer I usually do
LSB first (not because of any preference) just for the loop. Since I
work with 36 and 64 bit machines, I always add a sign extend on the
read.

  At one time I was operating a PC (original IBM) with a unique
coprocessor Cray2 on an ethernet link. The C2 calculated data and passed
it in 32 bit RLE format to a BASIC program which used calls to write the
display. Amazing what you can do to get a demo up FAST.
-- 
	bill davidsen - sysop *IX BBS and Public Access UNIX
davidsen@sixhub.uucp		...!uunet!crdgw1!sixhub!davidsen

"Getting old is bad, but it beats the hell out of the alternative" -anon

peter@ficc.uu.net (Peter da Silva) (01/22/90)

> I guess all the world's not a Vax any more, now it's a 68020.

Worse, since non-word-aligned values do cost extra cycles to access, any
68020 C compiler that didn't pad that structure is broken. Some "features"
of CISC processors are just too expensive to use.
-- 
 _--_|\  Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \
\_.--._/ Xenix Support -- it's not just a job, it's an adventure!
      v  "Have you hugged your wolf today?" `-_-'

slackey@bbn.com (Stan Lackey) (01/23/90)

In article <LJ81OX3ggpc2@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>> I guess all the world's not a Vax any more, now it's a 68020.
>Worse, since non-word-aligned values do cost extra cycles to access, any
>68020 C compiler that didn't pad that structure is broken. Some "features"
>of CISC processors are just too expensive to use.

Just a quick summary of the last time we went around on this issue:

There are a number of interesting applications that build many
instances of small data structures, each containing varied data types.
It was said that logic simulators do this.  In a machine that forces
you to always have data aligned, this can result in lots of wasted
memory.  Not because the programmer is stupid, but because of the
nature of the application.

Now, if I have a 4MB workstation, and alignment restrictions increases
the need from under 4MB to over 4MB, there will be significant paging.
I'd rather spend two cycles to access a word sometimes, than have to
page over the Etherent.  So would the people with whom I share the
network.

------

Also: the comments on the 360 (aligned) vs 370 (unaligned):

Boy did I hear a different story.  The version I heard was that the
370 supported unaligned data, because the experience with the 360
showed it was incredibly painful to be without it.  Remember in those
days memory was VERY expensive.

:-) Stan

cik@l.cc.purdue.edu (Herman Rubin) (01/23/90)

In article <LJ81OX3ggpc2@ficc.uu.net>, peter@ficc.uu.net (Peter da Silva) writes:
> > I guess all the world's not a Vax any more, now it's a 68020.
> 
> Worse, since non-word-aligned values do cost extra cycles to access, any
> 68020 C compiler that didn't pad that structure is broken. Some "features"
> of CISC processors are just too expensive to use.

Having seen the statement about penalties for unaligned, I tried the following
code (hand coded in assembler to eliminate unnecessary overhead):

.....
while(k < end)*k++ = *i++ ^ *j++;

and the j pointer was deliberately unaligned.  Now this was on a VAX, and it is
possible that other machines may give different results, but the time penalty,
while there, was not excessive.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

weaver@weitek.WEITEK.COM (01/24/90)

In article <51245@bbn.COM> slackey@BBN.COM (Stan Lackey) writes:
>Just a quick summary of the last time we went around on this issue:
>
>There are a number of interesting applications that build many
>instances of small data structures, each containing varied data types.
>It was said that logic simulators do this.  In a machine that forces
>you to always have data aligned, this can result in lots of wasted
>memory.  Not because the programmer is stupid, but because of the
>nature of the application.
>

I want to point out here that this data alignment problem can be 
mostly worked around for application programs. 

On a machine with "natural" alignment, a structure (record, common) 
made of primitive data items (integers, pointers, floats, etc.) 
needs no padding if the elements are ordered such that smaller items 
always follow larger items. The size ordering of primitive data
items is machine dependant, but similar from one machine to the next. 
If the entire record is not a multiple of the largest required alignment,
then some space may be lost between structures, or in nested 
structures. This cannot be handled so easily.

In summary, if you are writing an application from scratch, you
can minimize this effect in an almost (but not quite!) machine
independant way. So for new programs, I think natural alignment
is a good time/speed tradeoff. I also think that supporting 
unaligned data by both traps and special in-line code is a good
idea, since so many programs have long histories. 

Michael.

hascall@cs.iastate.edu (John Hascall) (01/24/90)

In article <21361> weaver@weitek.UUCP (Michael Gordon Weaver) writes:
}In article <51245> slackey@BBN.COM (Stan Lackey) writes:
}>There are a number of interesting applications that build many
}>instances of small data structures, each containing varied data types.
}>It was said that logic simulators do this.  In a machine that forces
}>you to always have data aligned, this can result in lots of wasted
}>memory.  Not because the programmer is stupid, but because of the
}>nature of the application.
 
}I want to point out here that this data alignment problem can be 
}mostly worked around for application programs. 
 
} [sort elements of structures by decreasing size...]


  It seems to me that now we have a conflict between "software engineering"
  and architecture.

  It surely seems to me that, from a programming point of view, you would
  want your structures in some meaningful order as an aid to program
  understanding.  Shouldn't elements that are used together, be located
  together?

  And doesn't everyone pretty much expect certain elements at the top
  of structures, for example:

    struct FOO {                   struct BAR {
        struct FOO  *next;             struct BAR  *left;
        struct FOO  *prev;             struct BAR  *right;
          :                              :
    };                             };

  And on machines with "displacement mode" addressing (i.e., 32(R4) addresses
  the element 32 bytes into the structure at the address in register four)
  there is often a bonus (e.g., speed or code size) for elements within some
  distance (i.e., 127 bytes) from the start of the structure.  So if you put
  the big elements first, you minimize the number of "close" elements.

John Hascall  /  ISU Comp Ctr

gary@dgcad.SV.DG.COM (Gary Bridgewater) (01/24/90)

In article <21361@weitek.WEITEK.COM> weaver@weitek.UUCP (Michael Gordon Weaver) writes:
>In article <51245@bbn.COM> slackey@BBN.COM (Stan Lackey) writes:
>>Just a quick summary of the last time we went around on this issue:
>>
>>There are a number of interesting applications that build many
>>instances of small data structures, each containing varied data types.
>>It was said that logic simulators do this.  In a machine that forces
>>you to always have data aligned, this can result in lots of wasted
>>memory.  Not because the programmer is stupid, but because of the
>>nature of the application.
>>
>
>I want to point out here that this data alignment problem can be 
>mostly worked around for application programs. 

I think you missed the phrase "Not because the programmer is stupid..."

>On a machine with "natural" alignment, a structure (record, common) 
>made of primitive data items (integers, pointers, floats, etc.) 
>needs no padding if the elements are ordered such that smaller items 
>always follow larger items. The size ordering of primitive data
>items is machine dependant, but similar from one machine to the next. 
>If the entire record is not a multiple of the largest required alignment,
>then some space may be lost between structures, or in nested 
>structures. This cannot be handled so easily.

I need to allocate an array of 50,000,000 8 bit integers. How do I do this?
Which is more important 1) overall memory use, 2) misalignment penalty, or
code readability? 
Then I need to allocate 1,000,000 structs containing other structs written
by another programmer. What is the natural order of the data a priori on any
machine? How big is an addr_t on a 386? Sparc? Cray? Is it bigger than a
long float?
I plan to pass these structures from a Sun 4 to a Vax to a Cray via an
ethernet connection. Now what is the natural order?

>In summary, if you are writing an application from scratch, you
>can minimize this effect in an almost (but not quite!) machine
>independant way. So for new programs, I think natural alignment
>is a good time/speed tradeoff. I also think that supporting 
>unaligned data by both traps and special in-line code is a good
>idea, since so many programs have long histories. 

I suggest that when RE-writing a program from scratch you can mitigate this
effect if you have some idea where the code is going to run. This is of
little help to Simulator vendors who have to run across different architectures.
When you write a program you have no idea if it will be successful enough to be
bothered by data alignment inefficiencies. You are usually more worried about
getting it up quickly and in the same execution universe as the specs.

In general, you are stuck and at best will have to go back and micro-tune the
heck out of it on a case-by-case basis. In your spare time, study malloc
algorithms so you can figure out how to allocate bit structures for fun and
profit.

I agree that it is easier if the hardware lets you misalign but that thinking is
passe in the brave new world of RISC where using the computer is a compiler
problem.
-- 
Gary Bridgewater, Data General Corporation, Sunnyvale California
gary@proa.sv.dg.com or {amdahl,aeras,amdcad}!dgcad!gary
Networking is the worst form of data exchange except for all the others
(apologies to WC).

larus@primost.cs.wisc.edu (James Larus) (01/25/90)

In article <21361@weitek.WEITEK.COM>, weaver@weitek.WEITEK.COM writes:
> In summary, if you are writing an application from scratch, you
> can minimize this effect in an almost (but not quite!) machine
> independant way. So for new programs, I think natural alignment
> is a good time/speed tradeoff. I also think that supporting 
> unaligned data by both traps and special in-line code is a good
> idea, since so many programs have long histories. 

This statement may be true in general, but it is not always true.  For example,
I wrote a program tracing system that writes out a trace file consisting of a
mixture of bytes, halfwords, and full words.  It is crucial to this system
that the byte quantities only take up 8 bits (otherwise the size of the already
large files grow by a factor of 2 or more).  However, it means that I need
to do unaligned stores into the trace buffer.  And, since I trace programs in
real time, I need to do the stores fast.

The MIPS R2000 has a 2 instruction sequence that can store a half/fullword
quantity on any byte boundary.  On SPARC, it takes 7 instructions to store
fullwords byte-by-byte.  Comming from Berkeley, I hate to say it, but this
is another case in which MIPS has a much better designed machine than Sun (-:

/Jim

venkat@matrix.UUCP (Desikan Venkatrangan) (02/02/90)

In article <1990Jan21.224826.1699@esegue.segue.boston.ma.us> johnl@esegue.segue.boston.ma.us (John R. Levine) writes:
>From article <111@melpar.UUCP>, by toppin@melpar.UUCP (Doug Toppin   X2075):
>> We are using the SUN 4/260 which is a RISC architecture machine.
>> We are having trouble with data alignment in our data structures.

and suggests:

>              You need to write something like this:
>
>read_foo_structure(struct foo *p)
>{
>	p->a = read_long();
>	p->b = read_short();
>	p->c = read_long();
>}
>
>long read_long(void)
>{
>	long v;
>
>	/* read in big endian order */
>	v = getc(f) << 24;	/* should do some error checking */
>	v |= getc(f) << 16;
>	v |= getc(f) << 8;
>	v |= getc(f);
>	return v;
>}
>
>This may seem like more work, but in my experience you write a few of these
>things and use them all over the place.  Then your code is really portable.
>-- 
For complete portablility and ease of maintenance, try to pattern such 
routines along the External Data Representation (XDR) as SUN has done, 
for support of the RPC mechanism.  You will be writing

bool_t
xdr_foo(xdrs, objp)
register XDR *xdrs;
register objtype *opbjp;
{
	return (xdr_long(xdrs, &objp->a) &&
		xdr_short(xdrs, &objp->b) &&
		xdr_long(xdrs, &obj->c));
}

This way, both read and write operations can be done using the same routines;
(with proper setting of XDR_ENCODE/XDR_DECODE in xdrs.)  Also, the read/write
can be from memory or a file.

The xdr routines for the premitive types are provided by SUN.  But they have
chosen to represent 'shorts' as 4-byte quantities externally.  If you wish to
avoid this and prefer little-endian representation, you should write similar
routines yourself.

The utility rpcgen may be useful as well.

carlw@mercury.sybase.com (carl weidling) (02/03/90)

	The question is whether or not C's requirement to build structures
with the components in the order in which they were declared is a mistake
or not.
In article <1990Jan29.173412.2859@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
	< stuff deleted >
>The basic problem here is that the compiler cannot read minds, and the
>language does not provide a way to tell the compiler which of two
>interpretations is wanted.  The two possibilities are "I want precise
>control of what goes into memory" and "I want these members but please
>pad as necessary to make accesses fast".  Unfortunately, you can't just
>say "well, if I want padding I'll put it in myself", because many people
>want to write portable programs, and the padding requirements are *very*
>machine-specific.  Precise control of memory layout is not necessary for
	< rest of article deleted>
	Reading this I got an idea which is a slight variation on the idea
of a pragma or directive in the language.
	Why not have a PRE-processor directive that will re-arrange the
fields in a structure to maximize efficiency one way or the other? The
C-language itself is untouched, the programmer can run the pre-processor
by itself on the code to see what was done.  Perhaps lint could be made
smart enough to tell if someone was playing too many games with one of
these re-arranged structures. Something like
struct { int alpha;
#ARRANGE_ANY_WAY_YOU_WANT /* maybe specify criteria? i.e. speed vs compact */
	 long beta;
	 char gamma[3];
#END_ARRANGE
	}
-Carl Weidling

shap@delrey.sgi.com (Jonathan Shapiro) (02/04/90)

In article <8314@sybase.sybase.com> carlw@mercury.UUCP (carl weidling) writes:
>	Why not have a PRE-processor directive that will re-arrange the
>fields in a structure to maximize efficiency one way or the other?

Yuck.  If this problem is worth solving, it is worth solving right.

Jon

carr@gandalf.UUCP (Dave Carr) (02/06/90)

In article <11666@thorin.cs.unc.edu>, tuck@jason.cs.unc.edu (Russ Tuck) writes:
> 
> If the compiler did what you suggest and did not align struct members,
> it would in most cases be impossible to access the data member "c" above 
> without causing the program to dump core.  This would not be a useful 
> compiler "feature" :-).  SPARC (and most other RISC archs) requires all 
> ordinary memory accesses to be aligned. 

That's *most* RISC architecture.  At least with the 80960 (I know, not a true
RISC), I have the freedom to access non word aligned data.  I would rather
have the choice than let the RISC architecture force me.

Data explosion on RISC computers is pretty bad.  We should have the choice 
between slowing the CPU down only for those accesses which are not word 
aligned.  We could pad the structures to speed it back up.
-- 
Dave Carr                |  carr@e.gandalf.ca   | If you don't know where  
Gandalf Data Limited	 |  TEL (613) 723-6500  | you are going, you will
Nepean, Ontario, Canada  |  FAX (613) 226-1717  | never get there.

ingoldsb@ctycal.UUCP (Terry Ingoldsby) (02/08/90)

In article <1648@skye.ed.ac.uk>, richard@aiai.ed.ac.uk (Richard Tobin) writes:
> In article <LJ81OX3ggpc2@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
> >Worse, since non-word-aligned values do cost extra cycles to access, any
> >68020 C compiler that didn't pad that structure is broken. 
> 
> This is nonsense.  Which you want depends whether speed or size is more
> important.  A valid criticism would be that too many C compilers don't let
> you specify which kind of optimisation you want.
> 

This discussion, IMHO, is pointless.  The C compilers work just fine the way
are (or at least the ones I am familiar with).  I don't think some of the
people discussing this realize the implications of what they propose.

I work on an Intergraph Clipper based workstation.  Unless I am mistaken,
floating point values can only be aligned on 8 byte boundaries if the
processor is to be able to access them in a single instruction.  If you
try to access a floating point value that is not 8 byte aligned, it
actually grabs the value at the next lowest 8 byte boundary.  It doesn't
even give a bus error trap!  In theory, the compiler could place it on
arbitrary boundaries by generating a sequence of instructions that would
read adjacent values and AND and OR the values into memory.  It sounds to
me that we are talking about 4 or 5 instructions to do this, so your
access speed would be the pits!

The reason people seem to want to be able to store values at arbitrary
locations seems to have to do with the need to write out contiguous
regions of memory to a binary file.  They then complain that reading
that file back into the memory of another machine doesn't work.  No
one ever said it would.  If you want portable code, don't write it
that way.  It is almost always possible to sacrifice portability for
speed.

I don't know why this is so astonishing; you can't write out binary
values for integers between machines, what would lead anyone to believe
that structures should be any different.

C is a low level language.  If you want greater data abstraction, move
to a higher level language that guarantees that data will appear to
be in the same format across systems.  That guarantee is not in the
C definition; doing so would probably limit C's ability to blast bits.
The only format that C guarantees to understand is ascii represented
numeric values.

The only thread of this discussion that might relate to comp.arch is
why processors (such as Clipper) do not give a trap if you try to
access memory on illegal boundaries.  Surely that would not require
much silicon?
-- 
  Terry Ingoldsby                ctycal!ingoldsb@calgary.UUCP
  Land Information Systems                 or
  The City of Calgary       ...{alberta,ubc-cs,utai}!calgary!ctycal!ingoldsb

cik@l.cc.purdue.edu (Herman Rubin) (02/11/90)

In article <328@ctycal.UUCP>, ingoldsb@ctycal.UUCP (Terry Ingoldsby) writes:
> In article <1648@skye.ed.ac.uk>, richard@aiai.ed.ac.uk (Richard Tobin) writes:
> > In article <LJ81OX3ggpc2@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:

			......................

> I don't know why this is so astonishing; you can't write out binary
> values for integers between machines, what would lead anyone to believe
> that structures should be any different.

I can see no more reason why strings of ASCII characters should be
transferrable by hardware with little software intervention than binary
integers, other fixed place binary numbers, other types of numbers (not
strings of numerals), mathematical symbols beyond the usual ones, etc.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

woody@rpp386.cactus.org (Woodrow Baker) (02/11/90)

In article <328@ctycal.UUCP>, ingoldsb@ctycal.UUCP (Terry Ingoldsby) writes:
> In article <1648@skye.ed.ac.uk>, richard@aiai.ed.ac.uk (Richard Tobin) writes:

> This discussion, IMHO, is pointless.  The C compilers work just fine the way
> are (or at least the ones I am familiar with).  I don't think some of the
> people discussing this realize the implications of what they propose.

Wrong.  It depends on what you do.  I happen to do programming dealing
with industrial controllers.  Specificaly, I maintain a compiler, editor
downloader, and monitor package used to program Eagle Signal Controls
EPTAK series industrial controllers.  The code that I work on runs under
MS-DOS.  I have to do things like reach out over the network, and read
data structures out of the remote controllers.  These structures for the
most part, are a mix of byte and word fields.  I then have to parse through
them, and isolate the parts.  Structures are the obvious way to do this.
BUT, the @#$% compiler choses to pad byte or char values out to ints.
This, obviously screws up the data structure access to the retrieved
values.  I have wound up doing things that I am not proud of, like unions,
monkeying around with pointers to the structures such that they don't
point to where they should, but to some offset other than the first byte
of the structure etc.  Yes, I could chose to use an array, but it is clearer
to use standard field names, (at least standard for the EPTAK controlers)
to access these data fields.

Cheers
Woody

peter@ficc.uu.net (Peter da Silva) (02/11/90)

Use structs internally.

Provide functions to read and write each structure, that do the needed
conversions. Never touch the external format internally.

For example:

	Analog accumulator:

		| flags  | val.lo   val.hi |
		+--------+--------+--------+
		| BYTE 0 | BYTE 1 | BYTE 2 |

	struct accumulator {
		char flags;
		int value;
	};

	read_accumulator(addr, info)
	char *addr;
	struct accumulator *info;
	{
		info->flags = addr[0];
		info->value = addr[2];
		info->value = (info->value << 8) | addr[1];
	}

	write_accumulator(addr, info)
	char *addr;
	struct accumulator *info;
	{
		*addr++ = info->flags;
		*addr++ = info & 0xFF;
		*addr   = (inf >> 8) & 0xFF;
	}
-- 
 _--_|\  Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \
\_.--._/ Xenix Support -- it's not just a job, it's an adventure!
      v  "Have you hugged your wolf today?" `-_-'

ronald@robobar.co.uk (Ronald S H Khoo) (02/12/90)

In article <17906@rpp386.cactus.org> woody@rpp386.cactus.org (Woodrow Baker) writes:
> 
> MS-DOS.  I have to do things like reach out over the network, and read
> data structures out of the remote controllers.  These structures for the
> most part, are a mix of byte and word fields.  I then have to parse through
> them, and isolate the parts.  Structures are the obvious way to do this.
> BUT, the @#$% compiler choses to pad byte or char values out to ints.

#ifdef MEDIUM_MADRAS

You don't think this is a hint that it would have been *so* much easier
if everything spoke *text* instead.  Sure, there's the overhead of
binary->text->binary, but the advantages outweigh the cost, especially
if you ever have a mix of controllers with wildly differing internal
architectures.

Oh, you want to discourage that to lock your customers in? Excuse me.

#endif

-- 
Eunet: Ronald.Khoo@robobar.Co.Uk   Phone: +44 1 991 1142    Fax: +44 1 998 8343
Paper: Robobar Ltd. 22 Wadsworth Road, Perivale, Middx., UB6 7JD ENGLAND.
$Header: /usr/ronald/.signature,v 1.2 90/01/26 15:17:15 ronald Exp $ :-)

msb@sq.sq.com (Mark Brader) (02/13/90)

> I can see no more reason why strings of ASCII characters should be
> transferrable by hardware with little software intervention than binary
> integers, other fixed place binary numbers, other types of numbers ...etc.

Because ASCII is, after all, the American Standard Code for Information
Interchange, and those other things aren't.  See signature quote.


Followups to comp.arch.

-- 
Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com
	A standard is established on sure bases, not capriciously but with
	the surety of something intentional and of a logic controlled by
	analysis and experiment. ... A standard is necessary for order
	in human effort.				-- Le Corbusier

This article is in the public domain.

cik@l.cc.purdue.edu (Herman Rubin) (02/13/90)

In article <S_O1_F6xds13@ficc.uu.net>, peter@ficc.uu.net (Peter da Silva) writes:
> Use structs internally.
> 
> Provide functions to read and write each structure, that do the needed
> conversions. Never touch the external format internally.

			[Example deleted.]

This is another situation where the procedure is extremely slow in software.
If the appropriate hardware were provided, this would not be a problem.  But
would the machine then be RISC?
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

peter@ficc.uu.net (Peter da Silva) (02/14/90)

In article <1925@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
> This is another situation where the procedure is extremely slow in software.
> If the appropriate hardware were provided, this would not be a problem.  But
> would the machine then be RISC?

Who cares if it's RISC, CISC, VLIW, or a bunch of elves with abaci? If it's
fast enough, fine. If it's not, unroll the loop to the LCD of the struct
size and the data size. If that doesn't do it, recode in assembler. Then get
a faster machine (where faster is defined in terms of the problem you have
to solve: if the problem involves moving weird numbers of bits around all the
byte ops in the world won't help you). Maybe a coprocessor would help (like
having a disk controller to convert NRZ into MFM instead of doing it yourself).

Most of the time this particular operation isn't a bottleneck, so who cares
how fast it is?
-- 
 _--_|\  Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \
\_.--._/ Xenix Support -- it's not just a job, it's an adventure!
      v  "Have you hugged your wolf today?" `-_-'

pasek@ncrcce.StPaul.NCR.COM (Michael A. Pasek) (02/14/90)

In <17906@rpp386.cactus.org> woody@rpp386.cactus.org (Woodrow Baker) writes:
>In <328@ctycal.UUCP>, ingoldsb@ctycal.UUCP (Terry Ingoldsby) writes:
>> This discussion, IMHO, is pointless.  The C compilers work just fine the way
>> are (or at least the ones I am familiar with).  I don't think some of the
>> people discussing this realize the implications of what they propose.
>Wrong.  It depends on what you do.  [specifics deleted..]
>  I have to do things like reach out over the network, and read
>data structures out of the remote controllers.  These structures for the
>most part, are a mix of byte and word fields.  I then have to parse through
>them, and isolate the parts.  Structures are the obvious way to do this.
>BUT, the @#$% compiler choses to pad byte or char values out to ints.

I also have the same problem.  Having the compiler pad to the "native" data
size is OK if (and ONLY if) you have complete control over that data structure
and do not need to share it with other programs/systems.  However, in data
communications protocols (pick one), the programmer has NO control over the
data structure -- it is predefined, and doesn't come with that nice padding
that the compiler likes to put in.  Some recent RISC compilers (I'm looking
at the 29K) allow you to specify whether structures are "packed" or not, 
which I think is mandatory.  Unfortunately, in the case of the 29K compiler,
although it will "pack" structures, as far as I know it will NOT generate
the appropriate instructions to access those structures if the external
memory subsystem does NOT support non-aligned accesses. Oh, well....

M. A. Pasek               Software Development              NCR Comten, Inc.
(612) 638-7668              MNI Development               2700 N. Snelling Ave.
pasek@c10sd3.StPaul.NCR.COM                               Roseville, MN  55113

ingoldsb@ctycal.UUCP (Terry Ingoldsby) (02/15/90)

In article <17906@rpp386.cactus.org>, woody@rpp386.cactus.org (Woodrow Baker) writes:
> In article <328@ctycal.UUCP>, ingoldsb@ctycal.UUCP (Terry Ingoldsby) writes:
> > In article <1648@skye.ed.ac.uk>, richard@aiai.ed.ac.uk (Richard Tobin) writes:
> > This discussion, IMHO, is pointless.  The C compilers work just fine the way
> > are (or at least the ones I am familiar with).  I don't think some of the
> > people discussing this realize the implications of what they propose.
> Wrong.  It depends on what you do.  I happen to do programming dealing
> with industrial controllers.  Specificaly, I maintain a compiler, editor
> downloader, and monitor package used to program Eagle Signal Controls
> EPTAK series industrial controllers.  The code that I work on runs under
> MS-DOS.  I have to do things like reach out over the network, and read
> data structures out of the remote controllers.  These structures for the
> most part, are a mix of byte and word fields.  I then have to parse through
> them, and isolate the parts.  Structures are the obvious way to do this.
> BUT, the @#$% compiler choses to pad byte or char values out to ints.
If you are passing values across a network to dissimilar machines, you
should be using something like the XDR (External Data Representation).  This
 makes for portable (although messy) code.  In your case, I would agree that
your compiler might reasonably be considered to be malfunctioning, since
the Intel processors can access arbitrarily aligned data.  The discussion
originally discussed RISC processors which can NOT access arbitrary
alignments for all data types.  In this case padding is necessary.  To
minimize the amount of padding, it is necessary to reorder the structure
elements.  This is in accordance with K&R which (as I recall) explicitly
states that the elements may be re-ordered.
 
I re-iterate my original claim; it is not the compilers that are causing
the problems (your case excepted).  Rather, it is the fact that different
processors have different access requirements for data types.  Even if you
wrote your programms in RISC assembler (a horrible thought) then you could
not align your variables arbitrarily.  You would be forced to make the
same decisions/tradeoffs that the compilers make.
-- 
  Terry Ingoldsby                ctycal!ingoldsb@calgary.UUCP
  Land Information Systems                 or
  The City of Calgary       ...{alberta,ubc-cs,utai}!calgary!ctycal!ingoldsb

martin@mwtech.UUCP (Martin Weitzel) (02/21/90)

There were some recent postings, that pointed out/complained about
'holes' in C-struct definitions. I hope it is to the benefit of
some readers, to explain an alternate point of view of C-struct-s
and give some advice how to access a certain byte-layout in memory
in a portable (nevertheless painless) way, which avoid struct-s
completly. Because the latter may be of more interest, I'll come
to it first.

Suppose, you have some library function 'getmsg' you supply with the
adresse of a buffer and when the function returns it has the buffer
filled with the following information:

	2 Byte Integer	- length of message
	1 Byte		- several flag bits
	1 Byte		- type of message
	4 Byte Integer	- checksum
	100 Byte	- arbitrary message

Many C-Programmers now think about defining the following

struct m {
	short m_length;
	unsigned char m_flags;
	char m_type;
	unsigned long m_checksum;
	char m_bytes[100];
} buffer;

so that after an 'getmsg(&buffer)' they can access the individual
parts 'by name', eg: buffer.m_length, buffer.m_flags, ....

... and as the previous posters pointed out, they eventually
get trapped by the 'holes' inserted into the struct by the
compiler for the sake of efficiency.

My advice in this situation is, to change this code as follows:

char buffer[
	  2 /* length of message */
	+ 1 /* several flag bits
	+ 1 /* type of message */
	+ 4 /* checksum */
	+ 100 /* arbitrary message */
];

#define m_length(b)	(*((short *)        (char *)(b) + 0))
#define m_flags(b)	(*((unsigned char *)(char *)(b) + 2))
#define m_type(b)	(*((char *)         (char *)(b) + 3))
#define m_checksum(b)	(*((unsigned long *)(char *)(b) + 4))
#define m_bytes(b)	(                   (char *)(b) + 8 )

(I inserted some white space for readability.)

The least you must know of your compiler in that case is that
a 'char' occupies exactly one byte in an 'array of char'. But
as before, you can access the individual parts 'by name' as
follows: m_length(buffer), m_flags(buffer), ....
If 'getmsg' is allways supplied to the same buffer, you could
make it even simpler by avoiding a parametrized macros and use

#define m_length (*(short *)buffer)
#define m_flags (*(unsigned char *)(buffer + 2))
......

Note that the above expressions are also 'lvalues' ie you
can use them on the left side of an assignment.

There remains only the minor problem, that 'buffer' must be
properly aligned. (Techniques for achieving this are shown
in K&R - you simply have to define buffer as a union with the
type of desired alignement. Alternatively you may allocate
the buffer with 'malloc'.)

If your concern is only 'reading' the elements out of the buffer,
you have the additional benefit that you can transparently compensate
for possible 'byte-order' problems. Suppose the message is produced
by some piece of hardware that assumes the LSB of a 16 Bit Integer
on the lower adress, and you want to move this hardware to a system,
where the CPU takes just the opposite view. All you have to change is:

#define m_length ((short)\
	((*(unsigned char *)(buffer+1))<<8)\
	|(*(unsigned char *)buffer))
.......

(Hope I missed no brackets ... :-)) 

Now back to an alternate view of the C-struct-s, hit 'n' if
you are no more interested.

IMHO many features of the C language can elegantly be explained in
an easy way, if you 'translate' the feature to the 'machine level'.
(Eg I explain much about pointers and arrays to my classes by
sketching pictures with the contents of the data segment.)

One thing to misunderstand here is, that such an explanation often
describes only *one* possible approach to implement the abstract
concept: Though it seems natural, to think about a C-struct as
beeing a collection of individual variables located at increasing
memory adresses in the order they are declared(%) as struct-components,
it often makes more sense, to see a C-struct only as a collection
of data-items, that are garanteed *not* to overlap(%%). Furthermore
the compiler asserts that access to a named struct-component will
allways refer to the same part of memory, even if only the struct-s
adress is the same (important when transfering struct-pointers as
function parameters).

The other guaranty, that the struct-components are located (more
or less) adjacent in memory is only of some 'practical' value,
especially if you have an 'array of struct'-s or write one struct
to a file (using write/fwrite together with sizeof), but has
nothing to do with the abstract concept of a C-struct. 

(%): Even the guarantee, that the struct elements are at ascending
adresses in the order they are declared, IMHO only was given
to avoid complex (and hard to understand) rules, when and when
not it would be allowed to rearrange the elements. Readers who
know other good reasons why this guarantee is given are welcome
to correct me (hello Chris :-)).

(%%): Note, that in the case of a C-union the garanty is *not*
that the elements overlap: They only *may* overlap (unless they
are of the same type or they are different C-structs but with
components of the same type at the beginning, which leads back
to the problem when and when not rearranging could have been
allowed ... again, correct me if I'm wrong).
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83

peter@ficc.uu.net (Peter da Silva) (02/23/90)

> (%): Even the guarantee, that the struct elements are at ascending
> adresses in the order they are declared, IMHO only was given
> to avoid complex (and hard to understand) rules, when and when
> not it would be allowed to rearrange the elements. Readers who
> know other good reasons why this guarantee is given are welcome
> to correct me (hello Chris :-)).

It makes the following two practices reasonably portable:

1:
	struct list_header {
		struct list_header next, prev;
	};

	struct object {
		struct list_header list;
		...
	};

	struct list_header *my_list == NULL;
	struct object my_object;
	extern add_list(struct list_header **list, struct list_header *elt);

	add_list(&my_list, &my_object);

2:
	struct buffer {
		int len;
		char *next;
		char data[1];
	};

	struct buffer *new_buffer(size)
	int size;
	{
		struct buffer *temp;
		
		temp = (struct buffer *) malloc(sizeof *temp + size);
		if(temp) {
			temp->len = size;
			temp->next = &temp->data[0];
		}
		return temp;
	}
-- 
 _--_|\  Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \
\_.--._/ Xenix Support -- it's not just a job, it's an adventure!
      v  "Have you hugged your wolf today?" `-_-'

djones@megatest.UUCP (Dave Jones) (02/23/90)

From article <645@mwtech.UUCP), by martin@mwtech.UUCP (Martin Weitzel):
) There were some recent postings, that pointed out/complained about
) 'holes' in C-struct definitions.
...
) 
) My advice in this situation is, to change this code as follows:
) 
) char buffer[
) 	  2 /* length of message */
) 	+ 1 /* several flag bits
) 	+ 1 /* type of message */
) 	+ 4 /* checksum */
) 	+ 100 /* arbitrary message */
) ];
) 
) #define m_length(b)	(*((short *)        (char *)(b) + 0))
) #define m_flags(b)	(*((unsigned char *)(char *)(b) + 2))
) #define m_type(b)	(*((char *)         (char *)(b) + 3))
) #define m_checksum(b)	(*((unsigned long *)(char *)(b) + 4))
) #define m_bytes(b)	(                   (char *)(b) + 8 )
) 

There's probably going to be a flurry of replies telling you why
this will not work in the general case.

These casts from char* to this-or-that* are not going to work
unless the data just happen to be properly aligned for whatever
processor you happen to be using.

martin@mwtech.UUCP (Martin Weitzel) (02/24/90)

In article <12118@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
}From article <645@mwtech.UUCP), by me (Martin Weitzel):
}) There were some recent postings, that pointed out/complained about
}) 'holes' in C-struct definitions.
}...
}) 
}) My advice in this situation is, to change this code as follows:
}) 
}) char buffer[
}) 	  2 /* length of message */
}) 	+ 1 /* several flag bits
}) 	+ 1 /* type of message */
}) 	+ 4 /* checksum */
}) 	+ 100 /* arbitrary message */
}) ];
}) 
}) #define m_length(b)	(*((short *)        (char *)(b) + 0))
}) #define m_flags(b)	(*((unsigned char *)(char *)(b) + 2))
}) #define m_type(b)	(*((char *)         (char *)(b) + 3))
}) #define m_checksum(b)	(*((unsigned long *)(char *)(b) + 4))
}) #define m_bytes(b)	(                   (char *)(b) + 8 )
}) 
}
}There's probably going to be a flurry of replies telling you why
}this will not work in the general case.
}
}These casts from char* to this-or-that* are not going to work
}unless the data just happen to be properly aligned for whatever
}processor you happen to be using.

I'm well aware that allignment restrictions may invalidate
certain casts from one pointer type to another, but you must
see my proposual in the context of the original questions:

The posters generally complained, that they were not able to
overlay certain byte patterns in memory, because the C-struct
they defined for that purpose contained holes (introduced by
the compiler). The question, by which hard- or software the byte
patterns were produced, was never mentioned in these postings,
but because the posters seemed to be sure, that (only) the
holes in the structures caused the problems, the parts must
have been allready properly aligned 

If the parts of the byte patterns were not properly
aligned, also struct-s *without* holes could not have been
used for this purpose(%). So my proposual is not worse than
a struct, but sometimes helps to get (better) control of which
memory locations are accessed, than struct-s can provide.
If it is only necessary to *read-access* the bytes in question,
the approach described later in my original posting for getting
'wrong' byte order 'right', may also be used in case
of not properly aligned short-s, int-s or long-s.

(%) If a compiler, which supports an option to pack structures,
does this *always tightly*, even on systems with specific alignment
requirements for short-s, int-s and long-s, it may emit code
to acces the LSB/MSB idividual and combine them in a register,
but this would be such an extreme performance penalty, that
I guess such compilers are rare.
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83