toppin@melpar.UUCP (Doug Toppin X2075) (01/20/90)
We are using the SUN 4/260 which is a RISC architecture machine. We are having trouble with data alignment in our data structures. We have to communicate with external devices that require data structures such as the following: struct { long a; short b; long c; }; When we compile and link something referencing this structure the data produced appears to have had each element word boundary aligned so that what results appears to be as follows: struct { long a; short b; short pad; <==== this was inserted by cc to align next thing long c; }; This means that we lose the benefit of data abstraction and have to create our own output without using structures. We have not been able to find any Sun-4 cc option that eliminates this problem. We cannot use the 'compile as Sun-3' option. Please let us know if you know of a built-in way around this. thanks Doug Toppin uunet!melpar!toppin
johnl@esegue.segue.boston.ma.us (John R. Levine) (01/22/90)
In article <111@melpar.UUCP> toppin@melpar.UUCP (Doug Toppin X2075) writes: >We are using the SUN 4/260 which is a RISC architecture machine. >We are having trouble with data alignment in our data structures. >We have to communicate with external devices that require data structures >such as the following: > struct > { > long a; > short b; > long c; > }; I guess all the world's not a Vax any more, now it's a 68020. It would be more correct to say that your external device requires a four-byte integer, a two-byte integer, and a four-byte integer, all sent highest byte first. C makes no promise that the layout of structures will be the same from machine to machine. For instance, if you ran this code on a 386, there doesn't need to be any padding (though many compilers add it to make the code run faster) but the words are all in the opposite byte order. The SPARC and every other RISC chip requires that items be aligned on their natural boundaries, because there is considerable performance to be gained by doing so, and because it is not very hard to write programs that are totally insensitive to padding and byte order. Many people have observed this. In an article on the IBM 370 series in the CACM about 10 years ago one of the 370's architects noted that the 370 permits misaligned data while its predecessor the 360 didn't, and it was a mistake to have done so because it's rarely used and adds considerable complicated to every 370 machine. In the particular case of the SPARC, there is a C compiler option (documented in the FM) to allow misaligned data at the enormous cost of several instructions and sometimes a subroutine call for every load and store. I presume you are passing byte streams back and forth to your device, a memory mapped interface that requires misaligned operands is too awful to contemplate. You need to write something like this: read_foo_structure(struct foo *p) { p->a = read_long(); p->b = read_short(); p->c = read_long(); } long read_long(void) { long v; /* read in big endian order */ v = getc(f) << 24; /* should do some error checking */ v |= getc(f) << 16; v |= getc(f) << 8; v |= getc(f); return v; } This may seem like more work, but in my experience you write a few of these things and use them all over the place. Then your code is really portable. -- John R. Levine, Segue Software, POB 349, Cambridge MA 02238, +1 617 864 9650 johnl@esegue.segue.boston.ma.us, {ima|lotus|spdcc}!esegue!johnl "Now, we are all jelly doughnuts."
davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (01/22/90)
johnl@esegue.segue.boston.ma.us (John R. Levine) writes: | long read_long(void) | { | long v; | | /* read in big endian order */ | v = getc(f) << 24; /* should do some error checking */ | v |= getc(f) << 16; | v |= getc(f) << 8; | v |= getc(f); | return v; | } | | This may seem like more work, but in my experience you write a few of these | things and use them all over the place. Then your code is really portable. I agree with your thought, although for portable transfer I usually do LSB first (not because of any preference) just for the loop. Since I work with 36 and 64 bit machines, I always add a sign extend on the read. At one time I was operating a PC (original IBM) with a unique coprocessor Cray2 on an ethernet link. The C2 calculated data and passed it in 32 bit RLE format to a BASIC program which used calls to write the display. Amazing what you can do to get a demo up FAST. -- bill davidsen - sysop *IX BBS and Public Access UNIX davidsen@sixhub.uucp ...!uunet!crdgw1!sixhub!davidsen "Getting old is bad, but it beats the hell out of the alternative" -anon
peter@ficc.uu.net (Peter da Silva) (01/22/90)
> I guess all the world's not a Vax any more, now it's a 68020.
Worse, since non-word-aligned values do cost extra cycles to access, any
68020 C compiler that didn't pad that structure is broken. Some "features"
of CISC processors are just too expensive to use.
--
_--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/ \
\_.--._/ Xenix Support -- it's not just a job, it's an adventure!
v "Have you hugged your wolf today?" `-_-'
slackey@bbn.com (Stan Lackey) (01/23/90)
In article <LJ81OX3ggpc2@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: >> I guess all the world's not a Vax any more, now it's a 68020. >Worse, since non-word-aligned values do cost extra cycles to access, any >68020 C compiler that didn't pad that structure is broken. Some "features" >of CISC processors are just too expensive to use. Just a quick summary of the last time we went around on this issue: There are a number of interesting applications that build many instances of small data structures, each containing varied data types. It was said that logic simulators do this. In a machine that forces you to always have data aligned, this can result in lots of wasted memory. Not because the programmer is stupid, but because of the nature of the application. Now, if I have a 4MB workstation, and alignment restrictions increases the need from under 4MB to over 4MB, there will be significant paging. I'd rather spend two cycles to access a word sometimes, than have to page over the Etherent. So would the people with whom I share the network. ------ Also: the comments on the 360 (aligned) vs 370 (unaligned): Boy did I hear a different story. The version I heard was that the 370 supported unaligned data, because the experience with the 360 showed it was incredibly painful to be without it. Remember in those days memory was VERY expensive. :-) Stan
cik@l.cc.purdue.edu (Herman Rubin) (01/23/90)
In article <LJ81OX3ggpc2@ficc.uu.net>, peter@ficc.uu.net (Peter da Silva) writes: > > I guess all the world's not a Vax any more, now it's a 68020. > > Worse, since non-word-aligned values do cost extra cycles to access, any > 68020 C compiler that didn't pad that structure is broken. Some "features" > of CISC processors are just too expensive to use. Having seen the statement about penalties for unaligned, I tried the following code (hand coded in assembler to eliminate unnecessary overhead): ..... while(k < end)*k++ = *i++ ^ *j++; and the j pointer was deliberately unaligned. Now this was on a VAX, and it is possible that other machines may give different results, but the time penalty, while there, was not excessive. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
weaver@weitek.WEITEK.COM (01/24/90)
In article <51245@bbn.COM> slackey@BBN.COM (Stan Lackey) writes: >Just a quick summary of the last time we went around on this issue: > >There are a number of interesting applications that build many >instances of small data structures, each containing varied data types. >It was said that logic simulators do this. In a machine that forces >you to always have data aligned, this can result in lots of wasted >memory. Not because the programmer is stupid, but because of the >nature of the application. > I want to point out here that this data alignment problem can be mostly worked around for application programs. On a machine with "natural" alignment, a structure (record, common) made of primitive data items (integers, pointers, floats, etc.) needs no padding if the elements are ordered such that smaller items always follow larger items. The size ordering of primitive data items is machine dependant, but similar from one machine to the next. If the entire record is not a multiple of the largest required alignment, then some space may be lost between structures, or in nested structures. This cannot be handled so easily. In summary, if you are writing an application from scratch, you can minimize this effect in an almost (but not quite!) machine independant way. So for new programs, I think natural alignment is a good time/speed tradeoff. I also think that supporting unaligned data by both traps and special in-line code is a good idea, since so many programs have long histories. Michael.
hascall@cs.iastate.edu (John Hascall) (01/24/90)
In article <21361> weaver@weitek.UUCP (Michael Gordon Weaver) writes: }In article <51245> slackey@BBN.COM (Stan Lackey) writes: }>There are a number of interesting applications that build many }>instances of small data structures, each containing varied data types. }>It was said that logic simulators do this. In a machine that forces }>you to always have data aligned, this can result in lots of wasted }>memory. Not because the programmer is stupid, but because of the }>nature of the application. }I want to point out here that this data alignment problem can be }mostly worked around for application programs. } [sort elements of structures by decreasing size...] It seems to me that now we have a conflict between "software engineering" and architecture. It surely seems to me that, from a programming point of view, you would want your structures in some meaningful order as an aid to program understanding. Shouldn't elements that are used together, be located together? And doesn't everyone pretty much expect certain elements at the top of structures, for example: struct FOO { struct BAR { struct FOO *next; struct BAR *left; struct FOO *prev; struct BAR *right; : : }; }; And on machines with "displacement mode" addressing (i.e., 32(R4) addresses the element 32 bytes into the structure at the address in register four) there is often a bonus (e.g., speed or code size) for elements within some distance (i.e., 127 bytes) from the start of the structure. So if you put the big elements first, you minimize the number of "close" elements. John Hascall / ISU Comp Ctr
gary@dgcad.SV.DG.COM (Gary Bridgewater) (01/24/90)
In article <21361@weitek.WEITEK.COM> weaver@weitek.UUCP (Michael Gordon Weaver) writes: >In article <51245@bbn.COM> slackey@BBN.COM (Stan Lackey) writes: >>Just a quick summary of the last time we went around on this issue: >> >>There are a number of interesting applications that build many >>instances of small data structures, each containing varied data types. >>It was said that logic simulators do this. In a machine that forces >>you to always have data aligned, this can result in lots of wasted >>memory. Not because the programmer is stupid, but because of the >>nature of the application. >> > >I want to point out here that this data alignment problem can be >mostly worked around for application programs. I think you missed the phrase "Not because the programmer is stupid..." >On a machine with "natural" alignment, a structure (record, common) >made of primitive data items (integers, pointers, floats, etc.) >needs no padding if the elements are ordered such that smaller items >always follow larger items. The size ordering of primitive data >items is machine dependant, but similar from one machine to the next. >If the entire record is not a multiple of the largest required alignment, >then some space may be lost between structures, or in nested >structures. This cannot be handled so easily. I need to allocate an array of 50,000,000 8 bit integers. How do I do this? Which is more important 1) overall memory use, 2) misalignment penalty, or code readability? Then I need to allocate 1,000,000 structs containing other structs written by another programmer. What is the natural order of the data a priori on any machine? How big is an addr_t on a 386? Sparc? Cray? Is it bigger than a long float? I plan to pass these structures from a Sun 4 to a Vax to a Cray via an ethernet connection. Now what is the natural order? >In summary, if you are writing an application from scratch, you >can minimize this effect in an almost (but not quite!) machine >independant way. So for new programs, I think natural alignment >is a good time/speed tradeoff. I also think that supporting >unaligned data by both traps and special in-line code is a good >idea, since so many programs have long histories. I suggest that when RE-writing a program from scratch you can mitigate this effect if you have some idea where the code is going to run. This is of little help to Simulator vendors who have to run across different architectures. When you write a program you have no idea if it will be successful enough to be bothered by data alignment inefficiencies. You are usually more worried about getting it up quickly and in the same execution universe as the specs. In general, you are stuck and at best will have to go back and micro-tune the heck out of it on a case-by-case basis. In your spare time, study malloc algorithms so you can figure out how to allocate bit structures for fun and profit. I agree that it is easier if the hardware lets you misalign but that thinking is passe in the brave new world of RISC where using the computer is a compiler problem. -- Gary Bridgewater, Data General Corporation, Sunnyvale California gary@proa.sv.dg.com or {amdahl,aeras,amdcad}!dgcad!gary Networking is the worst form of data exchange except for all the others (apologies to WC).
larus@primost.cs.wisc.edu (James Larus) (01/25/90)
In article <21361@weitek.WEITEK.COM>, weaver@weitek.WEITEK.COM writes: > In summary, if you are writing an application from scratch, you > can minimize this effect in an almost (but not quite!) machine > independant way. So for new programs, I think natural alignment > is a good time/speed tradeoff. I also think that supporting > unaligned data by both traps and special in-line code is a good > idea, since so many programs have long histories. This statement may be true in general, but it is not always true. For example, I wrote a program tracing system that writes out a trace file consisting of a mixture of bytes, halfwords, and full words. It is crucial to this system that the byte quantities only take up 8 bits (otherwise the size of the already large files grow by a factor of 2 or more). However, it means that I need to do unaligned stores into the trace buffer. And, since I trace programs in real time, I need to do the stores fast. The MIPS R2000 has a 2 instruction sequence that can store a half/fullword quantity on any byte boundary. On SPARC, it takes 7 instructions to store fullwords byte-by-byte. Comming from Berkeley, I hate to say it, but this is another case in which MIPS has a much better designed machine than Sun (-: /Jim
venkat@matrix.UUCP (Desikan Venkatrangan) (02/02/90)
In article <1990Jan21.224826.1699@esegue.segue.boston.ma.us> johnl@esegue.segue.boston.ma.us (John R. Levine) writes: >From article <111@melpar.UUCP>, by toppin@melpar.UUCP (Doug Toppin X2075): >> We are using the SUN 4/260 which is a RISC architecture machine. >> We are having trouble with data alignment in our data structures. and suggests: > You need to write something like this: > >read_foo_structure(struct foo *p) >{ > p->a = read_long(); > p->b = read_short(); > p->c = read_long(); >} > >long read_long(void) >{ > long v; > > /* read in big endian order */ > v = getc(f) << 24; /* should do some error checking */ > v |= getc(f) << 16; > v |= getc(f) << 8; > v |= getc(f); > return v; >} > >This may seem like more work, but in my experience you write a few of these >things and use them all over the place. Then your code is really portable. >-- For complete portablility and ease of maintenance, try to pattern such routines along the External Data Representation (XDR) as SUN has done, for support of the RPC mechanism. You will be writing bool_t xdr_foo(xdrs, objp) register XDR *xdrs; register objtype *opbjp; { return (xdr_long(xdrs, &objp->a) && xdr_short(xdrs, &objp->b) && xdr_long(xdrs, &obj->c)); } This way, both read and write operations can be done using the same routines; (with proper setting of XDR_ENCODE/XDR_DECODE in xdrs.) Also, the read/write can be from memory or a file. The xdr routines for the premitive types are provided by SUN. But they have chosen to represent 'shorts' as 4-byte quantities externally. If you wish to avoid this and prefer little-endian representation, you should write similar routines yourself. The utility rpcgen may be useful as well.
carlw@mercury.sybase.com (carl weidling) (02/03/90)
The question is whether or not C's requirement to build structures with the components in the order in which they were declared is a mistake or not. In article <1990Jan29.173412.2859@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: < stuff deleted > >The basic problem here is that the compiler cannot read minds, and the >language does not provide a way to tell the compiler which of two >interpretations is wanted. The two possibilities are "I want precise >control of what goes into memory" and "I want these members but please >pad as necessary to make accesses fast". Unfortunately, you can't just >say "well, if I want padding I'll put it in myself", because many people >want to write portable programs, and the padding requirements are *very* >machine-specific. Precise control of memory layout is not necessary for < rest of article deleted> Reading this I got an idea which is a slight variation on the idea of a pragma or directive in the language. Why not have a PRE-processor directive that will re-arrange the fields in a structure to maximize efficiency one way or the other? The C-language itself is untouched, the programmer can run the pre-processor by itself on the code to see what was done. Perhaps lint could be made smart enough to tell if someone was playing too many games with one of these re-arranged structures. Something like struct { int alpha; #ARRANGE_ANY_WAY_YOU_WANT /* maybe specify criteria? i.e. speed vs compact */ long beta; char gamma[3]; #END_ARRANGE } -Carl Weidling
shap@delrey.sgi.com (Jonathan Shapiro) (02/04/90)
In article <8314@sybase.sybase.com> carlw@mercury.UUCP (carl weidling) writes: > Why not have a PRE-processor directive that will re-arrange the >fields in a structure to maximize efficiency one way or the other? Yuck. If this problem is worth solving, it is worth solving right. Jon
carr@gandalf.UUCP (Dave Carr) (02/06/90)
In article <11666@thorin.cs.unc.edu>, tuck@jason.cs.unc.edu (Russ Tuck) writes: > > If the compiler did what you suggest and did not align struct members, > it would in most cases be impossible to access the data member "c" above > without causing the program to dump core. This would not be a useful > compiler "feature" :-). SPARC (and most other RISC archs) requires all > ordinary memory accesses to be aligned. That's *most* RISC architecture. At least with the 80960 (I know, not a true RISC), I have the freedom to access non word aligned data. I would rather have the choice than let the RISC architecture force me. Data explosion on RISC computers is pretty bad. We should have the choice between slowing the CPU down only for those accesses which are not word aligned. We could pad the structures to speed it back up. -- Dave Carr | carr@e.gandalf.ca | If you don't know where Gandalf Data Limited | TEL (613) 723-6500 | you are going, you will Nepean, Ontario, Canada | FAX (613) 226-1717 | never get there.
ingoldsb@ctycal.UUCP (Terry Ingoldsby) (02/08/90)
In article <1648@skye.ed.ac.uk>, richard@aiai.ed.ac.uk (Richard Tobin) writes: > In article <LJ81OX3ggpc2@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: > >Worse, since non-word-aligned values do cost extra cycles to access, any > >68020 C compiler that didn't pad that structure is broken. > > This is nonsense. Which you want depends whether speed or size is more > important. A valid criticism would be that too many C compilers don't let > you specify which kind of optimisation you want. > This discussion, IMHO, is pointless. The C compilers work just fine the way are (or at least the ones I am familiar with). I don't think some of the people discussing this realize the implications of what they propose. I work on an Intergraph Clipper based workstation. Unless I am mistaken, floating point values can only be aligned on 8 byte boundaries if the processor is to be able to access them in a single instruction. If you try to access a floating point value that is not 8 byte aligned, it actually grabs the value at the next lowest 8 byte boundary. It doesn't even give a bus error trap! In theory, the compiler could place it on arbitrary boundaries by generating a sequence of instructions that would read adjacent values and AND and OR the values into memory. It sounds to me that we are talking about 4 or 5 instructions to do this, so your access speed would be the pits! The reason people seem to want to be able to store values at arbitrary locations seems to have to do with the need to write out contiguous regions of memory to a binary file. They then complain that reading that file back into the memory of another machine doesn't work. No one ever said it would. If you want portable code, don't write it that way. It is almost always possible to sacrifice portability for speed. I don't know why this is so astonishing; you can't write out binary values for integers between machines, what would lead anyone to believe that structures should be any different. C is a low level language. If you want greater data abstraction, move to a higher level language that guarantees that data will appear to be in the same format across systems. That guarantee is not in the C definition; doing so would probably limit C's ability to blast bits. The only format that C guarantees to understand is ascii represented numeric values. The only thread of this discussion that might relate to comp.arch is why processors (such as Clipper) do not give a trap if you try to access memory on illegal boundaries. Surely that would not require much silicon? -- Terry Ingoldsby ctycal!ingoldsb@calgary.UUCP Land Information Systems or The City of Calgary ...{alberta,ubc-cs,utai}!calgary!ctycal!ingoldsb
cik@l.cc.purdue.edu (Herman Rubin) (02/11/90)
In article <328@ctycal.UUCP>, ingoldsb@ctycal.UUCP (Terry Ingoldsby) writes: > In article <1648@skye.ed.ac.uk>, richard@aiai.ed.ac.uk (Richard Tobin) writes: > > In article <LJ81OX3ggpc2@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes: ...................... > I don't know why this is so astonishing; you can't write out binary > values for integers between machines, what would lead anyone to believe > that structures should be any different. I can see no more reason why strings of ASCII characters should be transferrable by hardware with little software intervention than binary integers, other fixed place binary numbers, other types of numbers (not strings of numerals), mathematical symbols beyond the usual ones, etc. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
woody@rpp386.cactus.org (Woodrow Baker) (02/11/90)
In article <328@ctycal.UUCP>, ingoldsb@ctycal.UUCP (Terry Ingoldsby) writes: > In article <1648@skye.ed.ac.uk>, richard@aiai.ed.ac.uk (Richard Tobin) writes: > This discussion, IMHO, is pointless. The C compilers work just fine the way > are (or at least the ones I am familiar with). I don't think some of the > people discussing this realize the implications of what they propose. Wrong. It depends on what you do. I happen to do programming dealing with industrial controllers. Specificaly, I maintain a compiler, editor downloader, and monitor package used to program Eagle Signal Controls EPTAK series industrial controllers. The code that I work on runs under MS-DOS. I have to do things like reach out over the network, and read data structures out of the remote controllers. These structures for the most part, are a mix of byte and word fields. I then have to parse through them, and isolate the parts. Structures are the obvious way to do this. BUT, the @#$% compiler choses to pad byte or char values out to ints. This, obviously screws up the data structure access to the retrieved values. I have wound up doing things that I am not proud of, like unions, monkeying around with pointers to the structures such that they don't point to where they should, but to some offset other than the first byte of the structure etc. Yes, I could chose to use an array, but it is clearer to use standard field names, (at least standard for the EPTAK controlers) to access these data fields. Cheers Woody
peter@ficc.uu.net (Peter da Silva) (02/11/90)
Use structs internally. Provide functions to read and write each structure, that do the needed conversions. Never touch the external format internally. For example: Analog accumulator: | flags | val.lo val.hi | +--------+--------+--------+ | BYTE 0 | BYTE 1 | BYTE 2 | struct accumulator { char flags; int value; }; read_accumulator(addr, info) char *addr; struct accumulator *info; { info->flags = addr[0]; info->value = addr[2]; info->value = (info->value << 8) | addr[1]; } write_accumulator(addr, info) char *addr; struct accumulator *info; { *addr++ = info->flags; *addr++ = info & 0xFF; *addr = (inf >> 8) & 0xFF; } -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
ronald@robobar.co.uk (Ronald S H Khoo) (02/12/90)
In article <17906@rpp386.cactus.org> woody@rpp386.cactus.org (Woodrow Baker) writes: > > MS-DOS. I have to do things like reach out over the network, and read > data structures out of the remote controllers. These structures for the > most part, are a mix of byte and word fields. I then have to parse through > them, and isolate the parts. Structures are the obvious way to do this. > BUT, the @#$% compiler choses to pad byte or char values out to ints. #ifdef MEDIUM_MADRAS You don't think this is a hint that it would have been *so* much easier if everything spoke *text* instead. Sure, there's the overhead of binary->text->binary, but the advantages outweigh the cost, especially if you ever have a mix of controllers with wildly differing internal architectures. Oh, you want to discourage that to lock your customers in? Excuse me. #endif -- Eunet: Ronald.Khoo@robobar.Co.Uk Phone: +44 1 991 1142 Fax: +44 1 998 8343 Paper: Robobar Ltd. 22 Wadsworth Road, Perivale, Middx., UB6 7JD ENGLAND. $Header: /usr/ronald/.signature,v 1.2 90/01/26 15:17:15 ronald Exp $ :-)
msb@sq.sq.com (Mark Brader) (02/13/90)
> I can see no more reason why strings of ASCII characters should be > transferrable by hardware with little software intervention than binary > integers, other fixed place binary numbers, other types of numbers ...etc. Because ASCII is, after all, the American Standard Code for Information Interchange, and those other things aren't. See signature quote. Followups to comp.arch. -- Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com A standard is established on sure bases, not capriciously but with the surety of something intentional and of a logic controlled by analysis and experiment. ... A standard is necessary for order in human effort. -- Le Corbusier This article is in the public domain.
cik@l.cc.purdue.edu (Herman Rubin) (02/13/90)
In article <S_O1_F6xds13@ficc.uu.net>, peter@ficc.uu.net (Peter da Silva) writes: > Use structs internally. > > Provide functions to read and write each structure, that do the needed > conversions. Never touch the external format internally. [Example deleted.] This is another situation where the procedure is extremely slow in software. If the appropriate hardware were provided, this would not be a problem. But would the machine then be RISC? -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
peter@ficc.uu.net (Peter da Silva) (02/14/90)
In article <1925@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > This is another situation where the procedure is extremely slow in software. > If the appropriate hardware were provided, this would not be a problem. But > would the machine then be RISC? Who cares if it's RISC, CISC, VLIW, or a bunch of elves with abaci? If it's fast enough, fine. If it's not, unroll the loop to the LCD of the struct size and the data size. If that doesn't do it, recode in assembler. Then get a faster machine (where faster is defined in terms of the problem you have to solve: if the problem involves moving weird numbers of bits around all the byte ops in the world won't help you). Maybe a coprocessor would help (like having a disk controller to convert NRZ into MFM instead of doing it yourself). Most of the time this particular operation isn't a bottleneck, so who cares how fast it is? -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
pasek@ncrcce.StPaul.NCR.COM (Michael A. Pasek) (02/14/90)
In <17906@rpp386.cactus.org> woody@rpp386.cactus.org (Woodrow Baker) writes: >In <328@ctycal.UUCP>, ingoldsb@ctycal.UUCP (Terry Ingoldsby) writes: >> This discussion, IMHO, is pointless. The C compilers work just fine the way >> are (or at least the ones I am familiar with). I don't think some of the >> people discussing this realize the implications of what they propose. >Wrong. It depends on what you do. [specifics deleted..] > I have to do things like reach out over the network, and read >data structures out of the remote controllers. These structures for the >most part, are a mix of byte and word fields. I then have to parse through >them, and isolate the parts. Structures are the obvious way to do this. >BUT, the @#$% compiler choses to pad byte or char values out to ints. I also have the same problem. Having the compiler pad to the "native" data size is OK if (and ONLY if) you have complete control over that data structure and do not need to share it with other programs/systems. However, in data communications protocols (pick one), the programmer has NO control over the data structure -- it is predefined, and doesn't come with that nice padding that the compiler likes to put in. Some recent RISC compilers (I'm looking at the 29K) allow you to specify whether structures are "packed" or not, which I think is mandatory. Unfortunately, in the case of the 29K compiler, although it will "pack" structures, as far as I know it will NOT generate the appropriate instructions to access those structures if the external memory subsystem does NOT support non-aligned accesses. Oh, well.... M. A. Pasek Software Development NCR Comten, Inc. (612) 638-7668 MNI Development 2700 N. Snelling Ave. pasek@c10sd3.StPaul.NCR.COM Roseville, MN 55113
ingoldsb@ctycal.UUCP (Terry Ingoldsby) (02/15/90)
In article <17906@rpp386.cactus.org>, woody@rpp386.cactus.org (Woodrow Baker) writes: > In article <328@ctycal.UUCP>, ingoldsb@ctycal.UUCP (Terry Ingoldsby) writes: > > In article <1648@skye.ed.ac.uk>, richard@aiai.ed.ac.uk (Richard Tobin) writes: > > This discussion, IMHO, is pointless. The C compilers work just fine the way > > are (or at least the ones I am familiar with). I don't think some of the > > people discussing this realize the implications of what they propose. > Wrong. It depends on what you do. I happen to do programming dealing > with industrial controllers. Specificaly, I maintain a compiler, editor > downloader, and monitor package used to program Eagle Signal Controls > EPTAK series industrial controllers. The code that I work on runs under > MS-DOS. I have to do things like reach out over the network, and read > data structures out of the remote controllers. These structures for the > most part, are a mix of byte and word fields. I then have to parse through > them, and isolate the parts. Structures are the obvious way to do this. > BUT, the @#$% compiler choses to pad byte or char values out to ints. If you are passing values across a network to dissimilar machines, you should be using something like the XDR (External Data Representation). This makes for portable (although messy) code. In your case, I would agree that your compiler might reasonably be considered to be malfunctioning, since the Intel processors can access arbitrarily aligned data. The discussion originally discussed RISC processors which can NOT access arbitrary alignments for all data types. In this case padding is necessary. To minimize the amount of padding, it is necessary to reorder the structure elements. This is in accordance with K&R which (as I recall) explicitly states that the elements may be re-ordered. I re-iterate my original claim; it is not the compilers that are causing the problems (your case excepted). Rather, it is the fact that different processors have different access requirements for data types. Even if you wrote your programms in RISC assembler (a horrible thought) then you could not align your variables arbitrarily. You would be forced to make the same decisions/tradeoffs that the compilers make. -- Terry Ingoldsby ctycal!ingoldsb@calgary.UUCP Land Information Systems or The City of Calgary ...{alberta,ubc-cs,utai}!calgary!ctycal!ingoldsb
martin@mwtech.UUCP (Martin Weitzel) (02/21/90)
There were some recent postings, that pointed out/complained about 'holes' in C-struct definitions. I hope it is to the benefit of some readers, to explain an alternate point of view of C-struct-s and give some advice how to access a certain byte-layout in memory in a portable (nevertheless painless) way, which avoid struct-s completly. Because the latter may be of more interest, I'll come to it first. Suppose, you have some library function 'getmsg' you supply with the adresse of a buffer and when the function returns it has the buffer filled with the following information: 2 Byte Integer - length of message 1 Byte - several flag bits 1 Byte - type of message 4 Byte Integer - checksum 100 Byte - arbitrary message Many C-Programmers now think about defining the following struct m { short m_length; unsigned char m_flags; char m_type; unsigned long m_checksum; char m_bytes[100]; } buffer; so that after an 'getmsg(&buffer)' they can access the individual parts 'by name', eg: buffer.m_length, buffer.m_flags, .... ... and as the previous posters pointed out, they eventually get trapped by the 'holes' inserted into the struct by the compiler for the sake of efficiency. My advice in this situation is, to change this code as follows: char buffer[ 2 /* length of message */ + 1 /* several flag bits + 1 /* type of message */ + 4 /* checksum */ + 100 /* arbitrary message */ ]; #define m_length(b) (*((short *) (char *)(b) + 0)) #define m_flags(b) (*((unsigned char *)(char *)(b) + 2)) #define m_type(b) (*((char *) (char *)(b) + 3)) #define m_checksum(b) (*((unsigned long *)(char *)(b) + 4)) #define m_bytes(b) ( (char *)(b) + 8 ) (I inserted some white space for readability.) The least you must know of your compiler in that case is that a 'char' occupies exactly one byte in an 'array of char'. But as before, you can access the individual parts 'by name' as follows: m_length(buffer), m_flags(buffer), .... If 'getmsg' is allways supplied to the same buffer, you could make it even simpler by avoiding a parametrized macros and use #define m_length (*(short *)buffer) #define m_flags (*(unsigned char *)(buffer + 2)) ...... Note that the above expressions are also 'lvalues' ie you can use them on the left side of an assignment. There remains only the minor problem, that 'buffer' must be properly aligned. (Techniques for achieving this are shown in K&R - you simply have to define buffer as a union with the type of desired alignement. Alternatively you may allocate the buffer with 'malloc'.) If your concern is only 'reading' the elements out of the buffer, you have the additional benefit that you can transparently compensate for possible 'byte-order' problems. Suppose the message is produced by some piece of hardware that assumes the LSB of a 16 Bit Integer on the lower adress, and you want to move this hardware to a system, where the CPU takes just the opposite view. All you have to change is: #define m_length ((short)\ ((*(unsigned char *)(buffer+1))<<8)\ |(*(unsigned char *)buffer)) ....... (Hope I missed no brackets ... :-)) Now back to an alternate view of the C-struct-s, hit 'n' if you are no more interested. IMHO many features of the C language can elegantly be explained in an easy way, if you 'translate' the feature to the 'machine level'. (Eg I explain much about pointers and arrays to my classes by sketching pictures with the contents of the data segment.) One thing to misunderstand here is, that such an explanation often describes only *one* possible approach to implement the abstract concept: Though it seems natural, to think about a C-struct as beeing a collection of individual variables located at increasing memory adresses in the order they are declared(%) as struct-components, it often makes more sense, to see a C-struct only as a collection of data-items, that are garanteed *not* to overlap(%%). Furthermore the compiler asserts that access to a named struct-component will allways refer to the same part of memory, even if only the struct-s adress is the same (important when transfering struct-pointers as function parameters). The other guaranty, that the struct-components are located (more or less) adjacent in memory is only of some 'practical' value, especially if you have an 'array of struct'-s or write one struct to a file (using write/fwrite together with sizeof), but has nothing to do with the abstract concept of a C-struct. (%): Even the guarantee, that the struct elements are at ascending adresses in the order they are declared, IMHO only was given to avoid complex (and hard to understand) rules, when and when not it would be allowed to rearrange the elements. Readers who know other good reasons why this guarantee is given are welcome to correct me (hello Chris :-)). (%%): Note, that in the case of a C-union the garanty is *not* that the elements overlap: They only *may* overlap (unless they are of the same type or they are different C-structs but with components of the same type at the beginning, which leads back to the problem when and when not rearranging could have been allowed ... again, correct me if I'm wrong). -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83 -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
peter@ficc.uu.net (Peter da Silva) (02/23/90)
> (%): Even the guarantee, that the struct elements are at ascending > adresses in the order they are declared, IMHO only was given > to avoid complex (and hard to understand) rules, when and when > not it would be allowed to rearrange the elements. Readers who > know other good reasons why this guarantee is given are welcome > to correct me (hello Chris :-)). It makes the following two practices reasonably portable: 1: struct list_header { struct list_header next, prev; }; struct object { struct list_header list; ... }; struct list_header *my_list == NULL; struct object my_object; extern add_list(struct list_header **list, struct list_header *elt); add_list(&my_list, &my_object); 2: struct buffer { int len; char *next; char data[1]; }; struct buffer *new_buffer(size) int size; { struct buffer *temp; temp = (struct buffer *) malloc(sizeof *temp + size); if(temp) { temp->len = size; temp->next = &temp->data[0]; } return temp; } -- _--_|\ Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. / \ \_.--._/ Xenix Support -- it's not just a job, it's an adventure! v "Have you hugged your wolf today?" `-_-'
djones@megatest.UUCP (Dave Jones) (02/23/90)
From article <645@mwtech.UUCP), by martin@mwtech.UUCP (Martin Weitzel): ) There were some recent postings, that pointed out/complained about ) 'holes' in C-struct definitions. ... ) ) My advice in this situation is, to change this code as follows: ) ) char buffer[ ) 2 /* length of message */ ) + 1 /* several flag bits ) + 1 /* type of message */ ) + 4 /* checksum */ ) + 100 /* arbitrary message */ ) ]; ) ) #define m_length(b) (*((short *) (char *)(b) + 0)) ) #define m_flags(b) (*((unsigned char *)(char *)(b) + 2)) ) #define m_type(b) (*((char *) (char *)(b) + 3)) ) #define m_checksum(b) (*((unsigned long *)(char *)(b) + 4)) ) #define m_bytes(b) ( (char *)(b) + 8 ) ) There's probably going to be a flurry of replies telling you why this will not work in the general case. These casts from char* to this-or-that* are not going to work unless the data just happen to be properly aligned for whatever processor you happen to be using.
martin@mwtech.UUCP (Martin Weitzel) (02/24/90)
In article <12118@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes: }From article <645@mwtech.UUCP), by me (Martin Weitzel): }) There were some recent postings, that pointed out/complained about }) 'holes' in C-struct definitions. }... }) }) My advice in this situation is, to change this code as follows: }) }) char buffer[ }) 2 /* length of message */ }) + 1 /* several flag bits }) + 1 /* type of message */ }) + 4 /* checksum */ }) + 100 /* arbitrary message */ }) ]; }) }) #define m_length(b) (*((short *) (char *)(b) + 0)) }) #define m_flags(b) (*((unsigned char *)(char *)(b) + 2)) }) #define m_type(b) (*((char *) (char *)(b) + 3)) }) #define m_checksum(b) (*((unsigned long *)(char *)(b) + 4)) }) #define m_bytes(b) ( (char *)(b) + 8 ) }) } }There's probably going to be a flurry of replies telling you why }this will not work in the general case. } }These casts from char* to this-or-that* are not going to work }unless the data just happen to be properly aligned for whatever }processor you happen to be using. I'm well aware that allignment restrictions may invalidate certain casts from one pointer type to another, but you must see my proposual in the context of the original questions: The posters generally complained, that they were not able to overlay certain byte patterns in memory, because the C-struct they defined for that purpose contained holes (introduced by the compiler). The question, by which hard- or software the byte patterns were produced, was never mentioned in these postings, but because the posters seemed to be sure, that (only) the holes in the structures caused the problems, the parts must have been allready properly aligned If the parts of the byte patterns were not properly aligned, also struct-s *without* holes could not have been used for this purpose(%). So my proposual is not worse than a struct, but sometimes helps to get (better) control of which memory locations are accessed, than struct-s can provide. If it is only necessary to *read-access* the bytes in question, the approach described later in my original posting for getting 'wrong' byte order 'right', may also be used in case of not properly aligned short-s, int-s or long-s. (%) If a compiler, which supports an option to pack structures, does this *always tightly*, even on systems with specific alignment requirements for short-s, int-s and long-s, it may emit code to acces the LSB/MSB idividual and combine them in a register, but this would be such an extreme performance penalty, that I guess such compilers are rare. -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83