swanson@ihlpl.UUCP (Swanson) (11/20/86)
Could someone please explain to me the rational behind the way INTEL stores words in memory? The way Motorola stores words in memory? Please email. Thank you. Robert Swanson ihnp4!tss
gnu@hoptoad.UUCP (11/23/86)
In article <1509@ihlpl.UUCP>, swanson@ihlpl.UUCP (Swanson) writes: > Could someone please explain to me the rational behind the > way INTEL stores words in memory? The way Motorola stores > words in memory? If enough people are apathetic (e.g. don't complain), I will post a great piece, "On Holy Wars and a Plea for Peace", which is the best description of byte ordering problems I've ever seen. It was written by Danny Cohen of USC-ISI, released as an Internet Experiment Note (IEN-137), and eventually published in Datamation. It runs about 36K bytes. Send me mail if you think I should not post it. I will post it to mod.sources.doc if I can get any response out of the moderator. That group has been inactive for a long time. -- John Gilmore {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu jgilmore@lll-crg.arpa "I can't think of a better way for the War Dept to spend money than to subsidize the education of teenage system hackers by creating the Arpanet."
gene@cooper.UUCP (Eugene Kwiecinski ) (11/24/86)
In article <1509@ihlpl.UUCP>, swanson@ihlpl.UUCP (Swanson) writes: > Could someone please explain to me the rational behind the > way INTEL stores words in memory? The way Motorola stores > words in memory? Please email. Thank you. To save on a few transistors, of course. (money is money) From a hardware point of view, it's easier to add the LSBs first and work up to the MSBs ( B = Byte, not bit ). Bye, Gene Usenet (UUCP) Address: cucard\ psuvax!cmcl2\ {psuvax1!princeton, ucbvax!ulysses}!allegra>!phri!cooper!gene columbia/ {decwrl!ihnp4, harvard!seismo, decvax}!philabs/ (Whew!)
hansen@mips.UUCP (Craig Hansen) (11/26/86)
In article <1509@ihlpl.UUCP>, swanson@ihlpl.UUCP (Swanson) writes: > Could someone please explain to me the rational behind the > way INTEL stores words in memory? The way Motorola stores > words in memory? Please email. Thank you. You will undoubtably get several lame excuses why one ordering is better than the other. They are both right...and both wrong. The two conventions are commonly described as little-endian and big-endian, which is a reference to the frivolous dispute the Brobdignagians engaged in over which end an egg should be broken on, in Johnathan Swift's _Gulliver's_Travels_. At this point, a debate between the two conventions is entirely frivolous. What's important is that there are two distinct conventions, and that this can be a barrier to porting programs and databases, and data communications between machines employing different conventions. At MIPS, we refer to these conventions as "byte sex," emphasising the two-ness and incompatibility between them, but begging the question as to which is male and which is female. (The MIPS processor is, in this way, bisexual, and can be configured to follow either convention.) What I find inexcusable is the existence of machines that mix up the two conventions on a single machine. Motorola-endian is little-endian for bits, and big-endian for most everything else (including the 68020 bit field operations). VAX-endian is mostly little-endian, except for floating-point values which are, well, little-big-endian. -- Craig Hansen | "Evahthun' tastes MIPS Computer Systems | bettah when it ...decwrl!mips!hansen | sits on a RISC"
bjorn@alberta.UUCP (Bjorn R. Bjornsson) (11/27/86)
In article <1335@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes: > If enough people are apathetic (e.g. don't complain), I will post a > great piece, "On Holy Wars and a Plea for Peace", which is the best > description of byte ordering problems I've ever seen. ...... If I recall correctly, the biggest problem with this paper was it's bias, Cohen expresses a definite preference (not in so many words, but it shines through), and leaves out some, good arguments for the little endian side. I'm not unbiased either, but I certainly don't pretend to be. I'll elucidate, if this discussion gets of the ground again. Then again, I can work quite comfortably with either byte ordering, and do, on Suns and Vaxen, many times with applications that are sensitive to the particular order. When it's an issue, big endian usually makes things a little bit more interesting, if you have trouble disposing of your free time that is. Bjorn R. Bjornsson alberta!bjorn
lamaster@nike.uucp (Hugh LaMaster) (12/04/86)
In article <138@pembina.alberta.UUCP> bjorn@alberta.UUCP (Bjorn R. Bjornsson) writes: >In article <1335@hoptoad.uucp>, gnu@hoptoad.uucp (John Gilmore) writes: >> If enough people are apathetic (e.g. don't complain), I will post a >> great piece, "On Holy Wars and a Plea for Peace", which is the best >> description of byte ordering problems I've ever seen. ...... > ..... > .... and leaves out some, good arguments >for the little endian side. I'm not unbiased either, but I certainly >don't pretend to be. I'll elucidate, if this discussion gets of the >ground again. > > Bjorn R. Bjornsson > alberta!bjorn I, for one, would like to hear some good arguments for or against a particular byte ordering. It is my belief that there is no intrinsic architectural reason for either one. However, I am an unapologetic big-endian for two reasons: 1) A STANDARD is needed for the benefit of those of us who need to move BINARY data files between machines of different types, such as graphics and solids modeling data files; 2) Big Endian is easier to read for English speaking people because characters and floating point are in the same order as in English. (Has anyone ever wondered why we don't write 1 Million as 000,000,1 ?) But are there any intrinsic reasons for a particular order? Some people seem to think so. What are they? Hugh LaMaster, m/s 233-9, UUCP: {seismo,hplabs}!nike!pioneer!lamaster NASA Ames Research Center ARPA: lamaster@ames-pioneer.arpa Moffett Field, CA 94035 ARPA: lamaster%pioneer@ames.arpa Phone: (415)694-6117 ARPA: lamaster@ames.arc.nasa.gov "He understood the difference between results and excuses." ("Any opinions expressed herein are solely the responsibility of the author and do not represent the opinions of NASA or the U.S. Government")
fouts@orville (Marty Fouts) (12/04/86)
A number of people have made claims along the line that BIG ENDIAN is "easier to read because it's" like English. This is perhaps an oversimplification. You can write a memory dump program to present data in whichever format most amuses you, and I have seen terrible examples of several possible formats, my favorite being the one which gives lines of hex bytes alongside of lines of the ascii character codes for the same memory addresses with the hex reading right to left and the ascii reading left to right, like: 20 6e 69 74 72 61 4d 20 Martin 20 20 20 73 74 74 6f 46 Fouts Obviously, any of the four combinations LL, LR, RL, RR could have been coded, independent of the wordsize and byte ordering of the machine in question. Three of the four would be hard to read compared to the fourth, depending on who you are. I don't believe that there is an overriding hardware or software architural requirement that makes one byte ordering obviously right. There are application and implementation dependent factors favoring either, depending on the circumstance. (And of course, there is always the 60 bit word length machine :-) A standard would be nice, but its probably too late for that. (Anybody care to discuss the superiority of EBCDIC over ASCII?) I guess we should just be happy that there aren't more byte within word order choices being made.
tim@amdcad.UUCP (Tim Olson) (12/05/86)
In article <791@nike.UUCP> lamaster@pioneer.UUCP (Hugh LaMaster) writes: >I, for one, would like to hear some good arguments for or against >a particular byte ordering. It is my belief that there is no >intrinsic architectural reason for either one. However, I am an >unapologetic big-endian for two reasons: >1) A STANDARD is needed for the benefit of those of us who need >to move BINARY data files between machines of different types, >such as graphics and solids modeling data files; >2) Big Endian is easier to read for English speaking people >because characters and floating point are in the same order as >in English. (Has anyone ever wondered why we don't write 1 Million >as 000,000,1 ?) >But are there any intrinsic reasons for a particular order? Some >people seem to think so. What are they? I think a major reason why many microprocessors are little-endian is that they have an 8-bit ancestory, and to perform multi-precision arithmetic efficiently, they must index from the least-significant byte to the most significant. However, this is less of a problem with larger word size microprocessors, since multi-precision aritmetic is not used as much past 16 or 32 bits. One benefit of big-endian byte ordering on large (32-bit or more) wordsize machines is possible fast lexicographical comparison of character data with the use of integer compare instructions. Since the "direction" of MSB to LSB (B = byte) is the same as MSb to LSb (b = bit) strings may be compared a word at a time instead of byte-by-byte. (That is, as long as you use a sane character encoding, not something like the CDC display-codes ;-) -- Tim Olson Advanced Micro Devices "byte ordering preferences expressed in this article do not necessarily represent the views of this station or its management"
mayer@rochester.ARPA (Jim Mayer) (12/05/86)
I have never been convinced of any fundamental reason to prefer one byte ordering over another, however I believe there are some practical ones. Basically, any networked machine that uses a different byte order than the network(s) it is connected to will pay a (possibly significant) performance penalty. Furthermore, code that written to run on both byte orders will always pay some penalty even if run on the "right" machine. The rest of this article contains an example, a possible way out of the mess, an observation, and a question. THE EXAMPLE: Suppose a C program reads a message into a structure (let's assume problems with byte size and alignment have gone away). A correct program has two choices: it can convert the structure to the machine's byte order, or it can leave the structure in network order an convert on each reference. In the first case, there are three options. The first two are: (1) struct message { short x; long y; } m; m.x = ntohs(m.x); m.y = ntohl(m.y); (2) if (host byte order is not network byte order) { m.x = ntohs(m.x); m.y = ntohl(m.y); } (3) if (sending machine byte order is not host machine order) { m.x = ntohs(m.x); m.y = ntohl(m.y); } In (1), there is a constant penalty of at least one unnecessary copy. Unnecessary copy operations can quickly destroy the performance of a message passing system. Also, any missing swap operations will not be detected on a machine with network byte order. Case (2) assigns the penalty where it is due, but opens up even more possibilities for errors. Case (3) uses the same trick the X display server uses: accept messages in either byte order. The code ends up being similar to (2), but swapping is only done if the sending machine has a different byte order than the receiving machine. It has the same testability problems as (2). If the structure is maintained in network byte order, then each reference to the structure entails a possible conversion. The possibilities for programmer error are quite large here as well. THE SUGGESTION: All of the above solutions are error prone when written in a language like C that has no notion of byte order. Languages like CLU and C++ offer the possibility of encapsulating most of the nasty conversions. Another possibility is the addition of a "message" type constructor, analogous to "struct" in C, but maintaining a particular byte order (and floating point representation, etc.) and either prohibiting or interpreting correctly things like pointer references. Adding a "message" construct would be syntactically prettier (I think) than forcing a lot of calls to "m.get_x()" and "m.set_x(value)". It would also help with other representation issues (like value alignment, byte and word size, and floating point). THE OBSERVATION: I can easily envision a Load/Store architecture machine with a completely bisexual instruction set. Only the load and store operations would have to be modified. I understand from other postings that the MIPS processor can already be configured in either mode. THE QUESTION: TCP and friends use a "big-endian" order (since TCP is a byte stream protocol this applies only to things like addresses and headers). What order do other protocol families use? Are there any major "little-endian" protocol families? -- Jim Mayer Computer Science Department (arpa) mayer@Rochester.EDU University of Rochester (uucp) rochester!mayer Rochester, New York 14627 -- -- Jim Mayer Computer Science Department (arpa) mayer@Rochester.EDU University of Rochester (uucp) rochester!mayer Rochester, New York 14627
billw@navajo.STANFORD.EDU (William E. Westfield) (12/09/86)
Network interfaces should do any byte swapping that may be necessary for a given machine. That way there isn't any speed penalty. (having to byte swap the headers only is not so bad - note that the ones complement checksum used in TCP/IP is associative with respect tp byte swapping...) BillW
rb@cci632.UUCP (Rex Ballard) (12/10/86)
There are several protocols that are Non-endian, such as AX.25. Most such protocols simply establish a hierarchy of addresses. This hierarchy simplifies routing significantly, since only the the machine that matches the first ID will have to look at the second... In general, there is not "Best" endian approach. Little-endian machines have advantages in the integer processing of words larger than their registers/backplane, while Big-endian has advantages in raw search/compare/sort type applications. Even graphics isn't "cut and dried" in favor of either one, since there is lots of shifting (which gives Big-endian an advantage), and lots of integer computation (which favors Little-endian). The worst of course is "middle-endian", such as that used on the PDP-11 and some 808X machines for longs. When discussing network protocols, or data interchange formats, it is a good idea to avoid "endian" thinking at all. Bytes with "extension bits", or just raw bytes are definately preferable. Some psychological factors which work well, is to think in terms of "system; node" rather than "nodelow; nodehigh" type labelling. Internally, so long as the machine is consistant, either end is acceptable, based on the trade-offs desired for the application. Rex B.
ihm@minnie.UUCP (Ian Merritt) (12/11/86)
>There are several protocols that are Non-endian, such as AX.25. >Most such protocols simply establish a hierarchy of addresses. >This hierarchy simplifies routing significantly, since only the >the machine that matches the first ID will have to look at the >second... What you describe sounds like IP Source routing, where each address examines the address chain, strips the first and sends the rest on to that address. Still, if the basic address size is larger than the CPU basic word (as it will certainly be in any network with more than 256 nodes (or at least 256 nodes per heirarcical subnet), the issue of how to combine the basic size at the destination is no more well established by your suggestion than by the existing standards. > >In general, there is not "Best" endian approach. Little-endian >machines have advantages in the integer processing of words larger >than their registers/backplane, while Big-endian has advantages >in raw search/compare/sort type applications. Even graphics >isn't "cut and dried" in favor of either one, since there is >lots of shifting (which gives Big-endian an advantage), and >lots of integer computation (which favors Little-endian). Agreed, mostly, though I am somewhat partial (as I have mentioned before) to the Little endian school. > >The worst of course is "middle-endian", such as that used on >the PDP-11 and some 808X machines for longs. "middle-endian", you seem to use as a generic form for describing anything that is not consistently either little or big. I disagree with the term, but not with the conclusion you draw. To what 808x machines do you refer? > >When discussing network protocols, or data interchange formats, >it is a good idea to avoid "endian" thinking at all. Bytes >with "extension bits", or just raw bytes are definately preferable. I can't agree that extension bits are preferable, and in any case, as I mentioned above, there is often a need for numbers larger than can be represented in a single byte. There is no need to constrict the capability of the protocol just because it's "hard" to juggle the bytes around to accomodate a particular machine. If computers aren't here to juggle bits and bytes, what ARE they for? Granted it would save time in this case if every machine had the same standard, but since we can't agree on a uniform standard, juggling is likely to be here to stay for a while at least. >Some psychological factors which work well, is to think in terms of >"system; node" rather than "nodelow; nodehigh" type labelling. Again, source routing: not always useful; not necessarily byte oriented. > >Internally, so long as the machine is consistant, either end >is acceptable, based on the trade-offs desired for the application. Agreed, but there are very few machines that are totally consistent about it. > >Rex B. Cheerz-- <>IHM<> -- uucp: ihnp4!nrcvax!ihm