bcase@cup.portal.com (Brian bcase Case) (12/30/88)
>Both representations are useful. Are there any other respective >advantages I've left out? Big endian has the significant advantage that, when properly aligned, character strings can be compared using the full width of the machine's ALU. For 32-bit machines, this means that two four-character (sub)strings can be compared at one time. This is because the lowest address always points to the *first* character in the string. Little endian requires character-at-a-time processing or hardware gymnastics. Since it forces inefficiency, little-endian is for CISCs. :-) :-)
bcase@cup.portal.com (Brian bcase Case) (12/31/88)
>Big endian has the significant advantage that, when properly aligned, >character strings can be compared using the full width of the machine's >ALU. For 32-bit machines, this means that two four-character (sub)strings >can be compared at one time. This is because the lowest address always >points to the *first* character in the string. Oops, I meant to say that the first character is always in a more significant position than the second and succeeding characters. This corresponds to the convention that the first character in a string is the most important in determining its position in an alphabetically sorted list of strings. Thus, after properly aligning, (sub)strings can be compared as if they were simple (unsigned) integers.
patterso@hardees.rutgers.edu (Ross Patterson) (01/01/89)
>Since it forces inefficiency, little-endian is for CISCs. :-) :-)
Does that make the IBM 370 a RISC? ;-)
stuart@bms-at.UUCP (Stuart Gathman) (01/04/89)
In article <13045@cup.portal.com>, bcase@cup.portal.com (Brian bcase Case) writes: > Big endian has the significant advantage that, when properly aligned, > character strings can be compared using the full width of the machine's > ALU. For 32-bit machines, this means that two four-character (sub)strings > can be compared at one time. This is because the lowest address always > points to the *first* character in the string. Little endian requires > character-at-a-time processing or hardware gymnastics. I do the same thing on little endian. It all depends on how you store the characters. Read the "HOLY WAR" article for a detailed explanation. The problem is, there are no consistent little-endian machines, the big-endian infiltrators have sabotaged every last one (that I know of). The major (dis)advantages are: BIGend (numeric) compares / divides are faster LITTLEend adds / multiplies are faster -- Stuart D. Gathman <stuart@bms-at.uucp> <..!{vrdxhq|daitc}!bms-at!stuart>
w-colinp@microsoft.UUCP (Colin Plumb) (01/04/89)
In article <145@bms-at.UUCP> stuart@bms-at.UUCP (Stuart Gathman) writes: >The problem is, there are no consistent little-endian machines, the >big-endian infiltrators have sabotaged every last one (that I know of). The Inmos transputer is uniformly little-endian. This applies to both integers and floating-point numbers (where most others mess up). -- -Colin (uunet!microsof!w-colinp)
hutch@delft (David Hutchens) (01/05/89)
From article <170@microsoft.UUCP>, by w-colinp@microsoft.UUCP (Colin Plumb): > In article <145@bms-at.UUCP> stuart@bms-at.UUCP (Stuart Gathman) writes: >>The problem is, there are no consistent little-endian machines, the >>big-endian infiltrators have sabotaged every last one (that I know of). > > The Inmos transputer is uniformly little-endian. This applies to both > integers and floating-point numbers (where most others mess up). > -- > -Colin (uunet!microsof!w-colinp) Actually, where most little-endian machines screw up is storing the bits in the byte in the wrong order. It is good to hear that somebody got it right and stored a one as 100000...0000 rather than 00000001000...000. (That is what you meant wasn't it?). Note that this implies one multiplies by 2 by using a RIGHT shift (else there is an inconsistancy in the little-endian view in the registers)! The Inmos sounds interesting. David Hutchens hutch@hubcap.clemson.edu ...!gatech!hubcap!hutch
w-colinp@microsoft.UUCP (Colin Plumb) (01/05/89)
In article <4008@hubcap.UUCP> hutch@delft writes: >Actually, where most little-endian machines screw up is storing the >bits in the byte in the wrong order. It is good to hear that somebody got it >right and stored a one as 100000...0000 rather than 00000001000...000. >(That is what you meant wasn't it?). Note that this implies one >multiplies by 2 by using a RIGHT shift (else there is an inconsistancy >in the little-endian view in the registers)! The Inmos sounds interesting. Sorry, no. Little endian means that if two addressed objects (on the Trasnputer, the smallest object that can be addressed is a byte) are part of the same number, the object (byte) with the lower address is less significant. Note two things: -> There is no ordering implied on bits within bytes; a byte is an atomic object, and you can't say which bit of it comes "first." (Of course, in serial communications, the other site of the Holy War, this is significant.) -> Both big- and little-endian types agree that more significant bits should be to the left, conceptually (the Arabic heritage, remember?); they *don't* agree on whether addresses increase left-to-right (big-endian) or right-to-left (little-endian). See On Holy Wars and a Plea For Peace for more diagrams. A one is stored as 00000000 00000000 00000000 00000001, with the bytes' addresses being base+3 base+2 base+1 base+0. Thus, it is impossible for a byte-addressed machine to store the bits in a byte in the wrong order, unless it has bitfield instructions or some such bit-addressing kludge. -- -Colin (uunet!microsof!w-colinp)
mac3n@babbage.acc.virginia.edu (Alex Colvin) (01/05/89)
> > The Inmos transputer is uniformly little-endian. This applies to both > > integers and floating-point numbers (where most others mess up). You mean the beginning of a double looks like a float? f(x) float x; { g(&x); } /* g() is actually passed a (double *) */ > Actually, where most little-endian machines screw up is storing the > bits in the byte in the wrong order. It is good to hear that somebody got it On most of these machines, bits are stored vertically :-/. [half :-)] If you can't index or address bits, there is no order. If it makes you happy, call a right shift (to less significance) a down shift, a left shift an up shift. The big/little thing only has meaning in addressing parts. Another notational screw-up is where to put address 0 when drawing memory. I always put it at the top ("up there at the bottom of memory").
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (01/05/89)
In article <13045@cup.portal.com> bcase@cup.portal.com (Brian bcase Case) writes: >Big endian has the significant advantage that, when properly aligned, >character strings can be compared using the full width of the machine's >Since it forces inefficiency, little-endian is for CISCs. :-) :-) I could be wrong, but I think a fully consistent little-endian machine (e.g. nsc 32xxx) does not have this disadvantage. All this was covered about 2 years ago on this group: the conclusion then was that little-endian had a small advantage on tiny machines (e.g. 8008 class and slower) needing to do BCD arithmetic, big endian machines have the "advantage" that it is easier to read dumps, and there are no other significant differences. VAXes, of course, are not consistent little- endian or big-endian, but then, we are not supposed to have to read dumps anymore anyway, remember ? :-) -- Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
seanf@sco.COM (Sean Fagan) (01/05/89)
In article <20264@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes: >I could be wrong, but I think a fully consistent little-endian machine >(e.g. nsc 32xxx) does not have this disadvantage. You're wrong. On a NSC 32k, addresses are in the wrong order (actually, I think it might just be displacements), because the upper 1 or 2 bits determine the size of the address (and means that you can't use a displacement of 2gigs unsigned, or 1 gig signed. everybody sigh in unison 8-)). Also, I'd bet that the FP format is backwards (wrt big vs. little endian). Now, *Cybers* don't have this problem, you betcha. It's kinda nice not having to worry about byte addressing... -- Sean Eric Fagan | "Merry Christmas, drive carefully and have some great sex." seanf@sco.UUCP | -- Art Hoppe (408) 458-1422 | Any opinions expressed are my own, not my employers'.
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (01/06/89)
In article <20264@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes: >I could be wrong, but I think a fully consistent little-endian machine >(e.g. nsc 32xxx) does not have this disadvantage. I was assuming an equality comparison. Most people seem to assume strcmp(), for which it does make a difference (this could lead to a very long discussion of how important strcmp-like comparisons are, etc., which I will avoid.) -- Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
bcw@rti.UUCP (Bruce Wright) (01/06/89)
In article <20264@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes: > In article <13045@cup.portal.com> bcase@cup.portal.com (Brian bcase Case) writes: > > >Since it forces inefficiency, little-endian is for CISCs. :-) :-) > > All this was covered about 2 years ago on this group: the conclusion then > was that little-endian had a small advantage on tiny machines (e.g. 8008 > class and slower) needing to do BCD arithmetic, big endian machines have > the "advantage" that it is easier to read dumps, and there are no other > significant differences. VAXes, of course, are not consistent little- > endian or big-endian, but then, we are not supposed to have to read dumps > anymore anyway, remember ? :-) > My immediate thought on seeing the VAX instruction set when it first came out was that by making the byte order "little endian" it allowed something like a Fortran compiler to take a statement like: call sub (1) and pass a number to it (a longword - 4 bytes) which would be interpreted correctly whether the receiving formal parameter was a byte, a word (2 bytes), or a longword (4 bytes). This is not possible in a "big endian" machine - you have to know how many bytes of high order 0's to write before you get to the low order byte. Considering that the Fortran of the day had no way to declare the formal parameters for subroutines, and the importance of Fortran in the early days of the VAX (and the fact that the VAX was built with a great deal of input from the software guys), could this be the REAL motivation for "little endian"? Of course the fact I even thought of the possibility of such a trick probably shows I'm just an old Fortrash hacker ... Bruce C. Wright
steve@nuchat.UUCP (Steve Nuchia) (01/06/89)
In article <20293@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes: [supposed advantages of little-endian] >I was assuming an equality comparison. Most people seem to assume strcmp(), >for which it does make a difference (this could lead to a very long discussion >of how important strcmp-like comparisons are, etc., which I will avoid.) Well, I won't. :-) The literature on sorting algorithms focuses on the use of a "<=" oracle, by analogy with the mathematical definition of "well order", which is what a sort is supposed to do. In a previous life I derived a sort algorithm that used a three-way oracle (strcmp, in fact) to good advantage. I based the work on the fact that a large part of the comparison expense for strings is in scanning the initial equal part; the three-way answer comes for free after that. My algorithm maintained in-core data in a trinary tree with a degenerate (linear) subtree for the equals case. The expected data had significant clumping around discrete values so the extra space was well justified. The disk-resident format for intermediate runs included a bit for "known equal" so the tests didn't have to be repeated during merging. It was a very fast sort, given the expected input distribution. (It used a number of other tricks, including bidirectional run management and very-high-order merging. The other tricks exploited the unavoidable disk block cachine in unix, but the trinary tree is quite general.) So, don't discount strcmp's value. Very many progams use it for an equality test only, but sorting still consumes a great deal of computer time in the real world, and when sorting we need to know which way it went. It would be a good idea for computer architects to bear this in mind: As mundane as sorting may seem, it is the benchmark of choice for a great many check-signers. -- Steve Nuchia South Coast Computing Services uunet!nuchat!steve POB 890952 Houston, Texas 77289 (713) 964 2462 Consultation & Systems, Support for PD Software.
wen-king@cit-vax.Caltech.Edu (King Su) (01/06/89)
In article <2695@rti.UUCP> bcw@rti.UUCP (Bruce Wright) writes: >My immediate thought on seeing the VAX instruction set when it first came <out was that by making the byte order "little endian" it allowed something >like a Fortran compiler to take a statement like: < > call sub (1) < >and pass a number to it (a longword - 4 bytes) which would be interpreted <correctly whether the receiving formal parameter was a byte, a word (2 bytes), >or a longword (4 bytes). This is not possible in a "big endian" machine - <you have to know how many bytes of high order 0's to write before you get to >the low order byte. This cannot have anything to do with byte-ordering because the two byte-ordering conventions are totally symmetrical and isomorphic. Any difference between two machines must have been a result of some asymmetries that was imposed on on the machine when the machine was designed. In the example above, the asymmetry was imposed when the following question is answered: If a data unit is consisted of a sequence of bytes, what should the address of the data unit be: the address of the MSByte or the address of the LSByte. In VAX, and in most little-endian machines, the address of the LSByte was used to represent the address of the data unit. In 68K and most big-endian machines, the address of the MSByte was used. The choice is quite arbitrary, but the important thing is that it imposes an asymmetry. The supposed "advantage" of the little-endian byte-ordering is really the advantage of choosing the address of the LSByte to be the address of a multi-byte unit. We can build a big-endian machine with exactly the same advantage if we make the same choice for it as we have made for VAX. In this case, a 'long' that occupies byte address 0x20 0x21 0x22 0x23, will have 0x23 as its address. In general, given any little-endian machine, we can build a big-endian machine that is exactly as good as the little-endian machine (in fact, they will be duals), and vice versa. Byte-ordering should cease to be the focal point of any arguments; talks about the decisions that lead to the asymmetries should replace it. -- /*------------------------------------------------------------------------*\ | Wen-King Su wen-king@vlsi.caltech.edu Caltech Corp of Cosmic Engineers | \*------------------------------------------------------------------------*/
d25001@mic.UUCP (Carrington Dixon) (01/07/89)
In article <2695@rti.UUCP> bcw@rti.UUCP (Bruce Wright) writes: >My immediate thought on seeing the VAX instruction set when it first came >out was that by making the byte order "little endian" it allowed something >like a Fortran compiler to take a statement like: > call sub (1) >and pass a number to it (a longword - 4 bytes) which would be interpreted >correctly whether the receiving formal parameter was a byte, a word (2 bytes), >or a longword (4 bytes). This is not possible in a "big endian" machine - >you have to know how many bytes of high order 0's to write before you get to >the low order byte. Considering that the Fortran of the day had no way to >declare the formal parameters for subroutines, and the importance of >Fortran in the early days of the VAX (and the fact that the VAX was built >with a great deal of input from the software guys), could this be the REAL ^^^^^^^^^^^^^^^^^^^^^^ >motivation for "little endian"? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If I thought that it was, my respect for DEC would take a BIG drop. Passing a longword and receiving a word is an ERROR. Sure, it works fine for your example, but what if the statement were: call sub (70000) Now both the big- and little-endian machines are receiving the "wrong" value. On the big-endian machine, the developer will probably find his/her mistake during initial checkout. On the little-endian machine the error _may_ not be found until the module has been in production, spewing out wrong answers for months. I don't know about what kind of environment you work in, but where I work this kind of error could cost my company $k in data that had to be reprocessed. (Not too mention egg on our corporate face if a client were to discover the gaffe.) And now to lighten up.... No, this cannot be the _REAL_ motivation for the little-endian data format, because this is INTEGER data (-::-) snicker ... snicker ... Carrington Dixon UUCP: { convex, killer }!mic!d25001
bcw@rti.UUCP (Bruce Wright) (01/08/89)
In article <205@mic.UUCP>, d25001@mic.UUCP (Carrington Dixon) writes: > In article <2695@rti.UUCP> bcw@rti.UUCP (Bruce Wright) writes: > > >My immediate thought on seeing the VAX instruction set when it first came > >out was that by making the byte order "little endian" it allowed something > >like a Fortran compiler to take a statement like: > > > call sub (1) > > >and pass a number to it (a longword - 4 bytes) which would be interpreted > >correctly whether the receiving formal parameter was a byte, a word (2 bytes), > >or a longword (4 bytes). [...] Could this be the REAL > ^^^^^^^^^^^^^^^^^^^^^^ > >motivation for "little endian"? > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > If I thought that it was, my respect for DEC would take a BIG drop. > Passing a longword and receiving a word is an ERROR. Sure, it works fine > for your example, but what if the statement were: [...] Well, you can assert that the FORTRAN language itself is an error; this is essentially what you are saying. The point is that there is NO WAY (repeat: NO WAY) in the Fortran 77 standard to declare a formal argument list. That means that there is NO WAY to declare to the compiler that it is to pass an INTEGER*2 value (as opposed to, say, an INTEGER*4 value) as a parameter to the subroutine. In other words, the INTENT OF THE PROGRAMMER was all along to pass a word rather than a longword, but the DEFINITION OF THE LANGUAGE does not allow this to be explicitly declared. Now I am not going to defend FORTRAN as a "safe" language, or an "elegant" language, or even a "good" language. It is however a very commercially significant language - which is not at all the same thing (eg, COBOL). FORTRAN is still in considerable use (even for new development in some environments). > the error _may_ not be found until the module has been in production, > spewing out wrong answers for months. I don't know about what kind of > environment you work in, but where I work this kind of error could cost my > company $k in data that had to be reprocessed. (Not too mention egg on > our corporate face if a client were to discover the gaffe.) The classic FORTRAN error had nothing whatsoever to do with this kind of error, but with the terseness that FORTRAN uses for its syntax: a statement something like do 100 i=1,10 got permuted to something like do 100 i=1.10 The former statement starts a loop varying "I" from 1 to 10, and the latter assigns a value of 1.10 to a variable named "DO100I". Because of the structure of FORTRAN, where there is no explicit end-loop construct (that is specified by the statement label "100"), so the error went undetected ... until the satellite got dumped in the ocean and NASA had lots of egg on its face. In other words, if you want to flame anything about safe computing, you should probably be flaming FORTRAN, not DEC or the VAX or me. Bruce C. Wright
glennw@nsc.nsc.com (Glenn Weinberg) (01/10/89)
In article <2015@scolex> seanf@scolex.UUCP (Sean Fagan) writes: >In article <20264@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes: >>I could be wrong, but I think a fully consistent little-endian machine >>(e.g. nsc 32xxx) does not have this disadvantage. > >You're wrong. On a NSC 32k, addresses are in the wrong order (actually, I >think it might just be displacements), because the upper 1 or 2 bits >determine the size of the address (and means that you can't use a >displacement of 2gigs unsigned, or 1 gig signed. everybody sigh in unison >8-)). Also, I'd bet that the FP format is backwards (wrt big vs. little >endian). He's less wrong than you are :-) The 32K is completely consistently little-endian (including floating-point), except for displacements, which are as you described: the upper two bits of the displacement determine whether it is one, two or four bytes long. Since displacements are part of the instruction stream rather than the data, all data representations in the 32K are consistently little-endian. Unless you write self-modifying code, the only time the reverse order of the displacements is annoying is when you're writing an assembler or disassembler. -- Glenn Weinberg Email: glennw@nsc.nsc.com National Semiconductor Corporation Phone: (408) 721-8102 (My opinions are strictly my own, but you can borrow them if you want.)
d25001@mic.UUCP (Carrington Dixon) (01/10/89)
In article <2702@rti.UUCP> bcw@rti.UUCP (Bruce Wright) writes: >In article <205@mic.UUCP>, d25001@mic.UUCP (Carrington Dixon) writes: >Well, you can assert that the FORTRAN language itself is an error; this is >essentially what you are saying. The point is that there is NO WAY (repeat: >NO WAY) in the Fortran 77 standard to declare a formal argument list. That We both agree that there is no provision in FORTRAN to catch mismatched arguments at compile time. We even seem to agree that this is a failing of that language. Thus there is a large category of errors that FORTRAN cannot find at compile time. I maintain that those who wish to create "correct" programs will want to test these modules in order to find as many errors as possible before dumping the mess on some hapless user. With this in mind, I maintain that some data formats lend themselves to finding such latent errors more readily than do others and that it would be pernicious of any vendor to choose its data formats with an eye to making such checkout as difficult as possible. DEC and little- endian integers was just the example at hand; I can think of other architectures that allow the equally unfortunate passing of double-reals and receiving single-reals with similar problems in runtime diagnoses. >In other words, if you want to flame anything about safe computing, you >should probably be flaming FORTRAN, not DEC or the VAX or me. > > Bruce C. Wright I thought my response was a little mild to qualify as a full- fledged usenet flame, but I suppose that opinons may differ. For the record, I do not think that DEC was guilty of guilty of choosing its data formats in some blind and misguided attempt to follow FORTRAN's lead into the dismal swamp. They chose the "little-endian" format for other reasons. I am sure that they were under no delusion that they had to perpeptuate FORTRAN's shortcomings in their hardware. Incidentally, I think that the phrase that you were trying to use (twice) was "in error." I might be offended if I thought that you really meant that I was "an error." Carrington Dixon UUCP: { convex, killer }!mic!d25001
jesup@cbmvax.UUCP (Randell Jesup) (01/11/89)
In article <482@babbage.acc.virginia.edu> mac3n@babbage.acc.virginia.edu (Alex Colvin) writes: >> Actually, where most little-endian machines screw up is storing the >> bits in the byte in the wrong order. It is good to hear that somebody got it >Another notational screw-up is where to put address 0 when drawing memory. I >always put it at the top ("up there at the bottom of memory"). I think there are two main reasons for what appears to be more programmers liking big-endian (no flames, local observation) and hardware people liking little-endian are: 1) Little-endian used to make it easier to support big integers on small- buswidth machines (minor issue, solved or irrelevant now in general). 2) Hardware people like to draw diagrams with 0 at bottom-right, software people, used to printers and screens that print top to bottom, left to right, like to put 0 at upper-left. It also makes dumping memory with strings easier to read. -- Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup
bbadger@x102c.uucp (Badger BA 64810) (01/13/89)
In article <5658@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes: >2) Hardware people like to draw diagrams with 0 at bottom-right, software >people, used to printers and screens that print top to bottom, left to right, >like to put 0 at upper-left. It also makes dumping memory with strings easier >to read. DEC VAX DUMP prints out in a format that makes both integers and strings easy to read. Namely, it prints out each in their ``natural'' order: Integers in little-endian (right to left), and strings from left to right. Here's an example: Virtual block number 1 (00000001), 512 (0200) bytes 4E4D4C4B 4A494847 46454443 4241002F /.ABCDEFGHIJKLMN 000000 69685420 5A595857 56555453 5251504F OPQRSTUVWXYZ Thi 000010 74736574 20612079 6C6E6F20 73692073 s is only a test 000020 00000000 00000000 00000000 FFFF0021 !............... 000030 00000000 00000000 00000000 00000000 ................ 000040 <----- numbers go this way <---*---> strings go this way ---> People who expect the first word (000000) to appear first (at left) will be suprised by this, but it's perfectly consistent with the way we write our numbers and strings. Bernard A. Badger Jr. 407/984-6385 |``Use the Source, Luke!'' Secure UNIX Products |It's not a bug; it's a feature! Harris GISD, Melbourne, FL |Buddy, can you paradigm? Internet: bbadger@cobra@trantor.harris-atd.com|Recursive: see Recursive.
jesup@cbmvax.UUCP (Randell Jesup) (01/17/89)
In article <1433@trantor.harris-atd.com> bbadger@x102c.UUCP (Badger BA 64810) writes: >In article <5658@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes: >>2) Hardware people like to draw diagrams with 0 at bottom-right, software >>people, used to printers and screens that print top to bottom, left to right, >>like to put 0 at upper-left. It also makes dumping memory with strings easier >>to read. >DEC VAX DUMP prints out in a format that makes both integers and strings >easy to read. Namely, it prints out each in their ``natural'' order: >Integers in little-endian (right to left), and strings from left to right. > 4E4D4C4B 4A494847 46454443 4241002F /.ABCDEFGHIJKLMN 000000 > 69685420 5A595857 56555453 5251504F OPQRSTUVWXYZ Thi 000010 > 74736574 20612079 6C6E6F20 73692073 s is only a test 000020 > <----- numbers go this way <---*---> strings go this way ---> > >People who expect the first word (000000) to appear first (at left) will be >suprised by this, but it's perfectly consistent with the way we write >our numbers and strings. I don't know about you (or your hardware), but I tend to write from left to right, not right to left. :-) And I don't start writing in the middle of the page, and go both left and right from there. :-) Sure you can write this way, or even make things scroll up, but most terminals/whatever are easier to deal with in a sequential, left to right, top to bottom fashion. It's marginally more annoying to deal with in your way. Also, I get a headache trying to find the word/byte/whatever I'm looking for in a listing like that, I have to reverse my thinking. :-) Personally, that's a nice kludge to get around the fact that little- endian is "naturally" written right to left, bottom to top by most people. However, people don't read that way, certainly not text. I think little-endian is a long-standing joke played by hardware engineers of software writers. :-) -- Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup
ggs@ulysses.homer.nj.att.com (Griff Smith) (01/17/89)
In article <5703@cbmvax.UUCP>, jesup@cbmvax.UUCP (Randell Jesup) writes: > Personally, that's a nice kludge to get around the fact that little- > endian is "naturally" written right to left, bottom to top by most people. > However, people don't read that way, certainly not text. Where `people' are defined to be those who happen to be members of the Western cultures that read left to right. What does that make the others? > I think little-endian is a long-standing joke played by hardware > engineers of software writers. :-) Big-endian is a long-standing mistake imposed on us by merchants from the Middle Ages who missed the point. In transcribing the number system from the Arabic, they should have had the sense to reverse the digits to compensate for the strange Western custom of writing from left to right. ( :-), I suppose). > -- > Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup -- Griff Smith AT&T (Bell Laboratories), Murray Hill Phone: 1-201-582-7736 UUCP: {most AT&T sites}!ulysses!ggs Internet: ggs@ulysses.att.com
jesup@cbmvax.UUCP (Randell Jesup) (01/18/89)
In article <11113@ulysses.homer.nj.att.com> ggs@ulysses.homer.nj.att.com (Griff Smith) writes: >In article <5703@cbmvax.UUCP>, jesup@cbmvax.UUCP (Randell Jesup) writes: >> Personally, that's a nice kludge to get around the fact that little- >> endian is "naturally" written right to left, bottom to top by most people. >> However, people don't read that way, certainly not text. > >Where `people' are defined to be those who happen to be members of the >Western cultures that read left to right. What does that make the others? Yes, sorry, I forgot to qualify that as people is "Western" cultures. This is the smallest problem with existing systems/software for non-"Western" people (does your software support kanji? Arabic?) -- Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup
cik@l.cc.purdue.edu (Herman Rubin) (01/19/89)
In article <5721@cbmvax.UUCP>, jesup@cbmvax.UUCP (Randell Jesup) writes: > In article <11113@ulysses.homer.nj.att.com> ggs@ulysses.homer.nj.att.com (Griff Smith) writes: > >In article <5703@cbmvax.UUCP>, jesup@cbmvax.UUCP (Randell Jesup) writes: > >> Personally, that's a nice kludge to get around the fact that little- > >> endian is "naturally" written right to left, bottom to top by most people. > >> However, people don't read that way, certainly not text. > > > >Where `people' are defined to be those who happen to be members of the > >Western cultures that read left to right. What does that make the others? > > Yes, sorry, I forgot to qualify that as people is "Western" > cultures. This is the smallest problem with existing systems/software > for non-"Western" people (does your software support kanji? Arabic?) > > -- > Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup A look at ancient writing of numbers, both in symbols and spelled out, indicates that it is pretty much big-endian. Except for the units and tens digits, I know of no language in either the Semitic or the Indo-European group which does not express numbers with the most significant part first. For example, in Hebrew (and probably also in Arabic, they are sufficiently similar), one would say the equivalent of two hundred and thirty, NOT thirty and two hundred. It would be written right-to-left big-endian, just as the language is written. These languages then introduced (mostly) decimal representations, using different characters for multiples of different powers of 10. Again, they were written big-endian. Then the idea of using the same symbol in each place, with a zero to hold the place, originated in India. The Indian writing is left-to-right. After the Moslem invasion of India, they adopted the Indian decimal notation without change. That is why the Arabic expression appears as little-endian. There does not seem to be any support from "natural" languages for the little-endian approach. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
gpwrdcs@gp.govt.nz (Don Stokes, Govt Print, Wellington) (01/20/89)
In article <20264@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes: > VAXes, of course, are not consistent little- > endian or big-endian, but then, we are not supposed to have to read dumps > anymore anyway, remember ? :-) > VAXes are definitely little-endian as far as integers go ... and reading dumps is not a problem ... VMS DUMP puts the hex part of the dump in reverse order, so all the bytes are in the right order, and numeric values can be easily distinguished. It is just a matter of learning to read from right to left... The important part about little endian vs big endian (which can cause problems) is overlaying of disimilar data types. If I overlay a byte onto a word on a VAX (or any other little-endian processor), put in a word value < 256, and do a byte read from the same address, I will get a correct response. If I do the same thing with a big-endian processor, I will get zero. Of course you don't usually overlay floating point numbers ... so the order of the bytes in a floating-point number is (usually) irrelevant ... Don Stokes Systems Programmer Government Printing Office, Wellington, New Zealand.
mac@mrk.ardent.com (Michael McNamara) (01/21/89)
In article <1102@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: |There does not seem to be any support from "natural" languages for the |little-endian approach. Four and twenty black birds, baked in a pie.... |-- |Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 |Phone: (317)494-6054 |hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP) Michael McNamara mac@ardent.com
jkl@csli.STANFORD.EDU (John Kallen) (01/21/89)
In article <1102@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > >There does not seem to be any support from "natural" languages for the >little-endian approach. What about Danish: fem og halvfirsindtyve (75 (my Danish is rusty)) Or norwegian: en og femti (51). This fooled me once into believing one could rent a room in Paris for Fr 1.50... :-) Or better yet, German: Zwei und Vierzig (42!) I believe Danish, Norwegian and German count as "natural" languages. At least in Denmark, Norway and German[y|ies] :-) John. _______________________________________________________________________________ | | | | |\ | | /|\ | John Kallen "The light works. The gravity | |\ \|/ \| * |/ | |/| | | PoBox 11215 works. Anything else we must | |\ /|\ |\ * |\ | | | | Stanford CA 94309 take our chances with." _|_|___|___|____|_\|___|__|__|_jkl@csli.stanford.edu___________________________
bbadger@x102c.uucp (Badger BA 64810) (01/22/89)
In article <5703@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes: >In article <1433@trantor.harris-atd.com> bbadger@x102c.UUCP (Badger BA 64810) writes: >>In article <5658@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes: >>>2) Hardware people like to draw diagrams with 0 at bottom-right, software >>>people, used to printers and screens that print top to bottom, left to right, >>>like to put 0 at upper-left. It also makes dumping memory with strings easier >>>to read. > >>DEC VAX DUMP prints out in a format that makes both integers and strings >>easy to read. Namely, it prints out each in their ``natural'' order: >>Integers in little-endian (right to left), and strings from left to right. > >> 4E4D4C4B 4A494847 46454443 4241002F /.ABCDEFGHIJKLMN 000000 >> 69685420 5A595857 56555453 5251504F OPQRSTUVWXYZ Thi 000010 >> 74736574 20612079 6C6E6F20 73692073 s is only a test 000020 >> <----- numbers go this way <---*---> strings go this way ---> >> >>People who expect the first word (000000) to appear first (at left) will be >>suprised by this, but it's perfectly consistent with the way we write >>our numbers and strings. > > I don't know about you (or your hardware), but I tend to write from >left to right, not right to left. :-) And I don't start writing in the >middle of the page, and go both left and right from there. :-) > Actually, my hardware (VT100 terminal) normally writes left-to-right, but this doesn't stop me from *reading* right-to-left (and LtR) once an entire line is on-screen. > Sure you can write this way, or even make things scroll up, but >most terminals/whatever are easier to deal with in a sequential, left to >right, top to bottom fashion. It's marginally more annoying to deal with >in your way. Also, I get a headache trying to find the word/byte/whatever >I'm looking for in a listing like that, I have to reverse my thinking. :-) (Left-to-right and Top-to-bottom are separate issues.) > > Personally, that's a nice kludge to get around the fact that little- >endian is "naturally" written right to left, bottom to top by most people. >However, people don't read that way, certainly not text. > Aaahh! That's just it. People reading VMS DUMP output looking for numbers *do* read from right-to-left (RtL) (once they get the hang of it :-). It's not really hard, and it make sense of all lengths of integers from 1 byte to n. The reasons for *choosing* big- or little-endian integer representations play more to hardware and software issues than adherence to historical human reading conventions. The point I'm trying to make about DUMP output is that (Western) people expect to be able to *read* numeric output from left-to-right with the most-significant digits first. If you think the first (i.e., leftmost) byte printed should also have the lowest byte-address, you are really *specifying* big-endian order. By dropping this abitrary restriction, VMS DUMP can print the bytes out in a contiguous block for that line. Taking the first line of the dump as an example, >> 4E4D4C4B 4A494847 46454443 4241002F /.ABCDEFGHIJKLMN 000000 note that the first two bytes of the file specify a single integer number, LSB order: 002F ==> byte(0) = 2F, byte(1) = 00. It's certainly easier to read written MSB (002F) than in storage order (2F00). If the next element of the file were ``really'' an INTEGER*4 variable (please excuse the use of FORTRAN in mixed company :-), you would catenate the "4443 4241" into 44434241. But if it turned out to be two INTEGER*2 values you would read "4241" first, then "4443". This does result in your eyes moving RtL to increment addressing -- as when counting to a specific offset in a record structure -- and then scanning back from LtR to read an integer. This is far easier to put up with than printing hexadecimal output with addresses increasing from left-to-right on a little-endian machine! As far as consistency goes, I always liked the fact that on little-endian architectures, the bit numbering (0..31) makes bit $ k $ represent $ 2^k $ no matter what the word size is. Whereas on big-endian 32-bit words bit $ k $ equals $ 2 ^ {31 - k} $ and on 16-bit (half) words, the value is $ 2 ^ {15 - k}$. That is: LSB (little-endian): 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 2^7 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 So 2^7 sets bit number 7. MSB (little-endian): 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2^7 = [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0] So 2^7 sets bit number 24. 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 2^7 = 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 So 2^7 sets bit number 8. Normally we can sweep these distinctions under a rug of abstraction. It's only when we start to examine machine code or numeric representations that we operate on that low a level. > I think little-endian is a long-standing joke played by hardware >engineers of software writers. :-) Right. So if we just play along with the joke in DUMP output, we won't have to tangle up our bits too badly. Of course, then there's communications software where some data is MSB and some is LSB, depending whether you're using the host format or the network format. In that case, no matter which way we print our dump lines, some data will be written with the LSB on the left. P.S. You mentioned the bottom/top issue: whether to print the low addresses at the top (normal first-things-first order) or at the bottom (like most hardware address space diagrams, or STACK dumps). Again the most convenient order depends on the use that is made of the data, what its internal format *is*. Both forms of output are useful. The VAX DUMP doesn't have a "FFFFFFFF at top" option. Too bad. Bernard A. Badger Jr. 407/984-6385 | ``Use the Source, Luke!'' Secure UNIX Products | That's not a bug! It's a feature! Harris GISD, Melbourne, FL 32902 | Buddy, can you paradigm? Internet: bbadger@x102c.harris-atd.com | 's/./&&/' Tom sed [sic] expansively.
john@frog.UUCP (John Woods) (01/24/89)
In article <7193@csli.STANFORD.EDU>, jkl@csli.STANFORD.EDU (John Kallen) writes: > In article <1102@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > >There does not seem to be any support from "natural" languages for the > >little-endian approach. > What about Danish: fem og halvfirsindtyve (75 (my Danish is rusty)) > Or norwegian: en og femti (51). This fooled me once into believing > one could rent a room in Paris for Fr 1.50... :-) > Or better yet, German: Zwei und Vierzig (42!) Ah, but consider the German for 1988: neunzehn hundert acht und achtzig (nine-and-ten hundred eight and eighty). Middle-endian. AHA! Germans are PDP-11s! :-) -- John Woods, Charles River Data Systems, Framingham MA, (508) 626-1101 ...!decvax!frog!john, john@frog.UUCP, ...!mit-eddie!jfw, jfw@eddie.mit.edu Presumably this means that it is vital to get the wrong answers quickly. Kernighan and Plauger, The Elements of Programming Style
nol2105@dsacg2.UUCP (Robert E. Zabloudil) (01/25/89)
In article <1916@ardent.UUCP>, mac@mrk.ardent.com (Michael McNamara) writes: > In article <1102@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > |There does not seem to be any support from "natural" languages for the > |little-endian approach. > Four and twenty black birds, baked in a pie.... In German: 24 == vierundzwnzig In Dutch it's expressed similarly Also compare English thirteen, fourteen, ... nineteen.
cik@l.cc.purdue.edu (Herman Rubin) (01/25/89)
In article <250@dsacg2.UUCP>, nol2105@dsacg2.UUCP (Robert E. Zabloudil) writes: > In article <1916@ardent.UUCP>, mac@mrk.ardent.com (Michael McNamara) writes: > > In article <1102@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > > |There does not seem to be any support from "natural" languages for the > > |little-endian approach. > > Four and twenty black birds, baked in a pie.... > > In German: 24 == vierundzwnzig > In Dutch it's expressed similarly > > Also compare English thirteen, fourteen, ... nineteen. If you read my posting, I did state that there was reversal of the units and tens digits in many languages. This occurs regular in the Germanic languages, as many have posted. In Spanish, it only occurs from 11-15, and in French, from 11-16. A correction to my statement about Hebrew; it also applies there to hundreds, but either order can occur, and in fact both orders occur in the same passage. However, my statement still holds. To give a counterexample, it would be necessary to come up with examples where such numbers as 46,378 have the 378 before the 46,000. I know of no such examples. The clear resolution of this problem occurs in these cases of multi-"byte" expressions. The early symbolic representation of numbers by alphabetic characters or other symbols is, in every case to my knowledge, in the same order as the written letters. Even the Roman numerals do this, in that if a less significant symbol appears before a more significant one, it is treated anomalously. But the Roman numerals were not used for calculating. The early numerical representations used letters, but because of no 0 symbol, different letters were used in different places, or other devices were used. I know of no ancient little-endian devices. In Hebrew, 378 would always be 300 first, then 70, then 8, in the right-to-left direction of the writing, even though both word orders occur, and the other order would be unambiguous. The apparent little-endianness of Arabic is due to the direct importation of the left-to-right symbolic numerical writing from India. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
cramer@optilink.UUCP (Clayton Cramer) (01/27/89)
In article <1371@X.UUCP., john@frog.UUCP (John Woods) writes: . In article <7193@csli.STANFORD.EDU., jkl@csli.STANFORD.EDU (John Kallen) writes: . . In article <1102@l.cc.purdue.edu. cik@l.cc.purdue.edu (Herman Rubin) writes: . . .There does not seem to be any support from "natural" languages for the . . .little-endian approach. . . What about Danish: fem og halvfirsindtyve (75 (my Danish is rusty)) . . Or norwegian: en og femti (51). This fooled me once into believing . . one could rent a room in Paris for Fr 1.50... :-) . . Or better yet, German: Zwei und Vierzig (42!) . . Ah, but consider the German for 1988: neunzehn hundert acht und achtzig . (nine-and-ten hundred eight and eighty). Middle-endian. AHA! Germans . are PDP-11s! . -- . John Woods, Charles River Data Systems, Framingham MA, (508) 626-1101 I used to work for a German company, and you haven't seen confusion until you've seen a bunch of German engineers trying to say "68000" in English, and it keeps coming out "86000", for exactly that reason. It's an understandable mistake, and we rather got used to it after a while. -- Clayton E. Cramer {pyramid,pixar,tekbspa}!optilink!cramer Disclaimer? You must be kidding! No company would hold opinions like mine!