mandrews@alias.com (Mark Andrews) (04/25/91)
I have a question concerning concering the byte and bit order of fields within packet headers. Many of the RFCS (including RFC1060) state rules about the byte (octet) order: Data Notations The convention in the documentation of Internet Protocols is to express numbers in decimal and to picture data in "big-endian" order [21]. That is, fields are described left to right, with the most significant octet on the left and the least significant octet on the right. The order of transmission of the header and data described in this document is resolved to the octet level. Whenever a diagram shows a group of octets, the order of transmission of those octets is the normal order in which they are read in English. For example, in the following diagram the octets are transmitted in the order they are numbered. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 1 | 2 | 3 | 4 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 5 | 6 | 7 | 8 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 9 | 10 | 11 | 12 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Transmission Order of Bytes So, all multi-octet fields are transmitted in big-endian byte order. This is a small problem for little endian machines (most significant byte is on the right). They must correct their byte order before the packet is transmitted. In BSD systems, this job is performed by the htonl() and htons() functions (host to network long, host to network short), but what about the bit order? It is still little-endian (bits are numbered right to left instead of left to right)! What order is are the bits transmitted in? This is further complicated by the following code fragment from /usr/include/netinet/ip.h: struct ip { #if BYTE_ORDER == LITTLE_ENDIAN u_char ip_hl:4, /* header length */ ip_v:4; /* version */ #endif #if BYTE_ORDER == BIG_ENDIAN u_char ip_v:4, /* version */ ip_hl:4; /* header length */ #endif <etc.> When all this is translated, there are two views of the first byte of the ip structure; one little endian: 7 0 +------+-------+ | ip_v | ip_hl | +------+-------+ and one big endian: 0 7 +------+-------+ | ip_v | ip_hl | +------+-------+ Now according to the 4.3BSD C Reference Manual by B. Kernighan, the addresses of a structure increase as the declarations are read left to right (irrelevant of the bit or byte order), so in terms of a C addressing model, the first byte of the ip structure is: 0 7 +------+-------+----- | ip_v | ip_hl | other elements of ip structure +------+-------+----- Perhaps the bits are transmitted MSB to LSB based on the C model. I don't know. Any clarifications on my confusion or any other help would be appreciated. Thanks, Mark ------------------------------------------------------------------------------ Mark Andrews Systems Programmer, Alias Research, Toronto, Canada Phone: (416)-362-9181 Mail box: mark@alias.com
greene@coral.com (Jeremy Greene) (04/26/91)
) This is a small problem for little endian machines (most significant byte is ) on the right). They must correct their byte order before the packet is ) transmitted. In BSD systems, this job is performed by the htonl() and htons() ) functions (host to network long, host to network short), but what about the ) bit order? It is still little-endian (bits are numbered right to left instead ) of left to right)! What order is are the bits transmitted in? ) Byte order is a host to host issue. One host has a different view of byte order than the other. Fortunately, there is an agreed upon inter-host (network) format which is big-endian. Bit order is not a host problem; it is only network related, and more specifically, MAC layer related. In otherwords, you always have the same hardware at both ends of the connection and from the host perspective the bit order is treated the same. If you send 0x01 on fddi you will receive 0x01 on some other fddi interface. Same for Ethernet. Unlike the byte order issue, you never send from fddi to Ethernet, which would make a big mess. The fact that bits are sent in a different order do not present a problem in getting data from one host to another. The problem is that the actual network hardware interprets the mac address. Given that: - the address on both rings and Ethernets are the same: group bit first and, - rings transmit the left most bit from a byte first, Ethernet transmits the right most bit the same address has to be placed in memory in a different bit order depending on the media type. So, if the address starts 0x01 in Ehternet land (which is a group address) then it must start 0x80 for a fddi interface. From the network perspective it's the same address. In other words, similar to the byte order problem, you want to have the macros 'ntomac' and 'macton'. To do this, there has to be a canonical foramt, which IEEE has (recently) stated is the Ethernet format. The bottom line is that you only have to worry about bit order if you're wokring at the MAC layer. Jeremy
erick@sunee.waterloo.edu (Erick Engelke) (04/26/91)
In article <9104241753.AA21589@dino.alias.com> mandrews@alias.com (Mark Andrews) writes: > >I have a question concerning concering the byte and bit order of fields >within packet headers. Many of the RFCS (including RFC1060) state rules >about the byte (octet) order: > Much confusion stems from the fact that Intel processors store bits in the following order +--+--+--+--+--+--+--+---+---+--+--+--+--+--+--+---+ |07 06 05 04 03 02 01 00 | 15 14 13 12 11 10 09 08 | +--+--+--+--+--+--+--+---+---+--+--+--+--+--+--+---+ | first stored byte | second stored byte | etc. whereas network order is +--+--+--+--+--+--+--+---+---+--+--+--+--+--+--+---+ |15 14 13 12 11 10 09 08 | 07 06 05 04 03 02 01 00 | +--+--+--+--+--+--+--+---+---+--+--+--+--+--+--+---+ so the bits and nybbles are already in network order, you simply need to organize quantities larger than a byte, namely 16 and 32 bit values. The intel code should have unsigned ip_h : 4; unsigned ip_v : 4; I hope this clears it up a bit. Erick -- ---------------------------------------------------------------------------- Erick Engelke Watstar Computer Network Watstar Network Guy University of Waterloo Erick@Development.Watstar.UWaterloo.ca (519) 885-1211 Ext. 2965
mark@alias.com (Mark Andrews) (04/26/91)
From NIC.DDN.MIL!tcp-ip-RELAY@utcsri Fri Apr 26 03:20:29 1991 Date: 26 Apr 91 05:15:36 GMT From: usc!rpi!news-server.csri.toronto.edu!utgpu!watserv1!sunee!erick@apple.com (Erick Engelke) Organization: University of Waterloo Subject: Re: Byte and bit order within packet headers References: <9104241753.AA21589@dino.alias.com> Sender: tcp-ip-relay@nic.ddn.mil To: tcp-ip@nic.ddn.mil Erick Engelke (usc!rpi!news-server.csri.toronto.edu!utgpu!watserv1!sunee!erick@apple.com) responds to my question: >In article <9104241753.AA21589@dino.alias.com> mandrews@alias.com (Mark Andrews) writes: >> >>I have a question concerning concering the byte and bit order of fields >>within packet headers. Many of the RFCS (including RFC1060) state rules >>about the byte (octet) order: >> > >Much confusion stems from the fact that Intel processors store >bits in the following order > > +--+--+--+--+--+--+--+---+---+--+--+--+--+--+--+---+ > |07 06 05 04 03 02 01 00 | 15 14 13 12 11 10 09 08 | > +--+--+--+--+--+--+--+---+---+--+--+--+--+--+--+---+ > | first stored byte | second stored byte | etc. > >whereas network order is > > +--+--+--+--+--+--+--+---+---+--+--+--+--+--+--+---+ > |15 14 13 12 11 10 09 08 | 07 06 05 04 03 02 01 00 | > +--+--+--+--+--+--+--+---+---+--+--+--+--+--+--+---+ > >so the bits and nybbles are already in network order, you simply need to >organize quantities larger than a byte, namely 16 and 32 bit values. > >The intel code should have > unsigned ip_h : 4; > unsigned ip_v : 4; > >I hope this clears it up a bit. > >Erick Fine, this a good example. In my specific example, I was looking how BSD code interprets the version number and header length of an IP packet header: struct ip { #if BYTE_ORDER == LITTLE_ENDIAN u_char ip_hl:4, /* header length */ ip_v:4; /* version */ #endif #if BYTE_ORDER == BIG_ENDIAN u_char ip_v:4, /* version */ ip_hl:4; /* header length */ #endif Unfortunately, the bit order is machine and compiler dependent. On little endian machines, the bit fields are assigned least significant bit first (right to left), resulting in: MSB LSB +--+--+--+--+--+--+--+--+ | ip_v | ip_hl | +--+--+--+--+--+--+--+--+ 07 06 05 04 03 02 01 00 On big endian machines, the bit fields are also assigned least significant bit first, but this time the bit fields are assigned left to right: LSB MSB +--+--+--+--+--+--+--+--+ | ip_v | ip_hl | +--+--+--+--+--+--+--+--+ 00 01 02 03 04 05 06 07 In which order are the bits transmitted such that the integrity of the data is not compromised, independent of the endian order. For example, if a big endian machine is talking to a little endian machine, in what order are the bits transmitted so that the ip_v and ip_hl fields from the big endian machine are interpreted properly on the little endian machine. In the case of the ip_v field, bits 0-3 of the big endian byte must be transmitted to bits 4-7 of the little endian byte! Thanks for any information, Mark
henry@zoo.toronto.edu (Henry Spencer) (04/27/91)
In article <9104241753.AA21589@dino.alias.com> mandrews@alias.com (Mark Andrews) writes: >... what about the >bit order? It is still little-endian (bits are numbered right to left instead >of left to right)! What order is are the bits transmitted in? Fortunately, this is not an issue, because the data is fed to the hardware as bytes (usually) and consequently it is the hardware's business to get the bit order right on both ends. The usual practice is to send lsb first, but this is completely invisible to the software. There is sometimes confusion about how the bits are *numbered*, but that is a separate issue. The high-order bit is always the high-order bit and is always in the same place, regardless of whether the manual calls it bit 7 or bit 0. >This is further complicated by the following code fragment from >/usr/include/netinet/ip.h: Here we have a different issue. The reason why ip.h is #if'd is that C does not define the order of bitfields within a word, and it is both machine-specific and compiler-specific. Using bitfields for this was dumb, actually; convincing them to match an externally-defined storage layout can be tricky. >Now according to the 4.3BSD C Reference Manual by B. Kernighan, the addresses >of a structure increase as the declarations are read left to right... Bitfields do not have addresses and that rule does not apply to them. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
zweig@cs.uiuc.edu (Johnny Zweig) (04/27/91)
My solution is never to use bit-fields when decoding packets. Just mask things with 0x0F and 0xF0 and let the compiler optimize it. The problem is not a network thing but a C language thing. Quoting from K&R (2nd ed.): Almost everything about [bit] fields is implementation-dependent. ... Fields are assign left to right on some machines and right to left on others. The term machine here is misleading -- it is actually the implementation of C on a particular machine that decides what to do with bit fields. One could imagine two different compilers on the same architecture that did it differently. So just get a compiler with inline expansion and a decent optimizer and define functions to access fields inside of headers. And this htonl() ntohs() is a poor solution to the problem. It is too easy to forget what byte-order a particular int currently is in. In my TCP/IP implementation (in C++) I have a class that hides all that junk. I just assign values into number-holders according to what byte order they are in, and retrieve the values in the appropriate order for manipulation. This makes errors such as calling htons() twice never happen.... -ynnhoJ redro-etyB
kre@cs.mu.oz.au (Robert Elz) (04/29/91)
mark@alias.com (Mark Andrews) writes: >Unfortunately, the bit order is machine and compiler dependent. This is true, in a sense. Bu this ... >On little endian machines, the bit fields are assigned least significant bit >first (right to left), >On big endian machines, the bit fields are also assigned least significant >bit first, but this time the bit fields are assigned left to right: Is simply wrong, or perhaps inaccurate. Except on those processors that have "extract/insert bitfield" instructions, the order in which bitfields are placed in a byte (or whatever) is purely compiler dependant - on a little endian host you could assign bit fields either way, on a big endian host you could assign bit fields either way. Even with a host with bitfield instructions, the compiler could do it either way (the bit numbers of the fields will be constant, whether the compiler omits instructions to extract bits 0-3 or 4-7 when fetching the ip_v field is pretty much irrelevant to anything - except the expectations of the author of the code). The ifdef's in the BSD source are just a latent bug waiting to bite someone who isn't very careful porting the code to a new compiler. (It just happens to work out right on the compilers the code is normally compiled with). >In which order are the bits transmitted such that the integrity of the >data is not compromised, independent of the endian order. It depends entirely on the medium over which the data is being sent, and only on that - on serial (point to point) wires, sync or async, and on ethernet (ISO 8802/3) the least significant bit is sent first, on rings the most significant bit is sent first, if you happen to have an 8 bit parallel bus, then all the bits are sent simultaneously..., but as long as all the hardware understands this the most significant bit of a received byte will be the most significant bit of the transmitted byte, so once the hardware is designed & built correctly you never need to worry about this. On the other hand, you do need to worry about the order of the bytes wrt their interpretation as multi-byte objects (int's etc), and thw way that the compiler lays out storage in structs, including bits for bit fields. Anyone attempting to use struct definitions to represent network packet formats must have intimate knowledge of the way the compiler works - and should the compiler decide to change from one version to another, and the network code breaks because of that, its a bug in the net code - not the compiler. kre
romkey@ASYLUM.SF.CA.US (John Romkey) (04/29/91)
It's true. In the portable UDP stack which Epilogue sells, I originally had defined the header length and IP version numbers at bitfields but finally took them out because we found that that bitfield ordering was really compiler dependent instead of processor dependent. I replaced our bitfields with appropriate masking operations to avoid the problem. - john romkey Epilogue Technology USENET/UUCP/Internet: romkey@asylum.sf.ca.us voice/fax: 415 594-1141
nreadwin@micrognosis.co.uk (Neil Readwin) (04/30/91)
In article <9104252242.AA28673@taipan.coral.com>, greene@coral.com (Jeremy Greene) writes: |> The bottom line is that you only have to worry about bit order if |> you're wokring at the MAC layer. Or reading IEEE documents that specify everything in a counter-intuitive bit ordering :-\ Phone: +44 71 528 8282 E-mail: nreadwin@micrognosis.co.uk Quote: Everything is a cause for sorrow that my mind or body has made
lance@motcsd.csd.mot.com (lance.norskog) (04/30/91)
mark@alias.com (Mark Andrews) writes: >I have a question concerning concering the byte and bit order of fields >within packet headers. Many of the RFCS (including RFC1060) state rules >about the byte (octet) order: > [ use of C bit fields elided ] The C programming language is missing a lot of useful features; paradoxically, bit fields should never have been added. In particular, they should not be used for expressing the movement of binary data between machines. BSD TCP/IP shouldn't have used bit fields to begin with, and should be rewritten to get rid of them. Sorry, I can't volunteer. You should quit trying to use them, and rewrite your code to remove them. Lance Norskog