[comp.lang.c] byte order and bit order

rick@tmiuv0.uucp (06/27/90)

In article <1990Jun20.182745.3730@csrd.uiuc.edu>, pommerel@sp30.csrd.uiuc.edu (Claude Pommerell) writes:
> In article <1990Jun19.221359.7399@ecn.purdue.edu>,
> luj@delta.ecn.purdue.edu (Jun Lu) writes:
> |>Aslo, what's the byte order for a binary UNIX disk file ?  In other words,
> |>if a 4-byte integer( for example) is written to a disk file by using
> |>write/fwrite, what's the byte order of the interger as stored in the disk
> |>file ?   Is byte-swapping necessary if the integer is going to be
> retrieved by
> |>using read/fread ?
> 
> There is no standard byte ordering for binary UNIX files. You have to do
> byte-swapping
> when reading data generated on a big-endian to be read by a small-endian
> or vice versa.

In fact, it can get even more complicated.  Now, this goes back a few years,
but the old Whitesmiths C on VAXen and PDP-11s (RSX/RT-11) had to store
multi-byte variables in several formats.  Both systems are little-endian
and store two byte ints in ascending order (low byte to high byte).  On the
VAX, longs (4-byte integers) were stored in ascending order (low byte to
high byte), while on the PDP-11s, longs were stored as two integers (each
low byte to high byte), but the integers themselves were stored with the
_most_ significant int stored first.  Assuming a 4-byte integer with bytes
numbered 0 to 3, a VAX stored them:

    Address:   0    1    2    3
    Data byte: 0    1    2    3

while the PDP stored them:

    Address:   0    1    2    3
    Data byte: 2    3    0    1

In much of my code (which had to have data portability between the systems),
I ended up defining a preprocessor symbol, "WSWAP" (word swapped) which was
TRUE for PDP-11s, but FALSE for VAXEN.  I suspect that this strange behaviour
was caused by the fact that PDP-11s don't have a natural "long" (4-byte)
variable type, and one had to be manufactured.  The VAX, of course, has
4-byte variables, since that's the natural size of the machine.

However, if you used their buffered file I/O routines (putf, getf), disk
files were compatible.  That entailed linking in lots of library code, so I
ended up using the Unixish I/O (read, write) and dealing with this problem
myself.
-- 
  .-------------------------------------------------------------------------.
 / [- O] Rick Stevens (All opinions are mine. Everyone ignores them anyway.) \
|    ?   +--------------------------------------------------------------------|
|    V   | uunet!zardoz!tmiuv0!rick             (<-- Work (ugh!))             |
|--------+ uunet!zardoz!xyclone!sysop           (<-- Home Unix (better!))     |
|  uunet!perigrine!ccicpg!conexch!amoeba2!rps2  (<-- Home Amiga (Best!!)      |
 \ 75006.1355@compuserve.com (CIS: 75006,1355)  (<-- CI$)                    /
  `-------------------------------------------------------------------------'
"I am firm.  You are obstinate.  He is pigheaded!"
                                         - James P. Hogan

gaynor@busboys.rutgers.edu (Silver) (07/02/90)

This is where the internet's `network ordering' conventions really come in
handy.  Non-local i/o is assumed to be with a device that formats this
low-level stuff in the `standard' network format.  Data read from such a source
must be converted to conform to local conventions with a function (or macro, as
the case may be); data so written must be converted to network formats the same
way.

Just my two bits, [Ag] gaynor@topaz.rutgers.edu