[net.lang.c] Bit field differences

p500vax:pat (04/05/83)

Our 4.1 VAX compiler assigns bit fields from right to left,
ie. from low order to high,
while the 68000 compiler from MIT assigns bit fields from left to right,
ie. from high order to low.  So,

to set bit 15 on the VAX
	struct{
		unsigned short unused:15,
			       bit:1;
	}foo;

	foo.bit = 1;

to set bit 15 on the 68000
	struct{
		unsigned short bit:1;
	}foo;

	foo.bit = 1;

The C reference manual says
"Fields are assigned to words ... right-to-left on the PDP-11 and
left-to-right on other machines" so the VAX compiler is "wrong".
Of course the manual also says that it "describes the C language
on the DEC PDP-11, the Honeywell 6000, the IBM System/370, and the
Interdata 8/32", so maybe this is what they mean by other.
If I were writing a C compiler I would put 
a high value on maintaining  compatability with PDP-11s, so I'd say that
MIT took the standard a bit too literally.

pn (04/06/83)

Perhaps the VAX compiler was written by someone who considered it a big
PDP-11. In any case, it's putting it a bit strongly to claim it is wrong
because the C reference manual says "other machines" assign fields
left-to-right.

crc (04/06/83)

One would expect compiliers to assign bit fields the same way the cpu
manufaturer does. In this case MIT is wrong as the 68000 counts bits from
LSB to MSB, like the pdp11.  Unfortunately the 68000 counts bytes the other
way around, ie from MSByte to LSByte. It would be cleaner if they counted
bits and bytes the same way. Well at least you can *REALLY* directly address
more than 64k unlike the 8086, z8000 and 9900.

donchin (04/08/83)

#R:amd70:-174300:uiucdcs:27600023:000:79
uiucdcs!donchin    Apr  8 02:47:00 1983

What are they talking about?  I mean, what is the dispute, the topic
is clear.

guy (04/09/83)

Well, it's probably actually a function of how you number your bytes.  On
the PDP-11, the low-order byte of a word has the even address and the
high-order byte has the next odd address, i.e.:

	+--------+--------+
	| n + 1  |    n   |
	+--------+--------+

For hysterical raisins (namely, the way the FP-11 handles 32-bit integers),
the high- and low-order *words* of a *longword* are not handled this way
on PDP-11 C, although they ARE handled that way on all of DEC's PDP-11
software.  The VAX-11 handles bytes this way (and handles words in a longword
this way, i.e. differently from PDP-11 C).  The bits of a byte/word/longword/
quadword are also numbered this way.  As such, they probably decided to
assign bit-fields from the low-order bit to the high-order bit.

The 68000 handles bytes and words differently (although, for no conceivable
reason, it still numbers bits from the low-order bit up), so they probably
decided to do bit fields from the high-order bit to the low-order bit.
Since you can't transmit raw binary data from the PDP-11 or the VAX-11 to
the 68000 because of the byte-order problem, it's not clear that doing the
bit fields in the PDP-11/VAX-11 fashion would really help.

The moral of the story is: if you want to exchange data between systems, put
it either in ASCII or in a binary form which is insensitive to bit order.
The "tar" tape format, the new "cpio" tape format, and the Berkeley archive
format do the former, while the System V archive format and the USG UNIX
"pack" Huffman-coded file format do the latter.

					Guy Harris
					RLG Corporation
					{seismo,mcnc,we13}!rlgvax!guy

tjt (04/09/83)

I presume that the intent of assigning bit fields from left to right
on pdp-11's and right to left on other machines is to match the order
of bytes within larger integers.  This is certainly the
interpretation of PCC, in which #ifdef RTOLBYTES surrounds much of
the code dealing with bit fields.

This makes bit fields unportable for retro-fitting code whose
specifications are defined in terms of shifts and masks. An example
of this is <wait.h> in 4BSD which declares a structure using bit
fields as returned by the wait system call -- the fields have to be
declared in reverse order using PCC on a MC68000.  It should also be
emphasized that the portability problem here arises only because the
wait system call is described as returning information in the "low
order bits" and the "high order 8 bits" (from System III).  There
would be no problem if the bit field declarations were used
exclusively.  As an example of this, ld with vax-style symbols tables
(declared using bit fields) works fine on a 68000, at least for
object files generated on a 68000.  It's true that the bit field
order messes up the relocation information, but the byte order also
messes up the header and just about everything else in the object
file.

Unfortunately, having made this concession to portability (presumably
in the interest of efficiency), PCC doesn't go far enough.  The C
reference manual (p196 in Kernighan and Ritchie) states that:

	Field members are packed into machine integers; they do
	not straddle words.  A field which does not fit into the
	space remaining in a word is put into the next word.

and in the explanatory text (p137) "A field is a set of adjacent bits
within a single int." and later (p138) "A field may not overlap an
int boundary; if the width would cause this to happen, the field is
aligned at the next int boundary."

The code in PCC for dealing with bit fields in the absence of special
hardware does generate the obvious shifts and masks, operating on
values of data type int.  However, it does not pad out the next
non-field element of a structure to start at an int boundary, nor
does it align the first field element specially.  If it did, there
should have been little problem with defining bit fields to always go
from left to right or right to left (transforming positions of bit
fields to accomodate particular machine architectures can be done at
compile time since field widths are all constants).

As it is, PCC generates code which touches e.g. 32 bits in order to
change 1 bit. On 68000's (at least for the next year or two until the
68020 comes out), it is significantly more expensive to touch all
these bits than it is to touch say 16 or 8 bits.  There is an even
bigger problem if you attempt to use bit fields to access device
registers where reading and writing memory can have side effects even
if the "value" wasn't changed.  This was noticed at Stanford, where
the MIT PCC/68000 compiler was modified to access fields as a short
(16-bits) if possible (I don't know if the modifications were
incorporated into later versions of the compiler distributed by MIT).

The logical extension of this would be to access fields as char's
(8-bits) if possible as well, and in general, to treat a field
declared as char:8, short:16 and int:32 as much like structure
elements of type char, short and int respectively.  Obviously, these
field widths are machine dependent.  Also, these declarations can't
be equivalent since you can take the address of a structure element
of type char, but not of a bit field.  This is also the only place
where I can see the advantage of assigning bit fields in the same
order as bytes since assigning bits from right to left on a machine
where bytes are numbered from left to right would require knowing
where the right end of the whatever was. i.e. the next non-field
element of the structure.

Since the portability of bit fields is only a problem for binary
objects, and only where byte order is also a portability problem, I
think the only change that should be made is to get the compilers to
generate better code.