[comp.lang.c] Portability Question: sign of bit fields

msb@sq.uucp (Mark Brader) (12/24/87)
This should have been in comp.lang.c, so I'm moving it there and
redirecting followups.  Apologies to comp.unix.questions readers for the
long inclusions, which I provide for the sake of comp.lang.c readers.

David Hitz (hitz@quacky.UUCP) writes:
> In article <9803@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
> ] In article <122@insyte.uucp> jad@insyte.uucp (Jill Diewald) writes:
> ] ] ... HPUX c ... [and] VMS c ... [give] different answers.  We want to
> ] ] know which is right (if either) so we can report it as a bug to the
> ] ] correct source.
> ]   ...
> ] ] main() {
> ] ]    struct { int x : 1; } foo;
> ] ]
> ] ]    foo.x = 1;
> ] ]    printf ("%d\n", foo.x);
> ] ] }
> ]
> ] Whether bitfields are signed is undefined.  I believe the current
> ] draft says that to get a particular behaviour, you must use either
> ] of the `signed' or `unsigned' keywords.  In other words, the code
> ] is wrong, not either of the compilers.
> 
> Now I'm curious.  Does ANSII require a hardware implementation to use
> 2s complement arithmetic for its integer representation?  (Am I allowed
> to build a grey code machine?)

First, it's not a standard yet.  It's a draft proposed standard...
Draft Proposed American National Standard to be formal.  I usually
say Draft Standard; some write dpANS.  And the organization is ANSI
with one I (even if they are responsible for ASCII with two I's).

Now, finally, to answer the question.  The Draft Standard does not
require any particular representation for negative integers, but it
does require that non-negative integers be stored in ordinary binary
form.  (There is C on non-2's-complement hardware now; there is no C on
non-binary hardware and it seems hard to imagine it ever existing.)

Therefore x<<1 for x*2 is guaranteed to work if and only if the value of
x cannot be negative.  (An important special case of this, of course,
would be if x was declared unsigned.)   And the use of

	#define	FIELD1	1
	#define	FIELD2	2
	#define	FIELD3	4
	#define	FIELD4	8

as bitmasks is also guaranteed to work, again provided that the
numbers fit in the range of positive values of the particular kind
of integer being used.

And no, you aren't allowed to build a Gray code machine.  Or rather,
if you do, any time you perform a bitwise operation such as setting a
a bit field or using the & operator, you must convert the numbers to pure
binary form, do the operation, and convert back to your internal form.

> If the standard doesn't say, then setting a single bit in a bitfield
> could result in an arbitrary integer.

Well, arbitrary within the size allowed by the field.  But as I have
explained, this is not possible.  However, it IS true that storing a
negative value into an integer and turning off an arbitrary bit can
result in an arbitrary value.


Let me explain about "signed" here, for the benefit of those who haven't
been following the developments of the last couple of years.
What follows is an interpretation of the Draft Standard in my own words.

The current Draft Standard provides 9 types for integers and characters.
The signed types are "long", "int", "short", "signed char", and maybe "char";
the unsigned types are "unsigned long", "unsigned int", "unsigned short",
"unsigned char", and maybe "char".  By "maybe" I mean that an implementation
is allowed to treat chars either as signed chars or as unsigned chars, but
the types are all distinct even when they are treated the same (just as
ints and longs are distinct types even when they are both 32 bits).

The value of any ordinary character is guaranteed to be positive when
stored in a char or signed char.  The char types are at least 8 bits,
the short and int types at least 16, the long types at least 32.  Also
sizeof(char)<=sizeof(short)<=sizeof(int)<=sizeof(long), if I may mix C
and math notation for a moment.  Operations on the unsigned types are
guaranteed to be performed in modular arithmetic without the possibility
of overflow trapping, but with the signed types an implementation can do
what it likes on an overflow.  (In case "modular arithmetic" scares
anyone -- that's what 2's complement machines that don't check overflow
do in any case; on these machines signed and unsigned values are mostly
handled by the same operations.)

The above introduces a new keyword "signed".  It is additionally allowed
to be used as a noise-word with the larger integer types; thus "signed
long" is another way of saying "long", just as "long int" is.  (Yes,
you can even say "signed long int".)

And "signed" is also allowed to be used with bit fields, but here it is *not*
a noise-word.  Bit fields effectively come in three flavors, like chars.
That is, they may be plain, signed, or unsigned, and an implementation
is allowed to treat plain ones one way or the other.  And this, of course,
is what gives the conflicting results in the original example.

Mark Brader				"C takes the point of view
SoftQuad Inc., Toronto			 that the programmer is always right"
utzoo!sq!msb, msb@sq.com				-- Michael DeCorte