[comp.lang.c] type and size of bit-fields

taft@adobe.com (Ed Taft) (03/17/91)

Consider a C compiler that treats int as 16 bits and long int as 32 bits.
What should be the interpretation of the following bit-field appearing in a
struct?

  int foo: 24;

The ANSI C standard states: "A bit-field is interpreted as an integral type
consisting of the specified number of bits." Based on this, one might expect
foo to be treated as a 24-bit integer. That is, in this context, "int" means
"integral type", not "16-bit integer".

However, this interpretation may be contradicted by an earlier statement
that the width of a bit-field "shall not exceed the number of bits in an
ordinary object of compatible type." This statement is somewhat enigmatic,
since it depends on what you think is meant by "compatible type" in this
instance.

Alas, different compilers seem to disagree about this. Ed McCreight recently
made a survey of four C compilers for the IBM PC and reports the following
results:

* WATCOM 8.0 works as described above.

* MicroSoft 6.00A, Zortech 2.1, and Borland BCC 2.0 complain that the
  bit-field is too large to be an int.

In light of this, a possible work-around comes to mind:

  long int foo: 24;

Unfortunately, this violates the ANSI C standard, which states: "A bit-field
may have type int, unsigned int, or signed int" -- omitting long types.
Surveying the same four compilers, we observe the following:

* WATCOM 8.0 and Borland BCC 2.0 reject this declaration.

* Microsoft 6.00A and Zortech 2.1 accept it without complaint and
  (apparently) interpret the bit-field as intended.

We haven't yet discovered any way to get Borland BCC 2.0 to accept
bit-fields longer than 16 bits.

Now admittedly, nearly everything to do with bit-fields is implementation
dependent. On the other hand, it doesn't seem unreasonable to expect an
implementation to support bit-fields of any size up to and including the
largest integral type. Can anyone offer authoritative information on this
matter?


Ed Taft      taft@adobe.com      ...decwrl!adobe!taft

torek@elf.ee.lbl.gov (Chris Torek) (03/18/91)

In article <12638@adobe.UUCP> taft@adobe.COM writes:
>The ANSI C standard states: "A bit-field is interpreted as an integral type
>consisting of the specified number of bits." Based on this, one might expect
>foo to be treated as a 24-bit integer. That is, in this context, "int" means
>"integral type", not "16-bit integer".

>However, this interpretation may be contradicted by an earlier statement
>that the width of a bit-field "shall not exceed the number of bits in an
>ordinary object of compatible type." This statement is somewhat enigmatic,
>since it depends on what you think is meant by "compatible type" in this
>instance.

There is no contraction here (just a bit of confusing language).  What this
means is that there are various ways to declare bitfields:

	int <name>:<size>;
	signed int <name>:<size>;
and	unsigned int <name>:<size>;

where <size> is `small enough'.  The last two produce signed and
unsigned values that are exactly <size> bits long, while the first one
produces either signed or unsigned, at the implemention's discretion.
`Small enough' here means that the <size> should be no larger than the
number of bits in a `signed int' or `unsigned int', whichever is the
`compatible type' (to use the phrase quoted above, and thus define it
implicitly [I hope]).
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

henry@zoo.toronto.edu (Henry Spencer) (03/21/91)

In article <12638@adobe.UUCP> taft@adobe.COM writes:
>Now admittedly, nearly everything to do with bit-fields is implementation
>dependent. On the other hand, it doesn't seem unreasonable to expect an
>implementation to support bit-fields of any size up to and including the
>largest integral type. Can anyone offer authoritative information...

It's pretty much as you've stated.  The standard promises almost nothing
about bitfields.  The degree of support for them is a "quality of
implementation" issue.  One would hope for the property you request, but
there is no guarantee of it.
-- 
"[Some people] positively *wish* to     | Henry Spencer @ U of Toronto Zoology
believe ill of the modern world."-R.Peto|  henry@zoo.toronto.edu  utzoo!henry

ckp@grebyn.com (Checkpoint Technologies) (03/21/91)

In article <1991Mar20.172906.3645@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <12638@adobe.UUCP> taft@adobe.COM writes:
>>Now admittedly, nearly everything to do with bit-fields is implementation
>>dependent. On the other hand, it doesn't seem unreasonable to expect an
>>implementation to support bit-fields of any size up to and including the
>>largest integral type. Can anyone offer authoritative information...
>
>It's pretty much as you've stated.  The standard promises almost nothing
>about bitfields.  The degree of support for them is a "quality of
>implementation" issue.  One would hope for the property you request, but
>there is no guarantee of it.

Here's a kicker...  The best information I have right now is that SAS C
version 6.0 for the Amiga will use a different ordering and packing
scheme for bit fields than SAS C version 5.10.

Does it seem reasonable to anyone else that bit field ordering is *so*
indefinite that different versions of the compiler from the same vendor
for the same platform may do things differently?  This would seem to
prohibit writing records with bit fields to external storage, if there's
any chance that the program in question will ever be recompiled and then
asked to deal with the same records.

Perhaps the Standard has something to say in this regard?
-- 
First comes the logo: C H E C K P O I N T  T E C H N O L O G I E S      / /  
                                                ckp@grebyn.com      \\ / /    
Then, the disclaimer:  All expressed opinions are, indeed, opinions. \  / o
Now for the witty part:    I'm pink, therefore, I'm spam!             \/

scs@adam.mit.edu (Steve Summit) (03/21/91)

In article <1991Mar20.224317.1265@grebyn.com> ckp@grebyn.com (Checkpoint Technologies) writes:
>Here's a kicker...  The best information I have right now is that SAS C
>version 6.0 for the Amiga will use a different ordering and packing
>scheme for bit fields than SAS C version 5.10.
>
>Does it seem reasonable to anyone else that bit field ordering is *so*
>indefinite that different versions of the compiler from the same vendor
>for the same platform may do things differently?  This would seem to
>prohibit writing records with bit fields to external storage, if there's
>any chance that the program in question will ever be recompiled and then
>asked to deal with the same records.

This is why I (and many others) recommend *never* using "binary"
data files.

A couple of years ago, I made two mistakes: I disregarded my own
advice about binary data files, and I used a certain popular but
widely vilified processor having several architectural
peculiarities, including various "memory models," which end up
being highly visible at the source code level.

It happened that the structures I was writing out contained a few
pointer fields.  Of course, the pointer values were meaningless
when read back in (and the program took care of fixing them), but
arranging not to write them out at all would have meant writing
out the rest of the structure piecemeal, which would destroy the
whole point of these accursed "binary" files, namely that you can
read and write entire structures in one swell foop with fread and
fwrite.

Well, one day I had occasion to recompile the programs in
question using the "large memory model," in which pointers are 32
bits rather than 16.  Presto: all my old data files (and there
were hundreds of them, scattered all over the place) were
unreadable by new programs.  To this day I have to maintain
separate, "small model" versions of key programs (and several
large libraries) in order to deal with the older data files,
many of which are still around.

The moral: DON'T USE BINARY DATA FILES.  (You'll say, "but we
have to, for efficiency."  Sure, and go ahead, but don't come
crying to me :-) when you have problems -- and you will have
problems.)

>Perhaps the Standard has something to say in this regard?

It says (implicitly) "It's a quality of implementation issue."
I agree that the abovementioned change to the SAS compiler is
extremely low-quality: whether or not you're not using binary
data files, you'll have to recompile *all* of your object files
(and vendor-supplied libraries, assuming you have source for
them) if they deal with structures containing bitfields.

                                            Steve Summit
                                            scs@adam.mit.edu

gwyn@smoke.brl.mil (Doug Gwyn) (03/21/91)

In article <1991Mar20.224317.1265@grebyn.com> ckp@grebyn.com (Checkpoint Technologies) writes:
>...  This would seem to prohibit writing records with bit fields to
>external storage, if there's any chance that the program in question
>will ever be recompiled and then asked to deal with the same records.

Yes, indeed, if any program other than THE SAME PROCESS THAT WROTE IT
reads a binary "dump" of C data structures, there is no guarantee that
it will have the appropriate format.

>Perhaps the Standard has something to say in this regard?

It does NOT require all implementations to make the same choice for
bit field ordering, structure member padding, etc.  Successive
releases of the "same" compiler are considered different implementations.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/22/91)

In article <1991Mar21.021023.25615@athena.mit.edu> scs@adam.mit.edu writes:
> The moral: DON'T USE BINARY DATA FILES.

Binary files often increase not only efficiency, but security and
reliability. What am I referring to? Compilations.

There are examples of this other than language compilations. I'm working
on a system program, shuctld, which takes a readable configuration file,
typically /etc/shu.conf, compiles it into /etc/shu.cc, and then uses
/etc/shu.cc. This saves up to a few seconds of parsing and service-host
lookup time at the beginning of every network connection.

Not only that, but if someone makes a syntax error in /etc/shu.conf,
shuctld will blare a few warnings in various places and continue using
the old /etc/shu.cc. You know what happens if you mess up /vmunix and
then reboot? That can't happen with /etc/shu.conf.

shuctld automatically checks whether /etc/shu.cc is newer than
/etc/shu.conf, so you don't have to worry about the files getting out of
sync. It can also check /etc/shu.cc against its own executable, so that
you don't even have to worry if the compiled format changes.

Now why shouldn't shuctld use an easy-to-parse, reliable binary file,
rather than a text file that users will invariably trash?

---Dan

sbs@ciara.Frame.COM (Steven Sargent) (03/22/91)

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

Now why shouldn't shuctld use an easy-to-parse, reliable binary file,
rather than a text file that users will invariably trash?

---
We've moved from general rules about writing binary files
to your particular application (and even further from topics
in the C language!).  So I'll try not to overdo.

In your application, do what you want.  You're providing
a tool for wheels (/etc/shuctld); system administration
stuff often cuts corners on portability, so you'll not be
much worse than other providers.  (You doubtless will run
into somebody who wants to build shu.cc files at one machine,
and then distribute them automagically to heterogenous
machines; but that's *his* problem, not yours, right?)  Much
better, you provide shu.cc as an efficiency hack to shu.conf,
so that it can be regenerated more or less painlessly.

For applications whose charter is to be portable across
heterogeneous systems, binary files are rather messier.
Probably the worst thing to do is to drop in arbitrary unpadded
struct/unions, pointers, floating point numbers, and similar
trash.  Write transfer functions between file- and VM-resident
versions of the structures, and you've solved most of the
problem, given that you do something sensible with byte/word
ordering.  Provide a schema for the file, either explicitly,
or implicitly via the file's version string.

Users can and do "invariably trash" binary files as well as
text files, by using them via soft NFS mounts, or sending them
around over 7-bit communication paths, or mistakenly opening
and saving them with vi; a little paranoia on the application's
part doesn't hurt.  Write a header including a magic number
and version string.  If you're really nervous, think about
writing checksums.

Using these hints, you're still getting most of the speed
benefit of binary files (generally, but not always, more
compact than the ASCII version; no text strings to decode),
and a better chance of portability.  Your code will be more
complex, but you know better than I do where that trades off
for you.

-- 
Steven Sargent sbs@frame.com	"Frame's phone bill, my opinions."