[comp.lang.c] question about an array of enum

it1@ra.MsState.Edu (Tim Tsai) (11/03/90)

  consider the following code:

typedef enum {false, true} boolean;

main()
{
	boolean bit_fields[1000];

	bit_fields[0] = true;
}

  How much memory does bit_fields actually take up, assuming a 32-bit
architecture?  Will my C compiler treat it as an array of int's?
(Sun OS 4.1, Sun cc).  What about other C compilers?  (I know turbo C
has a switch that allows it to treat enum data types as int's).

  I thought about using an array of characters and calculate
the positions, but I'd rather use enum if possible.  It's quite a bit
simpler.

  Any help is greatly appreciated...

-- 
  Sometimes I think the surest sign that intelligent life exists elsewhere
  in the universe is that none of it has tried to contact us.
                                         <Calvin of Calvin & Hobbes>
  Disclaimer: I don't speak for ANYBODY but myself! <Me>

henry@zoo.toronto.edu (Henry Spencer) (11/04/90)

In article <it1.657622430@ra> it1@ra.MsState.Edu (Tim Tsai) writes:
>typedef enum {false, true} boolean;
>	boolean bit_fields[1000];
>
>  How much memory does bit_fields actually take up, assuming a 32-bit
>architecture?  Will my C compiler treat it as an array of int's?

Almost certainly it will treat it as an array of some integer type.
(ANSI C requires that, in fact.)  The compiler is within its rights
to treat it as an array of `int', although it could also be nice and
treat it as an array of `char' or some other small integer type.

It is unlikely, verging on impossible, that you will get the effect
you want this way.  C enums are not Pascal enumerations, and the
compiler would have to work fairly hard to be sure that `true' and
`false' were the only values you were assigning to elements of that
array.  There would also be problems with things like `sizeof'.
Finally, indexing into arrays is a colossal pain if the array elements
are smaller than the minimum addressable unit, which is typically the
`char', and compilers seldom want to take the trouble, especially since
it means bending the rules anyway.

If you really want the effect of an array of bits, you're going to have
to do the two-part addressing (get a byte/word and then pick out the bit
with a mask or a shift) yourself.
-- 
"I don't *want* to be normal!"         | Henry Spencer at U of Toronto Zoology
"Not to worry."                        |  henry@zoo.toronto.edu   utzoo!henry

gary@hpavla.AVO.HP.COM (Gary Jackoway) (11/05/90)

Henry Spencer writes:

> In article <it1.657622430@ra> it1@ra.MsState.Edu (Tim Tsai) writes:
> >typedef enum {false, true} boolean;
> >	boolean bit_fields[1000];
> >
> >  How much memory does bit_fields actually take up, assuming a 32-bit
> >architecture?  Will my C compiler treat it as an array of int's?

> Almost certainly it will treat it as an array of some integer type.
> (ANSI C requires that, in fact.)  The compiler is within its rights
   --------------------
> to treat it as an array of `int', although it could also be nice and
> treat it as an array of `char' or some other small integer type.
I don't think a compiler can make enums smaller than int and be ANSI
compliant, since the standard says that enums are ints.  Further, enums
don't except the short/long keywords (at least not on MSC and HP-UX C).
It is for this reason that I almost never use enums.  Their size varies
from machine to machine and there is nothing you can do about it.

> If you really want the effect of an array of bits, you're going to have
> to do the two-part addressing (get a byte/word and then pick out the bit
> with a mask or a shift) yourself.

Absolutely.  Use macros to make it look like you are addressing an
array of bits.  Then, some day if the language supports this, you will
be able to upgrade painlessly.

Gary Jackoway

pds@lemming.webo.dg.com (Paul D. Smith) (11/06/90)

[] I don't think a compiler can make enums smaller than int and be
[] ANSI compliant, since the standard says that enums are ints.

K&R II: Section A8.4, p. 215:
    "The identifiers in an enumerator list are declared as constants
     of type `int' ... "

(sorry, I don't have a copy of ANSI ... )

[] Further, enums don't except the short/long keywords (at least not
[] on MSC and HP-UX C).  It is for this reason that I almost never use
[] enums.  Their size varies from machine to machine and there is
[] nothing you can do about it.

If you *care* about how many bits are in the physical representation
of an enum, then you shouldn't be using one.  enum's are newfangled
(:-) software engineering / data abstraction concepts; their purpose
is to represent the type of a variable with a predefined, distinct set
of possible values (presumably a smaller set than those representable
by an `int'! :-).

I admit that I use the actual values of enums sometimes, but I've
never had a reason to use a value larger than the size of even a
16-bit integer -- mostly I use them as array indicies when I use the
actual value, since my debugger is quite intelligent about enums and
handily prints their symbolic value.  When I'm not using the actual
value I never bother to set it, just to show I don't care.
--

                                                                paul
-----
 ------------------------------------------------------------------
| Paul D. Smith                          | pds@lemming.webo.dg.com |
| Data General Corp.                     |                         |
| Network Services Development           |   "Pretty Damn S..."    |
| Open Network Applications Department   |                         |
 ------------------------------------------------------------------

mcdaniel@adi.com (Tim McDaniel) (11/07/90)

gary@hpavla.AVO.HP.COM (Gary Jackoway) writes:
> I don't think a compiler can make enums smaller than int and be ANSI
> compliant, since the standard says that enums are ints.

You're mixing 'enumeration constants' and 'enumerated types'.  See
Section 3.5.2.2 of the ANSI Standard.  Enumeration constants (the
identifiers between the { and the }) are of type 'int'.  Then again,
so are character constants, like '*'.  "Each enumerated type shall be
compatable with an integer type; the choice of type is implementation-
defined."  So the enum value itself might be a 'char', say, if the
range of enumeration constants defined permits it.

> It is for this reason that I almost never use enums.  Their size
> varies from machine to machine and there is nothing you can do about
> it.

It is for this reason that I almost never use character constants like
'a'.  They're ints, so their size varies from machine to machine and
there is nothing you can do about it.

What C type DOESN'T vary in size from machine to machine?  (Answer:
none.  The Standard only sets minima, and in practice there is
variation.  I suspect that there are or will be compilers with 16 bit
or 32 bit 'char'.)

Why in creation you CARE how much space they take?  I can see only two
reasons: (1) binary files, (2) memory space.  Binary files are
inherently unportable.  Often, worrying about memory space is
worshipping the Little Tin God of Efficiency.  If you have a large
array AND have severely limited memory, you may need to worry.
Otherwise, you may save a few bytes, but you may pay for it in slower
access time.

Only optimize if you absolutely need the space or the time, and then
only attack the areas with the biggest payoffs until you fit into your
budget.

> Then, some day if the language supports [arrays of bits], you will
> be able to upgrade painlessly.

Can't happen.  "sizeof (char) == 1" is now fixed in the language, as
is the fact that it yields an integral type.
--
Tim McDaniel                 Applied Dynamics Int'l.; Ann Arbor, Michigan, USA
Work phone: +1 313 973 1300                        Home phone: +1 313 677 4386
Internet: mcdaniel@adi.com                UUCP: {uunet,sharkey}!amara!mcdaniel

steve@taumet.com (Stephen Clamage) (11/07/90)

gary@hpavla.AVO.HP.COM (Gary Jackoway) writes:

>I don't think a compiler can make enums smaller than int and be ANSI
>compliant, since the standard says that enums are ints.

That's not quite what the standard says.  It says (3.5.2.2) that
the enum identifiers have type int, but that
    "Each enumerated type shall be compatible with an integer type;
     the choice of type is implementation-defined."

So given:
	enum color { red, yellow, blue };
The constants red, yellow, and blue each have type int, but you
cannot predict the value of sizeof(enum color).  It could be
anything from sizeof(char) to sizeof(long).
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

henry@zoo.toronto.edu (Henry Spencer) (11/07/90)

In article <9130019@hpavla.AVO.HP.COM> gary@hpavla.AVO.HP.COM (Gary Jackoway) writes:
>I don't think a compiler can make enums smaller than int and be ANSI
>compliant, since the standard says that enums are ints...

No it doesn't.  It says that each one is "compatible with an integer
type; the choice of type is implementation-defined" and that the *constants*
must have values that fit in an int.  There is actually no promise, as
nearly as I can tell, that an enumerated type is big enough to hold the
constants it is declared with!  Certainly there is no promise that
anything other than those constants will fit.  So an implementation
could easily decide that `enum { f=0, t=1 }' is represented as `char'
rather than `int'.
-- 
"I don't *want* to be normal!"         | Henry Spencer at U of Toronto Zoology
"Not to worry."                        |  henry@zoo.toronto.edu   utzoo!henry

scs@adam.mit.edu (Steve Summit) (11/08/90)

In article <1990Nov7.003126.23445@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>...each [enum] is "compatible with an integer
>type; the choice of type is implementation-defined"...
>So an implementation could easily decide that `enum { f=0, t=1 }'
>is represented as `char' rather than `int'.

I think this has been discussed before, perhaps on comp.std.c .
(I do hope it hasn't been discussed much, or recently.  I can't
remember the consensus, but it would be particularly embarrassing
for me to ask a frequently-asked question :-) .)  Does the
standard really allow a different choice to be made for different
enumerations?  Henry's word "each" (which also appears in the
standard, section 3.5.2.2) is interesting, and might suggest that
different sizes are legal, but Appendix F.3.9 (which is, to be
sure, not part of the formal Standard) notes as implementation-
defined "the integer type chosen to represent the values of an
enumeration type."

If different enums can have different sizes (which seems like a
useful license to grant the compiler) the documentation would
have to state "the algorithm by which the integral types for
enumerations are chosen," not "the [single] integral type."

                                            Steve Summit
                                            scs@adam.mit.edu

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/08/90)

In article <1990Nov8.000847.16340@athena.mit.edu>, scs@adam.mit.edu (Steve Summit) writes:
> Does the standard really allow a different choice to be made for
> different enumerations?

Consider the following program fragment:

	enum foo *p;
	... { ... f((char*)p); ... } ...
	enum foo {a,b,c};

Last time I tried that in a C program it worked.  But on a machine
where byte pointers and word pointers have different formats, the
compiler needs to know what casting p to (char*) involves.  Such a
machine's C compiler could of course always use 'int'.

-- 
The problem about real life is that moving one's knight to QB3
may always be replied to with a lob across the net.  --Alasdair Macintyre.

gwyn@smoke.brl.mil (Doug Gwyn) (11/09/90)

In article <1990Nov8.000847.16340@athena.mit.edu> scs@adam.mit.edu (Steve Summit) writes:
>Does the standard really allow a different choice to be made for different
>enumerations?

Yes.  In fact it might be nice to allot only as large an integer type as is
needed to hold all valid values (enumeration constant values for the type).

Enumeration constants themselves are int, but who cares.  It's the type we
would like to be compact.

>If different enums can have different sizes (which seems like a
>useful license to grant the compiler) the documentation would
>have to state "the algorithm by which the integral types for
>enumerations are chosen," not "the [single] integral type."

Yes, I would think that the implementation definition should be expected
to spell out how the choice is made.

henry@zoo.toronto.edu (Henry Spencer) (11/13/90)

In article <14401@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>>Does the standard really allow a different choice to be made for different
>>enumerations?
>
>Yes.  In fact it might be nice to allot only as large an integer type as is
>needed to hold all valid values (enumeration constant values for the type).

Out of curiosity, Doug, can you find anywhere where the standard promises
that the integer type *is* large enough to hold all those values?  I can't.
That's "obviously" the intent, but it doesn't seem to have made it into
the final document!
-- 
"I don't *want* to be normal!"         | Henry Spencer at U of Toronto Zoology
"Not to worry."                        |  henry@zoo.toronto.edu   utzoo!henry

gwyn@smoke.brl.mil (Doug Gwyn) (11/14/90)

In article <1990Nov13.041806.10991@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>Out of curiosity, Doug, can you find anywhere where the standard promises
>that the integer type *is* large enough to hold all those values?  I can't.
>That's "obviously" the intent, but it doesn't seem to have made it into
>the final document!

I recall somebody on X3J11 commenting on this at the last meeting.
It's the sort of thing that could use a firm interpretation ruling.
(I'm not sure that an implementation could make enums work according
to the spec if they were incapable of holding distinct representations
for all their members, but I don't have a proof ready.)