[comp.lang.c] Enum vs Define

awd@dbase.UUCP (Alastair Dallas) (06/20/87)

This may sound like a beginner's question, but I don't consider myself
a beginner:  Why use enum instead of #define?  For example, if I've
got an array of error message strings indexed with ERRMSGA, ERRMSGB
identifiers, isn't it more robust to explicitly define the values
(which must correspond to the array) than to let the compiler do it?
What are the tradeoffs involved between:

	enum {
		ERRMSGA = 1,
		ERRMSGB = 2
		};

and 

	#define ERRMSGA 1
	#define ERRMSGB 2

I'm asking because it's unusual to find two ways to do the same thing
that are effectively equal.  Any comments?

Alastair Dallas
ASHTON-TATE Glendale

These comments have nothing to do with my employer; I'm just curious.

guy%gorodish@Sun.COM (Guy Harris) (06/22/87)

> Why use enum instead of #define?  For example, if I've
> got an array of error message strings indexed with ERRMSGA, ERRMSGB
> identifiers, isn't it more robust to explicitly define the values
> (which must correspond to the array) than to let the compiler do it?

Absent a mechanism for declaring arrays with "enum"s as subscripts, the
#defines may be better *in this case*.  However, this is NOT a
typical example of the use of "enum"s.  Most uses of "enum" do not
consider the numerical value used to represent the various values of
the "enum" to be important.  In those cases, the advantages would be:

	1) Better type checking.  The compiler will complain (or,
	   one would hope, warn, even in an ANSI C compiler) about
	   mixing one "enum" type with another.  Such mixing rarely
	   makes sense (the compiler should probably allow the
	   programmer to use casts to indicate that they have
	   determined that mixing does make sense in some particular
	   case).

	2) Better interfaces with debuggers.  If you declare

		enum state { START, STATEA, STATEB, FINAL };

	   a good debugger, when asked to print the value of "state",
	   will print it as "START", "STATEA", "STATEB", or "FINAL".
	   However, if you declare

		int state;

		#define	START	0
		#define	STATEA	1
		#define	STATEB	2
		#define	FINAL	3

	   and ask it to print "state", it will print the numerical
	   value and force you to translate it to something
	   meaningful.

If you could declare arrays indexed by "enum"s, I would use them as
error message indices as well.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

franka@mmintl.UUCP (Frank Adams) (06/24/87)

In article <196@dbase.UUCP> awd@dbase.UUCP (Alastair Dallas) writes:
>Why use enum instead of #define?  For example, if I've
>got an array of error message strings indexed with ERRMSGA, ERRMSGB
>identifiers, isn't it more robust to explicitly define the values
>(which must correspond to the array) than to let the compiler do it?
[i.e.]
>	enum {
>		ERRMSGA = 1,
>		ERRMSGB = 2
>		};

This is a good example of a case where I would *not* put explicit constants
in the enum.  Suppose, having built a list of 100 error messages, the third
one becomes irrelevant and is to be removed?  This way, you have to either
manually renumber 96 error messages, leave a hole in the table, or move
another message down.

Trying to make message files consistent from release to release is a fool's
errand.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

jimp@cognos.uucp (Jim Patterson) (06/29/87)

In article <196@dbase.UUCP> awd@dbase.UUCP (Alastair Dallas) writes:
>This may sound like a beginner's question, but I don't consider myself
>a beginner:  Why use enum instead of #define?  

It's often preferrable to use a language feature rather than a
preprocessor featuure simply because the compiler can then make
more intelligent decisions.  This also extends to the language
support systems such as debuggers.

A good illustration of this is the VAX/VMS debugger which supports
C as a high-level language.  In your example, if you declare a value
x as an enum and then in the debugger say
	EXAMINE x 
the debugger will give you an answer of ERRMSGA or ERRMSGB (assuming
it contains a valid value). If you use #define, then the debugger
will answer 1 or 2.  Having the debugger provide symbolic answers
instead of numeric ones can be a real benefit when you have a lot
of such code and no handy listing.



-- 

Jim Patterson          decvax!utzoo!dciem!nrcaer!cognos!jimp
Cognos Incorporated    

am@cl.cam.ac.uk (Alan Mycroft) (06/30/87)

In article <196@dbase.UUCP> awd@dbase.UUCP (Alastair Dallas) writes:
> (what is the difference between)
>	enum {
>		ERRMSGA = 1,
>		ERRMSGB = 2
>		};
>and 
>
>	#define ERRMSGA 1
>	#define ERRMSGB 2
>
Here's one which is facetious, but none-the-less risky if you haven't
thought carefully:
If I say
        if (ERRMSGA == ERRMSGB)
           ...
then ... is always executed in both cases.
Now, if I do
        #if (ERRMSGA == ERRMSGB)
           ...
        #endif
then the ANSI-C draft REQUIRES(!!!!!)
the first case NOT to compile ... and the second case to compile ... .
Put simply, all undefined macros (including enum constants) are treated
as 0 by the pre-processor, often (sadly) without warning.

RMRichardson.PA@Xerox.COM (Rich) (07/30/87)

In article <729@jenny.cl.cam.ac.uk> Alan Mycroft
<am@computer-lab.cambridge.ac.uk> writes:
>In article <196@dbase.UUCP> awd@dbase.UUCP (Alastair Dallas) writes:
>> (what is the difference between)
>>	enum {
>>		ERRMSGA = 1,
>>		ERRMSGB = 2
>>		};
>>and 
>>
>>	#define ERRMSGA 1
>>	#define ERRMSGB 2
>>
>Here's one which is facetious, but none-the-less risky if you haven't
>thought carefully:
>If I say
>        if (ERRMSGA == ERRMSGB)
>           ...
>then ... is always executed in both cases.
>Now, if I do
>        #if (ERRMSGA == ERRMSGB)
>           ...
>        #endif
>then the ANSI-C draft REQUIRES(!!!!!)
>the first case NOT to compile ... and the second case to compile ... .
>Put simply, all undefined macros (including enum constants) are treated
>as 0 by the pre-processor, often (sadly) without warning.

Pardon me, but I think of the four cases the first two compile and do
NOT execute (that is, (1 == 2) is false), the third compiles because the
two macro names are undefined (thus replaced by 0 and 0 == 0, which is
true) and the fourth fails to compile because (1 == 2) is false.  

From H&S (2nd ed.) pg. 41 sec. 3.5.1 The #if, #else, and #endif Commands

> ...
>     If an undefined macro name appears in the constant-expression 
> of #if or #elif it is replaced by the integer constant 0.  This 
> means that the commands "#ifdef name" and "#if name" will have 
> the some effect as long as the macro name, when defined, has a 
> constant, arithmetic, nonzero value.  We think it is much 
> clearer to use #ifdef or the defined operator in these cases, 
> but even Draft Proposed ANSI C supports the use of #if.

(Has the Draft changed?)

I assume the statement:

	enum {
		ERRMSGA = 1,
		ERRMSGB = 2
		};

would not cause the preprocessor to take ERRMSGA and ERRMSGB as anything
other than undefined macro names; thus, #if (ERRMSGA == ERRMSGB) becomes
#if (0 == 0).  

If the tests were:

        if (ERRMSGA != ERRMSGB) 
        #if (ERRMSGA != ERRMSGB) 

then I would expect the first two cases to compile and execute, the
third case to be excluded from the compile, and the fourth case to
compile and execute.  

I've been looking at this message for a week or so trying to find
something arcane that would make the claims correct, unsucessfully.
Either the tests are backwards or I may never understand C.  (:-)  When
I read the section on #if in K&R, I thought the preprocessor might give
an error because of the undefined names, but H&S (and the Draft?) is
quite specific -- there is no warning.  

Did I miss anything here?

Rich

oster@dewey.soe.berkeley.edu (David Phillip Oster) (07/31/87)

I missed the beginning of this discussion, so apologies in advance if
this posting just repeats issues that have already been covered.

One hard won bit of knowledge for me is how the difference between
enums and defines can screw you up:

in :
       enum {
                ERRMSGA = 1,
                ERRMSGB = 2
                };
 ERRMSGA is defined to be a subtype of the smallest type that will
hold any of these enums. Since none of these enums are very large,
you'd expect:

sizeof ERRMSGA == sizeof(char)	/* the parens are not necessary on the
			/* left hand side, but they are on the right */

On the other hand, if I had said:

#define ERRMSGA 1
#define ERRMSGB 2

I would expect

sizeof ERRMESGA == sizeof(int)

Ordinarily, this fine distinction is harmless. However, the LightSpeed
C compiler for the macintosh has been augmented with a non-standard
keyword: "pascal". If you declare a function to be of type pascal, the
compiler compiles a different code preamble and postable for the
routine, and things that call the "pascal" routine pass their
arguments to it in a different manner. In particular, the automatic
conversion of char parameters into int parameters gets suppressed. (I
mean, the normal process by which the caller passes a char, the callee
recieves a char, but an int is really being passed on the stack.)

In addition, LightSpeed C acts as if the operating system calls (which
are all of type "pascal") automatically had function prototypes, i.e.
it automatically coerces char to int on operating system calls. Since
operating system calls, and ordinary C calls are handled correctly,
the above oddity will only bite you unless you define a "pascal"
routine of your own.

--- David Phillip Oster            --My Good News: "I'm a perfectionist."
Arpa: oster@dewey.soe.berkeley.edu --My Bad News: "I don't charge by the hour."
Uucp: {seismo,decvax,...}!ucbvax!oster%dewey.soe.berkeley.edu

alex@umbc3.UMD.EDU (Alex S. Crain) (07/31/87)

>One hard won bit of knowledge for me is how the difference between
>enums and defines can screw you up:
>
>in :
>       enum {
>                ERRMSGA = 1,
>                ERRMSGB = 2
>                };
> ERRMSGA is defined to be a subtype of the smallest type that will
>hold any of these enums. Since none of these enums are very large,
>you'd expect:

    No, wait stop, enough. I like C. I find it gives me lots of freedom to
work in and I believe that it is my responsibility as a programmer to write
readable, maintainable code. if i felt that i needed the language to do it
for me, i would use pascal or ada. 

    Anyone who doesn't feel comfortable with an untyped language but for some 
reason must use C has my sympathies, but this is really getting out of hand.
I remember reading somewhere that there is not a 'complete' ada compiler in 
existance because no-one can afford the disk space...this is why. the team
that i work on has agreed among ourselves not to use enum, not to #define
char int as 'byte', not to define { as begin and not to use TRUE as 1. we
leave that to the guys upstairs using turbo-pascal.

 AAAA	RRRR	FFFFF	 	Do what you want to do : You will anyway.
AA  AA	RR RR	FF
AAAAAA	RRRR	FFFF	 	If found wandering aimlessly, 
AA  AA	RR RR	FF           	   feed and return to:
AA  AA	RR RR	FF	 	   alex@umbc3.umd.edu

guy%gorodish@Sun.COM (Guy Harris) (07/31/87)

>     No, wait stop, enough. I like C. I find it gives me lots of freedom to
> work in and I believe that it is my responsibility as a programmer to write
> readable, maintainable code. if i felt that i needed the language to do it
> for me, i would use pascal or ada. 

I'm sure lots of people feel they don't need the language to help
them out when writing code; after the first few coding errors of theirs are
caught by "lint" *before* they compile and run the program, they'll
probably change their mind.  (After having several kernel problems
turned up by "lint", I know I'm convinced.)

> 
>     Anyone who doesn't feel comfortable with an untyped language but for
> some reason must use C has my sympathies,

Anyone who doesn't feel comfortable with a *typed* language but for
some reason must use C has *my* sympathies; C *is* a typed language,
and I believe a strongly-typed one (albeit with weak type checking).

> I remember reading somewhere that there is not a 'complete' ada compiler in 
> existance because no-one can afford the disk space...this is why.

Because it has enumerated data types?  Give me a break!  First of
all, I'm not sure I believe the assertion in question, and second of
all, I don't believe that if no such compiler exists becasue "no-one
can afford the disk space", it's not just because ADA is strongly
typed!  (Consider all the features ADA has: operator overloading,
generics, etc., etc., etc..)

There *do* exist strongly-typed languages with strong type-checking
that *do* have complete compilers and that, according to at least
some of their users, make their life easier for having strong
type-checking.  Mesa is one of them.

> the team that i work on has agreed among ourselves not to use enum,

Sounds like your team is cutting off their noses to spite their
faces.  I can think of three advantages to "enum" right off the bat:

	1) Given the proper fixes to "lint", it can detect some
	   coding mistakes at compile time.

	2) Debuggers like "dbx" can print "enum"s much more
	   meaningfully than they can print "int"s whose values are
	   restricted, by convention, to a member of a set of
	   #defines.

	3) Many compilers (including most PCC-based ones) will
	   automatically choose a reasonable size for an "enum"
	   variable based on the largest and smallest numbers used to
	   represent the values of that "enum".

We use both "enum"s and "lint" here, at least in the group doing OS
development.  "lint"ing the kernel catches quite a few problems.

Strong typing isn't for weak minds; the argument "strong typing is
for weak minds" is for weak minds.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

jimp@cognos.uucp (Jim Patterson) (08/11/87)

In article <19913@ucbvax.BERKELEY.EDU> oster@dewey.soe.berkeley.edu.UUCP (David Phillip Oster) writes:
>in :
>       enum {
>                ERRMSGA = 1,
>                ERRMSGB = 2
>                };
> ERRMSGA is defined to be a subtype of the smallest type that will
>hold any of these enums. Since none of these enums are very large,
>you'd expect:
>
>sizeof ERRMSGA == sizeof(char)	

I'm not familiar with any compiler that works this way. Apparently
your LightSpeed compiler does; perhaps PC compilers are more
space-conscious.  Compilers I've used always make enum constants
subtypes of a plain int. On the SUN, on VAX/VMS (using VAX-11 C) and
on a DG MV system (AOS/VS C), enum constants all occupy 4 bytes, the
same as int. This is in fact mandated by the current ANSI draft. In
section 3.1.3.3 Enumeration Constants, it says

    "An identifier declared as an enumeration constant has type int".

The compilers I've used have always made enum's (as opposed to enum
constants) of int size as well.  This is permissable but not required
by the ANSI draft, which says (section 3.5.2.3 Enumeration Specifiers):

    "The implementation may use the set of values in the enumeration to
    determine whether to allocate less storage than an int".

So, like normal integer constants, enum constants (like ERRMSGA) are
required by the ANSI draft to be of size int. In this regard, it appears
that your lightspeed compiler is non-compliant.  Regarding the enumeration
itself, the compiler is free to make it of size int, or to make it of a
size less than an int but sufficient to contain all of its values.
-- 

Jim Patterson          decvax!utzoo!dciem!nrcaer!cognos!jimp
Cognos Incorporated