[comp.lang.c] union or type casting, which to use?

ik@laic.UUCP (Ik Su Yoo) (11/27/90)

(Excuse me if this question is covered in the FAQ list.)

Suppose I have the following typedefs:

  typedef struct _foo { /* foo def */ } foo;
  typedef struct _bar { /* bar def */ } bar;

I want to create a new typedef with a field to hold a pointer to
either `foo' or `bar'. Is there any pros/cons of using union vs.
(explicit) type casting? For instance, is

  typedef struct {
    int type; /* 0 for foo pointer, 1 for bar pointer */
    union {
      foo *f;
      bar *b;
    } item;
  } oneof1;

better (or worse) than

  #include <sys/types.h>

  typedef struct {
    int     type; /* 0 for foo pointer, 1 for bar pointer */
    caddr_t item;
  } oneof2;

In the second method, the `item' field will be type casted to (foo *)
or (bar *) as necessary.

One of the major need is to be able to statically initialize the
`item' field. If the first method is better, how do I initialize the
`item' field? I don't care much for inefficiency due to run-time type
casting.

Thanks in advance.

-- 
|  Ik Su Yoo                                 |  Office: (415) 354-5584        |
|  Scientist @ Lockheed AI Center            |  Fax:    (415) 354-5235        |
|  Orgn. 96-20, Bldg. 259, 3251 Hanover St.  |  E-mail: ik@laic.lockheed.com  |
|  Palo Alto, CA  94304-1191                 |          ...!leadsv!laic!ik    |

bilbo@bisco.kodak.COM (Charles Tryon) (11/27/90)

In <931@laic.UUCP>, ik@laic.UUCP (Ik Su Yoo) asks:
> Suppose I have the following typedefs:
> 
>   typedef struct _foo { /* foo def */ } foo;
>   typedef struct _bar { /* bar def */ } bar;

  One side note:  Avoid using names beginning with underscores.  These names
  are reserved for the system to use, and could break things which would be
  VERY hard to track down.

  I use something like this:
        typedef struct bar { /* bar def */ } Bar;
                                             ^^^
                            All my typedef'ed names start with Ucase letters.

> I want to create a new typedef with a field to hold a pointer to
> either `foo' or `bar'. Is there any pros/cons of using union vs.
> (explicit) type casting? For instance, is
> 
>   typedef struct {
>     int type; /* 0 for foo pointer, 1 for bar pointer */
>     union {
>       foo *f;
>       bar *b;
>     } item;
>   } oneof1;

  Again, in order to make things more readable later on (i.e., less prone to
  misunderstandings and errors :-), I would use a pair of #defined constants,
  or even better, an enum, for your type identifier, rather than 0/1.

        enum {FOO_TYPE, BAR_TYPE} Type;

> One of the major need is to be able to statically initialize the
> `item' field. ...

  Hummm...  Now, after all of this, I'm not going to answer your real question.
  I don't know if you can staticly initialize a union.  Sorry.  :-/

> |  Ik Su Yoo

--
Chuck Tryon
    (PLEASE use this address, as Kodak foobars one in header!)
    <bilbo@bisco.kodak.com>
    USmail: 46 Post Ave.;Roch. NY 14619                       B. Baggins
    <<...include standard disclamer...>>                      At Your Service

  "Swagger knows no upper bound, but the laws of physics remain unimpressed."
                                                            (D. Mocsny)

pds@lemming.webo.dg.com (Paul D. Smith) (11/28/90)

[] > One of the major need is to be able to statically initialize the
[] > `item' field. ...

[]   Hummm...  Now, after all of this, I'm not going to answer your
[]   real question.  I don't know if you can staticly initialize a
[]   union.  Sorry.  :-/

ANSI C defines that the first item in a union may be initialized, but
only the first item.  This is done by just initializing the fields of
the first element of the union just as if it was a structure, i.e.:

typedef union
{
    struct
    {
        int     u1_int;
        char    *u1_str;
    } u1;
    struct
    {
        char    u2_char;
        char    *u2_charp;
    } u2;
} MY_STRUCT;

Initialize with:

MY_STRUCT my_struct =
{                                   /* union MY_STRUCT */
    {                               /* struct u1 */
        5,                          /* u1_int */
        "blooper"                   /* u1_str */
    }
};

Note u2 cannot be initialized.  If you don't want your program to be
portable, it is sometimes possible with some foreknowledge of the
relative sizes, etc. of different data types to declare the first
element to be general enough to hold all the other elements, if they
are cast correctly.

[] > I want to create a new typedef with a field to hold a pointer to
[] > either `foo' or `bar'. Is there any pros/cons of using union vs.
[] > (explicit) type casting? For instance, is
[] > 
[] >   typedef struct {
[] >     int type; /* 0 for foo pointer, 1 for bar pointer */
[] >     union {
[] >       foo *f;
[] >       bar *b;
[] >     } item;
[] >   } oneof1;

In your case, you have no problem: the union elements are both the
same size and are guaranteed by ANSI to be compatible with a cast:
both are pointers.  You can simply in the static initialization cast
the pointer to be of type (foo *), like this:

foo *f_ptr;
bar *b_ptr;

oneof1 my_foo =
{
    0,                      /* type: foo pointer */
    {                       /* union item */
        f_ptr               /* f */
    }
};

oneof1 my_bar =
{
    1,                      /* type: bar pointer */
    {                       /* union item */
        (foo *)b_ptr        /* f */
    }
};

And all will work wonderfully!
--

                                                                paul
-----
 ------------------------------------------------------------------
| Paul D. Smith                          | pds@lemming.webo.dg.com |
| Data General Corp.                     |                         |
| Network Services Development Division  |   "Pretty Damn S..."    |
| Open Network Systems Development       |                         |
 ------------------------------------------------------------------

chris@mimsy.umd.edu (Chris Torek) (11/29/90)

In article <PDS.90Nov27180241@lemming.webo.dg.com>
pds@lemming.webo.dg.com (Paul D. Smith) writes:
>In your case, you have no problem: the union elements are both the
>same size and are guaranteed by ANSI to be compatible with a cast:
>both are pointers. [details deleted, see original article]

Are you certain?

I think that it is possible to conclude, given nothing more than
X3.159-1989 and `ANSI standard nomenclature', that all struct pointers
*are* the same size, but that it is *not* possible to conclude that
they have the same format (e.g., different types might use different
bits, where struct pointers to A use the low bits and struct pointers
to B use the high bits).  I think you also cannot conclude that

	struct i { int x; };
	struct c { char x; };
	union u { struct i *i; struct c *c; } u;
	struct c foo;

	foo.x = 'a';
	u.i = (struct i *)&foo;
	printf("u.c->x = %c\n",
	    u.c->x				/* ERROR? */
	);

will work.  In particular, the line marked `ERROR?' *might* be guaranteed
to work if you change it to

	    ((struct c *)u.i)->x

since the `struct c *' cast `undoes' any effect that the `struct i *'
cast might have had (e.g., shifting the bits around to use low or high).

It *is* true that there are very few, if not none at all, implementations
in which the line marked `ERROR?' will in fact fail.  But I think it is
not guaranteed.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

pds@lemming.webo.dg.com (Paul D. Smith) (11/30/90)

[] In article <PDS.90Nov27180241@lemming.webo.dg.com>
[] pds@lemming.webo.dg.com (Paul D. Smith) writes:
[] >In your case, you have no problem: the union elements are both the
[] >same size and are guaranteed by ANSI to be compatible -->with
[] >a cast<--:
[] >both are pointers. [details deleted, see original article]

[] Are you certain?  [proof deleted, see original article]

Hmm.  I see your point here.  I retract my statement about "guaranteed
by ANSI".  It would appear that there really is no truly portable way
to initialize unions of any reasonable complexity; if you don't want
to use the first element of the union then you are SOL (unless you are
making a union where all elements have the same type structure, which
could be useful in some applications).

Of course, I'd venture to say that the static initialization I posted
would work on 99% of machines available today; does *anyone* know of a
machine which uses different internal representations of pointers for
different structure types?  Also of course, it's always that last 1%
which is the kick in the pants :-)

Bummer.  Too bad ANSI couldn't come up with some more syntax for
initializing unions ... (yes I know it would be difficult and not
very compatible, and no I don't have any good ideas offhand, and I'm
not criticizing them, just thinking wistfully ... )
--

                                                                paul
-----
 ------------------------------------------------------------------
| Paul D. Smith                          | pds@lemming.webo.dg.com |
| Data General Corp.                     |                         |
| Network Services Development Division  |   "Pretty Damn S..."    |
| Open Network Systems Development       |                         |
 ------------------------------------------------------------------

karl@ima.isc.com (Karl Heuer) (12/03/90)

In article <PDS.90Nov30094546@lemming.webo.dg.com> pds@lemming.webo.dg.com (Paul D. Smith) writes:
>Bummer.  Too bad ANSI couldn't come up with some more syntax for
>initializing unions ...

More precisely, it's too bad that a better syntax, such as my proposal for
labeled initializers (posted to alt.lang.cfutures not too long ago and now in
the hands of GNU) wasn't already existing practice at the time when it could
have made a difference.  I think it's a much better solution than the kludge
X3J11 had to invent.

X3J11 wasn't particularly interested in making it possible to initialize
unions.  They just wanted to have a rule that would make uninitialized
static-duration unions have a well-defined value.

Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint

chris@mimsy.umd.edu (Chris Torek) (12/04/90)

In article <28080@mimsy.umd.edu> I wrote:
>I think that it is possible to conclude, given nothing more than
>X3.159-1989 and `ANSI standard nomenclature', that all struct pointers
>*are* the same size, but that it is *not* possible to conclude that
>they have the same format (e.g., different types might use different
>bits, where struct pointers to A use the low bits and struct pointers
>to B use the high bits). ...
>
>	struct i { int x; };
>	struct c { char x; };
>	union u { struct i *i; struct c *c; } u;
>	struct c foo;
>
>	foo.x = 'a';
>	u.i = (struct i *)&foo;
>	printf("u.c->x = %c\n",
>	    u.c->x				/* ERROR? */
>	);

Tony Hansen sent me an interesting argument based on the idea of
`incomplete structure pointers'.  Given a module.h that declares:

	struct xyz;
	typedef struct xyz *hidden_t;
	hidden_t newxyz(void);
	void xyz(hidden_t), endxyz(hidden_t);
	#define NOXYZ ((hidden_t)0)

and a separate module that uses it, it becomes clear that at least
nil pointers to `xyz' structures must be known/knowable `outside'
the module that actually defines them.  With some stretching (and
the use of `void *' intermediaries), the argument can be extended
to non-nil pointers as well.

There is, however, a counterargument that applies to both of these.
There is no rule that says that the compiler must generate real code
until it `has all the pieces'.  `cc -c foo.c' might write a foo.o
that contains an intermediate form, and only when foo.o is `linked'
with module.o (which actually defines `struct xyz') are the pointers
given final types, and only then would the compiler generate the
machine instructions needed to use those pointers.

This argument might even obviate the need to make all struct pointers
the same `size'.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

henry@zoo.toronto.edu (Henry Spencer) (12/05/90)

In article <1990Dec02.231041.8281@dirtydog.ima.isc.com> karl@ima.isc.com (Karl Heuer) writes:
>... I think it's a much better solution than the kludge
>X3J11 had to invent.

Once again X3J11 gets blamed for "inventing" existing practice.  Yes,
believe it or not, there was existing practice for the first-member rule,
which gave it a big edge over various better-but-untried proposals.
-- 
"The average pointer, statistically,    |Henry Spencer at U of Toronto Zoology
points somewhere in X." -Hugh Redelmeier| henry@zoo.toronto.edu   utzoo!henry