[comp.lang.c] Union initialization

ray@burdvax.UUCP (04/21/87)

Is it possible to initialize structures containing unions?  I have
declared some structures which contain unions, i.e.

struct xxx
{
    ...
    union yyy
    {
        char a;
        int b;
    } c;
};

Then I define some variables of this structure type and want to give them
intial values.  Is there a way to give an initial value in the variable
definition?

-------------

Ray					ray@burdvax.PRC.Unisys.COM
					...!burdvax!ray

hokey@plus5.UUCP (04/23/87)

There is no current way to perform compile-time initailization of unions.

If the braindead compile-time first-member initialization rule in the proposed
C standard gets accepted, you will have a way to perform compile-time
initialization of unions which only need to be initialized to the type of
their initial members.  In my opinion, this mechanism is *worse* than not
having *any* way to perform compile-time initialization of unions.
-- 
Hokey

greg@utcsri.UUCP (04/26/87)

In article <1722@plus5.UUCP> hokey@plus5.UUCP (Hokey) writes:
>There is no current way to perform compile-time initailization of unions.
>
>If the braindead compile-time first-member initialization rule in the proposed
>C standard gets accepted, you will have a way to perform compile-time
>initialization of unions which only need to be initialized to the type of
>their initial members.  In my opinion, this mechanism is *worse* than not
>having *any* way to perform compile-time initialization of unions.
>-- 
>Hokey

Darn Tootin, Hokey. What's the &&#%!@%? point of initializing a union
(actually, making an initialized thing use unions ) if you can't use
the union feature to initialize different parts differently?

When a structure is initialized, {}'s are placed around the initializers for
its members ( they aren't always required, but I try not to know that ).  When
a union is initialized, and the default 'first member' is used, there are
presumably (no ANSI draft HERE...) no {}'s required, since there is only one
'thing' inside a union initializer ( the thing often requiring {}'s itself).
Is this the case? Actually I seem to remember that ANSI allows redundant {}'s
around scalar initializers and around other {}'s, and therefore around
union initializers too.

How about this:

union {
	double dvar;
	int ivar[2];
	struct svar{
		char flob[2];
		char *(*func)();
	}
} unarray[4] = {	/* this { is for the array */
.dvar =	12.00,		/* This entire line (without the comma) is the
			   initializer for unarray[0]. Since that is a union
			   object, the initializer is . <member-name> = <init>
			   where <init> is a valid initializer for an object
			   of the same type as the selected member, in this
			   case double */
.dvar =	13.100,
.ivar =	{ 1,0 },	/* the {}'s are for ivar, an array of ints */
.svar = { "?", malloc } /* the {}'s are for svar, a struct */
};			/* end the array unarray */

If no '. tag =' is given, the first member of the union is used.
So what d'y'all think? The '.' I think may be useful to allow
lexers to treat the identifier differently if they want to (once defined,
union and struct tags appear only after -> and . in C, and since they
are in a separate name space, this fact may be relied upon by some parsing
methods ).

The device only appears when the compiler 'knows' that a union
is being initialized. However YACC doesn't know that (barring some
strange mechanism to tell it). So it is important that such a thing
be parseable in a context-free way. I believe this thing is, since
'.' can never appear at the start of a normal expression. ( the token '.',
not the character ).

If directly nested unions are used, you can end up with things like
.member = .deeper_member = { 2, "blat" }
It is tempting to suggest that .member.deeper_member = ... be allowed,
which would be far more readable. This seems like rampantly feeping
creaturism, especially as directly nested unions are so scarcely used.

I guess it's too late to get anything done about this, right?
-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...

karl@haddock.UUCP (Karl Heuer) (04/27/87)

In article <1722@plus5.UUCP> hokey@plus5.UUCP (Hokey) writes:
>In my opinion, [ANSI's] mechanism is *worse* than not having *any* way to
>perform compile-time initialization of unions.

First off, let me state that I think that initializing a union is a perfectly
valid thing to want to do.  Let's look at some of the possibilities:

K&R says it can't be done.  This is annoying; it requires one to use run-time
initialization or non-portable kludges (including, perhaps, writing the
definition in assembly language).

ANSI proposes that unions be initializable, but only to their first member.
The sudden introduction of a distinguished member of a (previously unordered)
union bothers me a bit; I feel they're attacking the symptom instead of the
problem.  Also, this proposal doesn't always help: the initialization
    { a[0].asfloat = 17.5, a[1].asstring = "hello" }
still can't be performed at compile time.

A cast to union type would fix this (the example would then read
    union floatstr a[2] = { (union floatstr)17.5, (union floatstr)"hello" };
), but this could still cause problems -- if a union includes a char member as
well as an int member, how does one initialize the char?  (Recall that C has
no rvalues of type char.)

Here's my new proposal:

I think what I'd like to see is a syntax for initializing an aggregate by name
rather than position.  Something like
    union floatstr a[2] = { { asfloat: 17.5 }, { asstring: "hello" } };
(the outer braces are for the array, the inner ones for the union).  This
would also apply to structs, so one could initialize a struct without having
to know the number or order of the members:
    struct tm today = { tm_year: 87, tm_mon: 4, tm_mday: 27 };
(Other elements would, as usual, be initialized to zero (if the struct is
static) or garbage (if auto).)  This could even be extended to arrays:
    int a[30] = { 29: -1 }; /* an array with a -1 in the last position */
I would find this last feature useful when dealing with an array which is
logically subscripted by an enum:
    int val[NCOLORS] = { RED: 0x00f, GREEN: 0x0f0, WHITE: 0xfff };

The primary objection to this scheme, I think, is that it is a significant
addition to the language.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

chris@mimsy.UUCP (Chris Torek) (04/28/87)

In article <455@haddock.UUCP> karl@haddock.UUCP (Karl Heuer) writes:
>ANSI proposes that unions be initializable, but only to their first member.
>The sudden introduction of a distinguished member of a (previously unordered)
>union bothers me a bit; I feel they're attacking the symptom instead of the
>problem.

To refresh memories (in some cases) or provide background (in others),
the problem, or symptom, is this:  Uninitialised variables that have
static allocation (that is, globals and local static variables) are, in
C, defined as though they were initialised to `zero'.  That is, given
foo.c:

	% cat foo.c
	int i;
	double d;
	char *cp;
	%

code compiled with foo.c this must perform equivalently to code
compiled with this:

	int i = 0;
	double d = (double) 0;		/* casts included for clarity */
	char *cp = (char *) 0;		/* (C does not require them) */

In most existing systems, this is trivial because 0 and 0.0 and
(char *)0 are all all-zero bit patterns, and operating systems or
runtime startups need know only how many bits of zeros are required
(the infamous `bss' space).  There are machines, though, in which
not all of these are all zero bit patterns; a Lisp machine with
tagged pointers take the source to mean

	four_byte_object "i" 0, 0, 0, 0			# integer 0
	eight_byte_object "d" 0, 0, 0, 0, 0, 0, 0, 0	# 0 exponent
	five_byte_object "cp" 0, 0, 0, 0, 3		# 3 = nil tag

in which case `i' and `d' might be put in bss space, but cp would
be in initialised data space.

This leaves us with the problem of unions:

	union {
		int i;
		double d;
		char *cp;
	} u;

What can our tagged-pointer Lisp machine do?  It can make u.i 0,
or it can make u.d 0.0, or it can make u.cp NULL, but it cannot
possibly do all three, for the last two conflict.

The dpANS assigns meaning to this case by using a `first member'
rule:  On our Lisp machine, we set u.i = 0, and the other members
are undefined.  Another potential answer is to say that *none* of
the members are defined.

Even if we were to introduce aggregate constants, including union
constants, we would still have to define uninitialised global or
static union values:  Which member or members are in fact initialised?
You may not like the answer in the dpANS, but there must be an
answer.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
UUCP:	seismo!mimsy!chris	ARPA/CSNet:	chrisredibedef hneu-

hokey@plus5.UUCP (Hokey) (04/28/87)

My preference is for explicit casts instead of using the member names.

No matter; different strokes.  Both could be supported.

I like to think I could come up with examples, but I'm too burned out
right now.
-- 
Hokey

bzs@bu-cs.BU.EDU (Barry Shein) (04/28/87)

Posting-Front-End: GNU Emacs 18.41.4 of Mon Mar 23 1987 on bu-cs (berkeley-unix)



My relatively worthless 2c:

union foo {
	int a;
	char *p;
	double foo;
} u[] = {
	{0,,},		/* init the int */
	{,NULL,},	/* init the char *
	{,,0}		/* init the double */
};

Or, similarly:

union foo goo[] = {
	{0,void,void},		/* init the int */
	{void,NULL,void},	/* init the char * */
	{void,void,0}		/* init the double */
};

or very similarly:

union foo goo[] = {
	{0,(void),(void)},	/* init the int */
	{(void),NULL,(void)},	/* init the char * */
	{(void),(void),0}	/* init the double */
};

Seems all that is needed is a way to specify positional parameters,
any of these would do and all have slightly different historical
precedent (the first ",," not C, but macro assemblers.) I like the
first for its brevity (tho it would be error prone) and the last two
for their C-ish style.

Anyhow, that and several hundred million $$ will pay off the nat'l debt.

	-Barry "voidoid" Shein, Boston University

brett@wjvax.UUCP (Brett Galloway) (04/30/87)

In article <6483@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>To refresh memories (in some cases) or provide background (in others),
>the problem, or symptom, is this:  Uninitialised variables that have
>static allocation (that is, globals and local static variables) are, in
>C, defined as though they were initialised to `zero' [example]
>In most existing systems, this is trivial because 0 and 0.0 and
>(char *)0 are all all-zero bit patterns, and operating systems or
>runtime startups need know only how many bits of zeros are required
>(the infamous `bss' space).  There are machines, though, in which
>not all of these are all zero bit patterns ...

This has always bugged me about C -- the requirement that uninitialized
variables be implicitly initialized to 'zero` is problematic.  Where
binary 0000...000 is not `zero' for every type, it is wasteful, increasing
the size of binaries.  It is even more wasteful for embedded applications
(on PROM) where static initialization must be done at run-time.

The problem is that MOST of the time, the fact that uninitialized statics
are initialized to 'zero` is NOT used.  It would have been better if K&R
had not imposed this requirement.  It would also have made feasible the
(I think) better solution to union and structure initialization proposed by
Karl Heuer.

Oh well ... wishful thinking.
-- 
-------------
Brett Galloway
{pesnta,twg,ios,qubix,turtlevax,tymix,vecpyr,certes,isi}!wjvaxring ir

dick@cs.vu.nl (Dick Grune) (05/04/87)

A union looks like a struct, acts like a struct and quacks like a struct,
except that only one member can be initialized at any one given time. So
by analogy, a strong force in the universe:

union {
	int i;
	char ch1;
	char ch2;
} = {, 'X', };		/* which initializes the ch1 */

					Dick Grune
					Vrije Universiteit
					de Boelelaan 1081
					1081 HV  Amsterdam
					the Netherlands
					dick@cs.vu.nl
					...!mcvax!vu44!dick

john@viper.UUCP (John Stanley) (05/07/87)

In article <7023@bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes:
 >
 >My relatively worthless 2c:
 >
 >union foo {
 >	int a;
 >	char *p;
 >	double foo;
 >} u[] = {
 >	{0,,},		/* init the int */
 >	{,NULL,},	/* init the char *
 >	{,,0}		/* init the double */
 >};
 >

  How about the following? (after I change 2nd foo to baz to avoid confusion):

  union foo
	{
	int a;
	char *p;
	double baz;
	}
     fu[4] =
	{
	 {        0},	/* Defaults to init the 1st element (int) ((ANSI)) */
	 {a:      0},	/* init the int */
	 {p:   NULL},	/* init the char pointer */
	 {baz:  0.0}	/* init the double */
	};

I don't know about you, but this looks very readable to me....

	
--- 
John Stanley (john@viper.UUCP)
Software Consultant - DynaSoft Systems
UUCP: ...{amdahl,ihnp4,rutgers}!{meccts,dayton}!viper!john

hascall@atanasoff.cs.iastate.edu (John Hascall) (02/18/89)

  Does 'ANSI' C allow for union initialization?  If not, why not?

  John Hascall / ISU Comp Center

chris@mimsy.UUCP (Chris Torek) (02/19/89)

In article <816@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu
(John Hascall) writes:
>Does 'ANSI' C allow for union initialization?

Because all static and global variables are initialised to 0 (cast to
the appropriate type), the pANS *must* allow for union initialisation,
else how could one talk about the initial value of a static or global
union?  But since members of a union may overlay one another, and
a 0 of one type may not match a zero of another, there must be some
rule for deciding *which* union member(s) are to be zero.

The rule is (perhaps overly) simple: the first member of the union
is initialised.  Given

	union { float f; int i; } u;

u.f is 0.0, and u.i is indeterminte.  You may write

	union { float f; int i; } u = { 1.0 };

to set u.f, but you cannot initialise u.i since it is not the first
member.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

bill@twwells.uucp (T. William Wells) (02/19/89)

In article <816@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes:
:
:   Does 'ANSI' C allow for union initialization?

Yes, two ways:

1) Initializing via the first member:

union FOO {
	int     bar;
	char    *gak;
} Bletch = { 42 };

2) Initializing from a union of compatible type:

func()
{
	union FOO frab = Bletch;

Note the brace use: braces are required for the first way and
forbidden for the second.

---
Bill
{ uunet!proxftl | novavax } !twwells!bill

henry@utzoo.uucp (Henry Spencer) (02/19/89)

In article <816@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes:
>  Does 'ANSI' C allow for union initialization?  If not, why not?

Yes it does.  But you won't like it.  Initialization of a union is
initialization of the first member.  Nobody thinks this is wonderful,
but it does at least define (e.g.) the initial value of a static union,
and it has the virtue that it has been implemented and found workable.
Designing a more general facility is tricky -- you really have to name
the member you're initializing, it can't always be guessed from the type
of the value -- and there doesn't seem to be a desperate need for it,
since we've been living without it for a long time.  I believe there
were some union-initialization proposals to X3J11, none of which got
enough support to make it in.
-- 
The Earth is our mother;       |     Henry Spencer at U of Toronto Zoology
our nine months are up.        | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn ) (02/19/89)

In article <816@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu (John Hascall) writes:
>  Does 'ANSI' C allow for union initialization?  If not, why not?

Yes, it does.  The initializer is for the first member of the union.

Nobody suggested an acceptable way to specify initializing members
other than the first.  Please don't post your suggestion here, as
it's probably already been seen and rejected.

wald-david@CS.YALE.EDU (david wald) (02/20/89)

In article <16019@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>In article <816@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu
>(John Hascall) writes:
>>Does 'ANSI' C allow for union initialization?
>
>The rule is (perhaps overly) simple: the first member of the union
>is initialised.  Given
>
>       union { float f; int i; } u;
>
>u.f is 0.0, and u.i is indeterminte.  You may write
>
>       union { float f; int i; } u = { 1.0 };
>
>to set u.f, but you cannot initialise u.i since it is not the first
>member.

I wonder...

Yes, this question deals with some hypothetical C' or C+=2 (not quite D,
since it's a language extension rather than a revision), but...

Would it make the syntax more ambiguous to have allowed

union { float f; int i; } u.i = {1};

?

The only difficulty comes in extending this to structures containing
unions:

struct { char *cp; union { float f; int i; } u; } s = { NULL, ????};

Any suggestions?


============================================================================
David Wald                                              wald-david@yale.UUCP
waldave@yalevm.bitnet                                 wald-david@cs.yale.edu
"A monk, a clone and a ferengi decide to go bowling together..."
============================================================================

dg@lakart.UUCP (David Goodenough) (02/21/89)

From article <51116@yale-celray.yale.UUCP>, by wald-david@CS.YALE.EDU (david wald):
>>u.f is 0.0, and u.i is indeterminte.  You may write
>>
>>       union { float f; int i; } u = { 1.0 };
>>
>>to set u.f, but you cannot initialise u.i since it is not the first
>>member.
> 
> I wonder...
> 
> Yes, this question deals with some hypothetical C' or C+=2 (not quite D,
> since it's a language extension rather than a revision), but...
> 
> Would it make the syntax more ambiguous to have allowed
> 
> union { float f; int i; } u.i = {1};

Try:

union
 {
    float f;
    int i;
    char *c;
 } u[3] =
 {
    { 1.0 ; ; },	/* could also be { 1.0 } - trailing ; are optional */
    { ; 76 ; },		/* ditto: { ; 76 } would also be OK */
    { ; ; "STUG" }
 };

Not ambiguous at all - it is left as an excercise to the reader to figure
out what is happening. If you don't like the overloaded ';' then I suggest
you start bitching about ',' - that is overloaded far worse. :-P

BTW ....= { 1.0 ; 76 ; }, .....

WOULD NOT BE ALLOWED - it would generate some sort of complaint.
-- 
	dg@lakart.UUCP - David Goodenough		+---+
						IHS	| +-+-+
	....... !harvard!xait!lakart!dg			+-+-+ |
AKA:	dg%lakart.uucp@xait.xerox.com		  	  +---+

karl@haddock.ima.isc.com (Karl Heuer) (02/24/89)

In comp.lang.c article <437@lakart.UUCP> dg@lakart.UUCP (David Goodenough)
suggests that union initialization could be done thus:
>union { float f; int i; char *c; } u[3] = {
>    { 1.0 ; ; },	/* could also be { 1.0 } - trailing ; are optional */
>    { ; 76 ; },	/* ditto: { ; 76 } would also be OK */
>    { ; ; "STUG" }
>};

Not bad, but I see no reason to use semicolon as the separator.  Comma is
already used in this context for struct initializers, so it would be more
consistent to use the same token for unions.  (No, this would not conflict
with the comma operator.)

In fact, one could even generalize the notation to allow missing expressions
in a struct or array initializer: int a[3]={3, ,5} would initialize a[0]=3,
a[2]=5, and leave a[1] uninitialized (zero or garbage, depending on storage
duration).

However, initializing by position may be a mistake anyway; it requires the
application to know the order of the members.  This information is often
deliberately left unspecified.  If we're going to tweak the language, let's
try to do it in a way that will assist with this sort of data abstraction.

Would the following be workable?
	int a[3] = { 0: 3, 2: 5 };
	union { float f; int i; char *c; } u[3] = {
	    { f: 1.0 },
	    { i: 76 },
	    { c: "STUG" }
	};

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
Followups to comp.lang.misc; we're talking `D' again.

ch@maths.tcd.ie (Charles Bryant) (02/25/89)

In article <51116@yale-celray.yale.UUCP> wald-david@CS.YALE.EDU (david wald) writes:
>In article <16019@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>In article <816@atanasoff.cs.iastate.edu> hascall@atanasoff.cs.iastate.edu
>>(John Hascall) writes:
>>>Does 'ANSI' C allow for union initialization?
>>
.
.
.
>Any suggestions?

Given:
	If a union has two members with the same type, the compiler need
	not distinguish between them.

How about:
	union {
		float f;
		double d;
		int i;
		char c;
	} foo = { 1 };		/* initialises i */

OR	} foo = { (float) 1.1 };	/* f */
OR	} foo = {(double) 1.1 };	/* d */
OR	} foo = { (char) 'a'};		/* c */

Perhaps this would be too much of a special case for the compiler (it
otherwise dosen't need to know that an expression is of type 'char' for
instance).

-- 

		Charles Bryant.
Working at Datacode Electronics Ltd.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (02/28/89)

In article <609@maths.tcd.ie> ch@maths.tcd.ie (Charles Bryant) writes:
>How about: ...

Look, I asked you guys not to propose ways of doing this.  All the
ones posted so far have already been evaluated by X3J11 and found
wanting in one way or another.  It's rather of waste of your time
to worry further about this.

henry@utzoo.uucp (Henry Spencer) (02/28/89)

In article <609@maths.tcd.ie> ch@maths.tcd.ie (Charles Bryant) writes:
>[proposal in which compiler guesses which member based on type of initializer
>expression]
>
>Perhaps this would be too much of a special case for the compiler...

It's too much of a special case for the language designer, too, I'm afraid.
This particular suggestion always seems to come up.  It doesn't work very
well.  How do you initialize a struct inside a union?  (Cast to the struct
type?  Now we have a unique situation in which such casts are legal.)  What
about a union inside the union?  Is a string an initializer for a "char *"
or a "char []"?  Do implicit conversions get done?  (If so, chaos.  If not,
we now have a unique situation in which they aren't.)  If there are both
int and long members, which one gets initialized if the initializer is,
say, 75000 (the type of which is implementation-dependent)?  Is the cast
mandatory?  (If so, we now have a unique etc. etc.)

There are just too many problems with guessing member based on type.  It
really has to be done by name or position.
-- 
The Earth is our mother;       |     Henry Spencer at U of Toronto Zoology
our nine months are up.        | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

friedl@vsi.COM (Stephen J. Friedl) (03/03/89)

In article <9733@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
> Look, I asked you guys not to propose ways of doing this.  All the
> ones posted so far have already been evaluated by X3J11 and found
> wanting in one way or another.  It's rather of waste of your time
> to worry further about this.

Some of us are not wizards of Committee quality, and as such we
don't really understand the issue as clearly as Doug obviously
does.  There is a difference between "OK folks, why can't XXX work
for this?" and "I propose XXX for the Standard".  The former can
be a very valuable tool for expanding one's understanding of the
deeper issues of language design.

In the past I've been generally favorable to a posting here or
there on how this or that feature in the language would be a nice
thing (hey, even /noalias/ looked plausible when I first read about
it).  Later, though, some helpful soul shows how it is not
portable or cannot be extended in the general case or exposes some
other fundamental flaw.  After a while, I can start to see these
kinds of problems myself: I think this is called "learning".

Saying "We said no so don't think about it anymore" does wonders
to foster enlightenment.

     Steve

-- 
Stephen J. Friedl / V-Systems, Inc. / Santa Ana, CA / +1 714 545 6442 
3B2-kind-of-guy   / friedl@vsi.com  / {attmail, uunet, etc}!vsi!friedl

    "vi2000: the editor of the 21st century" -- Dr. Bertrand Meyer

austin@scallion.ucdavis.edu (Darren Austin) (05/30/89)

Hello,
	I have an application that uses unions for one of the
major data structures.  I would like to be able to initialize one
of these unions similar to the way that you can with arrays and
structures, for example:

struct foo {
	int a;
	char b[10];
};

struct foo f = { 4, "hello" };

is allowed, but when I tried a similar method to initialize a
union, the compiler complained.  

Is there a way to initialize static unions in C?  A quick glance
at K&R didn't yield any information on the subject.  I assume
that means that it is not possible. If it isn't, are there ways
around this problem, or am I just missing something very basic?
Any help would be most appreciated, but please SEND RESPONSES TO
ME THROUGH E-MAIL.  I do not want to start another
silly/stupid question flame fest.

Thanks,
--Darren
--
--------------------------------------+-------------------------------
Darren Austin                         | Is is a mistake to think you 
UC Davis Division of Computer Science | can solve any major problem
austin@clover.ucdavis.edu	      | just with potatoes.
--------------------------------------+-------------------------------