[comp.lang.c] Data initialization -- a major problem.

connors@druco.ATT.COM (ConnorsPA) (12/11/87)

There seems to be a major inconsistency in how various C compilers
assume data initializations to be defined.

EXAMPLE 1
--------

Either:
	int a = 1;
or:
	int a = {1};

is a valid way of initializing the integer 'a' to 1.
(See page 198 of the "C Programming Language" White Book.)

No problem with this.

But what about:
	int a[2] = { 1, 2 };
or:
	int a[2] = { {1}, {2} };
with the intention of initializing the array 'a' such that a[0]=1, a[1]=2 ?

Here, some compilers will reject the second form,
with some such message as "initialization alignment error".
Others will accept it and initialize the array in the intended way.

EXAMPLE 2
---------

A much more serious problem occurs in the following example,
abstracted from some real code which had portability problems,
and where different compilers will SILENTLY interpret the data
in different ways:

---------------------------------------------------------------------------
struct s1 {
		int a;
		int b;
		int c;
	  };

struct s2 {
		struct s1 sarr[2];
	 };

struct s2 SS[2] = {
			{2, 3},		/* ROW A */
			{4, 5}		/* ROW B */
		  };
---------------------------------------------------------------------------

The difference between compilers is that the two rows of data, A and B,
commented in the above section of code, are interpreted in two different ways. 
Looking at the assembler generated shows that the initializations
follow one of the following methods:

		METHOD A	METHOD B
		--------	--------
SS[0].sarr[0].a = 2;		= 2;
SS[0].sarr[0].b = 3;		= 3;
SS[0].sarr[0].c = 0;		= 0;
SS[0].sarr[1].a = 4;		= 0;
SS[0].sarr[1].b = 5;		= 0;
SS[0].sarr[1].c = 0;		= 0;
SS[1].sarr[0].a = 0;		= 4;
SS[1].sarr[0].b = 0;		= 5;
SS[1].sarr[0].c = 0;		= 0;
SS[1].sarr[1].a = 0;		= 0;
SS[1].sarr[1].b = 0;		= 0;
SS[1].sarr[1].c = 0;		= 0;

In Method A, the compiler thinks that the two rows of data
are for SS[0].sarr[0] and SS[0].sarr[1].

In Method B, the compiler thinks that the two rows of data
are for SS[0].sarr[0] and SS[1].sarr[0].

Naively, one would expect that only one of these two forms
is the "correct" one. But which?

After a careful perusal of the White Book, I have come up
with no clear answer. I can generate several other variations
on the same problem. In general, there seems to be a lack of precision
about the interpretation of braces inside data initializations.

Whatever the answer is, we are left with the problem
that current compilers (I have looked at eight) operate under
different assumptions.

But what IS the answer?

throopw@xyzzy.UUCP (Wayne A. Throop) (12/17/87)

> connors@druco.ATT.COM (ConnorsPA)
> [...] different compilers will SILENTLY interpret the data
> in different ways:
>     struct s1 { int a; int b; int c; };
>     struct s2 { struct s1 sarr[2]; };
>     struct s2 SS[2] = { {2, 3},		/* ROW A */
>		          {4, 5}		/* ROW B */
>     };

The problem being that some compilers take the 4 to be initializing
SS[0].sarr[1].a, and others take it to be initializing SS[1].sarr[0].a.

> Whatever the answer is, we are left with the problem
> that current compilers (I have looked at eight) operate under
> different assumptions.
> But what IS the answer?

Always apply a curly-bracketed list of initializers for every agregate,
and a non-curly-bracketed expression for every primitive type.  When all
agregates have a curly-bracketed list, and all primitive types have
non-bracketed expressions, I don't know of a C compiler what won't
assign the values where they are intended.  For example, I compiled and
debugged this program:

    struct s1 { int a; int b; int c; };
    struct s2 { struct s1 sarr[2]; };
    struct s2 SS1[2] = {{{{ 2, 3 }, { 4, 5 }}}};
    struct s2 SS2[2] = {{{{ 2, 3 }}}, {{{ 4, 5 }}}};

    int main(){ return(0); }

with these results:

    (dbx) print SS1
    {
    {
     sarr = {
    {
     a =  000000000002,
     b =  000000000003,
     c =  000000000000
    },
    {
     a =  000000000004,
     b =  000000000005,
     c =  000000000000
    }
    }
    },
    {
     sarr = {
    {
     a =  000000000000,
     b =  000000000000,
     c =  000000000000
    },
    {
     a =  000000000000,
     b =  000000000000,
     c =  000000000000
    }
    }
    }
    } 
    (dbx) print SS2
    {
    {
     sarr = {
    {
     a =  000000000002,
     b =  000000000003,
     c =  000000000000
    },
    {
     a =  000000000000,
     b =  000000000000,
     c =  000000000000
    }
    }
    },
    {
     sarr = {
    {
     a =  000000000004,
     b =  000000000005,
     c =  000000000000
    },
    {
     a =  000000000000,
     b =  000000000000,
     c =  000000000000
    }
    }
    }
    } 
    (dbx) 

--
I cain't git a long little doggie,
I cain't even git one that's small...
I cain't git a long little doggie,
I cain't git a doggie ay-tall.
                                        --- Yosemite Sam
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw