[comp.lang.c] Array initialization question

carey@eniac.seas.upenn.edu (Robert Carey) (05/14/91)

The SunOS C compiler seems to be doing me an unwanted favor.  If in
defining a multidimensional array of char I initialize a row using a
string which is one character longer than the row (admittedly a bad
thing to be doing - it was an accident), it inserts all the values up
to and not including the NUL byte, and does not issue a warning. (More
likely the second initializer actually clobbers the NUL byte from the
first one.)  If I try to use a string that is even one character longer
than that it prints a message and truncates the initialization string
to one less than the length of the row and inserts a NUL in the last
byte of the row.  I would have expected the compiler to give me a
warning in both cases.  Is it supposed to work this way?

Example initializing a 5 character row from a 6 character string:

  $ cat foo.c
  main()
  {
      static char foo[2][5]={"12345", "67890"};

      printf("[%s][%s]\n", foo[0], foo[1]);
  }
  $ cc -o foo foo.c
  $ foo
  [1234567890][67890]
  $

Example initializing a 5 character row from a 7 character string:

  $ cat foo.c
  main()
  {
      static char foo[2][5]={"123456", "678901"};

      printf("[%s][%s]\n", foo[0], foo[1]);
  }
  $ cc -o foo foo.c
  "foo.c", line 3: warning: string initializer too long
  "foo.c", line 3: warning: string initializer too long
  $ foo
  [1234][6789]
  $

gwyn@smoke.brl.mil (Doug Gwyn) (05/15/91)

In article <43075@netnews.upenn.edu> carey@eniac.seas.upenn.edu (Robert Carey) writes:
-defining a multidimensional array of char I initialize a row using a
-string which is one character longer than the row (admittedly a bad
-thing to be doing - it was an accident), it inserts all the values up
-to and not including the NUL byte, and does not issue a warning.

That's exactly what it is supposed to do.

torek@elf.ee.lbl.gov (Chris Torek) (05/15/91)

In article <43075@netnews.upenn.edu> carey@eniac.seas.upenn.edu
(Robert Carey) writes:
>      static char foo[2][5]={"12345", "67890"};

The ANSI C standard (X3.159-1989) explicitly says that if you use a
string literal as an initializer for a character array and you have
specified the size of the array and the string literal exactly fills
the array when the trailing '\0' character is dropped, that is what you
get.  Thus, this creates two adjacent blocks of 5 `char's and sets them
to:

	'1' '2' '3' '4' '5' '6' '7' '8' '9' '0'

(Some argued that this was a misfeature, but alternative proposals such
as `\c' for suppressing trailing NULs in string literals were
eventually rejected.  Some find the whole thing incomprehensible:  if
you wanted this result you could have used

	static char foo[2][5] = {
		{'1', '2', '3', '4', '5'},
		{'6', '7', '8', '9', '0'},
	};

I tend toward the latter camp; if a mechanism for omitting the trailing
NUL in string literals is to be required, I tend toward the former camp.)
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

gwyn@smoke.brl.mil (Doug Gwyn) (05/16/91)

In article <13193@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes:
>(Some argued that this was a misfeature, ...

However, it was well-established existing practice in UNIX C compilers
(among others).

carey@eniac.seas.upenn.edu (Robert Carey) (05/17/91)

In article <16161@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>In article <13193@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes:
>>(Some argued that this was a misfeature, ...
>
>However, it was well-established existing practice in UNIX C compilers
>(among others).

As Chris Torek correctly pointed out:

  "The ANSI C standard (X3.159-1989) explicitly says that if you use a
  string literal as an initializer for a character array and you have
  specified the size of the array and the string literal exactly fills
  the array when the trailing '\0' character is dropped, that is what you
  get."

SunOS, at least, handles this inconsistently:

    $ cat foo.c
    main()
    {
        static char foo[2][5]={"12345", "67890"};

        printf("[%s][%s]\n", foo[0], foo[1]);
    }
    $ cc -o foo foo.c
    $ ./foo
    [1234567890][67890]
    $ cat bar.c
    main()
    {
         static char foo[5]="12345";
         static char bar[]="abc";
    
         printf("%s\n", foo);
    }
    $ cc -o bar bar.c
    "bar.c", line 3: too many initializers
    $ ./bar
    ./bar: command not found

Then again, the Sun C compiler is not an ANSI compiler.  It looks like
gcc handles the second example correctly:

    $ gcc -o bar bar.c
    $ ./bar
    12345abc
    $