[comp.lang.c] Arrays of Unknown Length in Structures

jym@mit-prep.UUCP (09/22/87)

Consider the following:

	struct  STRUCT_FOO
	{
	  int  one;
	  int  two;
	  int  open[];
	};

	main()
	{
	  extern void  *malloc();

	  short int  i;
	
	  struct STRUCT_FOO  *foo_p;

	  printf("Size = %d\n",sizeof (struct STRUCT_FOO));

	  foo_p = malloc(sizeof (struct STRUCT_FOO) + (7 * sizeof (int)));

	  for (i = 0; i < 7; ++i)
	   foo_p->open[i] = i * 100;

	  for (i = 0; i < 7; ++i)
	   printf("open[%d] = %d\n",i,foo_p->open[i]);
	}

I've tried this on two systems:  VMS V4.5 running VAX C V2.3, and BSD 4.2
 running "cc".  Both do what I want them to do, though "cc" gives a warning
  message for line 6.

My question:  is this kosher?  I've seen people do this by declaring the
 struct with an array of 1 element ("int  open[1];"), but this seems a bit
  kludgy to me.  Declaring it as an array of 0 elements ("int  open[0];")
   works with "cc" (albeit with two warning messages) but VAX C won't
    compile it at all.

As far as I'm concerned, if this isn't kosher, it should be!
 <_Jym_>
-- 
jym@prep.ai.mit.edu			-- A STRANGE GAME.
--
.signature file under construction	-- THE ONLY WINNING MOVE IS
--
read at your own risk			-- NOT TO PLAY.
--

jym@mit-prep.UUCP (09/22/87)

(*Blush*  Whoops, I forgot to mention something!  *Blush*)

Also, notice the use of sizeof.  The way I have it set up now,
 "sizeof (struct STRUCT_FOO)" is a useful value for me.  Both of
  my compilers generate a value equal to two ints.
   <_Jym_>
-- 
jym@prep.ai.mit.edu			-- A STRANGE GAME.
--
.signature file under construction	-- THE ONLY WINNING MOVE IS
--
read at your own risk			-- NOT TO PLAY.
--

edw@ius1.cs.cmu.edu (Eddie Wyatt) (09/27/87)

> 	struct  STRUCT_FOO
> 	{
> 	  int  one;
> 	  int  two;
> 	  int  open[];
> 	};
> 
> 	  foo_p = malloc(sizeof (struct STRUCT_FOO) + (7 * sizeof (int)));
> 
> 	  for (i = 0; i < 7; ++i)
> 	   foo_p->open[i] = i * 100;
> 
.......
> I've tried this on two systems:  VMS V4.5 running VAX C V2.3, and BSD 4.2
>  running "cc".  Both do what I want them to do, though "cc" gives a warning
>   message for line 6.
> 
.....
> As far as I'm concerned, if this isn't kosher, it should be!

   Let me give you a reason of why this should not be "kosher".

  	    I don't think it is  required in C that the fields in a structure
	    be phyically ordered in the same ordering they are declared.
	    Hence any of the 6 permutations of the fields you describe
	    are valid for phyical offset orderings (ie open could have
	    an offset of 0, two could have an offset of 0 and one could
	    have an offset of 0 + sizeof(int)).  This would wreak havoc!!

-- 

					Eddie Wyatt

e-mail: edw@ius1.cs.cmu.edu

gwyn@brl-smoke.UUCP (09/28/87)

In article <243@mit-prep.ARPA> jym@mit-prep.ARPA (Jym Dyer) writes:
>	struct  STRUCT_FOO
>	{
...
>	  int  open[];
>	};
...
>	  printf("Size = %d\n",sizeof (struct STRUCT_FOO));
...
>As far as I'm concerned, if this isn't kosher, it should be!

How can it possibly be?  Just out of curiosity, what do you
think the printf() should print??

breck@aimt.UUCP (Robert Breckinridge Beatie) (09/30/87)

In article <1044@ius1.cs.cmu.edu>, edw@ius1.cs.cmu.edu (Eddie Wyatt) writes:
> > 	struct  STRUCT_FOO
> > 	{
> > 	  int  one;
> > 	  int  two;
> > 	  int  open[];
> > 	};
> > 
> > 	  foo_p = malloc(sizeof (struct STRUCT_FOO) + (7 * sizeof (int)));
> > 
> > 	  for (i = 0; i < 7; ++i)
> > 	   foo_p->open[i] = i * 100;
> > 
> .......
> > As far as I'm concerned, if this isn't kosher, it should be!
> 
>    Let me give you a reason of why this should not be "kosher".
> 
[ Reason deleted for brevity ]

Here's another possible reason (corrections welcome) that this isn't "kosher".
If the space for the array is allocated in the structure itself, then this
makes the size of the structure undefined (I think).  What would the compiler
do for an array of STRUCT_FOO?  What would be the address of bar[10] when
bar is an array of STRUCT_FOO?

This "feature" is interesting but I don't think it's interesting enough to
give up the ability to declare arrays of structures.  Also, what would a
compiler do if the structure includes more than one field of unknown size?
This seems a potentially nightmarish situation.

Now since the original article stated that "both [compilers] do what I want"
it seems that I may be making a fool of myself again.  Have I missed something
obvious?  What is really going on here?

-- 
Breck Beatie
uunet!aimt!breck

keesan@cc5.bbn.com.UUCP (09/30/87)

In article <1044@ius1.cs.cmu.edu> edw@ius1.cs.cmu.edu (Eddie Wyatt) writes:
>In article <something> someone <someone> writes:
>> 	struct  STRUCT_FOO
>> 	{
>> 	  int  one;
>> 	  int  two;
>> 	  int  open[];
>> 	};
>> 
>> 	  foo_p = malloc(sizeof (struct STRUCT_FOO) + (7 * sizeof (int)));
>> 
>> 	  for (i = 0; i < 7; ++i)
>> 	   foo_p->open[i] = i * 100;
>> 
>.....
>> As far as I'm concerned, if this isn't kosher, it should be!
>
>   Let me give you a reason of why this should not be "kosher".
>
>  	    I don't think it is  required in C that the fields in a structure
>	    be phyically ordered in the same ordering they are declared.
>	    Hence any of the 6 permutations of the fields you describe
>	    are valid for phyical offset orderings (ie open could have
>	    an offset of 0, two could have an offset of 0 and one could
>	    have an offset of 0 + sizeof(int)).  This would wreak havoc!!

From K&R p. 196, C Reference Manual 8.5 Structure and union declarations:

    Within a structure, the objects declared have addresses which increase as
    their declarations are read left-to-right.

The reason the above is not kosher is that int open[] declares an array of
unknown length, and therefore the enclosing structure is of unknown length and
sizeof(struct STRUCT_FOO) is undefined.  However, given the above citation from
K&R, I think the example would be completely kosher if "int open[]" were
replaced with "int open[0]".  There's nothing forbidding zero-sized arrays.
Even the 4.2BSD cc allows them, while issuing a bogus error message about
"illegal zero-sized arrays".  The published Draft Standard from X3J11
introduces this prohibition, and I complained of this in my formal comments.
(Speaking of formal comments, my form-letter reply says that the committee's
response will probably be coming in September 1987.  Any update from committee
members present?  Doug?  Courtney?)
-- 
Morris M. Keesan
keesan@bbn.com
{harvard,decvax,ihnp4,etc.}!bbn!keesan

dph@beta.UUCP (David P Huelsbeck) (10/01/87)

In article <786@cc5.bbn.com.BBN.COM> keesan@bbn.com (Morris M. Keesan) writes:
>In article <1044@ius1.cs.cmu.edu> edw@ius1.cs.cmu.edu (Eddie Wyatt) writes:
>>In article <something> someone <someone> writes:
>>> 	struct  STRUCT_FOO
>>> 	{
>>> 	  int  one;
>>> 	  int  two;
>>> 	  int  open[];
>>> 	};
>>> 
 [... some code using the above deleted ...]

>>   Let me give you a reason of why this should not be "kosher".
 [... some reasons deleted ...]

>The reason the above is not kosher is that int open[] declares an array of
>unknown length, and therefore the enclosing structure is of unknown length and
>sizeof(struct STRUCT_FOO) is undefined.  However, given the above citation from
>K&R, I think the example would be completely kosher if "int open[]" were
>replaced with "int open[0]".  There's nothing forbidding zero-sized arrays.
               ^^^^^^^^^^^^^                             ^^^^^^^^^^^^^^^^^
>-- 
>Morris M. Keesan
>keesan@bbn.com
>{harvard,decvax,ihnp4,etc.}!bbn!keesan


Hmmm?

I'm trying to imagine what possible semantics could be associated
with a zero-sized array. An array name is a constant that contains
the starting address of some pre-allocated space in memory. In other
words it's a pointer constant. What value could the compiler assign
that would point to zero pre-allocated memory locations? NULL?
And what good would it do you to have this constant?

I think the idea here was to declare a structure that could be 
an unkown total size. this would be a nice thing to do when declaring
something like a functions activation record to be placed on the stack.
You know you're going to have a static-link a dynamic-link and most likely
some arguments. But how may args, and of what size? So you want a structure
with the "bottom" left open until latter. I think what you need is to
declare a structure like:

	typedef struct FOO {
		int *open;
		int one;
		int two;
		int open0;
	} foo;

Then when you know that you'll need "x" elements in "open" you:

	foop = (foo *) malloc(sizeof(foo) + (x - 1));
	foop->open = &open0;
	for (i=0; i < x; i++) {
		foop->open[i] = i; /* or whatever */
	}

Is this what the original poster intended?
I'll admit it looks pretty sick but I think it'd work.

How 'bout it C hacks? Is there a better way to do this?
Will this even work? I'm sorry I don't have the time to check
just now.

	David Huelsbeck
	dph@lanl.gov.arpa
	{cmcl2!ihnp4}!lanl!dph

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/02/87)

In article <786@cc5.bbn.com.BBN.COM> keesan@bbn.com (Morris M. Keesan) writes:
>"illegal zero-sized arrays".  The published Draft Standard from X3J11
>introduces this prohibition, and I complained of this in my formal comments.
>(Speaking of formal comments, my form-letter reply says that the committee's
>response will probably be coming in September 1987.  Any update from committee
>members present?  Doug?  Courtney?)

The issue of 0-sized objects has been debated several times by X3J11,
and every time the majority opposes permitting them, even though good
arguments for them have been made.  The analogy to 0-trip loops has
not been sufficiently persuasive, so I doubt any argument will sway
the opponents.

The main use for 0-length arrays seems to be as the last member of a
struct of indefinite length.  Apparently, the committee does not want
to encourage the design of such ugly data structures, but if you
really have to have them, you can use a 1-long array member instead.

The responses to formal public comments (also some of the informal
letters received) are being reviewed this month and MAY finally be
mailed, along with current drafts of the Standard and Rationale, to
correspondents in November.  Every issue raised will have an individual
response, generally with explanation when the idea was not adopted.

gwyn@brl-smoke.UUCP (10/02/87)

In article <10760@beta.UUCP> dph@LANL.GOV.ARPA (David P Huelsbeck) writes:
>I'm trying to imagine what possible semantics could be associated
>with a zero-sized array. An array name is a constant that contains
>the starting address of some pre-allocated space in memory. In other
>words it's a pointer constant. What value could the compiler assign
>that would point to zero pre-allocated memory locations? NULL?

There isn't any problem, really.  The compiler can allocate a location
for the 0-sized object.  The tricky part is that it should probably
skip at least one byte before allocating the next object anyway, so
that all distinct objects have distinct addresses.  I think this
consideration is what killed the proposal for malloc(0).

xsimon@its63b.ed.ac.uk (Simon Brown) (10/03/87)

In article <1044@ius1.cs.cmu.edu> edw@ius1.cs.cmu.edu (Eddie Wyatt) writes:
>> 	struct  STRUCT_FOO
>> 	{
>> 	  int  one;
>> 	  int  two;
>> 	  int  open[];
>> 	};
>> 
>> 	  foo_p = malloc(sizeof (struct STRUCT_FOO) + (7 * sizeof (int)));
>> 
>> 	  for (i = 0; i < 7; ++i)
>> 	   foo_p->open[i] = i * 100;
>> 
>> As far as I'm concerned, if this isn't kosher, it should be!
>
>   Let me give you a reason of why this should not be "kosher".
>  	    I don't think it is  required in C that the fields in a structure
>	    be phyically ordered in the same ordering they are declared.
>	    Hence any of the 6 permutations of the fields you describe
>	    are valid for phyical offset orderings (ie open could have
>	    an offset of 0, two could have an offset of 0 and one could
>	    have an offset of 0 + sizeof(int)).  This would wreak havoc!!
>

Umm, but what about the System V "msgbuf" type, used for the msg IPC system 
calls? This has type
	struct msgbuf {
		long mtype;
		char mtext[1];
	}
and is intended to be used as a template type, by malloc()'ing space of whatever
message buffer size is required, with something like
	msg = (struct msgbuf *)malloc(sizeof(struct msgbuf) + NBYTES);
which results in a message with bytes mtext[0] ... mtext[NBYTES].

If a compiler is free to alter the order of occurence of the fields of a
structure, then this is no longer portable. 
(Yes, I realize that this has nothing to do with the C language as such, but 
it can't be just ignored!).

(BTW, certainly I'd much prefer something like "char mtext[]" or even 
"char mtext[0]", but it seems that zero-sized types are not too popular with
compilers, more's the pity :-().

    %{
       Simon.
    %}
-- 
----------------------------------
| Simon Brown                    | UUCP:  seismo!mcvax!ukc!{lfcs,its63b}!simon
| Department of Computer Science | JANET: simon@uk.ac.ed.{lfcs,its63b}
| University of Edinburgh,       | ARPA:  simon%lfcs.ed.ac.uk@cs.ucl.ac.uk
| Scotland, UK.                  |     or simon%its63b.ed.ac.uk@cs.ucl.ac.uk
----------------------------------     or simon%cstvax.ed.ac.uk@cs.ucl.ac.uk
                      "Life's like that, you know"

bts@sas.UUCP (Brian T. Schellenberger) (10/06/87)

Physical ordering in structures are, indeed, required to be in the same order
as the fields are declared.  Only *bit-fields* are not required to be in
order.  ANSI requires this, and massive amounts of code would break if it
were broken; lots of code does things like:

(char *)s->field2 - (char *)s->field1


-- 
                                                         --Brian.
(Brian T. Schellenberger)				 ...!mcnc!rti!sas!bts

DISCLAIMER:  Whereas Brian Schellenberger (hereinafter "the party of the first 

stuart@bms-at.UUCP (Stuart D. Gathman) (10/06/87)

A variable size array can usefully be declared as the last element of 
a structure.  The size returned by sizeof is the 'minimum' size of the
structure.  Arrays of the structure are created using the minimum size, but
are rarely useful.  A minimum size of [1] works with any compiler (that
doesn't reorder structure elements).  Some compilers allow [0] or [].

A logical equivalent of the construct is to include a pointer to a variable
size array in the structure.  This form is even referenced with the same
syntax!  Putting the variable array in the structure eliminates a pointer
dereference thereby gaining some speed and simplicity of allocation.
-- 
Stuart D. Gathman	<stuart@bms-at.uucp>
			<..!{vrdxhq|dgis}!bms-at!stuart>

thorinn@diku.UUCP (Lars Henrik Mathiesen) (10/07/87)

In article <668@its63b.ed.ac.uk> xsimon@its63b.ed.ac.uk (Simon Brown) writes:
>Umm, but what about the System V "msgbuf" type, used for the msg IPC system 
>calls? This has type
>	struct msgbuf {
>		long mtype;
>		char mtext[1];
>	}
>and is intended to be used as a template type, by malloc()'ing space of whatever
>message buffer size is required, with something like
>	msg = (struct msgbuf *)malloc(sizeof(struct msgbuf) + NBYTES);
>which results in a message with bytes mtext[0] ... mtext[NBYTES].

Just to note (again) that this may allocate space for up to
			NBYTES + sizeof(long) bytes:
The structure size will be rounded to the alignment of a long for use in
arrays - on most machines this will be stricter than sizeof(char), typically
sizeof(long). If you really care about those "wasted" bytes, you can use a
"portable" trick like this:
	struct msgbuf {
		long mtype;
		union mu {
			char mt[1];
			double manon;
		} mu;
	};
#define mtext mu.mt
#define MSGBUFBASE (sizeof(struct msgbuf) - sizeof(double))
	msg = (struct msgbuf *)malloc(MSGBUFBASE + NBYTES);

But given a structure like
	struct msgbuf { long mtype; char mflags, mtext[1] };
the former trick would force a stricter alignment on mtext than "necessary",
so you would "have to" be unportable and write something like

#define MSGBUFBASE (sizeof(msgbuf) - LONGALIGN + CHARARRAYALIGN)

By the way, is there anything in the various standards that says whether
the alignment of a double (or float) has to be the at least as strict as
that of any other datatype? Ditto for sizeof? If not, the first trick has
to be fixed.
--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcvax!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.

daveb@geac.UUCP (Dave Collier-Brown) (10/14/87)

  A bit of side-discussion on the question:  here's some examples from
real things which use both variable and fixed-size structs, and why.

  Fixed:
	This one's easy.  Unix wants to have the compiler generate
efficient code for stepping through small, possibly ordered, tables of
data. Therefore it defines p++ as "increment p by sizeof the things p
points to, and the code generated looks like:
	add	regP,size
	compare	regX,offset+regP
	...
when searching a table of elements of size "size" for something at
offset "offset" in each element.  For example, disk-cache buffer
headers. 

  variable:
	Not so easy.  The best example I've seen was declarations for
describing x.400-like hierarchical message structures...

   +--------------------------------------------------------------------+
   || tag || length || subtag | length | data | subtag | length | data ||
   +--------------------------------------------------------------------+
                     +------------------------+------------------------+
                        an embedded record         another one

   +--------------------------------------------------------------------+
                        the containing record.

Most of the other examples of variable-sized structs were attempts to
save a pointer-indirection instruction.  Since indirections are
usually cheap (even under GCOS), avoidance of them is *not* a win if
the price is hard-to-maintain variable-size struct-munging code.  And
on many machines (Vax, HW-BULL, Motorola), the cost of an indirection
in space and time works out to less than the extra calculations for
doing anything more than allocating and discarding the variable-size
structs. 

 --dave (compilers are smawt, hawold) c-b
-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

crowl@cs.rochester.edu (Lawrence Crowl) (10/15/87)

In article <1609@geac.UUCP> daveb@geac.UUCP (Dave Collier-Brown) writes:
>Most of the other examples of variable-sized structs were attempts to save a
>pointer-indirection instruction.  Since indirections are usually cheap (even
>under GCOS), avoidance of them is *not* a win if the price is hard-to-maintain
>variable-size struct-munging code.  And on many machines (Vax, HW-BULL,
>Motorola), the cost of an indirection in space and time works out to less than
>the extra calculations for doing anything more than allocating and discarding
>the variable-size structs.

Variable-size structs are needed in shared memory environments were a data
structure can be mapped at many different addresses.  For example, consider a
shared-memory implementation of a message buffer.  It has a fixed size header
followed by a variable number of message slots.  Because the message buffer may
reside at different addresses, the header cannot contain a pointer to the
variable number of slots.  This could also be solved with relative pointers.
To directly support such structures, at least one of the two methods must be
provided.
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627