[comp.lang.c] Zero Length Arrays Allowed in C Standard?

baalke@mars.jpl.nasa.gov (Ron Baalke) (12/01/89)

I've inherited some C code that had the following declaration in it:

     char tbi[0];

When I tried to compile this using Turbo C v2.0 or VAX C, it was flagged as
a fatal error. My question is this: are zero length arrays allowed in the
ANSI standard for C?

 Ron Baalke                       |    baalke@mars.jpl.nasa.gov 
 Jet Propulsion Lab  M/S 301-355  |    baalke@jems.jpl.nasa.gov 
 4800 Oak Grove Dr.               |
 Pasadena, CA 91109               |

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/01/89)

In article <2298@jato.Jpl.Nasa.Gov> baalke@mars.jpl.nasa.gov (Ron Baalke) writes:
>are zero length arrays allowed in the ANSI standard for C?

No; Standard C does not support zero-sized objects.

I'm POC for a zero-sized object special interest group,
but frankly there has been little activity since the
committee's consensus was clearly against such objects.

You might consider changing the [0]s to [1]s and where
the allocation/pointer trickery occurs in the code
making an adjustment for the additional byte in the
object type.

bruno@sdcc10.ucsd.edu (Bruce W. Mohler) (12/01/89)

In article <11715@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
 >I'm POC for a zero-sized object special interest group,
 >but frankly there has been little activity since the
 >committee's consensus was clearly against such objects.

Wouldn't this kind of group, because of its purpose, have 
zero members?  (Not counting the POC (which is, after all,
just a pointer)   :-)
--
Bruce W. Mohler
Systems Programmer (aka Staff Analyst)
bruno@sdcc10.ucsd.edu
voice: 619/586-2218

ark@alice.UUCP (Andrew Koenig) (12/01/89)

In article <11715@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:

> I'm POC for a zero-sized object special interest group,

Does it have any members?  :-)
-- 
				--Andrew Koenig
				  ark@europa.att.com

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (12/02/89)

In article <11715@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>I'm POC for a zero-sized object special interest group,
>but frankly there has been little activity since the
>committee's consensus was clearly against such objects.
>
>You might consider changing the [0]s to [1]s and where
>the allocation/pointer trickery occurs in the code
>making an adjustment for the additional byte in the
>object type.

That of course leads to problems if anyone does
    s = (Structure *)malloc(sizeof(Structure));
without knowing about this special kludge in the definition,
or if anyone does any of the other things that are now being discussed
in comp.std.unix about the problems associated with the structure
in <dirent.h>.

====

If you want a suggestion for your SIG, has this been thought of before?
(probably only a few hundred times)

If object[0] were made legal, it should have the restriction
that its size can never be known by the compiler.
i.e. sizeof(object) should never be 0.
This implies that if it appears in a structure it must be the last member,
and the size restriction is inherited by that structure.
(sort of like how incomplete types are ok if you don't do anything with them.)

    typedef struct { int x; int y[]; } Struct;
    Struct *p;
would make each of these six lines illegal:
    auto int y[0];
    static Struct s;
    func(s)
    malloc(sizeof(Struct))
    ++p
    p[NON_ZERO]
since all of these require the size of the object.

One could of course have
    extern Struct s;
    extern int z[0];
    func(s, &s, p->x, &s.y[0], s.y);
since the size isn't needed.

That would solve the current POSIX argument about dirent.h's
unusable structure.  Since doing anything funny with the structure
would be a compiler error, it would not be necessary to explain
why one shouldn't do funny things with the structure.

====

By the way, PCC and GCC both compile this and print "4".
i.e. they treat an undefined structure as having zero size.
Sounds both useless and dangerous to me.

    struct s;
    struct {
        int x;
        struct s;
    } z;
    
    main() {
        printf("%d\n", sizeof(z));
    }

bret@codonics.COM (Bret Orsburn) (12/02/89)

In article <11715@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <2298@jato.Jpl.Nasa.Gov> baalke@mars.jpl.nasa.gov (Ron Baalke) writes:
>>are zero length arrays allowed in the ANSI standard for C?
>
>No; Standard C does not support zero-sized objects.
>

Aargh! Whatever happened to "don't break existing code"?!

What was the rationale behind this (IMHO) arbitrary obstruction?




-- 
-------------------
bret@codonics.com
uunet!codonics!bret
Bret Orsburn

news@ism780c.isc.com (News system) (12/02/89)

In article <2298@jato.Jpl.Nasa.Gov> baalke@mars.jpl.nasa.gov (Ron Baalke) writes:
:
:I've inherited some C code that had the following declaration in it:
:
:     char tbi[0];
:
:When I tried to compile this using Turbo C v2.0 or VAX C, it was flagged as
:a fatal error. My question is this: are zero length arrays allowed in the
:ANSI standard for C?

No. Here is the quote from the standard of Dec 7, 1988 (page 68, line 3):

    The expression delimited by [ and ] (which specifies the size of the
    of an array) shall be an integral constant expression that has a value
    greater than zero.

I believe this one of the hot issues debated by the committee.

  Marv Rubinstein

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/02/89)

In article <5486@sdcc6.ucsd.edu> bruno@sdcc10.ucsd.edu (Bruce W. Mohler) writes:
>In article <11715@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
> >I'm POC for a zero-sized object special interest group,
> >but frankly there has been little activity since the
> >committee's consensus was clearly against such objects.
>Wouldn't this kind of group, because of its purpose, have 
>zero members?  (Not counting the POC (which is, after all,
>just a pointer)   :-)

One nifty use for zero-sized objects would be to store the information
content of postings like that :-)

Anybody remember the Signetics WOM ad?

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/02/89)

In article <480@codonics.COM> bret@codonics.com (Bret Orsburn) writes:
>In article <11715@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>>In article <2298@jato.Jpl.Nasa.Gov> baalke@mars.jpl.nasa.gov (Ron Baalke) writes:
>>>are zero length arrays allowed in the ANSI standard for C?
>>No; Standard C does not support zero-sized objects.
>Aargh! Whatever happened to "don't break existing code"?!
>What was the rationale behind this (IMHO) arbitrary obstruction?

Many existing C compilers do not support 0-sized objects,
so this is not a "change to C".  K&R1 was not explicit about this
and so offer no guidance; indeed, taken literally K&R1 prohibits
use of the name of a 0-length array in expressions (although the
C Standard fixed the thing that caused a problem there), and also
permits negative constants for array lengths in declarations.

There are several technical problems that would have to be overcome
if 0-sized objects were allowed in C.  (I don't want to discuss them
in this forum.)  As a proponent of zero-sized objects, I don't think
these obstacles are insurmountable, but given the lack of a clear
need, X3J11 decided not to open that can of worms.

bill@twwells.com (T. William Wells) (12/03/89)

In article <480@codonics.COM> bret@codonics.com (Bret Orsburn) writes:
: >No; Standard C does not support zero-sized objects.
:
: Aargh! Whatever happened to "don't break existing code"?!
:
: What was the rationale behind this (IMHO) arbitrary obstruction?

Here we !@#$ing go again. Someone mistaking the details of their
particular implementation for legal C.

This "arbitrary obstruction" is common practice; most of the C
compilers I've worked with do *not* support zero sized objects.

I like the idea but life is that it is not portable.

That ANSI chose not to require it is unfortunate but does not
change anything for those of us who believe that programs should
be portable.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

amull@Morgan.COM (Andrew P. Mullhaupt) (12/03/89)

In article <5486@sdcc6.ucsd.edu>, bruno@sdcc10.ucsd.edu (Bruce W. Mohler) writes:
> In article <11715@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>  >I'm POC for a zero-sized object special interest group,
>  >but frankly there has been little activity since the
>  >committee's consensus was clearly against such objects.
> 
> Wouldn't this kind of group, because of its purpose, have 
> zero members?  (Not counting the POC (which is, after all,
> just a pointer)   :-)

It would depend on how many APL programmers also program in C.
Almost all programmers with APL experience are quite interested
in empty arrays. There is a long, long litany of 'empty array'
jokes, like the following:



Later,
Andrew Mullhaupt

jeenglis@nunki.usc.edu (Joe English) (12/03/89)

bill@twwells.com (T. William Wells) writes:
>In article <480@codonics.COM> bret@codonics.com (Bret Orsburn) writes:
[ on zero-sized objects ]
>: Aargh! Whatever happened to "don't break existing code"?!
>:
>: What was the rationale behind this (IMHO) arbitrary obstruction?
>
>Here we !@#$ing go again. Someone mistaking the details of their
>particular implementation for legal C.
>
>This "arbitrary obstruction" is common practice; most of the C
>compilers I've worked with do *not* support zero sized objects.

So now that they're explicitly forbidden, code written
for compilers that don't support them won't break.  If
they were supported, code written for those compilers
*still* wouldn't break.  (Not for that reason, anyway.)

Seeing as how most compiler vendors have to make major
revisions to support ANSI (or have already done so)
anyway, how is this an issue?

(I do see, however, that supporting zero-sized arrays
would have required extensive revisions to the draft
itself, and probably would have substantially increased
language complexity.  That sounds like a valid reason
to me, though I'd like to see them supported too.)

--Joe English

  jeenglis@nunki.usc.edu

libes@cme.nbs.gov (Don Libes) (12/04/89)

In article <563@s5.Morgan.COM> amull@Morgan.COM (Andrew P. Mullhaupt) writes:
>in empty arrays. There is a long, long litany of 'empty array'
>jokes, like the following:
>
>
>
>Later,
>Andrew Mullhaupt

That's rather feeble.  Here are some better ones (pulled out of an old
Quote Quad that I'm embarrassed to admit still haunts my shelves).
They aren't gutbusters, but they do illustrate the point.

A woman gets on a bus with three sets of twins.
Driver: Gosh, lady, do you always get twins?
Woman: Not always - hundreds of times we don't get anything at all.

Patient: Doctor, have you got a cure for complete loss of voice?
Doctor: Good morning, can I help you?

Archaeologist: My research shows that the ancient Egyptians knew all
about wireless radio.
Reporter: That's astounding!  How did you determine that.
Archaeologist: In all my investigations, I never found any wire.

Amazingly, there is even one about empty arrays themselves:

Q: How many empty arrays does it take to fill all of memory.
A: One, if it's big enough.

(To appreciate this last one, it helps to know that APL 2 arrays
require structural information at run-time that can be arbitrarily
large.)

Don Libes          libes@cme.nist.gov      ...!uunet!cme-durer!libes

kenmoore@unix.cis.pitt.edu (Kenneth L Moore) (12/04/89)

In article <2298@jato.Jpl.Nasa.Gov> baalke@mars.jpl.nasa.gov (Ron Baalke) writes:
==>
==>I've inherited some C code that had the following declaration in it:
==>
==>     char tbi[0];
==>
==>When I tried to compile this using Turbo C v2.0 or VAX C, it was flagged as
==>a fatal error. My question is this: are zero length arrays allowed in the
==>ANSI standard for C?
==>
==> Ron Baalke                       |    baalke@mars.jpl.nasa.gov 
==> Jet Propulsion Lab  M/S 301-355  |    baalke@jems.jpl.nasa.gov 
==> 4800 Oak Grove Dr.               |
==> Pasadena, CA 91109               |

If you want to find out if your C code is conservative, run lint on it.
You are much more likely to get transportable code if you do this.

Ken

fredex@cg-atla.UUCP (Fred Smith) (12/05/89)

In article <1989Dec2.210042.12668@twwells.com> bill@twwells.com (T. William Wells) writes:
>In article <480@codonics.COM> bret@codonics.com (Bret Orsburn) writes:
>: >No; Standard C does not support zero-sized objects.
>:


Excuse me, but I must ask a stupid question. Why the !@#$ would anyone even want
to declare an array of zero size ???? Isn't that rather similar to a pointer
to the same type of object??  If so, what is wrong with declaring a pointer
rather than an empty array ??

I don't want to get flamed for this questin, but I would like an answer from one
of the gurus (Doug Gwyn, Chris Torek, Henry Spencer, etc.)!

Fred

davidsen@sungod.crd.ge.com (William Davidsen) (12/05/89)

In article <8129@cg-atla.UUCP> fredex@cg-atla.UUCP (Fred Smith) writes:

| Excuse me, but I must ask a stupid question. Why the !@#$ would anyone even want
| to declare an array of zero size ???? Isn't that rather similar to a pointer
| to the same type of object??  If so, what is wrong with declaring a pointer
| rather than an empty array ??

  A pointer is not the same as an array (of any size). One use of an
array of zero size (or one) is at the end of a struct definition. Then,
if the actual struct is allocated by malloc et al, the size allocated
can be larger than the size of the struct as defined, and the array will
be addressable using positive non-zero subscripts.

  The array must be used instead of the pointer to insure that
allignment considerations don't creep in. The portable solution is to
declare the array with size one, and then adjust the size of the malloc
call as needed.

	bill davidsen		(davidsen@crdos1.crd.GE.COM)
  {uunet | philabs}!crdgw1!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/05/89)

In article <8129@cg-atla.UUCP> fredex@cg-atla.UUCP (Fred Smith) writes:
>Why the !@#$ would anyone even want to declare an array of zero size ????

There are several contexts in which it would be useful.

The general philosophical answer is, since it could be useful
and since we all agree that control structures that can loop
zero times are preferable to ones that insist on looping at
least once, why shouldn't the same considerations be applied
to the data structures manipulated by such control structures?

For example, suppose that one can individually configure a
bunch of options when compiling an application, and that OPTA,
etc. are either 1 or 0 to determine whether or not the option
is to be supported at run time.  Then one might find it
convenient to have an array of run-time option enable/disable
flags, or some similar data structure:
	bool	opt_on[OPTA+OPTB+OPTC+...+OPTZ];
In the case where no option cupport was configured at compile
time, it would be nice if the resulting 0-length array could
merely be left as is, since we know that no accesses will be
made to its contents at run time.  Yet according to the C
standard, that would be an illegal declaration, so we have to
code in a special kludge just to take care of this special
case.

bill@twwells.com (T. William Wells) (12/05/89)

In article <8129@cg-atla.UUCP> fredex@cg-atla.UUCP (Fred Smith) writes:
: In article <1989Dec2.210042.12668@twwells.com> bill@twwells.com (T. William Wells) writes:
: >In article <480@codonics.COM> bret@codonics.com (Bret Orsburn) writes:
: >: >No; Standard C does not support zero-sized objects.
:
: Excuse me, but I must ask a stupid question. Why the !@#$ would anyone even want
: to declare an array of zero size ???? Isn't that rather similar to a pointer
: to the same type of object??  If so, what is wrong with declaring a pointer
: rather than an empty array ??
:
: I don't want to get flamed for this questin, but I would like an answer from one
: of the gurus (Doug Gwyn, Chris Torek, Henry Spencer, etc.)!

Well, I don't know if you think of me as a guru, but here goes:

Consider a symbol table that is used to store strings. You could
declare a member of it as:

	typedef struct SYMTAB {
		struct SYMTAB *sym_next;
		int     sym_type;
		char    *sym_text;
	} SYMTAB;

This has the drawback that one needs two allocates for the
structure and there is a pointer that really is not needed. The
"ideal" solution would be to stick the string right where the
pointer is.

But how do you declare it? Char sym_next[MAX_SYM];? Not only does
this waste space (we'd expect most strings are much shorter than
MAX_SYM), but it may be the case that there *is* no maximum
symbol length and one does not want to impose one. So that is out.

Here's another: define it as char sym_next[1]; and cheat. By
cheat, I mean to allocate the structure with something like:

	sp = (SYMTAB *)malloc(sizeof(SYMTAB) + strlen(text));

(There's a +1 for the null character and -1 for the 1 in the
structure which cancel.)

This still can waste memory, because of padding in the size of
SYMTAB. Moreover, some systems might take that [1] declaration
seriously and give you an error when you access something beyond
the first element of the string. The kind that immediately comes
to mind is debugging interpreters: these, one hopes, will check
for accessing outside the bounds of an array.

In ANSI C, one could make the declaration:

	sp = (SYMTAB *)malloc(offsetof(SYMTAB, sym_text) + strlen(text) + 1);

which eliminates the potential waste due to padding.

A better solution would be to declare the structure with char
sym_text[0]; The compiler would then forbid taking the size of the
structure (it being undefined), but the compiler would know that
accesses beyond the end of the structure are intended. You'd use
a malloc like the second one to allocate such a structure.

Here's another example. Suppose you want an array to represent
all of data memory, perhaps for an OS. You'll arrange that the
linker will put this at the start of data but you don't want to
declare a size since you don't want the compiler (or the reader,
for that matter) to believe that there is a fixed amount of memory
in your machine. This is different from an extern, in that it is a
definition, not just a declaration. You'd declare it as char
Memory[0];

However, since zero sized objects haven't been sanctified, and
have not been all that available anyway, this is not portable
practice.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

tbrakitz@phoenix.Princeton.EDU (Triantaphyllos Byron Rakitzis) (12/06/89)

Bill Wells says:

>This still can waste memory, because of padding in the size of
>SYMTAB. Moreover, some systems might take that [1] declaration
>seriously and give you an error when you access something beyond
>the first element of the string. The kind that immediately comes
>to mind is debugging interpreters: these, one hopes, will check
>for accessing outside the bounds of an array.

Wait, hold on a second. If an implementation of C does bounds checking
then it isn't C any more. That's not how C works. There's no such
thing as an array in C. Just pointers, and memory. It's up to the 
system to do segementation protection, if it wants. 

Byron Rakitzis

(Honestly, I don't see any problem with declaring an array [1] and
then subtracting one from the malloc() call.  Really, what's the
problem? How can this be a waste of memory? You never have a zero-byte
string! C strings are always terminated with a '\0' anyway!!)


-- 
"C Code."
	  "C Code run."
			"Run, Code, run!"
Byron Rakitzis. (tbrakitz@phoenix.princeton.edu ---- tbrakitz@pucc.bitnet)

fredex@cg-atla.UUCP (Fred Smith) (12/06/89)

OK. Thanks to all of you who sent me good answers to my question about
reasons for zero-length arrays.

Now I think I understand. In fact, I recall having seen code previously
which used a zero-length array at the tail end of a structure. I thought
at the time that it was a HORRIBLE way to write code. I STILL think so.
Sure, it works (on some implementation!!!!). Sure, its also convenient
and useful. Does that make it good C?  Does that make it easy to figure
out when you trip over it for the first time in somebody's undocumented
piece of code?  NO.

I think that its convenience is not sufficient reason to insist that
it should be a part of the language. After all, there are other
conveniences which are used in certain implementations that are also not
a part of the language as specified by X3J11, ususally for good reason.

One reason that comes to mind is a conflict with the way arrays 
relate to pointers. For normal arrays, simply mentioning its name
in a program evaluates to the address of the array. Now what is this
address??  It is the address of the first element of the array! If
the thing is declared as zero length, what do you get when you
mention its address ??  I dunno, haven't tried it yet (I intend to),
but I bet it makes no sense, whatever it is! Even if it does evaluate
to some address somewhere, it certainly is not an address at which
one would dare to store data (without, of course, the use of malloc
to create some data space there)! I would guess that it is because of this kind
of sticky issue that the committee chose not to embrace zero-length
arrays.

Sure, it would be nice if everybody's favorite feature (or abuse of the
language) were specified in the standard. However, the guys onthe committee
were also trying to make all the pieces fit together in a more-or-less
logical manner, so as to reduce some of the hidden gotchas, and it seems
to me that this particular one does have a few hidden gotchas. It certainly
is easy enough to work around the "lack" of this "feature".

Fred

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/06/89)

In article <11963@phoenix.Princeton.EDU> tbrakitz@phoenix.Princeton.EDU (Triantaphyllos Byron Rakitzis) writes:
-Bill Wells says:
->... some systems might take that [1] declaration
->seriously and give you an error when you access something beyond
->the first element of the string. The kind that immediately comes
->to mind is debugging interpreters: these, one hopes, will check
->for accessing outside the bounds of an array.
-Wait, hold on a second. If an implementation of C does bounds checking
-then it isn't C any more.

Wrong.

-That's not how C works. There's no such thing as an array in C.
-Just pointers, and memory.

Wrong.

-It's up to the system to do segementation protection, if it wants. 

Not relevant.

Just because you aren't familiar with an implementation like Bill
described does not mean they don't exist or that they're not valid
C implementations.

I use the [1] kludge on occasion, but I raise flags when I do because
it is not guaranteed to work.  It happens to, on all the systems I
commonly have access to.

asd@cbnewsj.ATT.COM (adam.denton) (12/07/89)

In article <480@codonics.COM> bret@codonics.com (Bret Orsburn) writes:
>In article <11715@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>>In article <2298@jato.Jpl.Nasa.Gov> baalke@mars.jpl.nasa.gov (Ron Baalke) writes:
>>>are zero length arrays allowed in the ANSI standard for C?
>>
>>No; Standard C does not support zero-sized objects.
>>
>
>Aargh! Whatever happened to "don't break existing code"?!
>
>What was the rationale behind this (IMHO) arbitrary obstruction?


Pardon me, but I just can't keep quiet any longer.

What fathomable purpose could a programmer possibly want by
declaring a zero-length array?  To store nothing?
Just what is a compiler supposed to do when it sees
  int a[0]; ??
Set `a' to point somewhere in memory (i.e., "allocate" this "array") ?
But since it's a zero-length
object, the compiler might well assign ANOTHER variable's location
to that same somewhere in memory.  And how could I assign something
to that location (which is dubious anyway%)?  C never allows assignment
to the name of an array.  `a' is not an lvalue.  And a[0] is not a member
of the array `a' !!  (or a[_anything_] for that matter!!)

I suspect the a[0] that the original poster found was a typo that
was never detected.

[End of my sermon on the mount.]
-------
% Akin to { int *p; *p = 3; } without p being malloc'ed.
-------

Adam S. Denton
asd@mtqua.ATT.COM

All flames are appreciated!  Please enclose $1.00 per flame to cover
processing and handling!!

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/07/89)

In article <8141@cg-atla.UUCP> fredex@cg-atla.UUCP (Fred Smith) writes:
-One reason that comes to mind is a conflict with the way arrays 
-relate to pointers. For normal arrays, simply mentioning its name
-in a program evaluates to the address of the array. Now what is this
-address??  It is the address of the first element of the array! If
-the thing is declared as zero length, what do you get when you
-mention its address ??  I dunno, haven't tried it yet (I intend to),
-but I bet it makes no sense, whatever it is!

It makes perfect sense.

-Even if it does evaluate to some address somewhere, it certainly is
-not an address at which one would dare to store data

This is no different from usual -- you're not allowed to store beyond
the bounds of ANY array.

The main problem is that two distinct 0-length objects might have
the same address.  This bothers some people.

maart@cs.vu.nl (Maarten Litmaath) (12/07/89)

In article <1989Dec5.112553.24087@twwells.com>
bill@twwells.com (T. William Wells) writes:
\...
\Consider a symbol table that is used to store strings. You could
\declare a member of it as:
\
\	typedef struct SYMTAB {
\		struct SYMTAB *sym_next;
\		int     sym_type;
\		char    *sym_text;
\	} SYMTAB;
\
\This has the drawback that one needs two allocates for the
\structure [...]

symtabptr = (SYMTAB *) malloc(sizeof(SYMTAB) + strlen(text) + 1);
symtabptr->sym_text = (char *) symtabptr + sizeof(SYMTAB);
strcpy(symtabptr->sym_text, text);
-- 
`Take John Berryhill: the guy is everywhere!  All because one day he typed "rn"
instead of [rm]'  (Richard Sexton)  | maart@cs.vu.nl, uunet!mcsun!botter!maart

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/07/89)

In article <2678@cbnewsj.ATT.COM> asd@cbnewsj.ATT.COM (adam.denton,mt,) writes:
>What fathomable purpose could a programmer possibly want by
>declaring a zero-length array?  To store nothing?

What fathomable purpose could a programmer possibly want by
coding a loop that gets executed zero times?  To do nothing?

Possible uses for zero-sized objects (notably arrays) have
already been posted.  Let me add that there are no logical
problems with the concept; all properties of such objects
would be well defined.
	sizeof zero would be 0
	&zero points at the object
	++ptr_to_zero would still point to zero
	*ptr_to_zero needn't do anything to "access" the contents
	&zero_length_array[0] points one past the last valid element
etc.

As I recall, zero was invented by Arabic mathematicians
thousands of years ago.  It's a pity it still frightens
or confuses people.

c9h@psuecl.bitnet (12/07/89)

> <1989Dec5.112553.24087@twwells.com> <11963@phoenix.Princeton.EDU> <8141@cg-atla.UUCP> <11759@smoke.BRL.MIL>
Organization: Engineering Computer Lab, Pennsylvania State University
Lines: 25

In article <11759@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>
> The main problem is that two distinct 0-length objects might have
> the same address.  This bothers some people.

In *most* compilers, a 0-length array would share the same address as the
next data item defined after it.  However, you should not rely on this,
because it may be non-portable.

It seems that the main reason (and *only* even half-way decent reason) for
using a 0-length array is to allocate a variable amount of memory for a
structure depending on the length of the array.  This seems reasonable.

However, as usual, some compilers try to protect the dumb, stupid, idiotic,
crazed, terminal-bashing, cpu-smashing programmer from making such a stupid
mistake.  (BTW: I'm being sarcastic.)  Hell, C doesn't do any other array
bounds checking; why should it bother me about something like this?

As far as I'm concerned, your compiler is broken.

--
- Charles Martin Hannum II       "Klein bottle for sale ... inquire within."
    (That's Charles to you!)     "To life immortal!"
  c9h@psuecl.{bitnet,psu.edu}    "No noozzzz izzz netzzzsnoozzzzz..."
  cmh117@psuvm.{bitnet,psu.edu}  "Mem'ry, all alone in the moonlight ..."

bret@codonics.COM (Bret Orsburn) (12/07/89)

Too bad I was out of town and missed most of the fun on this one!

In article <1989Dec2.210042.12668@twwells.com> bill@twwells.com (T. William Wells) writes:
>In article <480@codonics.COM> bret@codonics.com (Bret Orsburn) writes:
>: >No; Standard C does not support zero-sized objects.
>:
>: Aargh! Whatever happened to "don't break existing code"?!
>:
>: What was the rationale behind this (IMHO) arbitrary obstruction?
>
>Here we !@#$ing go again. Someone mistaking the details of their
>particular implementation for legal C.

Read my posting again.

I did not say (a) zero length objects are a Good Thing, (although some of
the follow-ups have got me pretty convinced) or (b) zero length
objects are portable, or (c) zero length objects are widely implemented.
I certainly don't "mistake the details of [my] particular implementation
for legal C." In fact, the C I have been working in for the last three
years is a dialect that runs on a *bit* *addressable* processor. I have *no*
illusions about any of that code being standard or portable. (Although
T.W. Wells has used a wonderful piece of ex post facto reasoning there,
as it is precisely the ANSI standard that will determine what is "legal C". :-)

>This "arbitrary obstruction" is common practice; most of the C
>compilers I've worked with do *not* support zero sized objects.

Which doesn't change the fact that some existing code uses zero sized
objects, and that such code will be broken by the ANSI restriction.

>I like the idea but life is that it is not portable.

And life is also that portability is not an issue and not an option for
a lot of software. A lot of us use C primarily as an assembly language
substitute and develop code that is as un-portable as the special-purpose
systems we have built. (Try porting the device drivers from a custom-designed
embedded system some time!)

(Please direct all wonderful anecdotes about that piece of code you never
thought you'd have to port to /dev/null. :-)

But that is ALSO not the point of my original posting.

>That ANSI chose not to require it is unfortunate but does not
>change anything for those of us who believe that programs should
>be portable.

"That ANSI chose not to require it" is ALSO not the point of the original
posting. That ANSI chose to FORBID it is the point. And that ANSI chose
to forbid it in the face of existing implementations and existing code
deserves at least an "Aargh!".

Well, I've had my turn. Thank you all for your time.

Cheers!


-- 
-------------------
bret@codonics.com
uunet!codonics!bret
Bret Orsburn

srg@quick.COM (Spencer Garrett) (12/07/89)

> OK. Thanks to all of you who sent me good answers to my question about
> reasons for zero-length arrays.
> 
> Now I think I understand. In fact, I recall having seen code previously
> which used a zero-length array at the tail end of a structure. I thought
> at the time that it was a HORRIBLE way to write code. I STILL think so.
> Sure, it works (on some implementation!!!!). Sure, its also convenient
> and useful. Does that make it good C?  Does that make it easy to figure
> out when you trip over it for the first time in somebody's undocumented
> piece of code?  NO.

Oh, please.  Zero length arrays are clean, simple, easy to understand,
and easy to implement.  They are by far the most elegant way to deal
with expandable arrays at the end of structures, and this is a very
important problem.

> I think that its convenience is not sufficient reason to insist that
> it should be a part of the language. After all, there are other
> conveniences which are used in certain implementations that are also not
> a part of the language as specified by X3J11, ususally for good reason.

But this "convenience" is in no way implementation specific.  It is
cleanly implementable in any environment which is capable of hosting
C at all.

> One reason that comes to mind is a conflict with the way arrays 
> relate to pointers. For normal arrays, simply mentioning its name
> in a program evaluates to the address of the array. Now what is this
> address??  It is the address of the first element of the array! If
> the thing is declared as zero length, what do you get when you
> mention its address ??  I dunno, haven't tried it yet (I intend to),
> but I bet it makes no sense, whatever it is! Even if it does evaluate
> to some address somewhere, it certainly is not an address at which
> one would dare to store data (without, of course, the use of malloc
> to create some data space there)! I would guess that it is because of this kind
> of sticky issue that the committee chose not to embrace zero-length
> arrays.

Zero-length arrays work *exactly* like arrays *of any other length*.
You'll get exactly the same address no matter what the size.  It's no
accident that the first element past the end of an array is a legal
address.  There's no other clean way to deal with any array.  The only
distinction is that, for zero-length arrays, the 0'th and N'th elements
are at the same address (because N is 0).  Of course you can't store
anything in a zero-length array that has been statically allocated.
That's not what they're for.  You can *always* store N elements in
an array of length N, so what did you expect?

> Sure, it would be nice if everybody's favorite feature (or abuse of the
> language) were specified in the standard. However, the guys onthe committee
> were also trying to make all the pieces fit together in a more-or-less
> logical manner, so as to reduce some of the hidden gotchas, and it seems
> to me that this particular one does have a few hidden gotchas. It certainly
> is easy enough to work around the "lack" of this "feature".

But it's always a work-around, and there are lots of inelegant possibilities.
How would you deal with the following?

	struct blat {
		struct blat    *next;
		int		value;
		char		name[0];
	} *new;
	char *str;

	new = (struct blat *)malloc(sizeof(struct blat) + strlen(str) + 1);
	new->next = ??;
	new->value = ??;
	strcpy(new->name, str);

The "char name[0]" is telling the compiler that a character array is going
to start at that point in the structure, but that sizeof(struct blat)
should not include any of the elements of that array.  You don't want
to leave out the definition of "name" entirely, since you want to have
something to call it as part of the structure, and you need to tell
the compiler its type anyway, so that alignment restrictions can
be dealt with (not all such arrays are character arrays).  If you
declare it "char name[1]" then there's no portable way to get the
size of the header (since you can't predict how much padding may
need to be added).  The best you can do is to throw away that 1 element
you declared just to avoid having to understand the concept of zero.

srg@quick.COM (Spencer Garrett) (12/07/89)

In article <11760@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> In article <2678@cbnewsj.ATT.COM> asd@cbnewsj.ATT.COM (adam.denton,mt,) writes:
> >What fathomable purpose could a programmer possibly want by
> >declaring a zero-length array?  To store nothing?
> 
> What fathomable purpose could a programmer possibly want by
> coding a loop that gets executed zero times?  To do nothing?

Excellent point.  Bravo!  I'm going to save your article for
the next time this comes up!

> Possible uses for zero-sized objects (notably arrays) have
> already been posted.  Let me add that there are no logical
> problems with the concept; all properties of such objects
> would be well defined.
> 	sizeof zero would be 0
> 	&zero points at the object
> 	++ptr_to_zero would still point to zero
> 	*ptr_to_zero needn't do anything to "access" the contents
> 	&zero_length_array[0] points one past the last valid element
> etc.

Just some minor nits so as not to confuse those who are still having
trouble with this concept.

	++ptr_to_zero   *will* increment the pointer.  The *array*
has size 0.  Its *elements* do not.  There just aren't any *of*
them.  Thus sizeof(zero) == 0,  sizeof(zero[0]) > 0, and
(sizeof(zero)/sizeof(zero[0])) gives the number of elements in
the array (as always) which in this case happens to be 0.

	*ptr_to_zero   will access storage wherever the pointer
points.  If that's beyond the end of the malloc'd storage it may
bomb like any other array reference.  If the array was statically
allocated, then this is always an illegal reference.

	&zero_length_array[0]   does indeed point one past the
last valid element.  It also points to the first element of the
array, but that element *isn't a valid element* unless the
array has been expanded via malloc (or friends).

> As I recall, zero was invented by Arabic mathematicians
> thousands of years ago.  It's a pity it still frightens
> or confuses people.

Yup.  At least we're not burning heretics so often anymore.

Actually, I must confess that structures with no members
of nonzero size do cause some problems, but arrays of
length zero make perfect sense, and that's the usage that
started this discussion.

bengsig@oracle.nl (Bjorn Engsig) (12/07/89)

Article <11751@smoke.BRL.MIL> by gwyn@brl.arpa (Doug Gwyn) says:
|I use the [1] kludge on occasion, but I raise flags when I do because
|it is not guaranteed to work.
Would you please explain why, and also tell what is the portable way of
having a 'variable length "thing"' in C.

Article <4733@solo11.cs.vu.nl> by maart@cs.vu.nl (Maarten Litmaath) proposed:
|	typedef struct SYMTAB {
|		struct SYMTAB *sym_next;
|		int     sym_type;
|		char    *sym_text;
|	} SYMTAB;
|
|symtabptr = (SYMTAB *) malloc(sizeof(SYMTAB) + strlen(text) + 1);
|symtabptr->sym_text = (char *) symtabptr + sizeof(SYMTAB);
|strcpy(symtabptr->sym_text, text);
Is that portable, or do you have to use two mallocs (one for the struct,
and one for the string)?
-- 
Bjorn Engsig,	Domain:		bengsig@oracle.nl, bengsig@oracle.com
		Path:		uunet!{mcsun!orcenl!bengsig,oracle!bengsig}

bill@twwells.com (T. William Wells) (12/07/89)

In article <4733@solo11.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes:
: In article <1989Dec5.112553.24087@twwells.com>
: bill@twwells.com (T. William Wells) writes:
: \...
: \Consider a symbol table that is used to store strings. You could
: \declare a member of it as:
: \
: \     typedef struct SYMTAB {
: \             struct SYMTAB *sym_next;
: \             int     sym_type;
: \             char    *sym_text;
: \     } SYMTAB;
: \
: \This has the drawback that one needs two allocates for the
: \structure [...]
:
: symtabptr = (SYMTAB *) malloc(sizeof(SYMTAB) + strlen(text) + 1);
: symtabptr->sym_text = (char *) symtabptr + sizeof(SYMTAB);
: strcpy(symtabptr->sym_text, text);

I suppose so. A strict reading of the standard does not permit
this; it says that the result of malloc can be cast to *an*
object. I can imagine a debugging interpreter that says that, once
the result of malloc is cast, that's it. That could make
maintaining type information much easier and maybe eliminate
certain kinds of bugs. Another possibility is a capability machine
might do something esoteric.

On the other hand, the following code would work, even in those
environments:

	char    *dummyp;

	dummyp = (char *)malloc(sizeof(SYMTAB) + strlen(text) + 1);
	symtabptr = (SYMTAB *)dummyp;
	symtabptr->sym_text = dummyp + sizeof(SYMTAB);
	strcpy(symtabptr->sym_text, text);

But there is still that pointer going to waste.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

gsf@ulysses.homer.nj.att.com (Glenn Fowler[drew]) (12/07/89)

In article <70188@psuecl.bitnet>, c9h@psuecl.bitnet writes:
> > <1989Dec5.112553.24087@twwells.com> <11963@phoenix.Princeton.EDU> <8141@cg-atla.UUCP> <11759@smoke.BRL.MIL>
> It seems that the main reason (and *only* even half-way decent reason) for
> using a 0-length array is to allocate a variable amount of memory for a
> structure depending on the length of the array.  This seems reasonable.

why not make the last element huge rather than small and then at malloc time
decrease the sizeof rather than increase:

	struct x
	{
		...
		char var[max_size];
	} *p;

	p = (struct x*)malloc(sizeof(struct x) - max_size + current_size);

or is this just as sleazy as the undersized array example?
-- 
Glenn Fowler    (201)-582-2195    AT&T Bell Laboratories, Murray Hill, NJ
uucp: {att,decvax,ucbvax}!ulysses!gsf       internet: gsf@ulysses.att.com

khera@macbeth.cs.duke.edu (Vick Khera) (12/07/89)

In article <11760@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>
>As I recall, zero was invented by Arabic mathematicians
>thousands of years ago.  It's a pity it still frightens
>or confuses people.

No, it was invented by Indians.  If you check your history, you will learn
that the first evidence of the existence of the zero was in Gwalior, India
(which by amazing coincidence is where I was born ;-). The Arabs "borrowed"
the Indian numbering system and eventually got credit for inventing it.

Enough said.

							vick.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vick Khera                              Department of Computer Science
ARPA:   khera@cs.duke.edu               Duke University
UUCP:   ..!{mcnc,decvax}!duke!khera     Durham, NC 27706

les@chinet.chi.il.us (Leslie Mikesell) (12/08/89)

In article <7350@quick.COM> srg@quick.COM (Spencer Garrett) writes:

>> already been posted.  Let me add that there are no logical
>> problems with the concept; all properties of such objects
>> would be well defined.
>> 	sizeof zero would be 0
>> 	&zero points at the object
>> 	++ptr_to_zero would still point to zero
>> 	*ptr_to_zero needn't do anything to "access" the contents
>> 	&zero_length_array[0] points one past the last valid element
>> etc.

>Just some minor nits so as not to confuse those who are still having
>trouble with this concept.

>	++ptr_to_zero   *will* increment the pointer.  The *array*
>has size 0.  Its *elements* do not.  There just aren't any *of*
>them.  Thus sizeof(zero) == 0,  sizeof(zero[0]) > 0, and
>(sizeof(zero)/sizeof(zero[0])) gives the number of elements in
>the array (as always) which in this case happens to be 0.

This sounds right to me. And you can't cheat by pretending that
sizeof(element) == 0.  Otherwise loops that detect the end of
an array by checking for an element address > address of last
element would never end.  Besides, when referencing an array
through a pointer (otherwise you can't ++an_address), there is
no notion of the size of the array being pointed to.

>	*ptr_to_zero   will access storage wherever the pointer
>points.  If that's beyond the end of the malloc'd storage it may
>bomb like any other array reference.  If the array was statically
>allocated, then this is always an illegal reference.

But it's up to the compiler to allocate array storage. Remember that
this was declared as "type name[0]"  (probably some symbol that
the preprocessor evaluates as 0, actually).  Thus the compiler can
make sure that the reference is legal.  However, because a reference
to array[end + 1] is also supposed to be legal, the address would have
to point at enough real memory to hold one element, although it could
be treated as a union with anything convenient.  Whether it is addressed
as name[0] or ptr=name; *ptr can't make any difference, and the
compiler can't know the size of the array being accessed via ptr.  To
make the concept useful, it would have to allow linking to code that
accesses arrays via pointers, so the compiler can't cheat and pretend
that there is a special type of element stored in arrays of 0 length.

>Actually, I must confess that structures with no members
>of nonzero size do cause some problems, but arrays of
>length zero make perfect sense, and that's the usage that
>started this discussion.

How about an array of 0 length of structures containing elements
of nonzero size?  Could you then sizeof(array[0].element) without 
having to create a real instance of the struct?

Les Mikesell
  les@chinet.chi.il.us

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (12/08/89)

In article <12468@ulysses.homer.nj.att.com> gsf@ulysses.homer.nj.att.com (Glenn Fowler[drew]) writes:

| why not make the last element huge rather than small and then at malloc time
| decrease the sizeof rather than increase:
| 
|	[  example  ]
|
| or is this just as sleazy as the undersized array example?

  It has one advantage over the use of size zero or one: if you have a
compiler which checks subscripts (I haven't seen one, but Sabre might)
this will allow subscripts up to the max range.

  The advantage of the zero size array is that then the size of the
malloc is (sizeof(struct whatever) + datasize), while in the case of the
big declaration its (sizeof(struct whatever) - (maxsize - datasize)).
The former is easier to write and understand, although I'd bury the
whole thing in a macro called with just the size.

#define getstructx(size) \
  (struct whatever *)malloc(sizeof(struct whatever)+size)
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

peter@ficc.uu.net (Peter da Silva) (12/08/89)

In article <7349@quick.COM> srg@quick.COM (Spencer Garrett) writes:
> But this "convenience" is in no way implementation specific.  It is
> cleanly implementable in any environment which is capable of hosting
> C at all.

Including the infamous Burroughs A-series processors?
-- 
`-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
 'U`  Also <peter@ficc.lonestar.org> or <peter@sugar.lonestar.org>.

      "If you want PL/I, you know where to find it." -- Dennis

jm36+@andrew.cmu.edu (John Gardiner Myers) (12/08/89)

gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> In article <11963@phoenix.Princeton.EDU> tbrakitz@phoenix.Princeton.EDU (Triantaphyllos B\
> yron Rakitzis) writes:
> -Bill Wells says:
> ->... some systems might take that [1] declaration
> ->seriously and give you an error when you access something beyond
> ->the first element of the string. The kind that immediately comes
> ->to mind is debugging interpreters: these, one hopes, will check
> ->for accessing outside the bounds of an array.
> -Wait, hold on a second. If an implementation of C does bounds checking
> -then it isn't C any more.
> 
> Wrong.
[...]
> Just because you aren't familiar with an implementation like Bill
> described does not mean they don't exist or that they're not valid
> C implementations.

Ok, I'll bite.  I claim that the following program is (modulo typos)
strictly conforming.  Could someone please point out which constraint
I missed?

#include <stdio.h>
main()
{
    struct foo_struct {
	int bar;
	char baz[1];
    } *foo;

    foo = (struct foo_struct *) malloc(sizeof(struct foo_struct)+1);
    foo->baz[1] = 1;  /* error? */
    return 0;
}

Note that it is provable that the char pointer (foo->baz + 1) points
within the object returned by malloc.

-- 
_.John G. Myers		Internet: John.G.Myers@andrew.cmu.edu
(412) 268-2984		LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up

bill@twwells.com (T. William Wells) (12/08/89)

In article <526@codonics.COM> bret@codonics.com (Bret Orsburn) writes:
: "That ANSI chose not to require it" is ALSO not the point of the original
: posting. That ANSI chose to FORBID it is the point. And that ANSI chose
: to forbid it in the face of existing implementations and existing code
: deserves at least an "Aargh!".

Let's see if I have this straight:

	1) Some implementations provide feature X.
	2) X3J11 chose not to permit feature X.
	3) Programs using feature X will break under an ANSI compiler.
	3) Therefore X3J11 deserves at least an Aargh!

Nope. Doesn't wash. Only if feature X were either a de facto or a
de jure standard (such as they were), or filled a very important,
portable, need, would this be a valid argument.

While I'd like to have zero length arrays (and, in fact,
complained about their lack in the comments I sent to X3J11) I'm
not going to pretend that this particular feature is really all
that important.

Never mind that the committee included at least one feeping
creature, it was not their business to include every feature that
someone had dreamt up for a C compiler. And code that relied on
those features will necessarily break when used on an ANSI
compliant compiler. Such is life.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

rhg@cpsolv.UUCP (Richard H. Gumpertz) (12/09/89)

Sender: 
Reply-To: rhg@cpsolv.uucp (Richard H. Gumpertz)
Followup-To: 
Distribution: 
Organization: Computer Problem Solving, Leawood, Kansas
Keywords: 

In article <UZTerlu00VcJEDulRd@andrew.cmu.edu> jm36+@andrew.cmu.edu (John Gardiner Myers) writes:
>Note that it is provable that the char pointer (foo->baz + 1) points
>within the object returned by malloc.

Unfortunately, it is not provable that the char pointer(foo->baz + 1)
points within the sub-object baz.  Hence, the behavior is undefined
(X3J11/88-158, 3.3.6, page 48, lines 24-27). 

-- 
===============================================================================
| Richard H. Gumpertz rhg%cpsolv@uunet.uu.NET -or- ...uunet!amgraf!cpsolv!rhg |
| Computer Problem Solving, 8905 Mohawk Lane, Leawood, Kansas 66206-1749      |
===============================================================================

bret@codonics.COM (Bret Orsburn) (12/09/89)

In article <7349@quick.COM> srg@quick.COM (Spencer Garrett) writes:
>How would you deal with the following?
>
>	struct blat {
>		struct blat    *next;
>		int		value;
>		char		name[0];
>	} *new;
>If you
>declare it "char name[1]" then there's no portable way to get the
>size of the header (since you can't predict how much padding may
>need to be added).

Sorry if I am missing some subtlety here, (if so, I have probably edited
out the meat of it) but it seems pretty clear to me that the padding bytes
should exist in the zero-length case as well. Otherwise, the array "name"
is not legally aligned.

--
rn:
go
away
-- 
-------------------
bret@codonics.com
uunet!codonics!bret
Bret Orsburn

bret@codonics.COM (Bret Orsburn) (12/09/89)

>Never mind that the committee included at least one feeping
>creature, it was not their business to include every feature that
>someone had dreamt up for a C compiler.

There's a false dichotomy in there somewhere.

There must be some ground between mandating a feature and forbiding it,
or none of the unique features of any architecture can be entailed in
a conforming implementation.

(Yes, I *will* shut up very soon now :-)

-- 
-------------------
bret@codonics.com
uunet!codonics!bret
Bret Orsburn

dricejb@drilex.UUCP (Craig Jackson drilex1) (12/09/89)

In article <7227@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>In article <7349@quick.COM> srg@quick.COM (Spencer Garrett) writes:
>> But this "convenience" is in no way implementation specific.  It is
>> cleanly implementable in any environment which is capable of hosting
>> C at all.
>
>Including the infamous Burroughs A-series processors?
>`-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
> 'U`  Also <peter@ficc.lonestar.org> or <peter@sugar.lonestar.org>.

I think the "convenience" referred to was the practice of putting an
array of length 1 (in the absence of zero-length arrays) at the end
of the structure.  You may rest assured that A-Series C will handle
this as well as the next compiler.  Which means that if you malloc'ed the
space, the malloc'ed size is what matters.

The implementors of A-Series C knew that this sort of thing was rampant,
and therefore implemented pointed-to objects using a simulated linear address
space.

They do special-case things which are not pointed to; if one declares an
array:
   int a[1];
and *never* point to it, then referencing a[1] will get you an interrupt.

Note: because of this pointed-to rule, a[1] and *(a+1) can have dramatically
different compilations.
-- 
Craig Jackson
dricejb@drilex.dri.mgh.com
{bbn,axiom,redsox,atexnet,ka3ovk}!drilex!{dricej,dricejb}

swatt@cup.portal.com (Steven Edward Watt) (12/10/89)

[lots of previous references about sizeof(zero[0]) being non-zero...]

  Unless, of course, zero is declared

void zero[0];

  I have met a compiler (on CP/M-86, of all things!) that allowed this.
And yes, zero++ did INDEED equal zero.

  I just post 'em like I see 'em.

swatt@cup.portal.com
preferred mailing:  ...!zok!wattres!steve

exspes@gdr.bath.ac.uk (P E Smee) (12/12/89)

In article <8129@cg-atla.UUCP> fredex@cg-atla.UUCP (Fred Smith) writes:
>Excuse me, but I must ask a stupid question. Why the !@#$ would anyone even want
>to declare an array of zero size ???? 

One reason I haven't seen given yet is to provide a marker for the end
of an entity.  I could use such a beast (more or less) as follows
(example slightly simplified)...

    struct fred {
	struct fred * last;
	struct fred * next;
	long offset;
	int thing;
	   ...  bunch of more data stuff
	char end_marker[0];         /* MUST BE LAST IN STRUCTURE */
	};

    ....

    bzero (&fred.offset, &fred.end_marker - &fred.offset);

to provide an easy way of clearing a member of the list without
disturbing its 2 list links (the first 2 pointers).  At present I use
an int end_marker as I don't trust size 0 arrays.  This is quicker than
zeroing each item in the struct separately, and I don't have to worry
about possible padding -- where I would if I tried to work out the size
of the bzero from the sizes of the data items in the region I want to
zap.

-- 
 Paul Smee, Univ. of Bristol Comp. Centre, Bristol BS8 1TW (Tel +44 272 303132)
 Smee@bristol.ac.uk   :-)   (..!uunet!ukc!gdr.bath.ac.uk!exspes if you HAVE to)

lee@sq.sq.com (Liam R. E. Quin) (12/13/89)

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes:
> gsf@ulysses.homer.nj.att.com (Glenn Fowler[drew]) writes:

>| why not make the last element huge rather than small and then at malloc time
>| decrease the sizeof rather than increase: [...]

You can only do that with malloc, of course, and that sounds to me to be
even more dangerous!

>  It has one advantage over the use of size zero or one: if you have a
>compiler which checks subscripts (I haven't seen one, but Sabre might)
>this will allow subscripts up to the max range.

Sabre-C checks bounds on both statically defined arrays (declared with
	t_SomeType ArrayName[ArraySize];
or whatever) and also on malloc'd areas of memory
	char *p = malloc( (unsigned) (BUFSIZ + 1));

It also reports references to data (including array elements) before they
have been initialised.
The tests are a little simplistic, involving initialising things with a
magic number (which you can set), but they are robust enough to be useful.

You can turn off the array- and malloc-checking if you want; although I
don't remember if you can do so on a per-variable basis, you can on a per-
file basis.

I have no idea what Sabre-C does with zero sized arrays.  The Sun with
Sabre on it isn't here right now (!) so I can't check.


Seeing as I've posted into this argume -- er, discussion,...

Yes, I sometimes use structures whose last element is of type char[1] so that
I can later malloc  larger lump.  Every time I do this (including in the
text retrieval package I sent to comp.sources.unix), I debate between doing
this and using a pointer, which I can set to point just beyond the structure
in the same allocated space.
I find that the [1] trick is a little simpler, and I am less likely to call
free() on the pointer by mistake in my "destructor function" that frees a
structure.

>#define getstructx(size) \
>  (struct whatever *)malloc(sizeof(struct whatever)+size)

I prefer *not* to do this, because if there are problems with the technique,
porting will involve more than simply changing this macro.  Some things are
better hidden, and soe are not, I think.
-- 
Liam R. Quin, Unixsys (UK) Ltd [note: not an employee of "sq" - a visitor!]
lee@sq.com (Whilst visiting Canada from England, until Christmas)
 -- I think I'm going to come out at last...
 -- What?  Admit you're not a fundamentalist Jew?  They'll *crucify* you!  :-)

bill@twwells.com (T. William Wells) (12/13/89)

In article <557@codonics.COM> bret@codonics.UUCP (Bret Orsburn) writes:
: >Never mind that the committee included at least one feeping
: >creature, it was not their business to include every feature that
: >someone had dreamt up for a C compiler.
:
: There's a false dichotomy in there somewhere.
:
: There must be some ground between mandating a feature and forbiding it,
: or none of the unique features of any architecture can be entailed in
: a conforming implementation.

No false dichotomy at all. Reread the paragraph two before that
one:

: Nope. Doesn't wash. Only if feature X were either a de facto or a
: de jure standard (such as they were), or filled a very important,
: portable, need, would this be a valid argument.

In other words, OF COURSE, they should include some features. But
the mere fact that some systems permitted a thing does not,
without further consideration, imply that ANSI was remiss in not
including the feature.

Features they added that fit the above were prototypes, the const
and volatile keywords, and many things in the library.

To repeat what I said earlier: it would have been nice if they had
permitted zero length arrays and I even asked about them in my
comments in the public review, but it is certainly no disaster.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

peter@isgtec.UUCP (Peter Curran) (12/14/89)

This is a re-posting, due to technical difficulties.

WIthout repeating all the discussion, much of this topic has been centred
around a single zero-length array in a structure, for which the actual
size is to be determined when an instance is malloc'd.

The Rationale document (my copy is X3J11/88-091, 13 May 1988), in section
3.5.4.2, pp. 54-55, suggests the use of array size "1".  This works
just as well as zero, except you have to subtract 1 (or sizeof(elt[0]))
from the amount malloc'd (except for the most common use, a variable-length
string, where the initial element accounts for the terminating NUL).

I realize the Rationale is not part of the standard, but I think it
would take an awfully brave implementor to disallow a construct
that the Rationale condones.

battle@alphard.cs.utk.edu (David Battle) (12/14/89)

In article <70188@psuecl.bitnet> c9h@psuecl.bitnet (Charles Hannum) writes:
>It seems that the main reason (and *only* even half-way decent reason) for
>using a 0-length array is to allocate a variable amount of memory for a
>structure depending on the length of the array.  This seems reasonable.

Is the order of elements of a structure guaranteed to be the same in memory
as in the program?  That is, given:

struct foo {
    int a;
    int b;
};

struct foo bar;

Is &bar.a guaranteed to be < &bar.b?

					-David L. Battle
					 battle@battle.esd.ornl.gov
					 battle@utkux1.utk.edu

darcy@druid.uucp (D'Arcy J.M. Cain) (12/14/89)

In article <1989Dec12.110042.29290@gdt.bath.ac.uk> exspes@gdr.bath.ac.uk (P E Smee) writes:
>
>One reason I haven't seen given yet is to provide a marker for the end
>of an entity.  I could use such a beast (more or less) as follows
>(example slightly simplified)...
>
>    struct fred {
>	struct fred * last;
>	struct fred * next;
>	long offset;
>	int thing;
>	   ...  bunch of more data stuff
>	char end_marker[0];         /* MUST BE LAST IN STRUCTURE */
>	};
>
>    ....
>
>    bzero (&fred.offset, &fred.end_marker - &fred.offset);
>
This looks like a good idea but I don't understand why it must be the
last element.  How about the following:

    struct fred {
	struct fred * last;
	long offset;
	int thing;
	   ...  bunch of more data stuff
	char end_marker[0];
	struct fred * next;
	};

Then your bzero still works but on a chunk of the structure internal to
the structure.  You could also use the address of next but the above 
should still work (assuming of course an ANSI standard different from what
we have.)

Of course I am assuming that this mythical construct would act as it does in
assembler where;

LABEL1:
LABEL2:
	dw	0

evaluates both LABEL1 and LABEL2 as pointing to the same memory location.

This is fun.  Maybe we need a comp.lang.mythical :-)


-- 
D'Arcy J.M. Cain (darcy@druid)     |   Thank goodness we don't get all 
D'Arcy Cain Consulting             |   the government we pay for.
West Hill, Ontario, Canada         |
No disclaimers.  I agree with me   |

henry@utzoo.uucp (Henry Spencer) (12/15/89)

In article <1498@utkcs2.cs.utk.edu> battle@alphard.cs.utk.edu (David Battle) writes:
>Is the order of elements of a structure guaranteed to be the same in memory
>as in the program? ...

In ANSI C, yes.  (Section 3.5.2.1)  In pre-ANSI implementations, well,
probably.
-- 
1755 EST, Dec 14, 1972:  human |     Henry Spencer at U of Toronto Zoology
exploration of space terminates| uunet!attcan!utzoo!henry henry@zoo.toronto.edu

bill@twwells.com (T. William Wells) (12/15/89)

In article <1989Dec12.110042.29290@gdt.bath.ac.uk> exspes@gdr.bath.ac.uk (P E Smee) writes:
: One reason I haven't seen given yet is to provide a marker for the end
: of an entity.  I could use such a beast (more or less) as follows
: (example slightly simplified)...
:
:     struct fred {
:       struct fred * last;
:       struct fred * next;
:       long offset;
:       int thing;
:          ...  bunch of more data stuff
:       char end_marker[0];         /* MUST BE LAST IN STRUCTURE */
:       };
:
:     ....
:
:     bzero (&fred.offset, &fred.end_marker - &fred.offset);

The above is missing some casts. And it can be done as well with
the following:

	bzero((char *)&fred.offset, sizeof(fred)
	    - ((char *)&fred.offset - (char *)&fred));

which does not need the end_marker. This being as ugly as it is,
I might just code it as:

	tmp1 = fred.last;
	tmp2 = fred.next;
	bzero(&fred, sizeof(fred));
	fred.last = tmp1;
	fred.next = tmp2;

I would certainly do so for exactly one element to be saved.

Here's another solution; it too is a bit ugly, but hides the
ugliness in the structure definition:

	struct fred {
		struct fred * fred_last;
		struct fred * fred_next;
		struct {
			long _fred_offset;
			int _fred_thing;
			   ...  bunch of more data stuff
		} fred_stuff;
	};
	#define fred_offset fred_stuff._fred_offset
	#define fred_thing  fred_stuff._fred_thing

	...

	bzero((char *)&fred.fred_stuff, sizeof(fred.fred_stuff));

This method leads to many headaches unless you make it a practice
to use structure prefixes or some other method of insuring
uniqueness of names.

On the other hand, it has the advantage that it will likely
generate slightly better code than any of the other methods. the
first two methods, for example, are very likely to generate
several extra instructions to compute the amount of space to
clear.

:                                                    This is quicker than
: zeroing each item in the struct separately, and I don't have to worry
: about possible padding -- where I would if I tried to work out the size
: of the bzero from the sizes of the data items in the region I want to
: zap.

Warning: null pointers and floating point zeros are *not*
necessarily represented by bit patterns of all zero bits.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

bill@twwells.com (T. William Wells) (12/15/89)

In article <1498@utkcs2.cs.utk.edu> battle@alphard.cs.utk.edu (David Battle) writes:
: Is the order of elements of a structure guaranteed to be the same in memory
: as in the program?  That is, given:
:
: struct foo {
:     int a;
:     int b;
: };
:
: struct foo bar;
:
: Is &bar.a guaranteed to be < &bar.b?

I don't think that K&R guarantees it but it might.

The dpANS does, however, guarantee it. See 3.5.2.1.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

exspes@gdr.bath.ac.uk (P E Smee) (12/18/89)

In article <1989Dec14.182113.5398@twwells.com> bill@twwells.com (T. William Wells) writes:
>In article <1989Dec12.110042.29290@gdt.bath.ac.uk> exspes@gdr.bath.ac.uk (P E Smee) writes:
>The above is missing some casts. And it can be done as well with

Well, I said I'd simplified it.  

Saving the pointers, zeroing the whole thing, and then putting the
pointers back I considered, but the construct occurs in a
frequently-used part of the program, so I was willing to sacrifice
beauty for speed.  I don't recall why I didn't substructure the stuff
after the pointers, but I do recall thinking at the time that it would
not be a great idea in context.  I *have* localized the definition of
my clear operation so that you only have to understand what I'm doing
(and why) once in the program.

>On the other hand, it has the advantage that it will likely
>generate slightly better code than any of the other methods. the
>first two methods, for example, are very likely to generate
>several extra instructions to compute the amount of space to
>clear.

I was trusting the compiler to realize that the size of the structure was
known at compile time (though not to me), and so to optimize that
computation.  My trust was repaid on the one machine where I looked at
the object code generated.

>Warning: null pointers and floating point zeros are *not*
>necessarily represented by bit patterns of all zero bits.

True, well known, and not an issue in my case.  (In fact, on all my
target machines, they are; but I've learned not to count on that.)

I also contemplated (since the my struct was fixed size) putting the
pointers at the end, and using one of them as the effective marker.
However, that interacted badly with the paging algorithm on one of my
target machines (extra paging to get at the most frequently used item)
and didn't gain any compensating advantage on any of the others.

I suppose I ought to say that the construct is *NOT* one that I am
particularly proud of, and I don't find it beautiful, but it *IS* a
taken-from-life example of where a zero-length entity could sensibly be
used.  (Which is all I meant it to be.)  It is also well-commented in
my original to indicate what I am doing, why I am doing it that way,
and why I am not happy that I have done.

-- 
Paul Smee, Univ of Bristol Comp Centre, Bristol BS8 1TW, Tel +44 272 303132
 Smee@bristol.ac.uk  :-)  (..!uunet!ukc!gdr.bath.ac.uk!exspes if you MUST)