escott%deis.uci.edu@icsg.uci.edu (01/30/87)
Somebody recently came to me with a program that worked on a VAX 11/750 running 4.2BSD but failed on our Sequent Balance 21000 running Dynix 2.1. The apparent culprit was of course the C compiler on the latter machine. However, after examining the code in question, I found a construct that seems a little strange to me: an automatic variable was declared as a "struct foo **bar[]". "How could this be right?" I said to myself. "How can you declare an automatic variable that has no size?" So I wrote a program that contained a similar declaration, and then tried to take sizeof( bar ). Sure enough: warning: sizeof returns 0 [This from the VAX 11/750 4.2BSD compiler] Okay, that makes sense. My question is: is there any reason why you should be able to declare an array with zero elements as an automatic variable? What's strange is that, on the VAX, the program apparently successfully dereferenced bar, both setting a value for "*bar" and then using that value later. How can this be right? How can "bar" have any value at all, much less "*bar"? If there is no use for a zero-sized automatic variable, how come the compiler lets you do it? (Even a C compiler should occasionally clamp down 8^). And, just for the heck of asking, does ANSI C let you make such a declaration? +-------------------------------------------------------------------------+ Scott Menter UCI ICS Computing Support Group Univ. of Calif. at Irvine (714) 856 7552 Irvine, California 92717 Internet: escott@ics.uci.edu UUCP: ...!ucbvax!ucivax!escott Bitnet: escott@uci CSNet: escott%ics.uci.edu@csnet-relay Internet (with Name Server): TBA +-------------------------------------------------------------------------+
chris@mimsy.UUCP (Chris Torek) (02/02/87)
In article <4114@brl-adm.ARPA> escott%deis.uci.edu@icsg.uci.edu (Scott Menter) writes: >... is there any reason why you should be able to declare an array >with zero elements as an automatic variable? Why not? It makes sense. Perhaps it should elicit a warning, since no members of that array are accessible: Valid subscripts are in the range [0..0). >What's strange is that, on the VAX, the program apparently successfully >dereferenced bar, both setting a value for "*bar" and then using that value >later. How can this be right? Just luck. >And, just for the heck of asking, does ANSI C let you make such a >declaration? There seems to be a great debate over malloc(0), with some support as well for empty arrays. It is trivial to allow either, or to disallow either; some argue in favour of `catching the programmer's mistakes for him', while others argue that the construct may not be a mistake, or may have been written by a machine, and that having special cases for zero is both unnecessary and ugly. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) UUCP: seismo!mimsy!chris ARPA/CSNet: chris@mimsy.umd.edu
pinkas@mipos3.UUCP (02/02/87)
In article <4114@brl-adm.ARPA> escott%deis.uci.edu@icsg.uci.edu (Scott Menter) writes: > ... I found a construct that >seems a little strange to me: an automatic variable was declared as a >"struct foo **bar[]". "How could this be right?" I said to myself. "How >can you declare an automatic variable that has no size?" > >So I wrote a program that contained a similar declaration, and then tried to >take sizeof( bar ). Sure enough: > > warning: sizeof returns 0 > >[This from the VAX 11/750 4.2BSD compiler] > >Okay, that makes sense. My question is: is there any reason why you should >be able to declare an array with zero elements as an automatic variable? >What's strange is that, on the VAX, the program apparently successfully >dereferenced bar, both setting a value for "*bar" and then using that value >later. How can this be right? How can "bar" have any value at all, much >less "*bar"? If there is no use for a zero-sized automatic variable, how >come the compiler lets you do it? I don't see the problem with this declaration. bar is declared to be an array of pointers to pointers to struct foo. That is, **(bar[0]) is of type foo. bar initially has no memory allocated to it. This type of construct appears to be a dynamic array, where malloc will be called to get some memory. Since the array is declared to have zero elements, sizeof will return zero. (Remember that sizeof(array) =~ sizeof(element of array) times number of elements. This is approximate because C allows a compiler to pack arrays.) So in your case, the compiler was correct in warning you that bar was of size zero (taking sizeof a zero sized element is not very useful as the most common uses for sizeof are malloc and pointer arithmatic when something cast the pointer to a different type). You should inspect the code, but if it worked on one machine, it should work on another. It could be that they really wanted to say sizeof(foo), in something like: bar = malloc(sizeof(struct foo) * 100) which would allocate 100 elements to the array bar, making it the equivalent of the auto declaration struct foo **bar[100]. -Israel -- ---------------------------------------------------------------------- UUCP: {amdcad,decwrl,hplabs,oliveb,pur-ee,qantel}!intelca!mipos3!pinkas ARPA: pinkas%mipos3.intel.com@relay.cs.net CSNET: pinkas%mipos3.intel.com
tim@amdcad.UUCP (02/03/87)
In article <397@mipos3.UUCP> pinkas@mipos3.UUCP (Israel Pinkas) writes: >In article <4114@brl-adm.ARPA> escott%deis.uci.edu@icsg.uci.edu (Scott Menter) writes: >> ... I found a construct that >>seems a little strange to me: an automatic variable was declared as a >>"struct foo **bar[]". "How could this be right?" I said to myself. "How >>can you declare an automatic variable that has no size?" It isn't right! > >I don't see the problem with this declaration. bar is declared to be an >array of pointers to pointers to struct foo. That is, **(bar[0]) is of >type foo. bar initially has no memory allocated to it. This type of >construct appears to be a dynamic array, where malloc will be called to get >some memory. Since the array is declared to have zero elements, sizeof >will return zero. (Remember that sizeof(array) =~ sizeof(element of array) >times number of elements. This is approximate because C allows a compiler >to pack arrays.) So in your case, the compiler was correct in warning you >that bar was of size zero (taking sizeof a zero sized element is not very >useful as the most common uses for sizeof are malloc and pointer arithmatic >when something cast the pointer to a different type). You should inspect >the code, but if it worked on one machine, it should work on another. It >could be that they really wanted to say sizeof(foo), in something like: > > bar = malloc(sizeof(struct foo) * 100) > ^^ Won't work; bar is a *constant* (see pp 94, 95 of K&R) >which would allocate 100 elements to the array bar, making it the >equivalent of the auto declaration struct foo **bar[100]. There are only 3 places where an array declaration is not required to declare a size between the brackets []: 1: an extern array --> extern int foo[]; 2: an initialized array --> int foo[] = {1,2,3}; 3: an array parameter --> foo(bar) int bar[]; Tim Olson Advanced Micro Devices
greg@utcsri.UUCP (Gregory Smith) (02/03/87)
In article <4114@brl-adm.ARPA> escott%deis.uci.edu@icsg.uci.edu writes: >Somebody recently came to me with a program that worked on a VAX 11/750 >running 4.2BSD but failed on our Sequent Balance 21000 running Dynix 2.1. > [...] after examining the code in question, I found a construct that >seems a little strange to me: an automatic variable was declared as a >"struct foo **bar[]". "How could this be right?" I said to myself. "How >can you declare an automatic variable that has no size?" > >So I wrote a program that contained a similar declaration, and then tried to >take sizeof( bar ). Sure enough: > warning: sizeof returns 0 >[This from the VAX 11/750 4.2BSD compiler] > >Okay, that makes sense. My question is: is there any reason why you should >be able to declare an array with zero elements as an automatic variable? >What's strange is that, on the VAX, the program apparently successfully >dereferenced bar, both setting a value for "*bar" and then using that value >later. How can this be right? How can "bar" have any value at all, much >less "*bar"? If there is no use for a zero-sized automatic variable, how >come the compiler lets you do it? Well, *bar is just bar[0], and is of type (struct foo **). There is indeed no storage reserved for this array element. It is like declaring 'char blat[6]' and then setting "blat[6]='?';". blat[0] through blat[5] exist, and blat[6] is simply the next char after blat[5]. The C language does not guarantee what that is. On the vax compiler the 'bar' declaration reserves zero words for the array 'bar', and then bar[0] is the word *after* that zero-length array. Despite having no length, the array still has an address, and bar[0] is effectively stored at this same address. Since the array occupies zero memory, another variable may start in the same place, and bar[0] will reference the memory occupied by this other variable. The goof who wrote it probably knew that, with the VAX compiler, setting *bar would actually set the next declared auto (a little knowledge is a dangerous thing). I.e. if it looks like this: { struct foo **bar[]; char *ptr; ... Then ptr and *bar are stored in the same place. Of course this is non-portable as you have found. Presumably the code using *bar depends on this shared storage. A quick and dirty fix (fight nasty with nasty?) is #define bar (struct foo ***)&ptr instead of the declaration for bar. This will achieve the same effect and is considerably more portable. It still isn't really correct; the shared storage should be done either by use of a union, or by casting between the stored type and the 'struct foo **' type. The choice depends on what is actually being done with this pointer. "No one ever said it was going to be easy...." -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...
chris@mimsy.UUCP (Chris Torek) (02/04/87)
>>>"struct foo **bar[]". >In article <397@mipos3.UUCP> pinkas@mipos3.UUCP (Israel Pinkas) writes: >>I don't see the problem with this declaration. ... >> >> bar = malloc(sizeof(struct foo) * 100) In article <14575@amdcad.UUCP> tim@amdcad.UUCP (Tim Olson) writes: > ^^ Won't work; bar is a *constant* (see pp 94, 95 of K&R) To be picky, bar is neither a constant nor a variable. Its value is set at entry to the function, and cannot be changed within that function invocation---not without cheating: all automatic variables are really just names for stack frame offsets, so altering the stack or frame pointer shuffles all the variables. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) UUCP: seismo!mimsy!chris ARPA/CSNet: chris@mimsy.umd.edu
pinkas@mipos3.UUCP (02/04/87)
In article <5258@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >In article <4114@brl-adm.ARPA> escott%deis.uci.edu@icsg.uci.edu >(Scott Menter) writes: >>... is there any reason why you should be able to declare an array >>with zero elements as an automatic variable? > >Why not? It makes sense. Perhaps it should elicit a warning, since >no members of that array are accessible: Valid subscripts are in the >range [0..0). Wrong. There are no valid subscripts to the array. To allow a subscript of 0, the array must be declared as bar[1]. Remember, the valid subscripts of an array declared foo[n] are [0..n1]. Regarding this problem, in a former posting I mentioned that the declaration of struct foo **bar[] as an auto variable would be useful as a dynamic array. I have since been corrected. Someone (I deleted the mail message, so I don't have your name, sorry) pointed out that K&R stat that an array is a constant and is thus unusable as an lvalue. To make a dynamic array, the declaration should read struct foo ***bar. When malloc'ed, it will yield an array of pointers to pointers to struct foo. -Israel -- ---------------------------------------------------------------------- UUCP: {amdcad,decwrl,hplabs,oliveb,pur-ee,qantel}!intelca!mipos3!pinkas ARPA: pinkas%mipos3.intel.com@relay.cs.net CSNET: pinkas%mipos3.intel.com
chris@mimsy.UUCP (02/05/87)
>In article <5258@mimsy.UUCP> I wrote: >>[for automatic array declarations of the form `int a[];' valid subscripts >>are in the range [0..0). In article <409@mipos3.UUCP> pinkas@mipos3.UUCP (Israel Pinkas) writes: >Wrong. Not so. >There are no valid subscripts to the array. That is what I said. The valid subscripts are in the range [0..0). Perhaps you would prefer the [0..0[ form? (I always thought that form particularly vile.) The notation [0..0) means the half-open interval between zero and zero, i.e., the null set. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) UUCP: seismo!mimsy!chris ARPA/CSNet: chris@mimsy.umd.edu
throopw@dg_rtp.UUCP (02/07/87)
> chris@mimsy.UUCP (Chris Torek) >> escott%deis.uci.edu@icsg.uci.edu (Scott Menter) >>... is there any reason why you should be able to declare an array >>with zero elements as an automatic variable? > Why not? It makes sense. [...] > some argue in favour of `catching the programmer's > mistakes for him', while others argue that the construct may not > be a mistake, or may have been written by a machine, and that having > special cases for zero is both unnecessary and ugly. This is sensible, I agree. But it is worth noting that int foo[0]; and int foo[]; are NOT the same thing. Even if X3J11 were to take the reasonable approach and allow the first as an automatic declaration, the second should still be an error as an automatic declaration. -- IBM manuals are written by little old ladies in Poughkeepsie who are instructed to say nothing specific. --- R. T. Lillington -- Wayne Throop <the-known-world>!mcnc!rti-sel!dg_rtp!throopw
greg@utcsri.UUCP (Gregory Smith) (02/10/87)
In article <397@mipos3.UUCP> pinkas@mipos3.UUCP (Israel Pinkas) writes: > [...] (Remember that sizeof(array) =~ sizeof(element of array) >times number of elements. This is approximate because C allows a compiler >to pack arrays.) The relationship is exact: sizeof(array)==sizeof( array[0] )*(# of elements). An array may not be packed in a way which makes this relationship inexact. -- ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg Have vAX, will hack...
braner@batcomputer.UUCP (02/11/87)
[]
In the famous "microEMACS" by David Conroy, which has been widely
utilized and modified, the basic text-line structure looks like this:
typedef struct LINE {
struct LINE *nextline;
struct LINE *prevline;
short size; /* s.b. int! */
short used;
char text[]; /* !!!!!!!!! */
} LINE;
The idea is to allocate memory for lines as follows:
lineptr = malloc(sizeof(LINE)+length);
where length is as needed at the time for that line. The actual text
of the line is stored OUTSIDE the struct, starting at lineptr->text[0].
This is, of course, "illegal". Some compilers give a warning about
"zero-size structure element".
Question: Do some compilers refuse to accept this? Is there a GOOD
way to do it legally? (NOTE: I KNOW that you can use:
...
char *text;
...
lineptr = malloc(sizeof(LINE));
lineptr->text = malloc(length);
- but the illegal version saves the overhead of the extra pointer and
the overhead of the extra malloc() control block. In this application
this saving is important, since there will be hundreds or even thousands
of LINEs.)
- Moshe Braner
mouse@mcgill-vision.UUCP (02/11/87)
In article <409@mipos3.UUCP>, pinkas@mipos3.UUCP (Israel Pinkas) writes: > In article <5258@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >> In article <4114@brl-adm.ARPA> escott%deis.uci.edu@icsg.uci.edu (Scott Menter) writes: >>> ... is there any reason why you should be able to declare an array >>> with zero elements as an automatic variable? Uniformity. Note that this, ie [0], is not the same as []. >> Why not? [...] no members of that array are accessible: Valid >> subscripts are in the range [0..0). It doesn't even occupy any storage (at least it does, zero bytes of it), so sure, why not? > Wrong. There are no valid subscripts to the array. That is what Chris meant (I'm sure). Mathematicians use square brackets to denote a closed interval end and parentheses to denote an open end, so that [1..10) would indicate those x for which 1<=x<10. This is arguably inconsistent when both ends are the same value, but I, at least, found his meaning perfectly clear anyway. (Generally, if you disagree with Chris about a point of fact (as opposed to opinion), check your beliefs, assumptions, and understanding of his posting very carefully; he's usually right.) der Mouse USA: {ihnp4,decvax,akgua,utzoo,etc}!utcsri!mcgill-vision!mouse think!mosart!mcgill-vision!mouse Europe: mcvax!decvax!utcsri!mcgill-vision!mouse ARPAnet: think!mosart!mcgill-vision!mouse@harvard.harvard.edu
drw@cullvax.UUCP (02/12/87)
braner@batcomputer.tn.cornell.edu (braner) writes: > In the famous "microEMACS" by David Conroy, which has been widely > utilized and modified, the basic text-line structure looks like this: > > typedef struct LINE { > struct LINE *nextline; > struct LINE *prevline; > short size; /* s.b. int! */ > short used; > char text[]; /* !!!!!!!!! */ > } LINE; > > The idea is to allocate memory for lines as follows: > > lineptr = malloc(sizeof(LINE)+length); > > where length is as needed at the time for that line. The actual text > of the line is stored OUTSIDE the struct, starting at lineptr->text[0]. > This is, of course, "illegal". Some compilers give a warning about > "zero-size structure element". > > Question: Do some compilers refuse to accept this? Is there a GOOD > way to do it legally? (NOTE: I KNOW that you can use: Replace "char text[]" with "char text[0]". This leaves the declaration perfectly legitimate. Probably it isn't kosher according to ANSI to reference foo.text[27], but the various requirements that ANSI puts on make it extremely likely that it will work in any conforming implementation. Dale -- Dale Worley Cullinet Software UUCP: ...!seismo!harvard!mit-eddie!cullvax!drw ARPA: cullvax!drw@eddie.mit.edu
tanner@ki4pv.UUCP (02/12/87)
) braner@batcomputer.tn.cornell.edu (braner) writes ) ... char text[] ... ) ... Question: Do some compilers refuse to accept this? Yes, the microsoft compiler (at least the one distributed by SCO as 2.2\(*b) refuses to accept this. Says "unknown size", of all things. I can understand and sympathise with this, of course -- the size is not specified, so I'd consider it unknown too. However, note that the obvious replacement (char text[0]) elicits the same error message, even though the size IS known (to be zero). In all fairness, this is a beta release of the compiler, and it may have been fixed already. -- <std dsclm, copies upon request> Tanner Andrews
dhb@rayssd.UUCP (02/12/87)
In article <159@batcomputer.tn.cornell.edu> braner@batcomputer.UUCP (braner) writes: > [much discussion of a structure with a trailing character string > and the fact that the way it is being used is illegal. also > mention of the fact that the overhead of an extra pointer and > malloc control block might be critical factors.] I have run into this problem on several occasions and have come up with what I think is a reasonable solution (actually two solutions). The approach that I prefer is the following: 1. change the definition of 'text' to 'char *text;' 2. do the malloc() for 'sizeof(LINE)+length' 3. set the text pointer to the base address of the structure plus sizeof(LINE). This approach has the added overhead of an extra pointer but it eliminates the extra malloc control block. Since the malloc control block is generally larger than a pointer this is a reasonable tradeoff. It also eliminates the extra call to malloc which can be important if you want your application to run fast. Another approach that I have used is to define multiple structures. The first structure has everything except the 'text' variable, the second structure consists of an instance of the first structure followed by a huge text buffer. For example: struct header_junk { int length; int other stuff; whatever else you need; }; struct LINE { struct header_junk hj; char text[32768]; /* or other large number */ }; When you malloc() the structure, specify the size as 'sizeof(struct header_junk)+length' but assign the pointer to something of type 'struct LINE'. This has the minor drawback of adding another level of indirection to get at the variables in the header and on some machines (actually: some compilers) this might add to the execution time. This can be taken care of by using a pointer to the header area. Note that this only adds one pointer instead of one pointer for each line element. You could probably even cheat a little and just use a cast to convert the pointer to the proper type. -- David H. Brierley Raytheon Submarine Signal Division; Portsmouth RI; (401)-847-8000 x4073 smart mailer or arpanet: dhb@rayssd.ray.com old dumb mailer or uucp: {cbosgd,gatech,ihnp4,linus!raybed2} !rayssd!dhb
colin@vu-vlsi.UUCP (02/13/87)
In article <159@batcomputer.tn.cornell.edu> braner@batcomputer.UUCP (braner) writes: > >In the famous "microEMACS" by David Conroy, which has been widely >utilized and modified, the basic text-line structure looks like this: > >typedef struct LINE { > struct LINE *nextline; > struct LINE *prevline; > short size; /* s.b. int! */ > short used; > char text[]; /* !!!!!!!!! */ >} LINE; > >The idea is to allocate memory for lines as follows: > > lineptr = malloc(sizeof(LINE)+length); Some other people suggested declaring the text field to be char *text, but I'm surprised no one suggested this: Declare the text field to be char text[1], then use lineptr = malloc(sizeof(LINE)-1+length); Almost all compilers will optimize sizeof(LINE)-1 into a single constant, so the code generated is likely to be exactly the same as that generated for the uEmacs example above...[Of course you can cast the argument to (unsigned) to keep lint happy.] Gnuplot (which we posted a couple weeks ago) uses this technique because it seemed to be the most portable... -Colin Kelley ..{cbmvax,pyrnj,bpa}!vu-vlsi!colin
gwyn@brl-smoke.UUCP (02/13/87)
In article <795@cullvax.UUCP> drw@cullvax.UUCP (Dale Worley) writes: -braner@batcomputer.tn.cornell.edu (braner) writes: -> typedef struct LINE { ->... -> char text[]; /* !!!!!!!!! */ -> } LINE; -Replace "char text[]" with "char text[0]". Since X3J11 hasn't agreed to permit 0-sized objects anywhere, you might be better off using "char text[1]". Turn off any range-checking your C system might have.
mikes@apple.UUCP (02/18/87)
We got bit by this one, where the user does struct Line { some overhead fields, like links, etc. char data[1]; /* actually, can be a lot more than one */ }; The idea was that the data part is at the end if struct Line, and can be VERY long. We were using a Green Hills C compiler, which had this nice feature of using 'short math' for certain array index calculations. Of course, when the 'data' got to be VERY long, short math won't do, and this caused some hard-to-find problems. Personally, I would incur the overhead of having an extra pointer in the structure, but if you really want to allocate the data *as part of struct Line*, then I am left with the feeling that the proper way to do this is: struct Line { overhead fields char data[MAX_IT_CAN_EVER_BE]; }; and allocate it via malloc(sizeof(struct Line) - MAX_IT_CAN_EVER_BE + sizeofyourdata) This isn't super clean, but I expect that for a language without dynamic arrays. -- Michael Shannon {apple!mikes}