[comp.lang.c] Indefinite-length array as member of struct: how?

charles@c3pe.UUCP (Charles Green) (07/06/89)

I have an application where I'm building and manipulating a stack of
variable-length strings.  I've set up a linked list of nodes, each one
declared as follows:

struct node {
	struct node* next;
	char	string[];
} *nodeptr;

When I know how long the string is I'm pushing onto the stack, I say:

	nodeptr = malloc(strlen(data)+5);

to cover the struct node* and terminating NULL, and then simply

	strcpy(nodeptr->string, data);

The only problem I have is with compilation:  I get a warning about the
zero-length element "string".  I'd like to find out the "correct" way to
do this.  I'll be glad to summarize any Emailed responses, of course.

Thanks,		Charles Green		charles%c3pe@decuac.dec.COM
-- 
{decuac.dec.com,cucstud,sundc}!c3pe!charles	ex::!echo Boo:

henry@utzoo.uucp (Henry Spencer) (07/07/89)

In article <7360@c3pe.UUCP> charles@c3pe.UUCP (Charles Green) writes:
>I have an application where I'm building and manipulating a stack of
>variable-length strings...
>
>struct node {
>	struct node* next;
>	char	string[];
>} *nodeptr;

I believe I recall Dennis Ritchie once characterizing this sort of thing
as "unwarranted chumminess with the compiler"!  I suspect that a close
reading of the draft C standard says you can get away with it, but it's
still cheating in at least a small way.

If you want to shut your compiler up, try making that "[1]".
-- 
$10 million equals 18 PM       |     Henry Spencer at U of Toronto Zoology
(Pentagon-Minutes). -Tom Neff  | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

lmiller@venera.isi.edu (Larry Miller) (07/08/89)

In article <7360@c3pe.UUCP> charles@c3pe.UUCP (Charles Green) writes:
>I have an application where I'm building and manipulating a stack of
>variable-length strings.  I've set up a linked list of nodes, each one
>declared as follows:
>
>struct node {
>	struct node* next;
>	char	string[];
>} *nodeptr;
>
>When I know how long the string is I'm pushing onto the stack, I say:
>
>	nodeptr = malloc(strlen(data)+5);
>

	This is a cute way to build nodes in a linked list with
	different size contents, but it is fraught with peril.
	Here are some potential problems:

	1)  You malloc space for the length of the string plus 5.
	You are assuming that a struct node * is 4 bytes, but this
	won't be the case on a PC with a near pointer, for example.
	Instead try:

		nodeptr = malloc(strlen(data)+1 + sizeof(struct node *));

	You should test the return from malloc too.

	2)  This works because of a trick: the order in which fields in the
	structure are declared.  Simply changing the definition of a struct node
	to:
		struct node {
			char	string[];
			struct node* next;
		};
	causes disaster.

	3)  Because of the indeterminate length of the string field in each
	node, you can't pass a struct node to/from a function.  All that gets
	copied over will be the next field and, at best, one character from 
	the string field.

An alternative method of storing arbitrary length strings is presented in K&R,
and in our book.

Larry Miller				lmiller@venera.isi.edu (no uucp)
USC/ISI					213-822-1511
4676 Admiralty Way
Marina del Rey, CA. 90292

dwho@nmtsun.nmt.edu (David Olix) (07/08/89)

>regarding variable-length strings...
>
>struct node {
>	struct node* next;
>	char	string[];
>} *nodeptr;
>
>If you want to shut your compiler up, try making that "[1]".

I guess I am unclear on something here.  If you define string as
'char string[1];', that only gives you one char's worth of space in string.
If that's the case why don't you define string as 'char string;'?

savela@tel2.tel.vtt.fi (Markku Savela) (07/09/89)

In article <8870@venera.isi.edu>, lmiller@venera.isi.edu (Larry Miller) writes:
> In article <7360@c3pe.UUCP> charles@c3pe.UUCP (Charles Green) writes:
>>struct node {
>>	struct node* next;
>>	char	string[];
>>} *nodeptr;
>
> 		nodeptr = malloc(strlen(data)+1 + sizeof(struct node *));

   This the prime example where the ANSI "offsetof"-macro becomes handy.
I write the above

	(struct node *)malloc(offsetof(struct node,string)+strlen(data)+1);

   About the original question, I guess we have to resign to using
either "string[1]" or "string[HUGE_NUMER]" to satisfy picky compilers.
(Isn't this one of the recurring questions of comp.lang.c?)
--
  Markku Savela

henry@utzoo.uucp (Henry Spencer) (07/09/89)

In article <2831@nmtsun.nmt.edu> dwho@nmtsun.nmt.edu (David Olix) writes:
>>	char	string[];
>>} *nodeptr;
>>
>>If you want to shut your compiler up, try making that "[1]".
>
>I guess I am unclear on something here.  If you define string as
>'char string[1];', that only gives you one char's worth of space in string.
>If that's the case why don't you define string as 'char string;'?

Because then nodeptr->string gives you a value of type char, not char *.
It's a convenience issue.
-- 
$10 million equals 18 PM       |     Henry Spencer at U of Toronto Zoology
(Pentagon-Minutes). -Tom Neff  | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

davidg@eleazar.dartmouth.edu (David Gelhar) (07/10/89)

>In article <7360@c3pe.UUCP> charles@c3pe.UUCP (Charles Green) writes:
>>I have an application where I'm building and manipulating a stack of
>>variable-length strings.  [...]
>>
>>struct node {
>>	struct node* next;
>>	char	string[];
>>} *nodeptr;
>>

In article <8870@venera.isi.edu> lmiller@venera.isi.edu.UUCP (Larry Miller) writes:

>[...]
>
>	2)  This works because of a trick: the order in which fields in the
>	structure are declared.  Simply changing the definition of a struct node
>	to:
>		struct node {
>			char	string[];
>			struct node* next;
>		};
>	causes disaster.
>
>	3)  Because of the indeterminate length of the string field in each
>	node, you can't pass a struct node to/from a function.  All that gets
>	copied over will be the next field and, at best, one character from 
>	the string field.
>[...]

2) Isn't a very strong argument.  K & R specifies that addresses are
assigned in increasing order from left to right; as long as you
DON'T change the definition order, disaster is avoided.  The danger
isn't that the compiler will do the wrong thing, rather that some
later (and lesser :-)) programmer will not understand the
significance of the ordering of the structure members.  Situations
like this are an excellent opportunity to try out a seldom-used
feature of C -- the "comment".

The obvious answer to 3) is to simply pass the address of the
structure instead of the structure itself.  Admittedly, this could
be a limitation, but copying structures for a function call is
expensive, so maybe you didn't want to do that anyway. :-)

This is certainly a trick, and not one I'd use every day, but it
could have its uses.  For one thing, it requires only one malloc per
element, while more straightforward methods (like putting a char *
in the struct) require two.  Depending on the malloc overhead (in
terms of both internal fragmentation and execution time), it seems
this could be a useful approach.

pc@cs.keele.ac.uk (Phil Cornes) (07/10/89)

From article <7360@c3pe.UUCP>, by charles@c3pe.UUCP (Charles Green):
> I have an application where I'm building and manipulating a stack of
> variable-length strings.  I've set up a linked list of nodes, each one
> declared as follows:
> 
> struct node {
> 	struct node* next;
> 	char	string[];
> } *nodeptr;
> 
> When I know how long the string is I'm pushing onto the stack, I say:
> 
> 	nodeptr = malloc(strlen(data)+5);
> 	strcpy(nodeptr->string, data);
> 
> The only problem I have is with compilation.....

This is not surprising, dynamically sized structures are not supported in C
and your solution to the problem won't work. Here is a piece of code you might
try instead (when you include error checking):

	nodeptr = (struct node *) malloc (sizeof(struct node)+strlen(data)+1);
	strcpy ((char *)nodeptr+sizeof(struct node),data);
	nodeptr->string = (char *)nodeptr+sizeof(struct node);

This solution will malloc() a block of memory large enough to hold a node
structure plus the space required to hold the data string and a terminating
null (line 1). Then the data bytes are copied into the tail end of that
block of memory (line 2). And finally the string pointer in the node 
structure at the start of the block of memory is set to point to the string
itself (line 3).

Set up in this way the various components can be accessed as follows:

	nodeptr              -  Is a pointer to the entire memory block
	nodeptr->next       -  Will access this nodes link (next) pointer
	nodeptr->string     -  Is a pointer to the start of the data array
	nodeptr->string[n]  -  Will access the nth character in the data string

I hope this solves some of your problems.


Phil Cornes          I just called to say .....
-----------*
                     JANET: cdtpc@uk.ac.stafpol.cr83
                     Phone: +44 (0)785 53511 x6058
                     Smail: Staffordshire Polytechnic, Computing Department
                            Blackheath Lane, STAFFORD, ST18 0AD, ENGLAND.

asst-jos@yetti.UUCP (Jonathan) (07/10/89)

I'm no guru, but don't user
	char string[]; 
in your struct. Use
	char *string;

Remember that although by definition, the name of an array is a pointer
to the array, there are certain limitations. If I remember correctly, 
you can't user the name of a declared as a pointer. namely

	char string[SIZE];

	*string = ....

is invalid. 



Jeffrey Klein

davidsen@sungod.crd.ge.com (William Davidsen) (07/11/89)

In article <7360@c3pe.UUCP> charles@c3pe.UUCP (Charles Green) writes:
| I have an application where I'm building and manipulating a stack of
| variable-length strings.  I've set up a linked list of nodes, each one
| declared as follows:
| 
| struct node {
| 	struct node* next;
| 	char	string[];
| } *nodeptr;
| 
| When I know how long the string is I'm pushing onto the stack, I say:
| 
| 	nodeptr = malloc(strlen(data)+5);

  You could change the string size to one, this would solve the problem
with compiler warnings. If it were my code I would use
"sizeof(nodeptr)+1" rather than 5, but I work on 16/32/64 bit machines
with a lot of my code.

  You could also make the string a char pointer and allocate the string
and node space, then set the string to point to the byte after the node
before copying the data in. This clearly imposed a performance penalty
(albeit tiny) but no longer relies on the string being last.

Here are a few passages from the draft standard (begin quote):

  Section 3.5.2.1 line 17:
	"as discussed in section 3.1.2.5, a structure is a type
consisting of a sequence of named members, whose storage is allocated in
an ordered sequence, and a union is a type consisting of a sequence of
named members, whose storage overlap."

  Section 3.1.2.5 line 21:
	"A *structure type* indicates a sequentially allocated set of
member objects, each of which has an optionally specified name and
possibly a distince type."

==== end quote ====

  Although I can't imagine anyone implementing this incorrectly,
neighther section says anything about "increasing" sequence, just that
the allocation is sequential. Still, how much better it would have been
to say something like:

	"A *structure type* indicates a set of member objects,
allocated in increasing sequential order, each of which has an
optionally specified name and possibly a distince type."

  I just have to feel that any wording which is less ambiguous is better.
	bill davidsen		(davidsen@crdos1.crd.GE.COM)
  {uunet | philabs}!crdgw1!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

scs@adam.pika.mit.edu (Steve Summit) (07/11/89)

In article <321@yetti.UUCP> asst-jos@yetti.UUCP (Jonathan) writes:
>I'm no guru...

Then perhaps you shouldn't post when you aren't sure of the answer.

>...but don't user
>	char string[]; 
>in your struct. Use
>	char *string;
>Remember that although by definition, the name of an array is a pointer
>to the array,

This definition is generally misleading, and only makes sense if
you understand arrays and pointers so well that the definition is
not needed.

>there are certain limitations. If I remember correctly, 
>you can't user the name of a declared as a pointer. namely
>	char string[SIZE];
>	*string = ....
>is invalid. 

There are limitations, but this is not one of them.  The principal
differences are that given

	char array[SIZE];
	char *string;

"array" can't be directly assigned to (that is,
	array = newptr;
is invalid; the members of the array can of course be assigned
to), and that sizeof(array) != sizeof(string) (in general).

In article <661@kl-cs.UUCP> pc@cs.keele.ac.uk (Phil Cornes) writes:
>...dynamically sized structures are not supported in C
>and your solution to the problem won't work. Here is a piece of code you might
>try instead (when you include error checking):
>	nodeptr = (struct node *) malloc (sizeof(struct node)+strlen(data)+1);
>	strcpy ((char *)nodeptr+sizeof(struct node),data);
>	nodeptr->string = (char *)nodeptr+sizeof(struct node);

This is unnecessarily baroque, and no more guaranteed to work
than the original attempt at simulating a "dynamically sized
structure."  I usually implement them as follows, and I believe
that the pANS contains sufficiently detailed descriptions of
required structure behavior that this sort of thing will work.
(Structure punning of a similar sort was, as I recall,
fundamental to the usage advocated in Thomas Plum's book Reliable
Data Structures in C, and although the book predated X3J11C, the
techniques should still be valid.)

	#define INITIALALLOC 1

	struct node {
		struct node* next;
		char string[INITIALALLOC];
	} *nodeptr;

	nodeptr = (struct node *)malloc(sizeof(struct node) +
		strlen(data) - INITIALALLOC + 1);	/* + 1 for \0 */
	(void)strcpy(nodeptr->string, data);

The only real difference here is the use of sizeof (which many
others have suggested) and the macro INITIALALLOC which obviates
the need for a 0-sized array while documenting and coordinating
the adjustment required when allocating.

                                            Steve Summit
                                            scs@adam.pika.mit.edu

walter@hpclwjm.HP.COM (Walter Murray) (07/12/89)

William Davidsen writes:

  [Quotes from dpANS about structure members being allocated "sequentialy"]

>   Although I can't imagine anyone implementing this incorrectly,
> neighther section says anything about "increasing" sequence, just that
> the allocation is sequential.

But the dpANS does provide such a guarantee, in 3.5.2.1:

   "Within a structure object, the non-bit-field members and the
   units in which bit-fields reside have addresses that increase
   in the order in which they are declared."

Also, 3.3.8 requires that a relational test involving pointers 
to members of the same structure will work in the expected way,
with the pointer to the member declared later testing higher.

Walter Murray
-------------

roelof@idca.tds.PHILIPS.nl (R. Vuurboom) (07/12/89)

|In article <1148@crdgw1.crd.ge.com| davidsen@crdos1.UUCP (bill davidsen) writes:
|
|Here are a few passages from the draft standard (begin quote):
|
|  Although I can't imagine anyone implementing this incorrectly,
|neighther section says anything about "increasing" sequence, just that
|the allocation is sequential. Still, how much better it would have been
|to say something like:
|
|	"A *structure type* indicates a set of member objects,
|allocated in increasing sequential order, each of which has an
|optionally specified name and possibly a distince type."
|

Well if we _really_ want to be nit-picky: I don't think I care whether
the objects are allocated sequentially or all at the same time :-).

You might need something like this:

A *structure type* indicates an ordered set of member objects. The  
corresponding allocated areas (in memory) follow the same ordering. Each
member object has an optionally...bla bla

-- 
Roelof Vuurboom  SSP/V3   Philips TDS Apeldoorn, The Netherlands   +31 55 432226
domain: roelof@idca.tds.philips.nl             uucp:  ...!mcvax!philapd!roelof

chris@mimsy.UUCP (Chris Torek) (07/12/89)

In article <321@yetti.UUCP> asst-jos@yetti.UUCP (Jonathan) writes:
>Remember that although by definition, the name of an array is a pointer
>to the array, there are certain limitations.

The name of an array is *not* a pointer to the array.  Forget `array
equals pointer'; it is false.  In an rvalue context, an object of type
`array N of T' is converted to one of type `pointer to T'; its value
is the address of the 0'th (first) element of the array (&array[0]).

>... namely
>
>	char string[SIZE];
>
>	*string = ....
>
>is invalid. 

Apply the rule: `In an rvalue context'---do we have an rvalue context?
The context here in which the object of type `array SIZE of char'
appears is `*string'.  Looking up the description for `*' in any good C
book, we find that it means either multipy (infix) or indirect
(prefix).  Here it is being used as the latter.  The target of the `*'
(here `string') is indeed in an rvalue context.  Thus, `string' is
converted from an object of type `array SIZE of char' to one of type
`pointer to char'.  Its value is the address of `string[0]'
(&string[0]).  The details of the indirection operator are that it
converts a pointer into the object to which that pointer points (here
string[0]), and yeilds an lvalue (so that we can assign into
string[0]).  Thus,

	char string[SIZE];

	*string = <any integer expression>;

is legal and means `write the value produced by the integer expression
into string[0]'.

Arrays are not pointers; pointers are not arrays.  There is a symmetry
between the two, but they are different.  The real relationship is that
array object are sometimes converted into pointers, and that pointers
that point to one member of an array may be used to reach any member of
that array.  If you want an analogy, they are like matter and energy,
interconvertible (E = mc^2) but not identical (you would eat an apple,
but you would not eat 7600000 gigajoules of energy [~3 oz]).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

ggg@sunquest.UUCP (Guy Greenwald) (07/12/89)

In article <12574@bloom-beacon.MIT.EDU>, scs@adam.pika.mit.edu (Steve Summit) writes (in part):
> 
> 	#define INITIALALLOC 1
> 
> 	struct node {
> 		struct node* next;
> 		char string[INITIALALLOC];
> 	} *nodeptr;
> 
> 	nodeptr = (struct node *)malloc(sizeof(struct node) +
> 		strlen(data) - INITIALALLOC + 1);	/* + 1 for \0 */
> 	(void)strcpy(nodeptr->string, data);
> 
> The only real difference here is the use of sizeof (which many
> others have suggested) and the macro INITIALALLOC which obviates
> the need for a 0-sized array while documenting and coordinating
> the adjustment required when allocating.


The continuing discussion on the problem of dynamically allocating memory
for a structure of the nature:

	struct node {
		struct node *next;
		char *string;		/* Instead of char string[], which
					 * started the whole fuss */
		...
	} *nodeptr;

seems to have ignored the possibility of two malloc() calls, one for the
node, another for the string.

	nodeptr = (struct node *) malloc(sizeof(*nodeptr));
	/* After the length of char data[] is known: */
	nodeptr->string = (char *) malloc(strlen(data) + 1);
	(void) strcpy(nodeptr->string, data);

Perhaps this isn't as much fun as fooling the compiler with
char string[THINGAMAJIG], but it is straightforward, easy to understand
and (I think) more flexible.

Slings and arrows, anyone?

--G. Guy Greenwald II

dwho@nmtsun.nmt.edu (David Olix) (07/13/89)

In article <18492@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>Thus,
>
>	char string[SIZE];
>
>	*string = <any integer expression>;
>
>is legal ...

I am not one to be picky (usually), but isn't *string (a.k.a. string[0])
of type char?  If so, any expression yeilding a result of type char
would be legal.  Yes, an integer expression would also be legal, but
depending on the result of the expression and the sizes of char and int on
the particular machine you could end up losing bits.

--David Olix (dwho@nmtsun.nmt.edu)

chris@mimsy.UUCP (Chris Torek) (07/13/89)

(I should change the title, but cannot think of a better one.)

>In article <18492@mimsy.UUCP> I wrote:
>>	char string[SIZE]; ... *string = <any integer expression>;
>>is legal ...

In article <2948@nmtsun.nmt.edu> dwho@nmtsun.nmt.edu (David Olix) writes:
>I am not one to be picky (usually), but isn't *string (a.k.a. string[0])
>of type char?  If so, any expression yeilding a result of type char
>would be legal.

It would, except that no expression yeilds a result of type `char'
by the time one gets around to doing assignment, because the value
promotes to one of type `int'.  (The exact duration of the `char'
type really depends on your compiler.  The longer the compiler
retains the type, the more likely it is to generate decent code.
But in principle, `char c, d; c = d;' means `fetch d, widen to int,
narrow to char, store in c'.)

>Yes, an integer expression would also be legal, but depending on the
>result of the expression and the sizes of char and int on the particular
>machine you could end up losing bits.

As long as we are being picky: `integer expression' includes values
of type `long'.  At any rate, the situation is possibly worse: code
such as

	char c; c = 12345;

is allowed (by the pANS) to produce weird results such as having the
display leap off your desk, run in circles, then turn into a butterfly.
The value of that integer-expression must fit in an object of type
`char', or the result of the assignment is {un or implementation}-
defined.  (I cannot recall which offhand.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

scs@adam.pika.mit.edu (Steve Summit) (07/13/89)

In article <661@kl-cs.UUCP> pc@cs.keele.ac.uk (Phil Cornes) writes:
>...dynamically sized structures are not supported in C
>and your solution to the problem won't work. Here is a piece of code you might
>try instead (when you include error checking):
>	nodeptr = (struct node *) malloc (sizeof(struct node)+strlen(data)+1);
>	strcpy ((char *)nodeptr+sizeof(struct node),data);
>	nodeptr->string = (char *)nodeptr+sizeof(struct node);

In article <12574@bloom-beacon.MIT.EDU> I wrote:
>This is unnecessarily baroque, and no more guaranteed to work
>than the original attempt at simulating a "dynamically sized
>structure."

I was hasty in my judgement.  In the absence of a new definition
of struct node, I assumed that Phil was overlaying the string
field in some tricky way.  In fact, given

	struct node {
		struct node* next;
		*char string;
	} *nodeptr;

the space allocated for the contents of the string has nothing
to do with the structure, and the effect (the resultant level of
indirection) is almost as if two separate mallocs had been done,
except of course that only one call is required.  This is a fine
technique, and I should not have criticized it.

(If I had thought about it, I would have realized that Phil's
last line implied that his string field was declared differently
than the original char string[some_indeterminate_size], because
as we all know a char string[] could not have been assigned to.)

It might (and I mean might; I'm not sure) be slightly clearer to
rearrange it as

	nodeptr = (struct node *)malloc(sizeof(struct node)+strlen(data)+1);
	nodeptr->string = (char *)nodeptr+sizeof(struct node);
	strcpy(nodeptr->string, data);

but this is not a real complaint.

                                            Steve Summit
                                            scs@adam.pika.mit.edu

bobmon@iuvax.cs.indiana.edu (RAMontante) (07/13/89)

-In article <661@kl-cs.UUCP> pc@cs.keele.ac.uk (Phil Cornes) writes:
->...dynamically sized structures are not supported in C
->and your solution to the problem won't work. Here is a piece of code you might
->try instead (when you include error checking):
->	nodeptr = (struct node *) malloc (sizeof(struct node)+strlen(data)+1);
->	strcpy ((char *)nodeptr+sizeof(struct node),data);
->	nodeptr->string = (char *)nodeptr+sizeof(struct node);

scs@adam.pika.mit.edu (Steve Summit) <12642@bloom-beacon.MIT.EDU> :
-It might (and I mean might; I'm not sure) be slightly clearer to
-rearrange it as
-
-	nodeptr = (struct node *)malloc(sizeof(struct node)+strlen(data)+1);
-	nodeptr->string = (char *)nodeptr+sizeof(struct node);
-	strcpy(nodeptr->string, data);


Don't these result in a chunk of memory that looks like:

        .-------------v---------------v--------------- - - - --.
        | ptr to next | ptr to string | "I AM A STRING . . . " |
        `-------------^---------------^--------------- - - - --'

where the second field (ptr to string) just points to the third?
I would think the desired memory chunk would look like:

        .-------------v--------------- - - - --.
        | ptr to next | "I AM A STRING . . . " |
        `-------------^--------------- - - - --'

In the first case, I access the string with "*(nodeptr->string)".  In
the second case I just use "nodeptr->string".

Go ahead and flame me.  I learn more from my failures than from my
successes...

chris@mimsy.UUCP (Chris Torek) (07/14/89)

In article <23282@iuvax.cs.indiana.edu> bobmon@iuvax.cs.indiana.edu
(RAMontante) writes:
[in re changing

	struct node { struct node *next; char string[?]; };

for some `?' to

	struct node { struct node *next; char *string; };

]
>Don't these result in a chunk of memory that looks like:
>
>        .-------------v---------------v--------------- - - - --.
>        | ptr to next | ptr to string | "I AM A STRING . . . " |
>        `-------------^---------------^--------------- - - - --'
>
>where the second field (ptr to string) just points to the third?
>I would think the desired memory chunk would look like:
>
>        .-------------v--------------- - - - --.
>        | ptr to next | "I AM A STRING . . . " |
>        `-------------^--------------- - - - --'

This is largely correct.  What is somewhat misleading is the phrase
`second field (ptr to string) ... points to the third.'  The second
field is a pointer to char, not a pointer to string.  When it points to
a character that is the first of an array of characters, where that
array is formed of a series of values other than '\0' whose end is
marked by one (or more) '\0' values, people generally say that the
pointer `points to a string', but it really points to one character.

>In the [char *string] case, I access the string with "*(nodeptr->string)".

No: *nodeptr->string (the parentheses are unnecessary) gets you the
character to which `string' points: one object of type char.  Without
the `*' it gets you an object of type `pointer to char', which in
this example points to the first character of a C-style string.

>[with char string[SOMESIZE]] I just use "nodeptr->string".

This names an object of type `array SOMESIZE of char'.  In rvalue
contexts, such as

	printf("%s\n", nodeptr->string);

the array-object converts to an object of type `pointer to char',
which points to the first character of that array---in this example,
the first character of a C-style string.

Given either declaration, one uses the name `nodeptr->string' in the
same way in rvalue contexts.  The difference between the two declarations
appears only in lvaue contexts (including `sizeof') and in the actual
memory layout (as you illustrated above).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

pc@cs.keele.ac.uk (Phil Cornes) (07/14/89)

From article <23282@iuvax.cs.indiana.edu>, by bobmon@iuvax.cs.indiana.edu (RAMontante):
> Go ahead and flame me.  I learn more from my failures than from my
> successes...

With a structure declaration:

	struct node1 {
		struct node1 *next1;
		char string1[1];
	} *nodeptr1;

and a block of code:

	nodeptr1=(struct node1 *)malloc(sizeof(struct node1)+strlen(data));
	(void)strcpy(nodeptr1->string1,data);

you end up with a memory layout as you suggest:
 
         .-------------v--------------- - - - --.
         | ptr to next | "I AM A STRING . . . " |
         `-------------^--------------- - - - --'
 
In this case the expression:

	nodeptr1->string1

evaluates to a pointer constant to the first (and only declared) element
of the string1 array. Accessing a single character in the stored data string
(say the 'M') can be done as follows:

	nodeptr1->string1[3]

On the other hand, with a structure declared as:

	struct node2 {
		struct node2 *next2;
		char *string2;
	} nodeptr2;

and a block of code like:

	nodeptr2=(struct node2 *)malloc(sizeof(struct node2)+strlen(data)+1);
	nodeptr2->string2 = (char *)nodeptr2+sizeof(struct node2);
	(void)strcpy(nodeptr2->string2,data);

you end up with the memory laid out again as you suggest:
 
         .-------------v---------------v--------------- - - - --.
         | ptr to next | ptr to string | "I AM A STRING . . . " |
         `-------------^---------------^--------------- - - - --'
 
These two may look very different but you will find that they don't behave
so in programs.  Once again, the expression:

	nodeptr2->string2

is a pointer to the first element of the string2 array. Accessing a single
character in the stored data string (say the 'M') can be done as follows:

	nodeptr2->string2[3]

As you can see this is the same as the first example. My own preference in
this case is to use the second of these solutions for two reasons:

1. The first solution relies upon a lot more knowledge of the way that C
   operates internally, in order to be confident that it will work.

2. The first solution also suffers from the problem that it relies upon
   accessing the string1 array outside its declared subscript range,
   which must be classed as a 'tacky' practice at best.

So, no flames... you pays your money and you takes your choice!!!


Phil Cornes      I just called to say .....
-----------*
                 JANET: cdtpc@uk.ac.stafpol.cr83
                 Phone: +44 (0)785 53511 x6058
                 Smail: Staffordshire Polytechnic, Computing Department
                        Blackheath Lane, STAFFORD, ST18 0AD, ENGLAND.

ellis@fozzy.UUCP (Randy Ellis) (07/17/89)

In article <7360@c3pe.UUCP>, charles@c3pe.UUCP (Charles Green) writes:
> I have an application where I'm building and manipulating a stack of
> variable-length strings.  I've set up a linked list of nodes, each one
> declared as follows:
> When I know how long the string is I'm pushing onto the stack, I say:
> 	nodeptr = malloc(strlen(data)+5);
> to cover the struct node* and terminating NULL, and then simply
> 	strcpy(nodeptr->string, data);
> The only problem I have is with compilation:  I get a warning about the
> zero-length element "string".  I'd like to find out the "correct" way to
> do this.  I'll be glad to summarize any Emailed responses, of course.

Does this fit your needs?  Change string into a char pointer, then allocate
dynamic memory for the string and save the pointer into nodeptr->string.
	struct node {
		struct node* next;
		char	*string;
	} *nodeptr;
	nodeptr = malloc(sizeof(struct node));
	nodeptr->string = malloc(strlen(data)+1);
	strcpy(nodeptr->string,data);
Good Luck!

ari@eleazar.dartmouth.edu (Ari Halberstadt) (07/18/89)

Ari Halberstadt '91, "Long live short signatures"

svirsky@ttidca.TTI.COM (Bill Svirsky) (07/20/89)

In article <669@kl-cs.UUCP> pc@cs.keele.ac.uk (Phil Cornes) writes:
+With a structure declaration:
+
+	struct node1 {
+		struct node1 *next1;
+		char string1[1];
+	} *nodeptr1;
+
+and a block of code:
+
+	nodeptr1=(struct node1 *)malloc(sizeof(struct node1)+strlen(data));
+	(void)strcpy(nodeptr1-+string1,data);

[stuff deleted]

+On the other hand, with a structure declared as:
+
+	struct node2 {
+		struct node2 *next2;
+		char *string2;
+	} nodeptr2;
+
+and a block of code like:
+
+	nodeptr2=(struct node2 *)malloc(sizeof(struct node2)+strlen(data)+1);
+	nodeptr2-+string2 = (char *)nodeptr2+sizeof(struct node2);
+	(void)strcpy(nodeptr2-+string2,data);

[more stuff deleted]

+                                              My own preference in
+this case is to use the second of these solutions for two reasons:
+
+1. The first solution relies upon a lot more knowledge of the way that C
+   operates internally, in order to be confident that it will work.
+
+2. The first solution also suffers from the problem that it relies upon
+   accessing the string1 array outside its declared subscript range,
+   which must be classed as a 'tacky' practice at best.

One more advantage to the 2nd solution is that it is easily expandable
and very flexible.  For instance, given:

	struct node {
		struct node *next;
		char *string1;
		char *string2;
	} nodeptr;

use a block of code like:

	nodeptr=(struct node *)malloc(sizeof(struct node)+
			strlen(data1)+strlen(data2)+2);

	nodeptr->string1 = (char *)nodeptr+sizeof(struct node);
	nodeptr->string2 = nodeptr->string1+strlen(data1)+1);

	(void)strcpy(nodeptr->string1,data1);
	(void)strcpy(nodeptr->string2,data2);

-- 
Bill Svirsky, Citicorp+TTI, 3100 Ocean Park Blvd., Santa Monica, CA 90405
Work phone: 213-450-9111 x2597
svirsky@ttidca.tti.com | ...!{csun,psivax,rdlvax,retix}!ttidca!svirsky

ecd@ncs-med.UUCP (Elwood C. Downey) (07/20/89)

You might make an array member of length 1 at the end of a struct, then malloc
enough room for all that you really need and index into it via the array.
for example:

struct s {
    ...			/* anything ... */
    char array[1];	/* must be last member */
};

f(n)
int n;
{
	struct s *sp;

	sp = malloc (sizeof(struct s) + n - 1); /* you get 1 already */
	/* now you can use sp->array[0..n-1] */
}

smryan@garth.UUCP (s m ryan) (07/30/89)

>	nodeptr->string = (char *)nodeptr+sizeof(struct node);

or	nodeptr->string=(char*)(nodeptr+1)
-- 
23. Till life was gone he kept his hoard          Steven Ryan: ingr!garth!smryan
when Fanir hewed with biting sword                 2400 Geng Road, Palo Alto, CA
as father rested. Hreithmar yelled            `Do you mean Nancy?' `I'm not mean
for daughters twain ere life was quelled.       to Nancy--she like it that way.'