[comp.lang.c] Allocation for pointers

gtchen@faline.bellcore.com (George T. Chen) (09/04/87)

I have a question about the following program:

main()
{	char *a;
	a = "text";
	recurs(a);
}

void recurs(a);
char *a;
{	strcat(a,"more text");
	if (some conditional test) recurs(a);
}

As I understand it, I'm passing the address of the variable a into
the routine recurs.  At no point do I specify how large of a buffer
a will eventually point to.  Is there a limit and is it compiler
dependent?


-- 
+-----------------------------------------------------------------------------+
|What's a .signature? Life is an equation whose only solutions are irrational |
|gtchen@thumper.bellcore.com ! gtchen@romeo.caltech.edu ! gtchen@where.am.i?. |
+-----------------------------------------------------------------------------+

chris@mimsy.UUCP (Chris Torek) (09/04/87)

In article <1357@faline.bellcore.com> gtchen@faline.bellcore.com
(George T. Chen) writes:
>main()
>{	char *a;
>	a = "text";
>	recurs(a);
>}
>
>void recurs(a);
>char *a;
>{	strcat(a,"more text");
>	if (some conditional test) recurs(a);
>}

This will not work (aside from the syntax error and the type clash
at the definition of `recurs').

>As I understand it, I'm passing the address of the variable a into
>the routine recurs.

No, to do that, you must say `recurs(&a)'.  You are passing the
value of a to the routine recurs; you have claimed that this
value points to zero or more characters (`char *a'), which it
does indeed.  We just covered this one:  The value passed from
main is the address of the first element of an anonymous array
of five characters which is initialised to {'t', 'e', 'x', 't', '\0'}.

>At no point do I specify how large of a buffer a will eventually
>point to.

At no point do you specify a buffer!  (For one thing, there is
no `buffer' construct *per se* in C, although `array N of char'
where N is `large' is conventionally called a buffer.)  You made
`a' point to this anonymous array of five characters, then told
strcat to write over a[4] with 'm', a[5] with 'o', a[6] with 'r',
and so forth.

>Is there a limit

Certainly:  The array to which `a' points contains only five
slots.

>and is it compiler dependent?

The result of writing nonexistent array elements is certainly
compiler (and runtime system) dependent!

You can, instead, pick your own limit:

	void	recurs();

	main(...)
	{
		char a[LIMIT];

		(void) strcpy(a, "text");
		recurs(a);
		...
	}

	recurs(a)
		char *a;
	{

		(void) strcat(a, "more text");
		if (...)
			recurs(a);
	}

or (to save a tiny bit of space and time)

	main(...) {
		static char a[LIMIT] = "text";
		...

or you can come up with fancy dynamic allocation routines:

	struct string {
		char	*s_addr;	/* address of text */
		int	s_len;		/* current length */
		int	s_space;	/* and space remaining */
	};

	struct string *string_append();

	#define	APPEND(result, app) \
	((result)->s_space < (app)->s_len ? string_append(result, app) : \
		((result)->s_len += (app)->s_len, \
			(result)->s_space -= (app)->s_len, \
			bcopy((app)->s_addr, (result)->s_addr, (app)->s_len), \
			result))

or whatnot.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

guy%gorodish@Sun.COM (Guy Harris) (09/04/87)

> main()
> {	char *a;
> 	a = "text";
> 	recurs(a);
> }
> 
> void recurs(a);
> char *a;
> {	strcat(a,"more text");
> 	if (some conditional test) recurs(a);
> }
> 
> As I understand it, I'm passing the address of the variable a into
> the routine recurs.

No, you are passing the *value* of the variable "a" into the routine "recurs";
it points to a 5-byte string.

> At no point do I specify how large of a buffer a will eventually point to.

No, the value of "a" is not changed after it has the address of the string
"text" assigned to it, so it always points to a 5-byte string.  "recurs" just
passes it along to itself.

In ANSI C, at least, there is no guarantee that this is a 5-byte buffer; the
compiler can e.g. place it into a non-writable portion of your address space,
so this code may just drop core.  In *no* implementation that I know of is
there any guarantee that you can append anything such as the "more text" to the
end of it; I suspect that in most implementations that allow you to write
something after the string at all, you will be scribbling on some random
portion of your address space, possibly on top of some other strings.

> Is there a limit and is it compiler dependent?

Yes, this will blow up very quickly on most compilers, since the "strcat" will
at best be writing on a random portion of your data space and will at worst
attempt to write into a non-writable portion of your data space.

If you want dynamically-extendable strings, you'll have to implement them
yourself.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

jv@mhres.mh.nl (Johan Vromans) (09/05/87)

In article <1357@faline.bellcore.com> gtchen@faline.UUCP (George T. Chen) writes:
>I have a question about the following program:
>
>main() { char *a; a = "text"; recurs(a);}
>
>void recurs(a) 
>char *a; 
>{	strcat(a,"more text");
>	if (some conditional test) recurs(a);
>}
>
>As I understand it, I'm passing the address of the variable a into
>the routine recurs.  At no point do I specify how large of a buffer
>a will eventually point to.  Is there a limit and is it compiler
>dependent?

Your pointer is pointing in your program space. The "strcat" is therefore
overwriting your program. Depending on how the compiler/linker have
arranged your program and data, disaster will result pretty soon.
The precise result is implementation dependent. Imagine the original text
being placed on the stack ...

--
Johan Vromans                              | jv@mh.nl via European backbone
Multihouse N.V., Gouda, the Netherlands    | uucp: ..{?????!}mcvax!mh.nl!jv
"It is better to light a candle than to curse the darkness"




-- 
Johan Vromans                              | jv@mh.nl via European backbone
Multihouse N.V., Gouda, the Netherlands    | uucp: ..{?????!}mcvax!mh.nl!jv
"It is better to light a candle than to curse the darkness"

john@chinet.UUCP (John Mundt) (09/06/87)

In article <27282@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes:
>> main()
>> {	char *a;
>> 	a = "text";
>> 	recurs(a);
>> }
>> 
>> void recurs(a);
>> char *a;
>> {	strcat(a,"more text");
>> 	if (some conditional test) recurs(a);
>> }
>> 
>> As I understand it, I'm passing the address of the variable a into
>> the routine recurs.

When the program was compiled, "text" was placed into the static area
of the object code along with other static variables.  a was then 
assigned to point to the beginning of the string.

When you use strcat to append more text, you will merrily rub out
whatever static data was kept beyond the 5 bytes assigned to store
"text\0".  Strange and wonderful things will happen when the static
area is overrun and you get out into other sections of the program.

The only safe way to do what you want is to either pre-assign a with
enough memory to hold whatever you'll put into it, as in 

char a[BUFSIZ];

or to get the size of the present a, the size of the piece you want
to append + 1 byte for the \0, and call malloc() to hold the new
string.  You should probably free the old string.

in recurs(a) 
char *a;
{
...
int len;
char *ptr, malloc();
	len = strlen(a)	
	len += strlen(string_you_want_to_add);
	ptr = malloc(len + 1);
	strcpy(ptr, a);
	strcat(ptr, string_you_want_to_add);
	free(a)
	return(ptr);
...
}

Up in main, you have to make a = recurs(a) each time you call it.
You should add error checking to see that malloc returns a valid
pointer too.

John Mundt                    ...ihnp4!chinet!john
Teachers' Aide, Inc.	      ...ihnp4!chinet!teachad!fred