[comp.std.c] char *strcat

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/24/88)

In article <1309@ark.cs.vu.nl> maart@cs.vu.nl (Maarten Litmaath) writes:
>Why do the functions named above return 'char *', instead of 'int', viz. the
>REALLY useful new size of the first argument string, or the number of chars
>moved?

"Historical reasons."

>Will this feature ever be changed?

No.  That would break a large number of existing correctly-written programs.

>One gets() tired of typing '(void) strcpy(buf, str);'.

Oh, I don't know about that.  Here is a real example from source code
I happened to have open in another layer on my terminal:

	(void)strcat(strcat(strcat(strcat(strcat(strcpy(fn,
							TargetDir
						       ),
       						 target
						),
					  Slash
  					 ),
 				   approx
   				  ),
  			    Slash
    			   ),
   		     CCMAP
     		    );

mkhaw@teknowledge-vaxc.ARPA (Mike Khaw) (06/24/88)

From article <1309@ark.cs.vu.nl>, by maart@cs.vu.nl (Maarten Litmaath):
> Why do the functions named above return 'char *', instead of 'int', viz. the
> REALLY useful new size of the first argument string, or the number of chars
> moved?

I suppose it's so that you can do

	strcpy(foo, strcat(bar, fgets(baz, size, stream)));

(Not that I do such things often).

Mike Khaw
-- 
internet: mkhaw@teknowledge.arpa
uucp:	  {uunet|sun|ucbvax|decwrl|uw-beaver}!mkhaw%teknowledge-vaxc.arpa
hardcopy: Teknowledge Inc, 1850 Embarcadero Rd, POB 10119, Palo Alto, CA 94303

jgk@speech2.cs.cmu.edu (Joe Keane) (06/24/88)

To be useful, strcpy and strcat should return the end of the new
string.  The old method is this:
	(void) strcpy (base, "foo");
	(void) strcat (base, "bar");
	(void) strcat (base, "baz");
This can be replaced by:
	end = strcpy (base, "foo");
	end = strcpy (end, "bar");
	(void) strcpy (end, "baz");
or more concisely:
	(void) strcpy (strcpy (strcpy (base, "foo"), "bar"), "baz");

If you're concatenating many strings, remember this is linear while
the old method is quadratic.  Unfortunately, there's no chance of
getting this changed.  I already have a function like this, i just
don't call it strcpy.

--Joe

jgm@k.gp.cs.cmu.edu (John Myers) (06/24/88)

In article <2029@pt.cs.cmu.edu> jgk@speech2.cs.cmu.edu (Joe Keane) writes:
>To be useful, strcpy and strcat should return the end of the new
>string.

While returning the end of the new string would be more useful than
the current specification, returning the first argumen *is* actually
useful.  One of my standard macros goes something like:

#define SAVESTR(s) (strcpy(malloc(strlen(s)+1),(s)))

The semantics of strcpy(), etc. have already been cast in stone--
changing them now would break current correctly-written programs.
X3J11 is not in the business of making gratuitous incompatible changes
to C.  Perhaps gratuitous or incompatible, but not both.

>[...]  I already have a function like this, i just
>don't call it strcpy.

...which is of course the right thing to do.
-- 
_.John G. Myers		Internet: John.Myers@cs.cmu.edu
			LoseNet:  ...!seismo!inhp4!wiscvm.wisc.edu!k!nobody
"The world is full of bozos.  Some of them even have PhD's in Computer Science"

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/24/88)

In article <2030@pt.cs.cmu.edu> jgm@k.gp.cs.cmu.edu (John Myers) writes:
>#define SAVESTR(s) (strcpy(malloc(strlen(s)+1),(s)))

Most of us call this function strdup(), and we test for NULL before
proceeding (since malloc() can fail).

fyl@ssc.UUCP (Phil Hughes) (06/25/88)

In article <1309@ark.cs.vu.nl>, maart@cs.vu.nl (Maarten Litmaath) writes:
> Why do the functions named above return 'char *', instead of 'int', viz. the
> REALLY useful new size of the first argument string, or the number of chars
> moved?

So you can say
	perror(strcat(message," more message"));

or, something like that.




-- 
Phil    uunet!pilchuck!ssc!fyl 

decot@hpisod2.HP.COM (Dave Decot) (06/25/88)

> Oh, I don't know about that.  Here is a real example from source code
> I happened to have open in another layer on my terminal:
> 
> 	(void)strcat(strcat(strcat(strcat(strcat(strcpy(fn,
> 							TargetDir
> 						       ),
>        						 target
> 						),
> 					  Slash
>   					 ),
>  				   approx
>    				  ),
>   			    Slash
>     			   ),
>    		     CCMAP
>      		    );

...and because the number of characters added to the string is not returned,
this statement has to examine the characters in TargetDir (which may be long)
five extra times; those in target four extra times; those in Slash three
extra times; those in approx two extra times; those in Slash one extra time.

This is much more efficiently (and readably!) rendered as:

     (void) sprintf(fn, "%s%s/%s/%s", TargetDir, target, approx, CCMAP);

...assuming that Slash is somebody's misguided idea of saving space for
strings consisting of "/".

While it's clear that we can't change the return value of strcat() because
of applications (such as the one above) that use it, there's nothing to
prevent adding more useful functions:

    char *strecpy(x, y)	  /* strcpy(x, y); returns pointer after end of y */
    char *x, *y;

    char *strecat(x, y)	  /* strcat(x, y); returns pointer after end of y */
    char *x, *y;

    int strlcpy(x, y)	  /* strcpy(x, y); returns strlen(y) */
    char *x, *y;

    int strlcat(x, y)	  /* strcat(x, y); returns strlen(y) */
    char *x, *y;

Dave Decot
hpda!decot

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/26/88)

In article <11580010@hpisod2.HP.COM> decot@hpisod2.HP.COM (Dave Decot) writes:
>> ...  Here is a real example from source code
>> I happened to have open in another layer on my terminal:
>> 	(void)strcat(strcat(strcat(strcat(strcat(strcpy(fn, [...]

>This is much more efficiently (and readably!) rendered as:
>     (void) sprintf(fn, "%s%s/%s/%s", TargetDir, target, approx, CCMAP);

What you say is true (indeed, the reason the code was open in a layer was
because I was straightening this out), but sometimes sprintf() is inadvisable
since it causes the general-purpose formatting package to be linked in from
the C library; in the unusal event that the application is not using stdio,
this adds substantially to the size of its executable image.

>... there's nothing to prevent adding more useful functions:
> [strecpy, strecat, strlcpy, strlcat]

Those are indeed useful, but probably the names should not start with "str",
to avoid conflicts with possible revisions of the C standard.

ljz@fxgrp.UUCP (Lloyd Zusman) (06/26/88)

In article <8160@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
  In article <11580010@hpisod2.HP.COM> decot@hpisod2.HP.COM (Dave Decot) writes:
  >> ...  Here is a real example from source code
  >> ...
  >> 	(void)strcat(strcat(strcat(strcat(strcat(strcpy(fn, [...]
  
  >This is much more efficiently (and readably!) rendered as:
  >     (void) sprintf(fn, "%s%s/%s/%s", TargetDir, target, approx, CCMAP);
  
  What you say is true (indeed, the reason the code was open in a layer was
  because I was straightening this out), but sometimes sprintf() is inadvisable
  ...

Here are a couple of routines that can be used in place of the
messy string of strcat's and also in place of sprintf:

1)  char *strset(result, s1, s2, ..., NULL)

    (where all arguments are "char *").  This concatenates 's1', 's2', ...
    together, putting the result plus a trailing '\0' into 'result',
    which must be large enough to hold everything.

    Variations of this routine appear in many places.

2)  This one is rarer:

    char *stralloc(s1, s2, ..., NULL)

    (again, where all arguments are "char *").  This allocates a string
    large enough to hold 's1', 's2', ... plus a trailing '\0', and then
    concatenates everything into this allocated string, whose address is
    returned to the caller.  Allocation is done via malloc(), and the
    return value is NULL if this failed.

I used the 'varargs' convention to implement this, but I believe that
'stdargs' would also work (using 'stdargs' is left as an exercise to
the reader).

Please forgive me if I accidentally introduced a bug or two into these
routines ... I had to copy them by hand into this article due to
problems at my site, and I may have mistyped something or other.

------------------------- routines follow ----------------------------
#include <varargs.h>

extern char *malloc();

#define NULL	((char *)0)

/*
 * strset(result, string1, string2, ..., NULL)
 *
 *	Concatenate string1, string2, ... together into the result,
 *	which must be big enough to hold all of them plus a trailing
 *	'\0'.  Returns the address of the result.
 *
 *	Example:
 *
 *		char result[100];
 *
 *		(void)strset(result, "This", " ", "is a ", "test", NULL);
 *
 *			result now contains "This is a test".
 */
char *
strset(va_alist)
va_dcl
{
    va_list ap;		/* argument list pointer */
    char *result;	/* pointer to result string */
    char *sp;		/* used as a temporary string pointer */
    char *cp;		/* used as a temporary string pointer */

    va_start(ap);
    
    sp = va_arg(ap, char *);	/* get first argument ... do nothing if NULL */
    if (sp == NULL) {
	return (sp);
    }

    result = sp;	/* save pointer to beginning of result string */

    /*
     * Loop through all the arguments, concatenating each one to the
     * result string.
     */
    for (cp = va_arg(ap, char *); cp != NULL; cp = va_arg(ap, char*)) {
	while (*cp != '\0') {
	    *sp++ = *cp++;
	}
    }
    *sp = '\0';		/* we need a trailing '\0' */

    va_end(ap);

    return (result);
}

/*
 * stralloc(string1, string2, ..., NULL)
 *
 *	Allocate a string big enough to hold the concatenation of
 *	string1, string2, ... plus a '\0', and then concatenate the
 *	strings into it, returning its address to the caller.
 *
 *	Example:
 *
 *		char *result;
 *
 *		result = stralloc("This", " ", "is a ", "test", NULL);
 *
 *			result now contains "This is a test" unless
 *  	    	    	malloc() failed, in which case it contains
 *  	    	    	NULL.  If the call succeeded, the result
 *			can be freed via "free(result)".
 */
char *
stralloc(va_alist)
va_dcl
{
    va_list ap;		/* argument list pointer */
    char *result;	/* result string pointer */
    char *cp;		/* temporary string pointer */
    char *sp;		/* temporary string pointer */
    int size = 1;	/* string size: leave room for trailing '\0' */

    /*
     * First pass: count characters in list of strings.
     */
    va_start(ap);
    for (cp = va_arg(ap, char *); cp != NULL; cp = va_arg(ap, char*)) {
	while (*cp++ != '\0') {
	    ++size;
	}
    }

    /*
     * Allocate the string.
     */
    result = malloc(size);
    if (result == NULL) {
	return (result);
    }
    sp = result;

    /*
     * Second pass: copy the arguments.
     */
    va_start(ap);
    for (cp = va_arg(ap, char *); cp != NULL; cp = va_arg(ap, char*)) {
	while (*cp) {
	    *sp++ = *cp++;
	}
    }
    *sp = '\0';

    va_end(ap);

    return (result);
}
--
  Lloyd Zusman                          UUCP:   ...!ames!fxgrp!ljz
  Master Byte Software              Internet:   ljz%fx.com@ames.arc.nasa.gov
  Los Gatos, California               or try:   fxgrp!ljz@ames.arc.nasa.gov
  "We take things well in hand."

ado@elsie.UUCP (Arthur David Olson) (06/27/88)

In article <8160@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
< In article <11580010@hpisod2.HP.COM> decot@hpisod2.HP.COM (Dave Decot) writes:
< >... there's nothing to prevent adding more useful functions:
< > [strecpy, strecat, strlcpy, strlcat]
< 
< Those are indeed useful, but probably the names should not start with "str",
< to avoid conflicts with possible revisions of the C standard.

While in article <8148@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
< In article <2030@pt.cs.cmu.edu> jgm@k.gp.cs.cmu.edu (John Myers) writes:
< >#define SAVESTR(s) (strcpy(malloc(strlen(s)+1),(s)))
< 
< Most of us call this function strdup(). . .

Well. . .strdup is indeed useful, but probably the name should not start with
"str", to avoid conflicts with possible revisions of the C standard.
(Here at elsie we call it "icpyalloc", but then again we're not "most of us.")
-- 
	ado@ncifcrf.gov			ADO is a trademark of Ampex.

karl@haddock.ISC.COM (Karl Heuer) (06/28/88)

In article <11580010@hpisod2.HP.COM> decot@hpisod2.HP.COM (Dave Decot) writes:
>While it's clear that we can't change the return value of strcat() because
>of applications (such as the one above) that use it, there's nothing to
>prevent adding more useful functions:

Since this is comp.std.c, I'll redeclare your list in ANSI.

>    char *strecpy(char *, char *)
>    char *strecat(char *, char *)
>    size_t strlcpy(char *, char *)
>    size_t strlcat(char *, char *)

Actually, given a useful set of end-string functions (including the simplest
one, "char *strend(char *)"), I see no need for any of the "cat" functions.
Even the standard strcat(x,y) is just strcpy(strend(x),y), and if the
application was designed properly it probably already has the value of
strend(x) in hand.

As Doug has noted, "str*" is reserved to the implementation.  Thus, the vendor
is free to add any of these to <string.h> as an extension (and would still
have a conforming implementation).  I suppose many implementations will put
strdup() there.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

decot@hpisod2.HP.COM (Dave Decot) (06/28/88)

> < Most of us call this function strdup(). . .
> 
> Well. . .strdup is indeed useful, but probably the name should not start with
> "str", to avoid conflicts with possible revisions of the C standard.
> -- 
>     ado@ncifcrf.gov			ADO is a trademark of Ampex.

The reason "most of us" call it strdup() is that it's in the SVID that way.

For that reason, I'd object strongly if ANSI defined it to be something else.

So, go ahead and call it strdup().  :-)

Dave "I call it strsave()" Decot
hpda!decot

smryan@garth.UUCP (Steven Ryan) (06/28/88)

>While it's clear that we can't change the return value of strcat() because
>of applications (such as the one above) that use it, there's nothing to
>prevent adding more useful functions:

With luck your implementation supports libraries--that's what they're for.

anw@nott-cs.UUCP (06/28/88)

In article <2029@pt.cs.cmu.edu> jgk@speech2.cs.cmu.edu (Joe Keane) writes:
>To be useful, strcpy and strcat should return the end of the new
>string.

	If it has to be one or t'other, don't forget that the way C works
it is easier to find the end of a string given its beginning than to find
its beginning given its end.

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK
anw@maths.nott.ac.uk

jagardner@watmath.waterloo.edu (Jim Gardner) (06/29/88)

In article <4773@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article <11580010@hpisod2.HP.COM> decot@hpisod2.HP.COM (Dave Decot) writes:
>
>As Doug has noted, "str*" is reserved to the implementation.  Thus, the vendor
>is free to add any of these to <string.h> as an extension (and would still
>have a conforming implementation).  I suppose many implementations will put
>strdup() there.

Section 4.1.2 "Each header declares and defines only those identifiers listed
in its associated section". Sounds like you can't put prototypes for any
new str* in string.h. It also sounds like I can't put a prototype for
__filbuf in stdio.h (can't use _filbuf 'cause that's a valid user auto name).

karl@haddock.ISC.COM (Karl Heuer) (06/30/88)

In article <568@tuck.nott-cs.UUCP> anw@maths.nott.ac.uk (Dr A. N. Walker) writes:
>In article <2029@pt.cs.cmu.edu> jgk@speech2.cs.cmu.edu (Joe Keane) writes:
>>To be useful, strcpy and strcat should return the end of the new string.
>
>If it has to be one or t'other, don't forget that the way C works it is
>easier to find the end of a string given its beginning than to find its
>beginning given its end.

If strcpy() and strcat() return the beginning of the string, the cost of
finding the end is O(N): you make another pass through the string.  If they
return the end of the string, the cost of finding the beginning is O(1): you
already had the value, since you passed it as the first argument.  At worst,
you have to declare a temporary variable to hold it.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
Followups to comp.lang.c.

fst@mcgp1.UUCP (Skip Tavakkolian) (07/01/88)

In article <4773@haddock.ISC.COM>, karl@haddock.ISC.COM (Karl Heuer) writes:
> In article <11580010@hpisod2.HP.COM> decot@hpisod2.HP.COM (Dave Decot) writes:
> >While it's clear that we can't change the return value of strcat() because
> >of applications (such as the one above) that use it, there's nothing to
> >prevent adding more useful functions:
> Since this is comp.std.c, I'll redeclare your list in ANSI.
> >    char *strecpy(char *, char *)
> >    char *strecat(char *, char *)
> >    size_t strlcpy(char *, char *)
> >    size_t strlcat(char *, char *)
> Actually, given a useful set of end-string functions (including the simplest
> one, "char *strend(char *)"), I see no need for any of the "cat" functions.
> Even the standard strcat(x,y) is just strcpy(strend(x),y), and if the
> application was designed properly it probably already has the value of
> strend(x) in hand.
> As Doug has noted, "str*" is reserved to the implementation.  Thus, the vendor
> is free to add any of these to <string.h> as an extension (and would still
> have a conforming implementation).  I suppose many implementations will put
> strdup() there.
> Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint


Whitesmiths Ltd. had (still has but are now called WSL extensions to ANSI C),
several library routines dealing with multiple copy operations. For example:

	/* I do not recall the return value, but I think it is */
	char *cpystr(destination, _1st_source [, _2nd_source, ...], NULL)
		char *destination, *_1st_source, *_2nd_source, ...;

	/* last arg passed has to be `NULL' */

so that you can say:

	char buf[BUFSIZ];
	cpystr(&buf[0], "This ", "is ", "a test", NULL);

I think this will do what, the original code that was posted, was trying to.
(Yes/No)?

Sincerely
-- 
Fariborz ``Skip'' Tavakkolian

UUCP	...!uw-beaver!tikal!mcgp1!fst

karl@haddock.ISC.COM (Karl Heuer) (07/01/88)

In article <19644@watmath.waterloo.edu> jagardner@watmath.waterloo.edu (Jim Gardner) writes:
>Section 4.1.2 "Each header declares and defines only those identifiers listed
>in its associated section". Sounds like you can't put prototypes for any
>new str* in string.h. It also sounds like I can't put a prototype for
>__filbuf in stdio.h

It also sounds like I can't define the symbol _H_STDIO to lock out multiple
includes, since that would be defining an undocumented identifier.  Clearly
this isn't what the Committee meant.

It appears that 4.1.2 should be slightly rephrased.  Something on the order of
"Each header declares and defines no identifiers except those listed in its
associated section or the corresponding subsection of Future Library
Directions, or the globally reserved identifiers."

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint