[comp.std.c] strncat is insufficient

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (08/12/90)

Having recently written some code that does string manipulation and
takes some pains to avoid buffer overflow, I conclude that strncat() is
nearly useless.  Most of the time I want to avoid string overflow by
doing this:

     append str2 to str1, but cause at most n characters to be in str1.

Here is what strncat() does:

     append str2 to str1, but copy at most n characters.

I think the standard C library badly needs another function called,
say, strlimcat(), which limits the length of the destination string to
some value.
--
Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com>
UUCP:  oliveb!cirrusl!dhesi

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/13/90)

In article <2201@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes:
>I think the standard C library badly needs another function called,
>say, strlimcat(), which limits the length of the destination string to
>some value.

Feel free to implement one of these for your own applications.
There isn't much point in discussing it in this newsgroup.
We're not soliciting suggestions for changes to the standard.

shankar@hpclscu.HP.COM (Shankar Unni) (08/15/90)

Rahul: 
> Most of the time I want to avoid string overflow by
> doing this:
>    append str2 to str1, but cause at most n characters to be in str1.

Well, you could write a cheapo function that goes:

strlimcat (char *s1, char *s2, int n)
{
   strncat (s1, s2, (n - strlen(s1)));
}

which is exactly what strlimcat would have to do anyway..
 
Doug G:
> >I think the standard C library badly needs another function called,
> >say, strlimcat(), which limits the length of the destination string to
> >some value.
> 
> Feel free to implement one of these for your own applications.
> There isn't much point in discussing it in this newsgroup.
> We're not soliciting suggestions for changes to the standard.

Ooohhh. Does that mean that C is frozen for eternity? The "ultimate,
greatest-ever language on earth"?  I hardly think so.

I think this is an appropriate place for suggestions for extensions to the
language, too, in addition to questions about clarifications of items in
the current standard...

$0.02,
-----
Shankar Unni                                   E-Mail: 
Hewlett-Packard California Language Lab.     Internet: shankar@hpda.hp.com
Phone : (408) 447-5797                           UUCP: ...!hplabs!hpda!shankar

karl@haddock.ima.isc.com (Karl Heuer) (08/16/90)

In article <12570054@hpclscu.HP.COM> shankar@hpclscu.HP.COM (Shankar Unni) writes:
>I think this is an appropriate place for suggestions for extensions to the
>language, too, in addition to questions about clarifications of items in
>the current standard...

I somewhat agree, as long as we're careful not to let the discussion explode
into the usual "let's make C into Ada" nonsense.

>>[Rahul notes that strncat() doesn't do what he usually wants]
>[Shankar shows how to do it in terms of strlen() and strncat()]

In my opinion, the string library was poorly designed in the first place; and
adding a strlimcat() along the lines you suggest would be compounding the
error.

Naive use of strcat() gives you quadratic behavior, which can be avoided by
operating at the *end* of the string instead of the beginning.  With this in
mind, I would propose replacement functions along these lines:

	#include <nstring.h>
	char *nstrend(char const *s) {
	    while (*s != '\0') ++s;
	    return (char *)s;
	}
	char *nstrcpy(char *d, char const *s) {
	    while (*s != '\0') *d++ = *s++;
	    return d;
	}
	char *nstrlimcpy(char *d, char const *dlim, char const *s) {
	    while (d < dlim-1 && *s != '\0') *d++ = *s++;
	    *d = '\0';
	    return (d);
	}

Notes:
- The `n' prefix means `new', and is used to distinguish these from the
  existing string library.  Better names are possible.

- Pointer-valued nstring functions normally point to the end of the
  destination, i.e. to the terminating null character.

- As with <string.h>, these would probably be optimized to death by a serious
  vendor.  They are likely to perform slightly better, since the algorithm
  will probably already have the correct return value in hand when the
  algorithm terminates.  (The old-string functions often have to save a copy
  of an incoming value for no other reason than to return it when done.)

- There is no `nstrcat'.  You probably don't need it: the old-string idiom
  `path=strcat(strcat(strcpy(buf, dir), "/"), file)' becomes
  `nstrcpy(nstrcpy(nstrcpy(path=buf, dir), "/"), file)'.  If you really need
  it, you can use `#define nstrcat(d, s) nstrcpy(nstrend(d), s)'.

- The extra arg to nstrlimcpy() is a limit pointer, not a length.  This way it
  will remain constant even if you're pasting several strings together.

- Unlike strncpy(), nstrlimcpy() always terminates with exactly one '\0'.
  This is the more useful behavior, now that the 14-character directory buffer
  is obsolete.

- Even if this package is generally agreed to be better than <string.h>, there
  is very little incentive for standardizing it, given that the old-string
  library already exists and is standardized.

Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint

diamond@tkou02.enet.dec.com (diamond@tkovoa) (08/16/90)

In article <12570054@hpclscu.HP.COM> shankar@hpclscu.HP.COM (Shankar Unni) writes:
>Rahul: 
>> Most of the time I want to avoid string overflow by doing this:
>>    append str2 to str1, but cause at most n characters to be in str1.
[me too -- Diamond]
>Well, you could write a cheapo function that goes:
> strlimcat (char *s1, char *s2, int n) { strncat (s1, s2, (n - strlen(s1))); }
>which is exactly what strlimcat would have to do anyway..

Well, it's cheap in terms of development time, but not usually in execution
time.  You MIGHT have a super-optimizing compiler that has learned a trick
(newly authorized by ANSI) of remembering that strlen and strcat are special,
so it can remember that the length of s1 has been computed and does not have
to be computed a second time.  But you PROBABLY don't.

>Doug G:
>> There isn't much point in discussing it in this newsgroup.
>> We're not soliciting suggestions for changes to the standard.
>Ooohhh. Does that mean that C is frozen for eternity?

The standard suggests that a few features might be deprecated in the future.

>The "ultimate, greatest-ever language on earth"?

This opinion seems to be widely held though (except for some of the
adherents of C++).

>I think this is an appropriate place for suggestions for extensions to the
>language, too, in addition to questions about clarifications of items in
>the current standard...

I think so too.  But these suggestions are not being solicited.  Perhaps
we need yet another newsgroup, which certain users would not want to read.
comp.std.c.2001?
-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
This is me speaking.  If you want to hear the company speak, you need DECtalk.

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/16/90)

In article <17418@haddock.ima.isc.com> karl@kelp.ima.isc.com (Karl Heuer) writes:
>... I would propose replacement functions along these lines: ...

I maintain that this is inappropriate in the comp.std.c newsgroup,
since there is no connection with the topic of the C standard (unless
of course you're submitting your suggestions to ISO, which hasn't
wrapped up the IS yet).

peter@ficc.ferranti.com (Peter da Silva) (08/17/90)

In article <1925@tkou02.enet.dec.com> diamond@tkou02.enet.dec.com (diamond@tkovoa) writes:
> I think so too.  But these suggestions are not being solicited.  Perhaps
> we need yet another newsgroup, which certain users would not want to read.
> comp.std.c.2001?

comp.std.c.futures?

(I'm planning on creating an alt.c-futures newsgroup for this sort of
 discussion. Naming suggestions (alt.lang.c.futures, alt.lang.future-c,
 alt.std.c.futures, ...) are encouraged.)
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com (currently not working)
peter@hackercorp.com

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (08/23/90)

gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> karl@kelp.ima.isc.com (Karl Heuer) writes:
>>... I would propose replacement functions along these lines: ...

>I maintain that this is inappropriate in the comp.std.c newsgroup,
>since there is no connection with the topic of the C standard (unless
>of course you're submitting your suggestions to ISO, which hasn't
>wrapped up the IS yet).

[I _swore_ I wouldn't let myself get into this mess again. I find I lied.]

Unless the rules for ANSI Technical Committees have changed a lot since I
was a member, the rule is that a standard must be reviewed at least every
five years.  While the _current_ ANSI standard being formalized by ISO may
not be in the part of its life cycle where technical change input is very
welcome, the committee has the ongoing responsibility to accept and file
suggestions for change/criticisms of the current standard for the _next_
review cycle.

The standard, as opposed to its current draft, is open for suggestions so
long as the TC continues to do business.  Hoping the user community will
shut up and go away, and trying to promote that behavior by deprecating
further suggestions, are not appropriate actions within the formal 
rules for ANSI technical committees.

However, posting to comp.std.c does not (again, unless the rules have been
changed) constitute "suggesting a change"; that is done on paper, to the
TC's business addresse, and is best done by quoting the current standard's
wording, and suggesting new wording, in concert with reasons why the change
is a good idea.

Given all that, suggestions for improvements to the standard are as valid
a discussion topic here as are requests for clarification of the standard.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

colin@array.UUCP (Colin Plumb) (08/25/90)

char *strlimcat(char *dst, char const *src, int maxlen)
{
	int len;

	len = strlen(dst);
	if (len < maxlen)
		strncpy(dst+len, src, maxlen-len);
	return dst;
}

browns@iccgcc.decnet.ab.com (Stan Brown, Oak Road Systems) (08/26/90)

In article <587@array.UUCP>, colin@array.UUCP (Colin Plumb) writes:
> char *strlimcat(char *dst, char const *src, int maxlen)
> {
> 	int len;
> 
> 	len = strlen(dst);
> 	if (len < maxlen)
> 		strncpy(dst+len, src, maxlen-len);
> 	return dst;
> }

Caution:  This will not necessarily put a '\0' at the end of the string.

Can Karl, or someone else who knows, explain why strncpy was standardized
to copy n characters even at the expense of a zero byte; or why no
alternative that always terminates the string was provided.  This
implementation is a fertile source of bugs that seem to bite every C
programmer at least once (and not always early in a career!). 

Stan Brown, Oak Road Systems, Cleveland, Ohio, U.S.A.         (216) 371-0043
The opinions expressed are mine. Mine alone!  Nobody else is responsible for
them or even endorses them--except my cat Dexter, and he signed the power of
attorney only under my threat to cut off his Cat Chow!

dik@cwi.nl (Dik T. Winter) (08/26/90)

In article <620.26d6d0cd@iccgcc.decnet.ab.com> browns@iccgcc.decnet.ab.com (Stan Brown, Oak Road Systems) writes:
 > In article <587@array.UUCP>, colin@array.UUCP (Colin Plumb) writes:
 > > char *strlimcat(char *dst, char const *src, int maxlen)
....
 > > 		strncpy(dst+len, src, maxlen-len);
....
I would say that strncat is better here.  Consider when maxlen-len is very large
while strlen(src) is very small.  strncpy will do all that zero-filling!
 > 
 > Caution:  This will not necessarily put a '\0' at the end of the string.
True.
 > 
 > Can Karl, or someone else who knows, explain why strncpy was standardized
 > to copy n characters even at the expense of a zero byte; or why no
 > alternative that always terminates the string was provided.
I would not gripe against strncpy, because it truncates or zero fills,
whatever is needed.  In my opinion the gripe is against strncat.
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

henry@zoo.toronto.edu (Henry Spencer) (08/26/90)

In article <620.26d6d0cd@iccgcc.decnet.ab.com> browns@iccgcc.decnet.ab.com (Stan Brown, Oak Road Systems) writes:
>Can Karl, or someone else who knows, explain why strncpy was standardized
>to copy n characters even at the expense of a zero byte...

Because that is the way strncpy() behaved in existing implementations,
and there was code that depended on it.

> or why no
>alternative that always terminates the string was provided...

Because there was no such alternative that had been implemented and used.

ANSI C standardized -- by and large -- an existing language.  This is a
feature, not a bug.
-- 
Committees do harm merely by existing. | Henry Spencer at U of Toronto Zoology
                       -Freeman Dyson  |  henry@zoo.toronto.edu   utzoo!henry

wildbill@haddock.ima.isc.com (Bill Torcaso) (08/27/90)

I recall a kernel guru telling me (in the Unix system-III era, which is
when the strn* functions appeared), that strncpy was used to copy a
filename into a 14-character long entry in a directory.  If the name was
shorter, it got NUL-extended.  Otherwise it filled the 14 chars and no
space was 'wasted' on the NUL byte.