stephen@dcl-cs.UUCP (Stephen J. Muir) (11/08/85)
I think that there should be a function called "strnlen" as follows: int strnlen (string, size) char *string; int size; where "size" is the maximum number of bytes in "string". The reason is that, if a character array is passed as argument, and it is not terminated with a null byte, then "strlen" will keep going, possibly hitting an unallocated piece of memory. -- UUCP: ...!seismo!mcvax!ukc!dcl-cs!stephen DARPA: stephen%comp.lancs.ac.uk@ucl-cs | Post: University of Lancaster, JANET: stephen@uk.ac.lancs.comp | Department of Computing, Phone: +44 524 65201 Ext. 4599 | Bailrigg, Lancaster, UK. Project:Alvey ECLIPSE Distribution | LA1 4YR
gwyn@BRL.ARPA (VLD/VMB) (11/11/85)
Re: proposed strnlen() All the str*() functions work with NUL-terminated strings. Probably the best thing that could happen if you fed them something else would be for the implementation to access unallocated memory. At least then you would be able to find the bug. Your suggestion would make more sense for the mem*() functions, which work with arbitary byte arrays. A good generalization would be "find the first occurrence of a given byte, not going beyond a certain distance". If you look up memchr(), you will see that that's what it does. It is possible that your system is not System V compatible, in which case the memchr() function may not be in your C library. But it is very easy to implement and there is no need to invent a different-shaped wheel. char *memchr( const char *s, int c, int n ); returns a pointer to the first occurrence of `c' in the first `n' bytes of memory area (whose lowest address is) `s', or NULL if not found. Calling this function with `c' set to '\0' is equivalent to your proposed strnlen(). (You would have to handle the exception case, and use whatever memchr() returns minus `s' in the regular case for the length of the string.)
dlc@a.sei.cmu.edu (Daryl Clevenger) (11/13/85)
One should never allow a character array to not have a null terminating byte. Doing things like this should be automatic. Besides, usually when the program dumps core, one usually can find the problem readily. Also, if I am not mistaken, having character arrays that do not have terminating null bytes will cause problems with many other funtions e.g printf(). printf() (or maybe _doprint() I'm not sure which) will keep printing characters until they hit that null byte, but they probably won't find it where it should be. Unless I'm wrong, making sure strings are null terminated should be as automatic as making sure that you aren't trying to use NULL pointers or that malloc() returns a valid pointer.
vishniac@wanginst.UUCP (Ephraim Vishniac) (11/14/85)
> I think that there should be a function called "strnlen" as follows: > > int strnlen (string, size) > char *string; > int size; > > where "size" is the maximum number of bytes in "string". > I agree that such a facility is needed, but I don't think it will ever be provided as "standard" C. The basic problem (or the C problem, if you prefer :-) is that C defines a representation for strings, but leaves the user to implement the operations. This allows one to cook up all sorts of invalid strings (such as the unterminated ones the original poster is worried about). To my mind, a string consists of three things: 1. The characters of the string; 2. The length of the string, either as such or encoded by marking the string; 3. The storage block where the string is located, which has its own attributes (alignment and size, to name two). C has no problem with the first (the characters are easy to access); some problems with the second (null termination is good for some purposes, rotten for others); and completely ignores the third. But: since this is C, you don't have to use the standard representation and functions. Just as I did when sufficiently burned, you can use your own representation and macros. Then the only problem is that nobody will use your modules, because they're "non-standard". -- Ephraim Vishniac [apollo, bbncca, cadmus, decvax, harvard, linus, masscomp]!wanginst!vishniac vishniac%Wang-Inst@Csnet-Relay
mikes@3comvax.UUCP (Mike Shannon) (11/14/85)
Stephen Muir in the cited article: > I think that there should be a function called "strnlen" as follows: > > int strnlen (string, size) > char *string; > int size; > > where "size" is the maximum number of bytes in "string". > > The reason is that, if a character array is passed as argument, and it is not > terminated with a null byte, then "strlen" will keep going, possibly hitting > an unallocated piece of memory. Then write one! Geez!! -- Michael Shannon {ihnp4,hplabs}!oliveb!3comvax!mikes
mek@rruxd.UUCP (M Kaufman) (11/16/85)
>I think that there should be a function called "strnlen" as follows: >int strnlen (string, size) > char *string; > int size; > >where "size" is the maximum number of bytes in "string". I THINK WE SHOULD PLACE A TAX ON ALL FOREIGNERS LIVING ABROAD! With Apologies to R. P. Gumby, Esq.
jack@boring.UUCP (11/16/85)
In article <207@a.sei.cmu.edu> dlc@a.sei.cmu.edu (Daryl Clevenger) writes: >One should never allow a character array to not have a null terminating byte. That's right, one *should* never allow it. On the other hand, if ou look at the format of /etc/utmp (v7, at least, dunno about 4.n or S5), there's 8 bytes available for loginnames, so if you've got an 8 char loginname, it won't be zero terminated. Or, look at the V7 directory entry for a 14-char filename. I *know* these should be avoided, but I would still prefer to see code like len = strnlen(buffer,BUFSIZ); above if( (len=strlen(buffer))>BUFSIZ) len=BUFSIZ; which will fail at some unknown point in the future, long after everyone who knew the code has passed away....... -- Jack Jansen, jack@mcvax.UUCP The shell is my oyster.
jsdy@hadron.UUCP (Joseph S. D. Yao) (11/16/85)
In article <6691@boring.UUCP> jack@boring.UUCP (Jack Jansen) writes: >I *know* these should be avoided, but I would still prefer to see >code like > len = strnlen(buffer,BUFSIZ); >above > if( (len=strlen(buffer))>BUFSIZ) len=BUFSIZ; >which will fail at some unknown point in the future, long >after everyone who knew the code has passed away....... That code should have failed immediately! If it ever got to a code review, it should have been shot dead immediately. You NEVER assume that there's something beyond where you put it, unless ... you put it there. ;-) #define NUL '\0' /* Now, class, tell me why this is not NULL. */ char buffer[BUFLEN+1]; ... fread(buffer, 1, BUFLEN, infile); buffer[BUFLEN] = NUL; len = strlen(buffer); /* if you must */ -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
steiny@scc.UUCP (Don Steiny) (11/18/85)
**
Yikes! Too much talk, I have 25 seconds to spare, here:
strnlen(str,len)
char *str;
{
register *s;
for(s=str;*s && (int) s-str<len;s++)
;
return((int) s-str);
}
--
scc!steiny
Don Steiny @ Don Steiny Software
109 Torrey Pine Terrace
Santa Cruz, Calif. 95060
(408) 425-0382
stephen@dcl-cs.UUCP (Stephen J. Muir) (11/19/85)
In article <207@a.sei.cmu.edu> dlc@a.sei.cmu.edu (Daryl Clevenger) writes: >One should never allow a character array to not have a null terminating byte. This is absolute rubbish. If I want character arrays without a terminating null byte then I'm quite entitled to do that. In fact, I *have* to do that as I'm writing interface routines for ADA. "strnlen" is to "strlen" as "strncmp" is to "strcmp". I've written the routine myself now, but I just think that it should be part of the standard library, that's all. -- UUCP: ...!seismo!mcvax!ukc!dcl-cs!stephen DARPA: stephen%comp.lancs.ac.uk@ucl-cs | Post: University of Lancaster, JANET: stephen@uk.ac.lancs.comp | Department of Computing, Phone: +44 524 65201 Ext. 4599 | Bailrigg, Lancaster, UK. Project:Alvey ECLIPSE Distribution | LA1 4YR
levy@ttrdc.UUCP (Daniel R. Levy) (11/19/85)
In article <562@scc.UUCP>, steiny@scc.UUCP (Don Steiny) writes: >** > > Yikes! Too much talk, I have 25 seconds to spare, here: > Yikes! Needs a brushup, I have 25 seconds to spare, here: [:-)] > int strnlen(str,len) ^^^ > char *str; int len; ^^^^^^^^ > { > register char *s; ^^^^ > > for(s=str;*s && s-str<len;s++) /* Diff of pointers is auto- matically cast to int */ > ; > return(s-str); > } > /* Did I get it right, Guy? Huh? Huh??? :-) */ > >-- >scc!steiny >Don Steiny @ Don Steiny Software >109 Torrey Pine Terrace >Santa Cruz, Calif. 95060 >(408) 425-0382 -- ------------------------------- Disclaimer: The views contained herein are | dan levy | yvel nad | my own and are not at all those of my em- | an engihacker @ | ployer or the administrator of any computer | at&t computer systems division | upon which I may hack. | skokie, illinois | -------------------------------- Path: ..!ihnp4!ttrdc!levy
mash@mips.UUCP (John Mashey) (11/19/85)
> One should never allow a character array to not have a null terminating byte. > Doing things like this should be automatic. Yes, except that for (at the time, and perhaps still) good reasons, certain structures had strings that might not be null-terminated. Such include directories (except in 4.2), accounting file command names, utmp/wtmp. Those things are why the strn* routines were written in the first place. I generally agree with the first advice, but there are at least a few occasions where one extra byte is fairly handy. -- -john mashey UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash DDD: 415-960-1200 USPS: MIPS Computer Systems, 1330 Charleston Rd, Mtn View, CA 94043
roger@dedalus.UUCP (Roger L. Cordes Jr.) (11/19/85)
What is the big deal here? How about:
extern int strlen();
static int _nstrnlen_; /* to avoid two calls to strlen() */
#define strnlen(S,N) ( (_nstrnlen_=strlen(S)) > (N) ? (N) : _nstrnlen_ )
or:
int strnlen(s,n)
char *s;
int n;
{
char *c;
int len = 0;
for ( c=s; *c && (len<n); c++, len++ )
;
return(len);
}
Was it a joke? Did I miss something?
Roger L. Cordes, Jr. William G. Daniel & Associates
...!mcnc!ikonas!dedalus!roger 8000 Regency Parkway, Suite 140
(919) 467-9708 Cary, N.C. 27511
bc@cyb-eng.UUCP (Bill Crews) (11/20/85)
> One should never allow a character array to not have a null terminating byte. > Doing things like this should be automatic. Besides, usually when the program > dumps core, one usually can find the problem readily. Also, if I am not > mistaken, having character arrays that do not have terminating null bytes > will cause problems with many other funtions e.g printf(). printf() (or > maybe _doprint() I'm not sure which) will keep printing characters until they > hit that null byte, but they probably won't find it where it should be. > Unless I'm wrong, making sure strings are null terminated should be as > automatic as making sure that you aren't trying to use NULL pointers or > that malloc() returns a valid pointer. It should be obvious that one is sometimes governed by external constraints, such as the format of a directory entry. If a file or directory entry format contains a fixed-length field which contains textual data that can fill the entire field, one must deal with it somehow. -- - bc - ..!{seismo,topaz,gatech,nbires,ihnp4}!ut-sally!cyb-eng!bc (512) 835-2266
gwyn@BRL.ARPA (VLD/VMB) (11/20/85)
Maybe you should have taken more than 25 seconds. Your strnlen() code has a bug (repeated twice).
jsdy@hadron.UUCP (Joseph S. D. Yao) (11/20/85)
OK. Let's all stop arguing over whether this routine should exist. #define NUL '\0' /* ** This routine, for whatever reason, wants to calculate the length ** of string 's' but be sure that it stops before element 'n'. */ int strnlen(s, n) register char *s; register int n; { register int i; for (i = 0; i < n; i++) { if (*s++ == NUL) break; } return(i); } There. It exists. If you must use it, do so. I will try not to, but may have to some day. Fair 'nuff? -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
peter@graffiti.UUCP (Peter da Silva) (11/21/85)
> #define NUL '\0' /* Now, class, tell me why this is not NULL. */ > > char buffer[BUFLEN+1]; > ... > fread(buffer, 1, BUFLEN, infile); > buffer[BUFLEN] = NUL; > len = strlen(buffer); /* if you must */ Uh... len = fread(buffer, BUFLEN, 1, infile); if(len==0) aha(endoffile); if(len<0) ohmygod(panic); else buffer[len] = NUL; ...would be much better code. Don't assume fread succeeds (it fails at least once on almost any file :-)). -- Name: Peter da Silva Graphic: `-_-' UUCP: ...!shell!{graffiti,baylor}!peter IAEF: ...!kitty!baylor!peter
preece@ccvaxa.UUCP (11/21/85)
> I think that there should be a function called "strnlen" as follows: > > int strnlen (string, size) > char *string; > int size; > > where "size" is the maximum number of bytes in "string". > /* Written 4:14 pm Nov 7, 1985 by stephen@dcl-cs.UUCP > in ccvaxa:net.lang.c */ ---------- This seems perfectly reasonable and useful. Most of the previously posted responses seem pretty obnoxious. On the other hand, where you should have put this is mod.std.c. This is something that belongs in the standard for the string library routines. You might make a proposal for what value you'd like returned if no null character is found before the given limit (-1 and size are reasonable alternatives). -- scott preece gould/csd - urbana ihnp4!uiucdcs!ccvaxa!preece
steiny@scc.UUCP (Don Steiny) (11/22/85)
> > >** > > > int strnlen(str,len) > ^^^ > > char *str; > > int len; > ^^^^^^^^ Why bother to do that? It is implicitly int anyway. -- scc!steiny Don Steiny @ Don Steiny Software 109 Torrey Pine Terrace Santa Cruz, Calif. 95060 (408) 425-0382
gwyn@BRL.ARPA (VLD/VMB) (11/25/85)
I disagree that strnlen() should be proposed for the X3J11 standard. The draft standard already includes memchr(), which offers similar functionality. The standard should not serve as a repository for everybody's personal favorite private function library.
jsdy@hadron.UUCP (Joseph S. D. Yao) (11/27/85)
In article <715@dedalus.UUCP> roger@dedalus.UUCP (Roger L. Cordes Jr.) writes: > What is the big deal here? How about: >extern int strlen(); >static int _nstrnlen_; /* to avoid two calls to strlen() */ >#define strnlen(S,N) ( (_nstrnlen_=strlen(S)) > (N) ? (N) : _nstrnlen_ ) Once again: doing a strlen() on something not guaranteed to be NUL-terminated is guaranteed to net you a core dump sooner or later. You can declare str to be char str[MAXSIZE+1] and always do a two-step: str[MAXSIZE] = NUL; n = strlen(str); or you can use the deplorable function I put on the net to try and end this line of discussion; or you can give the people who use your code fun trying to figure out "Oh no! How did I break the computer? Where did the core dump, and can I sweep it under the rug before morning?" ;-) Incidentally, Doug's suggestion to use memchr() is great, once we all have ANSI compilers. ;-} -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
preece@ccvaxa.UUCP (11/28/85)
If someone would post either a draft of x3j11 or a pointer to somewhere whence it can be FTPed, more of us might know what was in it... -- scott preece gould/csd - urbana ihnp4!uiucdcs!ccvaxa!preece
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/30/85)
> If someone would post either a draft of x3j11 or a pointer to > somewhere whence it can be FTPed, more of us might know what > was in it... I asked its editor about this, and he noted that (a) It is too big for net-copying around (b) To cover costs, ANSI needs to charge a nominal fee for membership or observership I think you can get a copy from the X3 Secretariat for on the order of $20. Perhaps someone has more exact information on this..
peter@baylor.UUCP (Peter da Silva) (01/18/86)
> One should never allow a character array to not have a null terminating byte. Except that in lots of places in UNIX you find character arrays that may or may not be null-terminated. Examples: directory entries and entries in /etc/utmp. The following program fragment lists all files in the current directory seperated by commas: if(!(fp = fopen(".", "r"))) { perror("."); return(ERROR); } commastring = ""; while(fread(dirp, 14, 1, fp)) if(dirp->d_ino) { printf("%s%.14s", commastring, dirp->d_name); commastring = ", "; } if(commastring[0]) putchar('\n'); dirp->d_name may or may not be null terminated. Printf doesn't get bent out of shape over it, now does it? (please, no flames from 4.2 people who want me to use their routines. I know all about them. (1) I'm on a 4.2 system right now. (2) I posted a generic UNIX implementation of them to net.sources a few months ago (and had to deal with non-null-terminated strings then)). -- -- Peter da Silva -- UUCP: ...!shell!{baylor,graffiti}!peter; MCI: PDASILVA; CIS: 70216,1076