jeff@rlgvax.UUCP (Jeffrey Kegler) (08/19/83)
As most of you know, the UNIX philosophy is to have everything have explicit terminators, instead of being terminated by counts. In strcpy() this has raised the complaint that a slight error in an argument to strcpy() can result is destruction of a considerable amount of data. This could happen when the string to be copied is not properly terminated, or is a bad pointer, or is just overlong. One can say that it is really up to the programmer to check for these things. Another part of the UNIX philosophy is, wherever something might be seen as either a protection or a restriction of the programmer, depending on point of view, to assume it is a restriction. For those of us who do desire to be protected from our own lapses, especially when these lapses may lead to a core with its stack destroyed, strncpy() exists. It takes a third argument which gives an explicit maximum length to the string. However, for some strange reason, it still does not guarantee its result will be null terminated if it is overlong. Instead it puts characters right up into the last permitted location. Always writing strncpy(string1, string2, n); string1[n] = '\0'; solves the problem, but whenever I see this in code, I wonder why UNIX's string copy of explicitly n characters does not always result in a string of n characters. Further, it always copies a string of the length specified, adding nulls once the source string has been terminated. I can think of situations where this is useful, but one has never occurred to me, and far more often I wind up copying a string whose maximum length is 5000 and whose average length is 10. So as a byproduct of this feature, I would get an average of 4990 extra trips through the loop. Finally, strncpy() is far less efficient than it could be. Its main loop maintains and updates both a string pointer and a count, and each character copied requires a comparison to n, which is not is a register. I would get more into its behavior, but I do not know if the code is public domain, unlike strcpy(), 4 versions of which are in K&R on pages 100-101. Below is the code for my rewrite of strncpy(), which never accesses anything except a register in its main loop (if your compiler supports this sort of thing), has a main loop one instruction shorter on the VAX (all claims are for code generated by the C optimizer), guarantees the copied string will be a legal string, even where the original was not, and does no copying beyond the end-of-string. The main efficiency improvement does not depend on my changes from the behavior of strncpy(), and scpy() could be rewritten to simulate strncpy() exactly. =========================== /* * Copy a string in a fail-safe manner. */ void scpy(s1,s2,n) register char *s1,*s2; { register char *eos = s1+n-1; while(*s1++ = *s2++) if (s1 >= eos) { *eos = '\0'; return; } } ============================== One could argue that the above is not as fail-safe as it should be, because if n is zero or negative, the byte before s1 is zeroed, and a check for such an n could be placed before the main loop.
tll@druxu.UUCP (08/20/83)
One place where strncpy's behavior is desirable is in dealing with UNIX* directory entries. A directory entry contains a 14-character array for the filename, which is padded with nulls. It is only null-terminated if the filename is less than 14 characters long. Tom Laidig * UNIX is a trademark of some subsidiary of AT&T, which may have almost any name or logo, depending on the date nad Judge Greene's mood.
mash@whuxlb.UUCP (John Mashey) (08/20/83)
strcpyn() (before it was renamed strncpy(), was originally written specifically t odeal with directories, accouting records, and various other places in UNIX that expect fixed-length fields containing varaible length strings padded with nulls. There used to be numerous instances of slightly different code to do the strncpy/strncmp functions in-line. -mashey
stevesu@bronze.UUCP (Steve Summit) (08/31/83)
(Tom Laidig pointed out that the sometimes peculiar behavior of strncpy(), i.e. not always appending a '\0', is useful in strings that need not be null terminated, like filenames in directories.) By the way, that reminds me of a tidbit I learned while building directories by hand (I was salvaging a trashed filesystem, and it was PAINFUL): Those filenames that are shorter than DIRSIZE (14) characters should be fully null padded. The kernel must not use the strncmp() I'm used to, because if there are characters other than '\0' after the first '\0', it won't match. Steve Summit
guy@rlgvax.UUCP (Guy Harris) (09/01/83)
The kernel does not use "strncmp" at all for comparing directory entries. Given this, there is a bug in "fsck" (V7, S3, 4.1BSD) where it will NOT zero out the full d_name portion of a directory entry when it is reattaching an orphan file to "lost+found". If the slot being used for that had a pathname longer than the (6 or 7, depending on which UNIX) characters used by the reattachment name, you will end up with a totally inaccessible file with a null character in the middle of its name. Guy Harris {seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy
mrm@datagen.UUCP (09/01/83)
Strncpy() is spec'ed to completely null pad the buffer if given a shorter string, and not to append any nulls if given a same size or larger string. Michael Meissner Data General Corporation ...(decvax!ittvax, allegra)!datagen!mrm
hal@cornell.UUCP (Hal Perkins) (09/04/83)
Why are so many articles about C also posted to unix-wizards? Is there any need for this? The reason I complain is that there is a lot of traffic in both news groups. I read these groups in my spare time (when waiting for troff or TeX to do something, for instance), and I rarely have time to read through both of these groups in one sitting. Readnews appears to be unable to filter articles posted to several groups unless you read all of the groups at the same time. Of course, the real solution would be to fix the news programs so they would remember what articles have been read, regardless of when. This is a long standing bug, and it is pretty annoying that it hasn't been fixed sometime in the last few releases of readnews. But even if the news stuff is fixed, why can't all the C articles be posted to net.lang.c, all the micro stuff to net.micro, and just the Unix operating system items to net.unix-wizards? It would be easier to filter through the tremendous amount of stuff in unix-wizards if as much of it as possible were posted to more relevent specialized groups. Arrrrgh Hal Perkins UUCP: {decvax|vax135|...}!cornell!hal Cornell Computer Science ARPA: hal@cornell BITNET: hal@crnlcs