chris@mimsy.UUCP (Chris Torek) (03/22/88)
>In article <10731@mimsy.UUCP> I wrote: >> /* remove leading junk (n < strlen(buf)) */ >> (void) strcpy(buf, buf + n); In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes: >This usage was never a good idea, because a valid implementation of >strcpy() would be to copy right-to-left rather than left-to-right `That turns out not to be the case'---or rather, are you certain? I agree that a generic block copy operation (one of {memcpy, memmove} ---I cannot remember which allows overlap) might do this; I do not agree that strcpy() may be implemented that way. (I could be wrong.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
ok@quintus.UUCP (Richard A. O'Keefe) (03/22/88)
The UNIX manuals say of strcpy(s1, s2) that it
"copies s2 to s1, stopping after the null character has been copied."
While they doesn't strictly speaking say anything about the order in which
the other characters are copied, they _do_ say that the NUL character must
be copied last, so
char *strcpy(char *dst, *src)
{
int n = strlen(src) + 1;
dst += n, src += n;
while (--n >= 0) *--dst = *--src;
return dst;
}
is clearly illegal (it copies the NUL first).
wca@ut-emx.UUCP (William C. Anderson) (03/22/88)
In article <10753@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: -> In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes: -> ->This usage was never a good idea, because a valid implementation of -> ->strcpy() would be to copy right-to-left rather than left-to-right -> `That turns out not to be the case'---or rather, are you certain? Chris is right here, Doug. For example, the ndbm(3) routines in 4.3BSD depend upon bcopy() doing the correct ordering in cases of overlap. Luckily, it is simple to do the code correctly. William Anderson - University of Texas Computation Center - wca@emx.utexas.edu
gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/23/88)
In article <10753@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: -In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes: ->This usage was never a good idea, because a valid implementation of ->strcpy() would be to copy right-to-left rather than left-to-right -I agree that a generic block copy operation (one of {memcpy, memmove} ----I cannot remember which allows overlap) might do this; I do not -agree that strcpy() may be implemented that way. (I could be wrong.) I've never seen a specification of strcpy() that promised left-to-right processing. The dpANS specifies (redundantly, now that noalias qualifiers are shown on the parameters) that copying between overlapping objects results in undefined behavior.
gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/23/88)
In article <1304@ut-emx.UUCP> wca@ut-emx.UUCP (William C. Anderson) writes: -In article <10753@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: --> In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes: --> ->This usage was never a good idea, because a valid implementation of --> ->strcpy() would be to copy right-to-left rather than left-to-right --> `That turns out not to be the case'---or rather, are you certain? -Chris is right here, Doug. For example, the ndbm(3) routines in 4.3BSD -depend upon bcopy() doing the correct ordering in cases of overlap. Talk about non-sequiturs! The subject was strcpy() and the implications of noalias on its parameters.
jrl@anuck.UUCP (j.r.lupien) (03/24/88)
From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe): > The UNIX manuals say of strcpy(s1, s2) that it > "copies s2 to s1, stopping after the null character has been copied." > While they doesn't strictly speaking say anything about the order in which > the other characters are copied, they _do_ say that the NUL character must > be copied last, so Stopping after something occurs, as with "after the NULL has been copied" does NOT equate, as you go on to assume, to "nothing will be done after the NULL is copied. The function will return immediately." The definition says "after". "After", as I recall, means "not before", which in no way precludes doing the required act first, and taking care of other requirements next. As an aside, I would be loathe to assume that the string was copied front to rear even if the function WAS so specified. Documentation has been known in the past to lag or lead the actual code, or to simply ignore it. The moral of this is, don't depend on bizarre side effects unless there is no other efficient way to get the job done, and even then be quite sure that things work the way you expect (test it). Be prepared to adopt some less efficient method, because you WILL get bitten. John R. Lupien twitch!mvuxa!anuxh!jrl Watch out for that pirrhana!
ok@quintus.UUCP (Richard A. O'Keefe) (03/24/88)
In article <545@anuck.UUCP>, jrl@anuck.UUCP (j.r.lupien) writes: > From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe): > > The UNIX manuals say of strcpy(s1, s2) that it > > "copies s2 to s1, stopping after the null character has been copied." > > While they doesn't strictly speaking say anything about the order in which > > the other characters are copied, they _do_ say that the NUL character must > > be copied last, so > Stopping after something occurs, as with "after the NULL has been copied" > does NOT equate, as you go on to assume, to "nothing will be done after > the NULL[*] is copied. The function will return immediately." That's not what I assumed. The function could well compute factorial 5000. So what? The manual says that COPYING stops after the NUL character has been copied. So whatever strcpy does after copying NUL, either it doesn't copy any part of s2 to s1, or the manual entry is just plain wrong (which would not be unprecedented). The point of my message was that AT&T documentation provides some warrant for expecting a left-to-right order rather than some other order. The VMS C documentation says that strcpy(str_1, str_2) "copies str_2 into str_1, stopping after copying str_2's NUL character." which again says that COPYING stops after the NUL is copied. > The moral of this is, don't depend on bizarre side effects unless The order in which strcpy works is hardly a "bizarre side effect". The ADA LRM takes the trouble to point out in section 5.2.1 that the effect of assignments such as A : STRING(1..31); A(1..9) := "tar sauce"; A(4..12) := A(1..9); yields A(1..12) = "tartar sauce". There doesn't seem to be any good reason for C being less well defined. We want a block transfer which works correctly with overlapping blocks. We want a "string" transfer which is defined to copy left to right (only one direction, because C "strings" have one easy end (p) and one hard end (p+strlen(p))). And there is room for incompletely specified block/string transfers as well. Not relying on the documentation, as (j.r.lupien) suggests, leads to people writing their own version of the C library so that they know what will happen. If the ANSI C library doesn't include something like strcpy() which is defined to work left to right, people will have to keep on rolling their own. [*] The name of the null character is NUL, not NULL, just as the name of the bell character is BEL, not BELL. PS: I have found on some machines that calling my own routine char *mycpy(dst, src) char *dst, *src; { register char *d, *s; for (d = dst, s = src; *d++ = *s++; ) ; return dst; } can be FASTER than calling the vendor's strcpy()! You might like to measure it for yourself. I was very surprised by this result.
barmar@think.COM (Barry Margolin) (03/25/88)
In article <810@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >In article <545@anuck.UUCP>, jrl@anuck.UUCP (j.r.lupien) writes: >> From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe): >> > The UNIX manuals say of strcpy(s1, s2) that it >> > "copies s2 to s1, stopping after the null character has been copied." >> > While they doesn't strictly speaking say anything about the order in which >> > the other characters are copied, they _do_ say that the NUL character must >> > be copied last, so >> Stopping after something occurs, as with "after the NULL has been copied" >> does NOT equate, as you go on to assume, to "nothing will be done after >> the NULL[*] is copied. The function will return immediately." > >That's not what I assumed. The function could well compute factorial 5000. >So what? The manual says that COPYING stops after the NUL character has >been copied. So whatever strcpy does after copying NUL, either it doesn't >copy any part of s2 to s1, or the manual entry is just plain wrong (which >would not be unprecedented). The point of my message was that AT&T >documentation provides some warrant for expecting a left-to-right order >rather than some other order. Will you guys stop playing word games, and think about what that sentence was really intended to mean? I think the point of the "stopping after the NUL" phrase is that it doesn't copy any characters after the NUL. Thus, if you have char [10] dest, source; strcpy (source, "abcdefghi"); strcpy (dest, "123456789"); source [3] = '\0'; strcpy (dest, source); the resulting contents of dest will be 'a' 'b' 'c' '\0' '5' '6' '7' '7' '9' '\0' i.e. the last six characters are not affected. Programmers accustomed to some other programming languages might have expected all the declared contents of the string to be copied, and this phrase serves as a reminder that string functions don't know about declared array dimensions, all they know is that '\0' ends a string. Barry Margolin Thinking Machines Corp. barmar@think.com uunet!think!barmar
dsill@NSWC-OAS.arpa (Dave Sill) (03/25/88)
From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe): > The UNIX manuals say of strcpy(s1, s2) that it > "copies s2 to s1, stopping after the null character has been copied." > While they doesn't strictly speaking say anything about the order in which > the other characters are copied, they _do_ say that the NUL character must > be copied last, so I think you're misinterpreting that statement. I don't think that statement says anything about the order in which the characters are copied or that the NUL is copied last. As we all know, a string in C is a pointer to a list of characters that, by convention, is terminated by a NUL character. Given a string, the ONLY way to determine its contents or length is to start at the beginning and scan for the terminating NUL. The statement above is merely restating the NUL-terminator convention. I don't think it was intended to specify the actual order in which the characters are copied. Of course, with C's string representation, copying from beginning to end is more efficient than finding the end of the source string and copying backward. ========= The opinions expressed above are mine. "We are offended and resent it when people do not respect us; and yet no man, deep down in his heart, has any considerable respect for himself." -- Mark Twain
barmar@think.COM (Barry Margolin) (03/25/88)
In article <12622@brl-adm.ARPA> dsill@NSWC-OAS.arpa (Dave Sill) writes: >Of course, with C's string representation, copying from beginning to >end is more efficient than finding the end of the source string and >copying backward. Depends on the hardware. If a machine has a fast instruction that does a search for a byte and a fast block move instruction, it would probably be best for strcpy to be written memcpy (dest, src, strpos (src, '\0') + 1); (assuming that memcpy and strpos are inlined into the appropriate instructions, otherwise strcpy should be written in assembler to take advantage of the hardware). Of course, if you have hardware that knows how to do a block transfer until a particular character is reached, that would even be better. Honeywell's Multics mainframes have such a thing in their Extended Instruction Set. (It's actually more general than this, because you give it a table where each byte corresponds to a particular character value -- it stops when it encounters a character whose table entry is nonzero and returns the pointer to that character in a register and the table entry value. It can therefore implement the strspn family of functions, and is especially useful for lexical analyzers because the character table can be used to implement a state transition table -- I can envision a three-instruction FSM loop.) Barry Margolin Thinking Machines Corp. barmar@think.com uunet!think!barmar
ok@quintus.UUCP (Richard A. O'Keefe) (03/25/88)
In article <18488@think.UUCP>, barmar@think.COM (Barry Margolin) writes: > Will you guys stop playing word games, and think about what that > sentence was really intended to mean? I think the point of the We don't KNOW "what that sentence was really intended to mean". All we can tell is what it SAYS. Anyone who needs reminding in the manual page for strcpy() "that string functions don't know about declared array dimensions" is going to have a hard time with almost anything in C. I've just checked in Harbison & Steele, and while they do not feel constrained to point out that string functions don't know about declared array dimensions, they *do* explicitly say that strcat() and strcpy() may not work with overlapping strings. Presumably this means that they knew of C implementations that didn't do the copies left to right. Anyone have any idea what those implementations might be?
dsill@NSWC-OAS.arpa (Dave Sill) (03/25/88)
In article <18488@think.UUCP> Barry Margolin <barmar@think.COM> writes: >Will you guys stop playing word games, and think about what that >sentence was really intended to mean? I think the point of the >"stopping after the NUL" phrase is that it doesn't copy any characters >after the NUL. Thus, if you have > > char [10] dest, source; > strcpy (source, "abcdefghi"); > strcpy (dest, "123456789"); > source [3] = '\0'; > strcpy (dest, source); > >the resulting contents of dest will be > > 'a' 'b' 'c' '\0' '5' '6' '7' '7' '9' '\0' > >i.e. the last six characters are not affected. I don't think that that's guaranteed, or even implied by that sentence. I would expect the contents of `dest' to be: 'a' 'b' 'c' '\0' ? ? ? ? ? ? where `?' may or may not be the same character that was in that position before the call to strcpy. I could imagine an implementation that would null-out the destination string if it was longer than the source. ANSI describes `strcpy' a little differently: "The `strcpy' function copies the string pointed to by `s2' (including the terminating null character) into the array pointed to by `s1'. If copying takes place between objects that overlap, the behavior is undefined." There is nothing said about the order in which the copying takes place, or the contents of the destination string past the null character. ========= The opinions expressed above are mine. "The wretched reflect either too much or too little." -- Publilius Syrus
jrl@anuck.UUCP (j.r.lupien) (03/26/88)
From article <810@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe):
+ In article <545@anuck.UUCP>, jrl@anuck.UUCP (j.r.lupien) writes:
+> From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe):
+> > The UNIX manuals say of strcpy(s1, s2) that it
+> > "copies s2 to s1, stopping after the null character has been copied."
+> > While they doesn't strictly speaking say anything about the order in which
+> > the other characters are copied, they _do_ say that the NUL character must
+> > be copied last, so
+ That's not what I assumed. The function could well compute factorial 5000.
+ So what? The manual says that COPYING stops after the NUL character has
+ been copied.
I realize this is not really addressing the issue of how strcpy should
or shouldn't work. The point of my RESPONSE has to do with the direct
interpretation of documentation. The above quoted statement does not
say that the copying stops as soon as the NUL gets copied. It just
does not say that, at all. If you assume that it intends to give
that impression, perhaps you are making a reasonable assumption,
but I only try adding words in to the statements from the manual
(words like "as soon as") after I have tried something out and I
find that the behavior does not correspond exactly to what the
manual says.
+ So whatever strcpy does after copying NUL, either it doesn't
+ copy any part of s2 to s1, or the manual entry is just plain wrong (which
+ would not be unprecedented).
No, as I have just pointed out, the manual entry is NOT "just plain
wrong", it is just plain MISLEADING. Misleading is more than just
"not unprecedented", it seems to be a way of life in the UNIX manuals.
+ The point of my message was that AT&T
+ documentation provides some warrant for expecting a left-to-right order
+ rather than some other order.
+
+ The VMS C documentation says that strcpy(str_1, str_2)
+ "copies str_2 into str_1, stopping after copying str_2's NUL character."
+ which again says that COPYING stops after the NUL is copied.
Indeed, but it fails just as fully to specify at what point the copying
stops after the NUL is copied. After means later in time. Something
else to specify immediacy is required before I will assume it.
+
+> The moral of this is, don't depend on bizarre side effects unless
+
+ The order in which strcpy works is hardly a "bizarre side effect".
[ADA example omitted]
I don't really agree that having things work the way you might expect
them to is "not bizarre". Few things surprise me more than to have my
first impression of what the manual said being born out in fact.
+ Not relying on the documentation, as (j.r.lupien) suggests,
No, no. I meant that you should not rely on a "reasonable interpretation"
of what the documentation says. Relying on LITERAL interpretation
will get you in trouble more often than it should. If you expect things
to behave in a "reasonable" manner on top of the literal specification,
some implementor's concept of what is reasonable will at some point
diverge from your own, and you will suffer unreasonably as a result.
I will indeed rely on the function to stop after the NUL has been
copied. If it stops before the NUL has been copied, I will call the
implementor and get them to fix either the library or the manual
so that they agree.
+ leads to people writing their own version of the C library so
+ that they know what will happen.
As you go on to explain, there are many very good reasons to
"roll your own". Having the code do what you expect is only one
of them. However, I am an enthusiastic and appreciative user
of other people's libraries. I prefer to use "standard calls"
whenever possible. I just try not to read more into the documentation
than is actually written there in ink.
twitch!mvuxa!anuxh!jrl
Watch out for that nuance!
jgy@hropus.UUCP (John Young) (03/26/88)
In response to Barry Margolin, Dave sill writes: > I don't think that that's guaranteed, or even implied by that > sentence. I would expect the contents of `dest' to be: > > 'a' 'b' 'c' '\0' ? ? ? ? ? ? .......... > There is nothing said about the order in which the copying takes > place, or the contents of the destination string past the null > character. > The opinions expressed above are mine. I'm glad these are just yours! Your mistaken. If the "contents of the destination string past the null character" are not guaranteed why would anyone use strcpy()?
barmar@think.COM (Barry Margolin) (03/28/88)
In article <12636@brl-adm.ARPA> dsill@NSWC-OAS.arpa (Dave Sill) writes: ]In article <18488@think.UUCP> Barry Margolin <barmar@think.COM> writes: ]>Will you guys stop playing word games, and think about what that ]>sentence was really intended to mean? I think the point of the ]>"stopping after the NUL" phrase is that it doesn't copy any characters ]>after the NUL. Thus, if you have ]> ]> char [10] dest, source; ]> strcpy (source, "abcdefghi"); ]> strcpy (dest, "123456789"); ]> source [3] = '\0'; ]> strcpy (dest, source); ]> ]>the resulting contents of dest will be ]> ]> 'a' 'b' 'c' '\0' '5' '6' '7' '7' '9' '\0' ]> ]>i.e. the last six characters are not affected. ] ]I don't think that that's guaranteed, or even implied by that ]sentence. I would expect the contents of `dest' to be: ] ] 'a' 'b' 'c' '\0' ? ? ? ? ? ? ] ]where `?' may or may not be the same character that was in that ]position before the call to strcpy. I could imagine an implementation ]that would null-out the destination string if it was longer than the ]source. Well, I can't, because of C's rules about passing array arguments to functions. Only the address is passed, not the allocated length. If strcpy were to affect the portion of the destination array past the NUL character, it would have to be careful not to modify anything outside the destination array. But since it can't know where the destination array ends, it must not modify any elements but the ones necessary to perform its stated function (which, by the way, still doesn't prevent it from exceeding the destination's length -- it is the programmer's responsibility to make sure that sizeof(dest) > strlen(source)). ] ANSI describes `strcpy' a little differently: ] ] "The `strcpy' function copies the string pointed to by `s2' ] (including the terminating null character) into the array pointed ] to by `s1'. If copying takes place between objects that overlap, ] the behavior is undefined." ] ]There is nothing said about the order in which the copying takes ]place, or the contents of the destination string past the null ]character. There is also nothing said about the affect on /dev/icbm, but that doesn't imply that it is permitted to send it the "launch" signal. Since it doesn't say that the other elements of the destination are modified, I believe that an implementation would be incorrect if it did. And I suspect that there are many existing applications that assume that they can use strcpy to copy into the middle of a string without affecting later elements. Barry Margolin Thinking Machines Corp. barmar@think.com uunet!think!barmar
dsill@NSWC-OAS.arpa (Dave Sill) (03/29/88)
In article <90@hropus.UUCP> John Young <hropus!jgy> writes: >If the "contents of the destination string past the null character" >are not guaranteed why would anyone use strcpy()? Well, at the risk of sounding flippant, one would use strcpy() to make a copy of a string. I personally don't use strcpy() for any other reason, and I don't see how writing past the null but within the bounds of the destination array would preclude this. [In a previous posting I retracted my statement that strcpy() could write something past the null in the destination string.] ========= The opinions expressed above are mine. "We must remove the TV-induced stupor that lies like a fog across the land." -- Ted Nelson
throopw@xyzzy.UUCP (Wayne A. Throop) (03/29/88)
>,>>> ok@quintus.UUCP (Richard A. O'Keefe) >> jrl@anuck.UUCP (j.r.lupien) >>> The UNIX manuals say of strcpy(s1, s2) that it >>> "copies s2 to s1, stopping after the null character has been copied." >> Stopping after something occurs, as with "after the NULL has been copied" >> does NOT equate, as you go on to assume, to "nothing will be done after >> the NULL[*] is copied. The function will return immediately." > That's not what I assumed. The function could well compute factorial 5000. > So what? The manual says that COPYING stops after the NUL character has > been copied. So whatever strcpy does after copying NUL, either it doesn't > copy any part of s2 to s1 It does indeed say that the copying stops after the null has been copied. But this in no way indicates that no more copying occurs after the copy of the null has been made. Consider: The car went careening down the street, stopping after the pedestrian had been hit. Do you think this means that the car does no more traveling after the impact? I think not. If that had been meant, it should have said "stopping immediately after the pedestrian had been hit" or "stopping when the pedestrian had been hit". I think the same applies to the manual entry. And apparently other people think so too, since the ANSI clarification of this passage does not guarantee that the null is the last item copied. I'm fairly certain that the only thing the phrasing of the manual guarantees, (or even is intended to guarantee)) is that the null is copied during the process, and not what the relative order is. -- Sometimes I wonder whether God enjoys Christmas. --- Horace Rumpole -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw
john@frog.UUCP (John Woods, Software) (03/29/88)
In article <1304@ut-emx.UUCP>, wca@ut-emx.UUCP (William C. Anderson) writes: > In article <10753@mimsy.UUCP>, chris@mims y.UUCP (Chris Torek) writes: > -> In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes: > -> ->This usage was never a good idea, because a valid implementation of > -> ->strcpy() would be to copy right-to-left rather than left-to-right > -> `That turns out not to be the case'---or rather, are you certain? > Chris is right here, Doug. For example, the ndbm(3) routines in 4.3BSD > depend upon bcopy() doing the correct ordering in cases of overlap. > Luckily, it is simple to do the code correctly. BUZZ! No, Doug is right. The standard (3 August 87 draft) explicitly states that "If copying takes place between objects that overlap, the behavior is undefined." You can't *depend* on the behavior of strcpy() and expect to have your program be portable, QED. Perhaps bcopy() is *defined* to work correctly in cases of overlap, though people worried about that less back in the old days :-). The following is a perfectly *legal* (if perfectly awful) implementation. char *strcpy(char *s1, const char *s2) { const char *eos = s2 + strlen(s2); s1 += strlen(s2); while (eos != s2) *s1-- = *eos--; *s1 = *eos; return s1; } -- John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101 ...!decvax!frog!john, ...!mit-eddie!jfw, jfw@eddie.mit.edu FUN: THE FINAL FRONTIER Zippy the Pinhead in '88!
jgy@hropus.UUCP (John Young) (03/29/88)
> In article <90@hropus.UUCP> John Young <hropus!jgy> writes: > >If the "contents of the destination string past the null character" > >are not guaranteed why would anyone use strcpy()? > > Well, at the risk of sounding flippant, one would use strcpy() to make > a copy of a string. I personally don't use strcpy() for any other > reason, and I don't see how writing past the null but within the > bounds of the destination array would preclude this. > > [In a previous posting I retracted my statement that strcpy() could > write something past the null in the destination string.] > There was no mention of "within the bounds of the destination array" in the origonal posting. How would you suggest strcpy() check for this? Good thing you retracted your statement (I didn't see it!)
djones@megatest.UUCP (Dave Jones) (03/30/88)
in article <725@xyzzy.UUCP>, throopw@xyzzy.UUCP (Wayne A. Throop) says: > > > It does indeed say that the copying stops after the null has been > copied. But this in no way indicates that no more copying occurs after > the copy of the null has been made. > > > Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw Huh?
nevin1@ihlpf.ATT.COM (00704a-Liber) (03/30/88)
In article <810@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: > >> The moral of this is, don't depend on bizarre side effects unless > >The order in which strcpy works is hardly a "bizarre side effect". I'm sorry, but it is! If you are writing code which is dependent upon the IMPLEMENTATION of strcpy instead of the DESCRIPTION of strcpy, then you ARE depending on side effects of strcpy. Whenever possible, code should NOT depend on the side effects/implementation details of a function that it calls. Suppose I looked at the source for some obscure system call (call it foo) and found out that it modified a static variable somewhere in memory. Would you say that it is okay for me to look at the variable that it modified (assuming that this was not a documented property of foo, of course)? I think not. The strcpy argument is no different. This is one of the things that makes code very hard to maintain. For example: one of the routines that I was using returned a unique number according to a certain set of constraints. It also happened that the number it returned was pseudo-random (it would not necessarily return the same number under similar circumstances), but this was a property of the implementation, not the description of the routine. I modified the routine so that it always returned the lowest number that met the constraints (changed the implementation, not the description). Guess what happened? Another part of the program was in an infinite loop because it called this routine to generate two separate numbers which met the constraints, but my (new) routine always returned the same number. Now I know why people don't like to touch code that already works (old code is just too delicately intertwined). With languages such as C++ becoming more popular, abstraction will be forced so that these types of problems do not occur. But until the time that this is commonplace, we should be trying to abstract on a procedural level. By this I mean that code, whenever possible, should be written so that it depends ONLY upon the description of a subroutine and NOT dependent on the implementation of that subroutine. -- _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 ' ) ) "The secret compartment of my ring I fill / / _ , __o ____ with an Underdog super-energy pill." / (_</_\/ <__/ / <_ These are solely MY opinions, not AT&T's, blah blah blah
ok@quintus.UUCP (Richard A. O'Keefe) (03/30/88)
In article <4190@ihlpf.ATT.COM>, nevin1@ihlpf.ATT.COM (00704a-Liber) writes: > In article <810@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: > >> The moral of this is, don't depend on bizarre side effects unless > >The order in which strcpy works is hardly a "bizarre side effect". > I'm sorry, but it is! Questions like "what happens to the rest of the destination" and "what happens if the two areas overlap" are so important that the answers SHOULD be part of the description of strcpy(). It is extremely useful to have a function which can safely be used to move part of a character array towards its origin. Given that strcpy() is the only possible candidate for this in the SVID, that the description in the SVID can be naturally construed as describing a left to right copy, and that the descriptions of the string operations are pretty vague, it is reasonable for someone to expect that strcpy() will work this way. If the memcpy() question was solved by adding a memmove(), is there also a strmove() in the current dpANS draft? Does anyone know whether the vagueness of the SVID description of strcpy() was intentional, or whether strcpy() was originally intended to work left-to-right and the vagueness was accidental?
chris@mimsy.UUCP (Chris Torek) (03/31/88)
In article <836@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >Questions like "what happens to the rest of the destination" and "what >happens if the two areas overlap" are so important that the answers >SHOULD be part of the description of strcpy(). Unless there were some overriding reason not to do so, I agree. The claim as to efficiency is similar to the claim that Unix should have a `spawn' system call. After all, most of the time you are copying from one string to another. After all, most of the time you are going to exec immediately after a fork. >If the memcpy() question was solved by adding a memmove(), is there >also a strmove() in the current dpANS draft? No. This sort of thing can lead to a function space explosion: strcpy for one-to-another; strltor for left-to-right copy; strrtol for right-to- left; strmove for whichever is `right'; `strunsharedcpy' for memory regions that are guaranteed unshared; .... Where does one stop? That is a matter of taste. In the case of strcpy, I happen to believe that defining it to work left-to-right is worth any expense it may cause (because I believe that cost will be small if not zero). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
rbutterworth@watmath.waterloo.edu (Ray Butterworth) (03/31/88)
In article <836@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes: > If the memcpy() question was solved by adding a memmove(), is there > also a strmove() in the current dpANS draft? You could #define strmove(out,in) ((char*)memmove((void*)out, (void*)in, 1+strlen(in))) if it weren't for the fact that the identifier "strmove" is already reserved by the Standard. (If I got the return value of memmove() wrong, please don't bother posting to tell me. I don't have a copy of the Standard with me.)
msb@sq.uucp (Mark Brader) (04/01/88)
Regarding: > > If a machine has a fast instruction that > > does a search for a byte and a fast block move instruction, it would > > probably be best for strcpy to be written > > > > memcpy (dest, src, strpos (src, '\0') + 1); > > > > (assuming that memcpy and strpos are inlined into the appropriate > > instructions ... ) I said, in an article I am now canceling: > The above > algorithm is two-pass, and therefore not robust in the face > of *shared memory*. It has been pointed out to me that a one-pass algorithm is also not robust in a shared-memory situation. The two-pass and one-pass algorithms merely fail in slightly different ways and on slightly different race conditions. Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com "I'm a little worried about the bug-eater," she said. "We're embedded in bugs, have you noticed?" -- Niven, "The Integral Trees"
gnu@hoptoad.uucp (John Gilmore) (04/01/88)
john@frog.UUCP (John Woods, Software) wrote: > BUZZ! No, Doug is right. The standard (3 August 87 draft) explicitly > states that "If copying takes place between objects that overlap, the > behavior is undefined." You can't *depend* on the behavior of strcpy() > and expect to have your program be portable, QED. If the standard was already perfect there would be no need to discuss it. But having an August draft say X doesn't mean that X is proven to be true and correct, QED. The whole discussion here is about what the standard *should* say. Arguing that "this is right because the draft standard says it" carries no weight at all; we knew that already and are arguing that it is wrong. In particular, both Chris and I have seen programs that depend on strcpy() being able to slide a string into lower array indices without destroying it. We think this is a valid interpretation of the man page. Now, some people are picking nits with the English used to document it, which reminds me of people spending years analyzing the Bible and quoting it to support their claims -- without once reading the original Arameic (sp?) to see what it really said. In our case, we know what the original source code said -- it copied left to right and made no bones about it. And so far nobody has named a compiler/library/OS/environment that *doesn't* just copy left to right. But somebody somewhere wants the freedom to copy all the even bytes and then all the odd bytes, or something, and so we burn a few hundred K of comp.lang.c... -- {pyramid,pacbell,amdahl,sun,ihnp4}!hoptoad!gnu gnu@toad.com "Don't fuck with the name space!" -- Hugh Daniel
dsill@NSWC-OAS.arpa (Dave Sill) (04/01/88)
>> It does indeed say that the copying stops after the null has been >> copied. But this in no way indicates that no more copying occurs after >> the copy of the null has been made. > >Huh? It means the copying does not stop until the null is copied. This reminds me of the Saturday Night Live sketch with Ed Asner as the nuclear plant manager going on vacation whose parting advice is "You can't use too much cooling water in a nuclear reactor." The intent of the statement: Strcpy copies string s2 to s1, stopping after the null char- acter has been moved. is that all characters in s2, up to and including the terminating null, are copied to s1. Nothing at all is said about the the order in which the copying takes place. To assume that all implementations copy from right-to-left or left-to-right is plainly wrong.
karl@haddock.ISC.COM (Karl Heuer) (04/02/88)
In article <17942@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes: >You could >#define strmove(out,in) ((char*)memmove((void*)out, (void*)in, 1+strlen(in))) >if it weren't for the fact that the identifier "strmove" is already >reserved by the Standard. You could do it anyway -- provided "you" means the vendor, rather than the user. You'd still be a conforming implementation; the name "strmove" is part of the implementation's available namespace. I object on different grounds, though: "in" is evaluated twice. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
pablo@polygen.uucp (Pablo Halpern) (04/02/88)
From article <725@xyzzy.UUCP>, by throopw@xyzzy.UUCP (Wayne A. Throop): > [ Refering to the manual entry for strcpy() ] > I'm fairly certain that the only thing the phrasing of the manual > guarantees, (or even is intended to guarantee)) is that the null is > copied during the process, and not what the relative order is. Fine. To avoid incorrect inferences from readers, the entry should be revised to just say that the copy INCLUDES the NUL terminator. Also, since the length of the destination array cannot be determined by strcpy() (because of C's array/pointer semantics), the manual entry should explicitely state that no characters in the desination string following the NUL are modified. (Again, considering the array/pointer semantics, perhaps someone could come up with an even more precise rewording of my last sentence.) Pablo Halpern | mit-eddie \ Polygen Corp. | princeton \ !polygen!pablo (UUCP) 200 Fifth Ave. | bu-cs / Waltham, MA 02254 | stellar /
djones@megatest.UUCP (Dave Jones) (04/02/88)
Geez. Enough already!!
Everybody knows what strcpy does:
void strcpy(str1, str2)
char* str1;
char* str2;
{
while(*str1++ = *str2++) {;}
}
If I remember correctly, it says as much right in K&R. I don't
think you want to break K&R without darn good reason.
If you want a function that does something else, give it another name.
"strcpy" is already taken.
-- Sgt. Dave Jones, Naming Conventions Police, ret.
chris@mimsy.UUCP (Chris Torek) (04/02/88)
In article <4295@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >The whole discussion here is about what the standard *should* say. Precisely. >In particular, both Chris and I have seen programs that depend on >strcpy() being able to slide a string into lower array indices without >destroying it. Yes. `strcpy(p, p+n)' is not an uncommon idiom. >We think this is a valid interpretation of the man page. (Well, perhaps the manual entry should be clarified.) >... In our case, we know what the original >source code said -- it copied left to right and made no bones about it. >And so far nobody has named a compiler/library/OS/environment >that *doesn't* just copy left to right. To be fair, I *do* know of one: The 4.3BSD Vax strcpy() uses the Vax locc and movc3 instructions. movc3 moves in whichever direction is nondestructive. This implies that strcpy(p+n, p) moves the string *up* n bytes nondestructively, except when the string is more than 65535 bytes long (the limit for a single locc/movc3). I would not mind having to change this if the standard mandated left-to-right copying (which has a duplication effect on (p+n,p)-style overlapping strings). Alternatively, the standard could proclaim that if the strings overlap and dst<src, the copy is done left-to- right, otherwise the result is implementation dependent; this, however, is an overly grotesque description. I prefer the simple and well- defined semantics of `if the strings overlap, the copy acts as if it were performed from left to right, one byte at a time.' -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
ok@quintus.UUCP (Richard A. O'Keefe) (04/02/88)
In article <10895@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > In article <4295@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: > >And so far nobody has named a compiler/library/OS/environment > >that *doesn't* just copy left to right. > > To be fair, I *do* know of one: The 4.3BSD Vax strcpy() uses the Vax > locc and movc3 instructions. movc3 moves in whichever direction is > nondestructive. This implies that > This may not be such a wonderful idea: according to the DEC manuals, some VAX models do not implement the locc instruction. (The machine will trap to some sort of library which emulates the missing instructions.) Getting this right for strings longer than 2^16-a few characters must be a nightmare: both locc and movc3 have a 16-bit length operand. (This has never made sense to me.)
chris@mimsy.UUCP (Chris Torek) (04/03/88)
>In article <10895@mimsy.UUCP> I mentioned that >>... The 4.3BSD Vax strcpy() uses the Vax locc and movc3 instructions. In article <848@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >This may not be such a wonderful idea: according to the DEC manuals, >some VAX models do not implement the locc instruction. (The machine >will trap to some sort of library which emulates the missing instructions.) This is true. In particular, the Microvax I and II chips do not. (Indeed, the uVax I does not even implement movc3 in hardware.) The II traps to kernel code that emulates locc. (And people wonder why strcpy() and index() are slow there! I argued for a `getcputype' syscall just for library optimisation, but no one has done it.) >Getting this right for strings longer than 2^16-a few characters must be >a nightmare: both locc and movc3 have a 16-bit length operand. (This >has never made sense to me.) (Since VMS string descriptor lengths are Words rather than Longwords, obviously no one would ever want strings longer than that. Right.) Actually, it is not that bad; in particular, movc3 leaves registers r1 and r3 pointing to the `next' string, so that you wind up with something like this: # strcpy(dst, src) ... loop: /* src in r1, dst in r3 */ locc $0,$65535,src # find the \0 in src beql last_block # if we found it, finish up movc3 $65535,src,dst # otherwise move 64K brb loop # and keep going last_block: /* convert to a count and move <65535 bytes */ The code for bcopy/memcpy/memmove that handles overlapping `backwards' moves, however, is perhaps best described as `amusing': /* length in r6, src in r1, dst in r3 */ addl2 r6,r1 # jump to end of block addl2 r6,r3 movzwl $65535,r0 # get a handy 64K brb 5f 4: subl2 r0,r6 # count 64K moved /* here begins the silliness: note how r1 and r3 need adjustment now */ subl2 r0,r1 # ... from 64K behind where we were subl2 r0,r3 movc3 r0,(r1),(r3) # the VAX does this back to front movzwl $65535,r0 # but we still have to fix the pointers /* ... and again! */ subl2 r0,r1 # afterward subl2 r0,r3 5: cmpl r6,r0 # 64K? bgtr 4b # more subl2 r6,r1 # 64K or less; subl2 r6,r3 # adjust the pointers movc3 r6,(r1),(r3) # and move movl 4(ap),r0 # always return dst ret In other words, even though the microcode decides to move the string `back to front' (high addresses to low addresses), and therefore sets the registers to count down from the top, it very carefully adjusts them afterward so that they point to the high addresses---exactly what we do NOT want. (I suspect the high bits of one of the counting registers are used to flag the direction, which would give another reason why the lengths are limited. Too bad they are not limited to 30 bits, which is as much as you can address in one segment [no, not iNTEL segments].) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
rbutterworth@watmath.waterloo.edu (Ray Butterworth) (04/03/88)
In article <10895@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > I would not mind having to change this if the standard mandated > left-to-right copying (which has a duplication effect on (p+n,p)-style > overlapping strings). Alternatively, the standard could proclaim > that if the strings overlap and dst<src, the copy is done left-to- > right, otherwise the result is implementation dependent; this, however, > is an overly grotesque description. I prefer the simple and well- > defined semantics of `if the strings overlap, the copy acts as if > it were performed from left to right, one byte at a time.' I'm not disagreeing what you said, only with the way you said it. The terms "left-to-right" and "right-to-left" can be misleading. Standard C can presumably be used in countries where the natural direction of the language is right-to-left (e.g. Hebrew or Arabic) rather than left-to-right (e.g. English or French). In such an anvironment, one would consider the terminating nul on a string to be its left-most character, not its right-most as we would in English. Similarly, many of us that use VAX or other similar equipment tend to think of the bytes being laid out in memory numbered right-to-left (then shorts, ints, and longs line up nicely without any of the complications that arise if one thinks in terms of byte-swapping). If the standard is going to specify an order for strcpy, (and I really see no reason why it shouldn't), please let that order be in terms of "start-to-end" or "low-to-high address" or some other notation that doesn't presume which end of a string is the "right" end.
nevin1@ihlpf.ATT.COM (00704a-Liber) (04/05/88)
In article <836@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >Questions like "what happens to the rest of the destination" and "what >happens if the two areas overlap" are so important that the answers >SHOULD be part of the description of strcpy(). It is extremely useful to >have a function which can safely be used to move part of a character >array towards its origin. I agree that it is useful to have a function which can safely move strings with overlapping characters. That is what memmove() is for. BTW, the answer to "what happens to the rest of the destination" in strcpy() would be is that it is unaffected, since there is no way of conveying what is meant by "the rest of the destination" to a function call; ie, how can strcpy() tell the difference between an exact fit and an inexact fit? It can't. And the answer to "what happens if the two areas overlap" is found directly in the standard: "If copying takes place between objects that overlap, the behavior is undefined." You may not like the answer, but the standard answers the question just the same. >If the memcpy() question was solved by adding a memmove(), is there >also a strmove() in the current dpANS draft? strmove() is not needed since it is just a very special case of memmove(). In order to copy possibly overlapping strings, you need to know the length of the source string. Therefore, give a source string s2 (char *s2) and a destination string s1 (char *s1): (char *)memmove((void *)s1, (void *)s2, strlen(s2) + (size_t)1) will accomplish that you would want a strmove() to do. -- _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 ' ) ) "The secret compartment of my ring I fill / / _ , __o ____ with an Underdog super-energy pill." / (_</_\/ <__/ / <_ These are solely MY opinions, not AT&T's, blah blah blah
nevin1@ihlpf.ATT.COM (00704a-Liber) (04/05/88)
In article <425@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes: >Geez. Enough already!! >Everybody knows what strcpy does: >void strcpy(str1, str2) > char* str1; > char* str2; >{ > while(*str1++ = *str2++) {;} >} First off, you got it slightly wrong (where is your return value?). Secondly, many implementations of C convert strcpy() into inline assembly code. It is conceivable that there may be hardware move instructions which will copy in a right-to-left order. Since 'good' C programs should not be depending on the implementation of strcpy() anyway, why should the implementation of it be restricted?? >If I remember correctly, it says as much right in K&R. I don't >think you want to break K&R without darn good reason. K&R gives *possible* ways of implementating strcpy() in C (see page 100 in the first edition). These are not entirely correct (they do not include the return value), nor are they all-inclusive. They are merely there as examples so that someone reading the book can understand how to implement a function like strcpy(). BTW, I wonder of the people who are saying that C won't be used on a multiprocessing machine are the same people who used to say that Unix will never be implemented on a Cray?? :-) :-) >If you want a function that does something else, give it another name. >"strcpy" is already taken. It seems that you, not I, want a function to do something other than what strcpy() is *guaranteed* to do now. -- _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 ' ) ) "The secret compartment of my ring I fill / / _ , __o ____ with an Underdog super-energy pill." / (_</_\/ <__/ / <_ These are solely MY opinions, not AT&T's, blah blah blah
ok@quintus.UUCP (Richard A. O'Keefe) (04/05/88)
In article <4263@ihlpf.ATT.COM>, nevin1@ihlpf.ATT.COM (00704a-Liber) writes: > In article <836@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: > I agree that it is useful to have a function which can safely move strings > with overlapping characters. That is what memmove() is for. Nope, memmove() is for when IN ADDITION you already know the exact amount you want to move. > "If copying takes place between objects that overlap, the behavior is > undefined." > You may not like the answer, but the standard answers the question just the > same. Refusing to answer is _not_ an answer! > In order to copy possibly overlapping strings, you need to know the length > of the source string. Therefore, give a source string s2 (char *s2) and a > destination string s1 (char *s1): > (char *)memmove((void *)s1, (void *)s2, strlen(s2) + (size_t)1) > will accomplish that you would want a strmove() to do. (a) I don't want "to copy possibly overlapping strings", I want to move a NUL-terminated sequence to another area of memory which may overlap the current copy of that sequence. I am happy to destroy the original (so it's "move", not "copy"), and in general I neither know nor care whether the destination has a NUL in it (so one of the areas might not be a "string"). [Actually, my objection to calling a move a copy counts against me: strcpy() is a copy only when the two areas don't overlap.] (b) if you are moving towards lower addresses, you do _not_ need to know the length of the source string in advance, but can check as you go. The implementor of a strmov() function can check for this, and only calculate strlen() when necessary. Anyway, I give in. From now on I'll stick with my own code, so that I can be _sure_ what it does. [PS: is it really so vital to wring the very last microsecond out of strcpy? I once went through a program changing things like sprintf(buffer, "foo%s", X); to strcpy(buffer, "foo"), strcpy(buffer+3, X); and it didn't make any appreciable difference. Letting the implementor optimise the whatever out of strcpy() while not requiring that 1.0+1.0 be a good approximation to 2.0 doesn't seem like quite the right balance.]
djones@megatest.UUCP (Dave Jones) (04/05/88)
in article <4264@ihlpf.ATT.COM>, nevin1@ihlpf.ATT.COM (00704a-Liber) says: > > In article <425@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes: >>Geez. Enough already!! > >>Everybody knows what strcpy does: > >>void strcpy(str1, str2) >> char* str1; >> char* str2; >>{ >> while(*str1++ = *str2++) {;} >>} > > First off, you got it slightly wrong (where is your return value?). Okay. You win round one. It's supposed to return a char*, not void. Says so in the documentation. K&R is wrong. Gad! Is nothing sacred? > Secondly, many implementations of C convert strcpy() into inline > assembly code. So? If they get it right, that's fine with me. > It is conceivable that there may be hardware move > instructions which will copy in a right-to-left order. Then you better not use that conceivable hardware move! It doesn't do the right thing. Besides, how is that right-to-left instruction going to find the terminating null character? I cut this directly out of the on-line UNIX documentation: strcpy copies string s2 to s1, stopping after the null char- acter has been copied. If you expect the jury to believe that means anything other than a "left-to-right" copy, you better have a darn good lawyer. "Your Honor, the copy stops after the null character has been copied. STOPS. Nothing is copied after the null character." "But Your Honor, it stops after EVERY character is copied. It doesn't say it stops IMMEDIATELY after the null character is copied. They just phrased it that way to trick you." "Your Honor, that is very silly." > Since 'good' C > programs should not be depending on the implementation of > strcpy() anyway, why should the implementation of it be restricted?? > Because not all C programs are 'good' ones. The ones you and I write are, of course. But there's all those other programs out there, just waiting to rear their shrowded strcpys in agony. I get bored chasing down bugs in old brittle code. I'd rather be at the beach. >>If I remember correctly, it says as much right in K&R. I don't >>think you want to break K&R without darn good reason. > > K&R gives *possible* ways of implementating strcpy() in C (see page 100 in > the first edition). These are not entirely correct (they do not include > the return value), nor are they all-inclusive. They are merely there as > examples so that someone reading the book can understand how to implement > a function like strcpy(). > Probably they were meant to be only example implementations, but I'll guess many have taken it quite literally, and programmed accordingly. > BTW, I wonder of the people who are saying that C won't be used on a > multiprocessing machine are the same people who used to say that Unix will > never be implemented on a Cray?? :-) :-) > Yep. Same lot. Bunch of Fortran geeks. Just ignore them. >>If you want a function that does something else, give it another name. >>"strcpy" is already taken. > > It seems that you, not I, want a function to do something other than what > strcpy() is *guaranteed* to do now. I never said I want the function to do what it is *guaranteed* to do now. I want it to do what it *does* now. > -- > _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 > ' ) ) "The secret compartment of my ring I fill > / / _ , __o ____ with an Underdog super-energy pill." > / (_</_\/ <__/ / <_ These are solely MY opinions, not AT&T's, blah blah blah If you want to attempt a counterrebuttal (and I don't recommend it), I won't *string*it*out*. (Urk.) I'll let you have the last word. It's not really an earth-shaking matter is it? Dave (Break it, you bought it) Jones
karl@haddock.ISC.COM (Karl Heuer) (04/07/88)
In article <7007@ki4pv.uucp> tanner@ki4pv.uucp (Dr. T. Andrews) writes: >The real net effect of the X3J11 "improvement" of strcpy() definitions is >likely to be that folks need to write their own version in order to be sure >that something useful is done. Fine with me. If people use strcpy() only for non-overlapping areas, and roll their own when they want to modify a string in place, then at least I can tell the two apart when I read the code. >A hundred programmers, each dreaming up his own name for strcpy() ... Is this any worse than, say, everybody dreaming up his own name for "bool"%? I suspect this is more common than in-place string shifting. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint %among those of us who don't like to overload "int" for this.
swarbric@tramp.Colorado.EDU (Frank Swarbrick) (04/08/88)
In article <3365@haddock.ISC.COM> karl@haddock.ISC.COM (Karl Heuer) writes: >Is this any worse than, say, everybody dreaming up his own name for "bool"%? [...] >%among those of us who don't like to overload "int" for this. Gee, and I don't even define it as an int. I use typedef enum {false,true=!false} bool; I don't suppose that there's any real value in doing this, but oh well... Frank Swarbrick (and his cat) swarbric@tramp.UUCP swarbric@tramp.Colorado.EDU ...!{ncar|nbires}!boulder!tramp!swarbric "Timothy Leary is dead..."
nw@amdahl.uts.amdahl.com (Neal Weidenhofer) (04/08/88)
In article <137@polygen.UUCP>, pablo@polygen.uucp (Pablo Halpern) writes: > To avoid incorrect inferences from readers, the entry should > be revised to just say that the copy INCLUDES the NUL terminator. > > Pablo Halpern | mit-eddie \ > Polygen Corp. | princeton \ !polygen!pablo (UUCP) > 200 Fifth Ave. | bu-cs / > Waltham, MA 02254 | stellar / In dpANS Section 4.11.2.3 the description is: The |strcpy| function copies the string pointed to by |s2| (including the terminating null character) into the array pointed to by |s1|. If copying takes place between objects that overlap, the behavior is undefined. The opinions expressed above are mine (but I'm willing to share.) Sometimes I live Regards, in the country Neal Weidenhofer Sometimes I live ...{hplabs|ihnp4|ames|decwrl}!amdahl!nw in town Amdahl Corporation Sometimes I take 1250 E. Arques Ave. (M/S 316) a great notion P. O. Box 3470 To jump in the river Sunnyvale, CA 94088-3470 and drown (408)737-5007
flaps@dgp.toronto.edu (Alan J Rosenthal) (04/09/88)
david@dhw68k.cts.com (David H. Wolfskill) writes: >Suppose... [ strcpy's order were implementation-defined, and this implementation defined it as being left-to-right. ] >Then, an algorithm to clear a given >string (str1) to a given value (other than NUL) could be coded: > > *str1 = ch; > for (c1 = str1; *++c1 != '\0'; *c1 = *(c1 -1)); > >or (remembering the characteristics of the implementation): > > *str1 = ch; > strcpy(str1+1, str1) > >but I think the latter is easier to comprehend. Gosh, I find these both really complicated. (I must say however that the most complicated part of the first example is the fact that the _for_ body is placed inside the control structure, and the increment inside the test!) Why not do the simple: for(p = str1; *p; p++) /* optionally insert "!= '\0'" */ *p = ch; ajr -- "Comment, Spock?" "Very bad poetry, Captain."
mouse@mcgill-vision.UUCP (der Mouse) (04/12/88)
In article <848@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes: > In article <10895@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: >> In article <4295@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >>> And so far nobody has named a compiler/library/OS/environment that >>> *doesn't* just copy left to right. >> To be fair, I *do* know of one: The 4.3BSD Vax strcpy() uses the >> Vax locc and movc3 instructions. movc3 moves in whichever direction >> is nondestructive. This is not quite true. I just looked at it, and the 4.3 VAX strcpy() does use locc and movc3. However, this doesn't imply that the strcpy() operation is done whichever way is nondestructive. Why is this? Because the string may be longer than 64k. The code loops, from left to right, doing 64k-1 chunks until it has it whittled down to less than 64k. Thus, the code works right for non-overlapping operands and for cases where left-to-right would work. The other sort of overlap will work non-destructively for lengths up to 64k-1, and above that will do replication with a stride of 64k-1. > This may not be such a wonderful idea: according to the DEC manuals, > some VAX models do not implement the locc instruction. Primarily the MicroVAX-II (and possibly the -I as well). > (The machine will trap to some sort of library which emulates the > missing instructions.) It just traps through a specific vector in the SCB, much like a device interrupt or an exception. > Getting this right for strings longer than 2^16-a few characters must > be a nightmare: both locc and movc3 have a 16-bit length operand. This is whence the looping I mentioned above. > (This has never made sense to me.) The 16-bit limitation on the string instructions? Yeah, me either. Anybody from DEC care to explain what this silly restriction is doing there? der Mouse uucp: mouse@mcgill-vision.uucp arpa: mouse@larry.mcrcim.mcgill.edu
ray@micomvax.UUCP (Ray Dunn) (04/13/88)
In article <1988Mar31.183321.4740@sq.uucp> msb@sq.UUCP (Mark Brader) writes: >Regarding: >> > If a machine has a fast instruction that >> > does a search for a byte and a fast block move instruction, it would >> > probably be best for strcpy to be written.... Hmm. I fell foul to this practice in MicroSoft 4.0 string library routines fairly recently. strchr (at least, probably others) unbeknownst to me does this to determine the length of the string prior to doing the character search. ....Now, to optimize my file operations, I used a 32K buffer, and was using strchr to find the line endings! Can you say S.L.O.W. I wonder how much software is out there just now running s.l.o.w.l.y because of this practice? Ray Dunn. ..{philabs,mnetor}!micomvax!ray
karl@haddock.ima.isc.com (Karl Heuer) (06/13/89)
In article <4400001@tdpvax> scott@tdpvax.UUCP writes: >The second question deals with strcpy(). Is it like memcpy in that if the >arguments memory overlap the behavior is undefined or is it different. Is >pre-ANSI and ANSI different on this. In both pre-ANSI and ANSI, strcpy() has the same disclaimer as memcpy(). If you want to copy overlapping strings, you should probably use memmove(dest, src, strlen(src)); since memmove() does have predictable behavior on overlap. (In all known implementations, strcpy() happens to do the right thing for ONE direction of copy, but this has never been guaranteed, and I wouldn't try to rely on it.) Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
bright@Data-IO.COM (Walter Bright) (06/17/89)
In article <13674@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
<If you want to copy overlapping strings, you should probably use
< memmove(dest, src, strlen(src));
<since memmove() does have predictable behavior on overlap.
I get involved with helping people debug C code from time to time, and the
bug in the above code occurs frequently. I.e., the line should be:
memmove(dest, src, strlen(src) + 1);
I'm pointing this out because it's such a common bug that it's one of the
things I routinely look for. Remember:
static char src[] = "abc";
sizeof(src) == 4
strlen(src) == 3
karl@haddock.ima.isc.com (Karl Heuer) (06/20/89)
In article <2013@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright) writes: >In article <13674@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes: >> memmove(dest, src, strlen(src)); > > memmove(dest, src, strlen(src) + 1); (I die. My replacement reads my uncommented code and deletes a fragment he doesn't understand. Eventually the subroutine is sold to the government, and the bug causes nuclear missles to be launched by accident. The other side retaliates, and all die. O the embarrassment.)% This is correct, of course; it only works to stop after strlen() if the receiving buffer is known to have a null character in the appropriate spot. Even if this were known information (e.g. if we're right-shifting a string in a null-padded buffer), it's cheaper to use the +1 than to document why it isn't necessary. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint ________ % First person to identify the reference wins a defunct root password.