[comp.lang.c] strcpy

chris@mimsy.UUCP (Chris Torek) (03/22/88)

>In article <10731@mimsy.UUCP> I wrote:
>>		/* remove leading junk (n < strlen(buf)) */
>>		(void) strcpy(buf, buf + n);

In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes:
>This usage was never a good idea, because a valid implementation of
>strcpy() would be to copy right-to-left rather than left-to-right

`That turns out not to be the case'---or rather, are you certain?
I agree that a generic block copy operation (one of {memcpy, memmove}
---I cannot remember which allows overlap) might do this; I do not
agree that strcpy() may be implemented that way.  (I could be wrong.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

ok@quintus.UUCP (Richard A. O'Keefe) (03/22/88)

The UNIX manuals say of strcpy(s1, s2) that it
	"copies s2 to s1, stopping after the null character has been copied."
While they doesn't strictly speaking say anything about the order in which
the other characters are copied, they _do_ say that the NUL character must
be copied last, so 
	char *strcpy(char *dst, *src)
	    {
		int n = strlen(src) + 1;
		dst += n, src += n;
		while (--n >= 0) *--dst = *--src;
		return dst;
	    }
is clearly illegal (it copies the NUL first).

wca@ut-emx.UUCP (William C. Anderson) (03/22/88)

In article <10753@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:

-> In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes:

-> ->This usage was never a good idea, because a valid implementation of
-> ->strcpy() would be to copy right-to-left rather than left-to-right

-> `That turns out not to be the case'---or rather, are you certain?

Chris is right here, Doug.  For example, the ndbm(3) routines in 4.3BSD
depend upon bcopy() doing the correct ordering in cases of overlap.
Luckily, it is simple to do the code correctly.

William Anderson - University of Texas Computation Center - wca@emx.utexas.edu

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/23/88)

In article <10753@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
-In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes:
->This usage was never a good idea, because a valid implementation of
->strcpy() would be to copy right-to-left rather than left-to-right
-I agree that a generic block copy operation (one of {memcpy, memmove}
----I cannot remember which allows overlap) might do this; I do not
-agree that strcpy() may be implemented that way.  (I could be wrong.)

I've never seen a specification of strcpy() that promised left-to-right
processing.  The dpANS specifies (redundantly, now that noalias
qualifiers are shown on the parameters) that copying between overlapping
objects results in undefined behavior.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/23/88)

In article <1304@ut-emx.UUCP> wca@ut-emx.UUCP (William C. Anderson) writes:
-In article <10753@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
--> In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes:
--> ->This usage was never a good idea, because a valid implementation of
--> ->strcpy() would be to copy right-to-left rather than left-to-right
--> `That turns out not to be the case'---or rather, are you certain?
-Chris is right here, Doug.  For example, the ndbm(3) routines in 4.3BSD
-depend upon bcopy() doing the correct ordering in cases of overlap.

Talk about non-sequiturs!  The subject was strcpy() and the implications
of noalias on its parameters.

jrl@anuck.UUCP (j.r.lupien) (03/24/88)

From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe):
> The UNIX manuals say of strcpy(s1, s2) that it
> 	"copies s2 to s1, stopping after the null character has been copied."
> While they doesn't strictly speaking say anything about the order in which
> the other characters are copied, they _do_ say that the NUL character must
> be copied last, so 
Stopping after something occurs, as with "after the NULL has been copied"
does NOT equate, as you go on to assume, to "nothing will be done after
the NULL is copied. The function will return immediately." The definition
says "after". "After", as I recall, means "not before", which in no way
precludes doing the required act first, and taking care of other
requirements next. 
  As an aside, I would be loathe to assume that the string was copied
front to rear even if the function WAS so specified. Documentation
has been known in the past to lag or lead the actual code, or to
simply ignore it. 
  The moral of this is, don't depend on bizarre side effects unless
there is no other efficient way to get the job done, and even then
be quite sure that things work the way you expect (test it). Be
prepared to adopt some less efficient method, because you WILL get
bitten.

John R. Lupien
twitch!mvuxa!anuxh!jrl


Watch out for that pirrhana!

ok@quintus.UUCP (Richard A. O'Keefe) (03/24/88)

In article <545@anuck.UUCP>, jrl@anuck.UUCP (j.r.lupien) writes:
> From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe):
> > The UNIX manuals say of strcpy(s1, s2) that it
> > 	"copies s2 to s1, stopping after the null character has been copied."
> > While they doesn't strictly speaking say anything about the order in which
> > the other characters are copied, they _do_ say that the NUL character must
> > be copied last, so 
> Stopping after something occurs, as with "after the NULL has been copied"
> does NOT equate, as you go on to assume, to "nothing will be done after
> the NULL[*] is copied. The function will return immediately."

That's not what I assumed.  The function could well compute factorial 5000.
So what?  The manual says that COPYING stops after the NUL character has
been copied.  So whatever strcpy does after copying NUL, either it doesn't
copy any part of s2 to s1, or the manual entry is just plain wrong (which
would not be unprecedented).  The point of my message was that AT&T
documentation provides some warrant for expecting a left-to-right order
rather than some other order.

The VMS C documentation says that strcpy(str_1, str_2)
	"copies str_2 into str_1, stopping after copying str_2's NUL character."
which again says that COPYING stops after the NUL is copied.

>   The moral of this is, don't depend on bizarre side effects unless

The order in which strcpy works is hardly a "bizarre side effect".
The ADA LRM takes the trouble to point out in section 5.2.1 that the
effect of assignments such as
	A : STRING(1..31);
	A(1..9) := "tar sauce";
	A(4..12) := A(1..9);
yields A(1..12) = "tartar sauce".  There doesn't seem to be any good
reason for C being less well defined.  We want a block transfer which
works correctly with overlapping blocks.  We want a "string" transfer
which is defined to copy left to right (only one direction, because
C "strings" have one easy end (p) and one hard end (p+strlen(p))).
And there is room for incompletely specified block/string
transfers as well.

Not relying on the documentation, as (j.r.lupien) suggests, leads to
people writing their own version of the C library so that they know
what will happen.  If the ANSI C library doesn't include something
like strcpy() which is defined to work left to right, people will
have to keep on rolling their own.

[*] The name of the null character is NUL, not NULL, just as
    the name of the bell character is BEL, not BELL.

PS: I have found on some machines that calling my own routine
    char *mycpy(dst, src)
	char *dst, *src;
	{    
	    register char *d, *s;
	    for (d = dst, s = src; *d++ = *s++; ) ;
	    return dst;
	}
    can be FASTER than calling the vendor's strcpy()!  You might like
    to measure it for yourself.  I was very surprised by this result.

barmar@think.COM (Barry Margolin) (03/25/88)

In article <810@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>In article <545@anuck.UUCP>, jrl@anuck.UUCP (j.r.lupien) writes:
>> From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe):
>> > The UNIX manuals say of strcpy(s1, s2) that it
>> > 	"copies s2 to s1, stopping after the null character has been copied."
>> > While they doesn't strictly speaking say anything about the order in which
>> > the other characters are copied, they _do_ say that the NUL character must
>> > be copied last, so 
>> Stopping after something occurs, as with "after the NULL has been copied"
>> does NOT equate, as you go on to assume, to "nothing will be done after
>> the NULL[*] is copied. The function will return immediately."
>
>That's not what I assumed.  The function could well compute factorial 5000.
>So what?  The manual says that COPYING stops after the NUL character has
>been copied.  So whatever strcpy does after copying NUL, either it doesn't
>copy any part of s2 to s1, or the manual entry is just plain wrong (which
>would not be unprecedented).  The point of my message was that AT&T
>documentation provides some warrant for expecting a left-to-right order
>rather than some other order.

Will you guys stop playing word games, and think about what that
sentence was really intended to mean?  I think the point of the
"stopping after the NUL" phrase is that it doesn't copy any characters
after the NUL.  Thus, if you have

	char [10] dest, source;
	strcpy (source, "abcdefghi");
	strcpy (dest, "123456789");
	source [3] = '\0';
	strcpy (dest, source);

the resulting contents of dest will be

	'a' 'b' 'c' '\0' '5' '6' '7' '7' '9' '\0'

i.e. the last six characters are not affected.  Programmers accustomed
to some other programming languages might have expected all the
declared contents of the string to be copied, and this phrase serves
as a reminder that string functions don't know about declared array
dimensions, all they know is that '\0' ends a string.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
uunet!think!barmar

dsill@NSWC-OAS.arpa (Dave Sill) (03/25/88)

From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe):
> The UNIX manuals say of strcpy(s1, s2) that it
> 	"copies s2 to s1, stopping after the null character has been copied."
> While they doesn't strictly speaking say anything about the order in which
> the other characters are copied, they _do_ say that the NUL character must
> be copied last, so 

I think you're misinterpreting that statement.  I don't think that
statement says anything about the order in which the characters are
copied or that the NUL is copied last.

As we all know, a string in C is a pointer to a list of characters
that, by convention, is terminated by a NUL character.  Given a
string, the ONLY way to determine its contents or length is to start
at the beginning and scan for the terminating NUL.  The statement
above is merely restating the NUL-terminator convention.  I don't
think it was intended to specify the actual order in which the
characters are copied.

Of course, with C's string representation, copying from beginning to
end is more efficient than finding the end of the source string and
copying backward.

=========
The opinions expressed above are mine.

"We are offended and resent it when people do not respect us;
and yet no man, deep down in his heart, has any considerable
respect for himself."
					-- Mark Twain

barmar@think.COM (Barry Margolin) (03/25/88)

In article <12622@brl-adm.ARPA> dsill@NSWC-OAS.arpa (Dave Sill) writes:
>Of course, with C's string representation, copying from beginning to
>end is more efficient than finding the end of the source string and
>copying backward.

Depends on the hardware.  If a machine has a fast instruction that
does a search for a byte and a fast block move instruction, it would
probably be best for strcpy to be written

	memcpy (dest, src, strpos (src, '\0') + 1);

(assuming that memcpy and strpos are inlined into the appropriate
instructions, otherwise strcpy should be written in assembler to take
advantage of the hardware).

Of course, if you have hardware that knows how to do a block transfer
until a particular character is reached, that would even be better.
Honeywell's Multics mainframes have such a thing in their Extended
Instruction Set.  (It's actually more general than this, because you
give it a table where each byte corresponds to a particular character
value -- it stops when it encounters a character whose table entry is
nonzero and returns the pointer to that character in a register and
the table entry value.  It can therefore implement the strspn family
of functions, and is especially useful for lexical analyzers because
the character table can be used to implement a state transition table
-- I can envision a three-instruction FSM loop.)

Barry Margolin
Thinking Machines Corp.

barmar@think.com
uunet!think!barmar

ok@quintus.UUCP (Richard A. O'Keefe) (03/25/88)

In article <18488@think.UUCP>, barmar@think.COM (Barry Margolin) writes:
> Will you guys stop playing word games, and think about what that
> sentence was really intended to mean?  I think the point of the

We don't KNOW "what that sentence was really intended to mean".
All we can tell is what it SAYS.
Anyone who needs reminding in the manual page for strcpy()
"that string functions don't know about declared array dimensions"
is going to have a hard time with almost anything in C.

I've just checked in Harbison & Steele, and while they do not feel
constrained to point out that string functions don't know about
declared array dimensions, they *do* explicitly say that strcat()
and strcpy() may not work with overlapping strings.  Presumably
this means that they knew of C implementations that didn't do the
copies left to right.  Anyone have any idea what those implementations
might be?

dsill@NSWC-OAS.arpa (Dave Sill) (03/25/88)

In article <18488@think.UUCP> Barry Margolin <barmar@think.COM> writes:
>Will you guys stop playing word games, and think about what that
>sentence was really intended to mean?  I think the point of the
>"stopping after the NUL" phrase is that it doesn't copy any characters
>after the NUL.  Thus, if you have
>
>	char [10] dest, source;
>	strcpy (source, "abcdefghi");
>	strcpy (dest, "123456789");
>	source [3] = '\0';
>	strcpy (dest, source);
>
>the resulting contents of dest will be
>
>	'a' 'b' 'c' '\0' '5' '6' '7' '7' '9' '\0'
>
>i.e. the last six characters are not affected.

I don't think that that's guaranteed, or even implied by that
sentence.  I would expect the contents of `dest' to be:

	'a' 'b' 'c' '\0'  ?   ?   ?   ?   ?   ?

where `?' may or may not be the same character that was in that
position before the call to strcpy.  I could imagine an implementation
that would null-out the destination string if it was longer than the
source.  ANSI describes `strcpy' a little differently:

  "The `strcpy' function copies the string pointed to by `s2'
   (including the terminating null character) into the array pointed
   to by `s1'.  If copying takes place between objects that overlap,
   the behavior is undefined."

There is nothing said about the order in which the copying takes
place, or the contents of the destination string past the null
character.

=========
The opinions expressed above are mine.

"The wretched reflect either too much or too little."
					-- Publilius Syrus

jrl@anuck.UUCP (j.r.lupien) (03/26/88)

From article <810@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe):
+ In article <545@anuck.UUCP>, jrl@anuck.UUCP (j.r.lupien) writes:
+> From article <793@cresswell.quintus.UUCP>, by ok@quintus.UUCP (Richard A. O'Keefe):
+> > The UNIX manuals say of strcpy(s1, s2) that it
+> > 	"copies s2 to s1, stopping after the null character has been copied."
+> > While they doesn't strictly speaking say anything about the order in which
+> > the other characters are copied, they _do_ say that the NUL character must
+> > be copied last, so 
+ That's not what I assumed.  The function could well compute factorial 5000.
+ So what?  The manual says that COPYING stops after the NUL character has
+ been copied.  

I realize this is not really addressing the issue of how strcpy should
or shouldn't work. The point of my RESPONSE has to do with the direct
interpretation of documentation. The above quoted statement does not
say that the copying stops as soon as the NUL gets copied. It just
does not say that, at all. If you assume that it intends to give
that impression, perhaps you are making a reasonable assumption,
but I only try adding words in to the statements from the manual
(words like "as soon as") after I have tried something out and I
find that the behavior does not correspond exactly to what the
manual says.

+ So whatever strcpy does after copying NUL, either it doesn't
+ copy any part of s2 to s1, or the manual entry is just plain wrong (which
+ would not be unprecedented).

No, as I have just pointed out, the manual entry is NOT "just plain
wrong", it is just plain MISLEADING. Misleading is more than just
"not unprecedented", it seems to be a way of life in the UNIX manuals.

+ The point of my message was that AT&T
+ documentation provides some warrant for expecting a left-to-right order
+ rather than some other order.
+ 
+ The VMS C documentation says that strcpy(str_1, str_2)
+ 	"copies str_2 into str_1, stopping after copying str_2's NUL character."
+ which again says that COPYING stops after the NUL is copied.

Indeed, but it fails just as fully to specify at what point the copying
stops after the NUL is copied. After means later in time. Something
else to specify immediacy is required before I will assume it.

+ 
+>   The moral of this is, don't depend on bizarre side effects unless
+ 
+ The order in which strcpy works is hardly a "bizarre side effect".
[ADA example omitted]

I don't really agree that having things work the way you might expect
them to is "not bizarre". Few things surprise me more than to have my
first impression of what the manual said being born out in fact.

+ Not relying on the documentation, as (j.r.lupien) suggests,

No, no. I meant that you should not rely on a "reasonable interpretation"
of what the documentation says. Relying on LITERAL interpretation
will get you in trouble more often than it should. If you expect things
to behave in a "reasonable" manner on top of the literal specification,
some implementor's concept of what is reasonable will at some point
diverge from your own, and you will suffer unreasonably as a result.
I will indeed rely on the function to stop after the NUL has been
copied. If it stops before the NUL has been copied, I will call the
implementor and get them to fix either the library or the manual
so that they agree.

+ leads to people writing their own version of the C library so
+ that they know what will happen.

As you go on to explain, there are many very good reasons to 
"roll your own". Having the code do what you expect is only one
of them. However, I am an enthusiastic and appreciative user
of other people's libraries. I prefer to use "standard calls"
whenever possible. I just try not to read more into the documentation
than is actually written there in ink. 

twitch!mvuxa!anuxh!jrl

Watch out for that nuance!

jgy@hropus.UUCP (John Young) (03/26/88)

In response to Barry Margolin, Dave sill writes:
> I don't think that that's guaranteed, or even implied by that
> sentence.  I would expect the contents of `dest' to be:
> 
> 	'a' 'b' 'c' '\0'  ?   ?   ?   ?   ?   ?
		..........
> There is nothing said about the order in which the copying takes
> place, or the contents of the destination string past the null
> character.
> The opinions expressed above are mine.

I'm glad these are just yours!
Your mistaken.
If the "contents of the destination string past the null character"
are not guaranteed why would anyone use strcpy()? 

barmar@think.COM (Barry Margolin) (03/28/88)

In article <12636@brl-adm.ARPA> dsill@NSWC-OAS.arpa (Dave Sill) writes:
]In article <18488@think.UUCP> Barry Margolin <barmar@think.COM> writes:
]>Will you guys stop playing word games, and think about what that
]>sentence was really intended to mean?  I think the point of the
]>"stopping after the NUL" phrase is that it doesn't copy any characters
]>after the NUL.  Thus, if you have
]>
]>	char [10] dest, source;
]>	strcpy (source, "abcdefghi");
]>	strcpy (dest, "123456789");
]>	source [3] = '\0';
]>	strcpy (dest, source);
]>
]>the resulting contents of dest will be
]>
]>	'a' 'b' 'c' '\0' '5' '6' '7' '7' '9' '\0'
]>
]>i.e. the last six characters are not affected.
]
]I don't think that that's guaranteed, or even implied by that
]sentence.  I would expect the contents of `dest' to be:
]
]	'a' 'b' 'c' '\0'  ?   ?   ?   ?   ?   ?
]
]where `?' may or may not be the same character that was in that
]position before the call to strcpy.  I could imagine an implementation
]that would null-out the destination string if it was longer than the
]source.

Well, I can't, because of C's rules about passing array arguments to
functions.  Only the address is passed, not the allocated length.  If
strcpy were to affect the portion of the destination array past the
NUL character, it would have to be careful not to modify anything
outside the destination array.  But since it can't know where the
destination array ends, it must not modify any elements but the ones
necessary to perform its stated function (which, by the way, still
doesn't prevent it from exceeding the destination's length -- it is
the programmer's responsibility to make sure that sizeof(dest) >
strlen(source)).

]  ANSI describes `strcpy' a little differently:
]
]  "The `strcpy' function copies the string pointed to by `s2'
]   (including the terminating null character) into the array pointed
]   to by `s1'.  If copying takes place between objects that overlap,
]   the behavior is undefined."
]
]There is nothing said about the order in which the copying takes
]place, or the contents of the destination string past the null
]character.

There is also nothing said about the affect on /dev/icbm, but that
doesn't imply that it is permitted to send it the "launch" signal.
Since it doesn't say that the other elements of the destination are
modified, I believe that an implementation would be incorrect if it
did.  And I suspect that there are many existing applications that
assume that they can use strcpy to copy into the middle of a string
without affecting later elements.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
uunet!think!barmar

dsill@NSWC-OAS.arpa (Dave Sill) (03/29/88)

In article <90@hropus.UUCP> John Young <hropus!jgy> writes:
>If the "contents of the destination string past the null character"
>are not guaranteed why would anyone use strcpy()? 

Well, at the risk of sounding flippant, one would use strcpy() to make
a copy of a string.  I personally don't use strcpy() for any other
reason, and I don't see how writing past the null but within the
bounds of the destination array would preclude this.

[In a previous posting I retracted my statement that strcpy() could
write something past the null in the destination string.]

=========
The opinions expressed above are mine.

"We must remove the TV-induced stupor that lies like a fog across the
land."
					-- Ted Nelson

throopw@xyzzy.UUCP (Wayne A. Throop) (03/29/88)

>,>>> ok@quintus.UUCP (Richard A. O'Keefe)
>> jrl@anuck.UUCP (j.r.lupien)

>>> The UNIX manuals say of strcpy(s1, s2) that it
>>> 	"copies s2 to s1, stopping after the null character has been copied."
>> Stopping after something occurs, as with "after the NULL has been copied"
>> does NOT equate, as you go on to assume, to "nothing will be done after
>> the NULL[*] is copied. The function will return immediately."
> That's not what I assumed.  The function could well compute factorial 5000.
> So what?  The manual says that COPYING stops after the NUL character has
> been copied.  So whatever strcpy does after copying NUL, either it doesn't
> copy any part of s2 to s1

It does indeed say that the copying stops after the null has been
copied.  But this in no way indicates that no more copying occurs after
the copy of the null has been made.  Consider:

        The car went careening down the street, stopping after the
        pedestrian had been hit.

Do you think this means that the car does no more traveling after the
impact?  I think not.  If that had been meant, it should have said
"stopping immediately after the pedestrian had been hit" or "stopping
when the pedestrian had been hit".  I think the same applies to the
manual entry.  And apparently other people think so too, since the ANSI
clarification of this passage does not guarantee that the null is the
last item copied.

I'm fairly certain that the only thing the phrasing of the manual
guarantees, (or even is intended to guarantee)) is that the null is
copied during the process, and not what the relative order is.

--
Sometimes I wonder whether God enjoys Christmas.
                                --- Horace Rumpole
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw

john@frog.UUCP (John Woods, Software) (03/29/88)

In article <1304@ut-emx.UUCP>, wca@ut-emx.UUCP (William C. Anderson) writes:
> In article <10753@mimsy.UUCP>, chris@mims
y.UUCP (Chris Torek) writes:
> -> In article <7506@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn) writes:
> -> ->This usage was never a good idea, because a valid implementation of
> -> ->strcpy() would be to copy right-to-left rather than left-to-right
> -> `That turns out not to be the case'---or rather, are you certain?
> Chris is right here, Doug.  For example, the ndbm(3) routines in 4.3BSD
> depend upon bcopy() doing the correct ordering in cases of overlap.
> Luckily, it is simple to do the code correctly.

BUZZ!  No, Doug is right.  The standard (3 August 87 draft) explicitly
states that "If copying takes place between objects that overlap, the
behavior is undefined."  You can't *depend* on the behavior of strcpy()
and expect to have your program be portable, QED.  Perhaps bcopy() is
*defined* to work correctly in cases of overlap, though people worried about
that less back in the old days :-).

The following is a perfectly *legal* (if perfectly awful) implementation.

char *strcpy(char *s1, const char *s2)
{
	const char *eos = s2 + strlen(s2);
	s1 += strlen(s2);
	while (eos != s2) *s1-- = *eos--;
	*s1 = *eos;
	return s1;
}

--
John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101
...!decvax!frog!john, ...!mit-eddie!jfw, jfw@eddie.mit.edu

FUN:  THE FINAL FRONTIER
Zippy the Pinhead in '88!

jgy@hropus.UUCP (John Young) (03/29/88)

> In article <90@hropus.UUCP> John Young <hropus!jgy> writes:
> >If the "contents of the destination string past the null character"
> >are not guaranteed why would anyone use strcpy()? 
> 
> Well, at the risk of sounding flippant, one would use strcpy() to make
> a copy of a string.  I personally don't use strcpy() for any other
> reason, and I don't see how writing past the null but within the
> bounds of the destination array would preclude this.
> 
> [In a previous posting I retracted my statement that strcpy() could
> write something past the null in the destination string.]
> 

There was no mention of "within the bounds of the destination array"
in the origonal posting.  How would you suggest strcpy() check for
this?  Good thing you retracted your statement (I didn't see it!)

djones@megatest.UUCP (Dave Jones) (03/30/88)

in article <725@xyzzy.UUCP>, throopw@xyzzy.UUCP (Wayne A. Throop) says:
> 
>
> It does indeed say that the copying stops after the null has been
> copied.  But this in no way indicates that no more copying occurs after
> the copy of the null has been made.
> 
>
> Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw


Huh?

nevin1@ihlpf.ATT.COM (00704a-Liber) (03/30/88)

In article <810@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>
>>   The moral of this is, don't depend on bizarre side effects unless
>
>The order in which strcpy works is hardly a "bizarre side effect".

I'm sorry, but it is!  If you are writing code which is dependent upon the
IMPLEMENTATION of strcpy instead of the DESCRIPTION of strcpy, then you ARE
depending on side effects of strcpy.  Whenever possible, code should NOT
depend on the side effects/implementation details of a function that it
calls.

Suppose I looked at the source for some obscure system call (call it foo)
and found out that it modified a static variable somewhere in memory.
Would you say that it is okay for me to look at the variable that it
modified (assuming that this was not a documented property of foo, of
course)?  I think not.  The strcpy argument is no different.

This is one of the things that makes code very hard to maintain.  For
example:  one of the routines that I was using returned a unique number
according to a certain set of constraints.  It also happened that the
number it returned was pseudo-random (it would not necessarily return the
same number under similar circumstances), but this was a property of the
implementation, not the description of the routine.   I modified the
routine so that it always returned the lowest number that met the
constraints (changed the implementation, not the description).  Guess what
happened?  Another part of the program was in an infinite loop because it
called this routine to generate two separate numbers which met the
constraints, but my (new) routine always returned the same number.  Now I
know why people don't like to touch code that already works (old code is
just too delicately intertwined).

With languages such as C++ becoming more popular, abstraction will be
forced so that these types of problems do not occur.  But until the time
that this is commonplace, we should be trying to abstract on a
procedural level.  By this I mean that code, whenever possible, should be
written so that it depends ONLY upon the description of a subroutine and NOT
dependent on the implementation of that subroutine.
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

ok@quintus.UUCP (Richard A. O'Keefe) (03/30/88)

In article <4190@ihlpf.ATT.COM>, nevin1@ihlpf.ATT.COM (00704a-Liber) writes:
> In article <810@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
> >>   The moral of this is, don't depend on bizarre side effects unless
> >The order in which strcpy works is hardly a "bizarre side effect".
> I'm sorry, but it is!

Questions like "what happens to the rest of the destination" and "what
happens if the two areas overlap" are so important that the answers
SHOULD be part of the description of strcpy(). It is extremely useful to
have a function which can safely be used to move part of a character
array towards its origin.  Given that strcpy() is the only possible
candidate for this in the SVID, that the description in the SVID can be
naturally construed as describing a left to right copy, and that the
descriptions of the string operations are pretty vague, it is reasonable
for someone to expect that strcpy() will work this way.

If the memcpy() question was solved by adding a memmove(), is there
also a strmove() in the current dpANS draft?

Does anyone know whether the vagueness of the SVID description of
strcpy() was intentional, or whether strcpy() was originally intended
to work left-to-right and the vagueness was accidental?

chris@mimsy.UUCP (Chris Torek) (03/31/88)

In article <836@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard
A. O'Keefe) writes:
>Questions like "what happens to the rest of the destination" and "what
>happens if the two areas overlap" are so important that the answers
>SHOULD be part of the description of strcpy().

Unless there were some overriding reason not to do so, I agree.  The
claim as to efficiency is similar to the claim that Unix should have
a `spawn' system call.  After all, most of the time you are copying
from one string to another.  After all, most of the time you are
going to exec immediately after a fork.

>If the memcpy() question was solved by adding a memmove(), is there
>also a strmove() in the current dpANS draft?

No.

This sort of thing can lead to a function space explosion: strcpy for
one-to-another; strltor for left-to-right copy; strrtol for right-to-
left; strmove for whichever is `right'; `strunsharedcpy' for memory
regions that are guaranteed unshared; ....

Where does one stop?  That is a matter of taste.  In the case of strcpy,
I happen to believe that defining it to work left-to-right is worth
any expense it may cause (because I believe that cost will be small
if not zero).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (03/31/88)

In article <836@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes:
> If the memcpy() question was solved by adding a memmove(), is there
> also a strmove() in the current dpANS draft?

You could

#define strmove(out,in) ((char*)memmove((void*)out, (void*)in, 1+strlen(in)))

if it weren't for the fact that the identifier "strmove" is already
reserved by the Standard.

(If I got the return value of memmove() wrong, please don't bother
posting to tell me.  I don't have a copy of the Standard with me.)

msb@sq.uucp (Mark Brader) (04/01/88)

Regarding:
> > If a machine has a fast instruction that
> > does a search for a byte and a fast block move instruction, it would
> > probably be best for strcpy to be written
> > 
> > 	memcpy (dest, src, strpos (src, '\0') + 1);
> > 
> > (assuming that memcpy and strpos are inlined into the appropriate
> > instructions ... )

I said, in an article I am now canceling:
> The above
> algorithm is two-pass, and therefore not robust in the face 
> of *shared memory*.

It has been pointed out to me that a one-pass algorithm is also not
robust in a shared-memory situation.  The two-pass and one-pass
algorithms merely fail in slightly different ways and on slightly
different race conditions.

Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com
	"I'm a little worried about the bug-eater," she said.  "We're embedded
	in bugs, have you noticed?"		-- Niven, "The Integral Trees"

gnu@hoptoad.uucp (John Gilmore) (04/01/88)

john@frog.UUCP (John Woods, Software) wrote:
> BUZZ!  No, Doug is right.  The standard (3 August 87 draft) explicitly
> states that "If copying takes place between objects that overlap, the
> behavior is undefined."  You can't *depend* on the behavior of strcpy()
> and expect to have your program be portable, QED.

If the standard was already perfect there would be no need to discuss
it.  But having an August draft say X doesn't mean that X is proven
to be true and correct, QED.

The whole discussion here is about what the standard *should* say.
Arguing that "this is right because the draft standard says it"
carries no weight at all; we knew that already and are arguing that
it is wrong.

In particular, both Chris and I have seen programs that depend on
strcpy() being able to slide a string into lower array indices without
destroying it.  We think this is a valid interpretation of the man page.
Now, some people are picking nits with the English used to document
it, which reminds me of people spending years analyzing the Bible and quoting
it to support their claims -- without once reading the original Arameic (sp?) 
to see what it really said.  In our case, we know what the original
source code said -- it copied left to right and made no bones about it.
And so far nobody has named a compiler/library/OS/environment
that *doesn't* just copy left to right.  But somebody somewhere wants
the freedom to copy all the even bytes and then all the odd bytes,
or something, and so we burn a few hundred K of comp.lang.c...
-- 
{pyramid,pacbell,amdahl,sun,ihnp4}!hoptoad!gnu			  gnu@toad.com
"Don't fuck with the name space!" -- Hugh Daniel

dsill@NSWC-OAS.arpa (Dave Sill) (04/01/88)

>> It does indeed say that the copying stops after the null has been
>> copied.  But this in no way indicates that no more copying occurs after
>> the copy of the null has been made.
>
>Huh?

It means the copying does not stop until the null is copied.

This reminds me of the Saturday Night Live sketch with Ed Asner as the
nuclear plant manager going on vacation whose parting advice is "You
can't use too much cooling water in a nuclear reactor."

The intent of the statement:
     Strcpy copies string s2 to	s1, stopping after the null char-
     acter has been moved.
is that all characters in s2, up to and including the terminating
null, are copied to s1.  Nothing at all is said about the the order in
which the copying takes place.  To assume that all implementations
copy from right-to-left or left-to-right is plainly wrong.

karl@haddock.ISC.COM (Karl Heuer) (04/02/88)

In article <17942@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes:
>You could
>#define strmove(out,in) ((char*)memmove((void*)out, (void*)in, 1+strlen(in)))
>if it weren't for the fact that the identifier "strmove" is already
>reserved by the Standard.

You could do it anyway -- provided "you" means the vendor, rather than the
user.  You'd still be a conforming implementation; the name "strmove" is part
of the implementation's available namespace.

I object on different grounds, though: "in" is evaluated twice.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

pablo@polygen.uucp (Pablo Halpern) (04/02/88)

From article <725@xyzzy.UUCP>, by throopw@xyzzy.UUCP (Wayne A. Throop):
> [ Refering to the manual entry for strcpy() ]
> I'm fairly certain that the only thing the phrasing of the manual
> guarantees, (or even is intended to guarantee)) is that the null is
> copied during the process, and not what the relative order is.

Fine.  To avoid incorrect inferences from readers, the entry should
be revised to just say that the copy INCLUDES the NUL terminator.
Also, since the length of the destination array cannot be determined
by strcpy() (because of C's array/pointer semantics), the manual entry
should explicitely state that no characters in the desination string
following the NUL are modified.  (Again, considering the array/pointer
semantics, perhaps someone could come up with an even more precise
rewording of my last sentence.)

Pablo Halpern		|	mit-eddie \
Polygen Corp.		|	princeton  \ !polygen!pablo  (UUCP)
200 Fifth Ave.		|	bu-cs      /
Waltham, MA 02254	|	stellar   /

djones@megatest.UUCP (Dave Jones) (04/02/88)

Geez.  Enough already!! 

Everybody knows what strcpy does:

void strcpy(str1, str2)
  char* str1;
  char* str2;
{
  while(*str1++ = *str2++) {;} 
}

If I remember correctly, it says as much right in K&R.  I don't
think you want to break K&R without darn good reason.

If you want a function that does something else, give it another name.
"strcpy" is already taken.


  -- Sgt. Dave Jones, Naming Conventions Police, ret.

chris@mimsy.UUCP (Chris Torek) (04/02/88)

In article <4295@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>The whole discussion here is about what the standard *should* say.

Precisely.

>In particular, both Chris and I have seen programs that depend on
>strcpy() being able to slide a string into lower array indices without
>destroying it.

Yes.  `strcpy(p, p+n)' is not an uncommon idiom.

>We think this is a valid interpretation of the man page.

(Well, perhaps the manual entry should be clarified.)

>... In our case, we know what the original
>source code said -- it copied left to right and made no bones about it.
>And so far nobody has named a compiler/library/OS/environment
>that *doesn't* just copy left to right.

To be fair, I *do* know of one:  The 4.3BSD Vax strcpy() uses the Vax
locc and movc3 instructions.  movc3 moves in whichever direction is
nondestructive.  This implies that

	strcpy(p+n, p)

moves the string *up* n bytes nondestructively, except when the string
is more than 65535 bytes long (the limit for a single locc/movc3).  I
would not mind having to change this if the standard mandated
left-to-right copying (which has a duplication effect on (p+n,p)-style
overlapping strings).  Alternatively, the standard could proclaim
that if the strings overlap and dst<src, the copy is done left-to-
right, otherwise the result is implementation dependent; this, however,
is an overly grotesque description.  I prefer the simple and well-
defined semantics of `if the strings overlap, the copy acts as if
it were performed from left to right, one byte at a time.'
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

ok@quintus.UUCP (Richard A. O'Keefe) (04/02/88)

In article <10895@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> In article <4295@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
> >And so far nobody has named a compiler/library/OS/environment
> >that *doesn't* just copy left to right.
> 
> To be fair, I *do* know of one:  The 4.3BSD Vax strcpy() uses the Vax
> locc and movc3 instructions.  movc3 moves in whichever direction is
> nondestructive.  This implies that
> 
This may not be such a wonderful idea:  according to the DEC manuals,
some VAX models do not implement the locc instruction.  (The machine
will trap to some sort of library which emulates the missing instructions.)

Getting this right for strings longer than 2^16-a few characters must be
a nightmare:  both locc and movc3 have a 16-bit length operand.  (This
has never made sense to me.)

chris@mimsy.UUCP (Chris Torek) (04/03/88)

>In article <10895@mimsy.UUCP> I mentioned that
>>... The 4.3BSD Vax strcpy() uses the Vax locc and movc3 instructions.

In article <848@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard
A. O'Keefe) writes:
>This may not be such a wonderful idea:  according to the DEC manuals,
>some VAX models do not implement the locc instruction.  (The machine
>will trap to some sort of library which emulates the missing instructions.)

This is true.  In particular, the Microvax I and II chips do not.
(Indeed, the uVax I does not even implement movc3 in hardware.)  The II
traps to kernel code that emulates locc.  (And people wonder why
strcpy() and index() are slow there!  I argued for a `getcputype'
syscall just for library optimisation, but no one has done it.)

>Getting this right for strings longer than 2^16-a few characters must be
>a nightmare:  both locc and movc3 have a 16-bit length operand.  (This
>has never made sense to me.)

(Since VMS string descriptor lengths are Words rather than Longwords,
obviously no one would ever want strings longer than that.  Right.)
Actually, it is not that bad; in particular, movc3 leaves registers
r1 and r3 pointing to the `next' string, so that you wind up with
something like this:

	# strcpy(dst, src)
	...
	loop:
		/* src in r1, dst in r3 */
		locc	$0,$65535,src	# find the \0 in src
		beql	last_block	# if we found it, finish up
		movc3	$65535,src,dst	# otherwise move 64K
		brb	loop		# and keep going
	last_block:
		/* convert to a count and move <65535 bytes */

The code for bcopy/memcpy/memmove that handles overlapping `backwards'
moves, however, is perhaps best described as `amusing':

	/* length in r6, src in r1, dst in r3 */
		addl2	r6,r1		# jump to end of block
		addl2	r6,r3
		movzwl	$65535,r0	# get a handy 64K
		brb	5f
	4:
		subl2	r0,r6		# count 64K moved
/* here begins the silliness: note how r1 and r3 need adjustment now */
		subl2	r0,r1		# ... from 64K behind where we were
		subl2	r0,r3
		movc3	r0,(r1),(r3)	# the VAX does this back to front
		movzwl	$65535,r0	# but we still have to fix the pointers
/* ... and again! */
		subl2	r0,r1		# afterward
		subl2	r0,r3
	5:
		cmpl	r6,r0		# 64K?
		bgtr	4b		# more
		subl2	r6,r1		# 64K or less;
		subl2	r6,r3		# adjust the pointers
		movc3	r6,(r1),(r3)	# and move
		movl	4(ap),r0	# always return dst
		ret

In other words, even though the microcode decides to move the string
`back to front' (high addresses to low addresses), and therefore sets
the registers to count down from the top, it very carefully adjusts
them afterward so that they point to the high addresses---exactly what
we do NOT want.  (I suspect the high bits of one of the counting
registers are used to flag the direction, which would give another
reason why the lengths are limited.  Too bad they are not limited
to 30 bits, which is as much as you can address in one segment [no,
not iNTEL segments].)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (04/03/88)

In article <10895@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> I would not mind having to change this if the standard mandated
> left-to-right copying (which has a duplication effect on (p+n,p)-style
> overlapping strings).  Alternatively, the standard could proclaim
> that if the strings overlap and dst<src, the copy is done left-to-
> right, otherwise the result is implementation dependent; this, however,
> is an overly grotesque description.  I prefer the simple and well-
> defined semantics of `if the strings overlap, the copy acts as if
> it were performed from left to right, one byte at a time.'

I'm not disagreeing what you said, only with the way you said it.
The terms "left-to-right" and "right-to-left" can be misleading.
Standard C can presumably be used in countries where the natural
direction of the language is right-to-left (e.g. Hebrew or Arabic)
rather than left-to-right (e.g. English or French).  In such an
anvironment, one would consider the terminating nul on a string
to be its left-most character, not its right-most as we would in
English.  Similarly, many of us that use VAX or other similar
equipment tend to think of the bytes being laid out in memory
numbered right-to-left (then shorts, ints, and longs line up
nicely without any of the complications that arise if one thinks
in terms of byte-swapping).

If the standard is going to specify an order for strcpy,
(and I really see no reason why it shouldn't), please let
that order be in terms of "start-to-end" or "low-to-high address"
or some other notation that doesn't presume which end of a string
is the "right" end.

nevin1@ihlpf.ATT.COM (00704a-Liber) (04/05/88)

In article <836@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:

>Questions like "what happens to the rest of the destination" and "what
>happens if the two areas overlap" are so important that the answers
>SHOULD be part of the description of strcpy(). It is extremely useful to
>have a function which can safely be used to move part of a character
>array towards its origin.

I agree that it is useful to have a function which can safely move strings
with overlapping characters.  That is what memmove() is for.

BTW, the answer to "what happens to the rest of the destination" in
strcpy() would be is that it is unaffected, since there is no way of conveying
what is meant by "the rest of the destination" to a function call; ie, how
can strcpy() tell the difference between an exact fit and an inexact fit?
It can't.  And the answer to "what happens if the two areas overlap" is
found directly in the standard:

	"If copying takes place between objects that overlap, the behavior is
	undefined."

You may not like the answer, but the standard answers the question just the
same.

>If the memcpy() question was solved by adding a memmove(), is there
>also a strmove() in the current dpANS draft?

strmove() is not needed since it is just a very special case of memmove().

In order to copy possibly overlapping strings, you need to know the length
of the source string.  Therefore, give a source string s2 (char *s2) and a
destination string s1 (char *s1):

	(char *)memmove((void *)s1, (void *)s2, strlen(s2) + (size_t)1)

will accomplish that you would want a strmove() to do.
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

nevin1@ihlpf.ATT.COM (00704a-Liber) (04/05/88)

In article <425@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>Geez.  Enough already!! 

>Everybody knows what strcpy does:

>void strcpy(str1, str2)
>  char* str1;
>  char* str2;
>{
>  while(*str1++ = *str2++) {;} 
>}

First off, you got it slightly wrong (where is your return value?).
Secondly, many implementations of C convert strcpy() into inline
assembly code.  It is conceivable that there may be hardware move
instructions which will copy in a right-to-left order.  Since 'good' C
programs should not be depending on the implementation of
strcpy() anyway, why should the implementation of it be restricted??

>If I remember correctly, it says as much right in K&R.  I don't
>think you want to break K&R without darn good reason.

K&R gives *possible* ways of implementating strcpy() in C (see page 100 in
the first edition).  These are not entirely correct (they do not include
the return value), nor are they all-inclusive.  They are merely there as
examples so that someone reading the book can understand how to implement
a function like strcpy().

BTW, I wonder of the people who are saying that C won't be used on a
multiprocessing machine are the same people who used to say that Unix will
never be implemented on a Cray??  :-) :-)

>If you want a function that does something else, give it another name.
>"strcpy" is already taken.

It seems that you, not I, want a function to do something other than what
strcpy() is *guaranteed* to do now.
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

ok@quintus.UUCP (Richard A. O'Keefe) (04/05/88)

In article <4263@ihlpf.ATT.COM>, nevin1@ihlpf.ATT.COM (00704a-Liber) writes:
> In article <836@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
> I agree that it is useful to have a function which can safely move strings
> with overlapping characters.  That is what memmove() is for.
Nope, memmove() is for when IN ADDITION you already know the exact amount
you want to move.

> 	"If copying takes place between objects that overlap, the behavior is
> 	undefined."
> You may not like the answer, but the standard answers the question just the
> same.
Refusing to answer is _not_ an answer!

> In order to copy possibly overlapping strings, you need to know the length
> of the source string.  Therefore, give a source string s2 (char *s2) and a
> destination string s1 (char *s1):
> 	(char *)memmove((void *)s1, (void *)s2, strlen(s2) + (size_t)1)
> will accomplish that you would want a strmove() to do.

(a) I don't want "to copy possibly overlapping strings", I want to move
    a NUL-terminated sequence to another area of memory which may overlap
    the current copy of that sequence.  I am happy to destroy the original
    (so it's "move", not "copy"), and in general I neither know nor care
    whether the destination has a NUL in it (so one of the areas might not
    be a "string").	[Actually, my objection to calling a move a copy
    counts against me:  strcpy() is a copy only when the two areas don't
    overlap.]
(b) if you are moving towards lower addresses, you do _not_ need to know
    the length of the source string in advance, but can check as you go.
    The implementor of a strmov() function can check for this, and only
    calculate strlen() when necessary.

Anyway, I give in.  From now on I'll stick with my own code, so that I
can be _sure_ what it does.

[PS: is it really so vital to wring the very last microsecond out of
 strcpy?  I once went through a program changing things like
	sprintf(buffer, "foo%s", X);
 to	strcpy(buffer, "foo"), strcpy(buffer+3, X);
 and it didn't make any appreciable difference.  Letting the implementor
 optimise the whatever out of strcpy() while not requiring that 1.0+1.0
 be a good approximation to 2.0 doesn't seem like quite the right balance.]

djones@megatest.UUCP (Dave Jones) (04/05/88)

in article <4264@ihlpf.ATT.COM>, nevin1@ihlpf.ATT.COM (00704a-Liber) says:
> 
> In article <425@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>>Geez.  Enough already!! 
> 
>>Everybody knows what strcpy does:
> 
>>void strcpy(str1, str2)
>>  char* str1;
>>  char* str2;
>>{
>>  while(*str1++ = *str2++) {;} 
>>}
> 
> First off, you got it slightly wrong (where is your return value?).

  Okay.  You win round one. It's supposed to return a char*, not void.
  Says so in the documentation. K&R is wrong.  Gad!  Is nothing sacred?

> Secondly, many implementations of C convert strcpy() into inline
> assembly code.

  So?  If they get it right, that's fine with me.

>  It is conceivable that there may be hardware move
> instructions which will copy in a right-to-left order.

  Then you better not use that conceivable hardware move! It doesn't do
  the right thing.  Besides, how is that right-to-left instruction
  going to find the terminating null character?
  
  I cut this directly out of the on-line UNIX documentation:      

	strcpy copies string s2 to s1, stopping after the null char-
     	acter has been copied.

  If you expect the jury to believe that means anything other than
  a "left-to-right" copy, you better have a darn good lawyer.

  "Your Honor, the copy stops after the null character has been copied.
  STOPS.  Nothing is copied after the null character."

  "But Your Honor, it stops after EVERY character is copied.  It
  doesn't say it stops IMMEDIATELY after the null character is copied.
  They just phrased it that way to trick you."

  "Your Honor, that is very silly."

> Since 'good' C
> programs should not be depending on the implementation of
> strcpy() anyway, why should the implementation of it be restricted??
>

Because not all C programs are 'good' ones.  The ones you and I write are,
of course.  But there's all those other programs out there, just waiting
to rear their shrowded strcpys in agony.  I get bored chasing down bugs
in old brittle code.  I'd rather be at the beach.
 
>>If I remember correctly, it says as much right in K&R.  I don't
>>think you want to break K&R without darn good reason.
> 
> K&R gives *possible* ways of implementating strcpy() in C (see page 100 in
> the first edition).  These are not entirely correct (they do not include
> the return value), nor are they all-inclusive.  They are merely there as
> examples so that someone reading the book can understand how to implement
> a function like strcpy().
> 

Probably they were meant to be only example implementations, but
I'll guess many have taken it quite literally, and programmed accordingly.

> BTW, I wonder of the people who are saying that C won't be used on a
> multiprocessing machine are the same people who used to say that Unix will
> never be implemented on a Cray??  :-) :-)
> 

Yep.  Same lot.  Bunch of Fortran geeks.  Just ignore them.

>>If you want a function that does something else, give it another name.
>>"strcpy" is already taken.
> 
> It seems that you, not I, want a function to do something other than what
> strcpy() is *guaranteed* to do now.

I never said I want the function to do what it is *guaranteed* to do now.
I want it to do what it *does* now.

> -- 
>  _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
> ' )  )				"The secret compartment of my ring I fill
>  /  / _ , __o  ____		 with an Underdog super-energy pill."
> /  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah


If you want to attempt a counterrebuttal (and I don't recommend it), 
I won't *string*it*out*.  (Urk.)  I'll let you have the last word.  It's 
not really an earth-shaking matter is it?




		Dave (Break it, you bought it) Jones

karl@haddock.ISC.COM (Karl Heuer) (04/07/88)

In article <7007@ki4pv.uucp> tanner@ki4pv.uucp (Dr. T. Andrews) writes:
>The real net effect of the X3J11 "improvement" of strcpy() definitions is
>likely to be that folks need to write their own version in order to be sure
>that something useful is done.

Fine with me.  If people use strcpy() only for non-overlapping areas, and roll
their own when they want to modify a string in place, then at least I can tell
the two apart when I read the code.

>A hundred programmers, each dreaming up his own name for strcpy() ...

Is this any worse than, say, everybody dreaming up his own name for "bool"%?
I suspect this is more common than in-place string shifting.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
%among those of us who don't like to overload "int" for this.

swarbric@tramp.Colorado.EDU (Frank Swarbrick) (04/08/88)

In article <3365@haddock.ISC.COM> karl@haddock.ISC.COM (Karl Heuer) writes:
>Is this any worse than, say, everybody dreaming up his own name for "bool"%?
[...]
>%among those of us who don't like to overload "int" for this.

Gee, and I don't even define it as an int.  I use

typedef enum {false,true=!false} bool;

I don't suppose that there's any real value in doing this, but oh well...

Frank Swarbrick (and his cat)
swarbric@tramp.UUCP               swarbric@tramp.Colorado.EDU
...!{ncar|nbires}!boulder!tramp!swarbric
"Timothy Leary is dead..."

nw@amdahl.uts.amdahl.com (Neal Weidenhofer) (04/08/88)

In article <137@polygen.UUCP>, pablo@polygen.uucp (Pablo Halpern) writes:
>        To avoid incorrect inferences from readers, the entry should
> be revised to just say that the copy INCLUDES the NUL terminator.
> 
> Pablo Halpern		|	mit-eddie \
> Polygen Corp.		|	princeton  \ !polygen!pablo  (UUCP)
> 200 Fifth Ave.		|	bu-cs      /
> Waltham, MA 02254	|	stellar   /

In dpANS Section 4.11.2.3 the description is:

The |strcpy| function copies the string pointed to by |s2|
(including the terminating null character) into the array
pointed to by |s1|.  If copying takes place between objects that
overlap, the behavior is undefined.

The opinions expressed above are mine (but I'm willing to share.)

Sometimes I live	Regards,
	in the country		Neal Weidenhofer
Sometimes I live		...{hplabs|ihnp4|ames|decwrl}!amdahl!nw
	in town			Amdahl Corporation
Sometimes I take		1250 E. Arques Ave. (M/S 316)
	a great notion		P. O. Box 3470
To jump in the river		Sunnyvale, CA 94088-3470
	and drown		(408)737-5007

flaps@dgp.toronto.edu (Alan J Rosenthal) (04/09/88)

david@dhw68k.cts.com (David H. Wolfskill) writes:
 >Suppose...
  [
    strcpy's order were implementation-defined, and this implementation
    defined it as being left-to-right.
  ]
 >Then, an algorithm to clear a given
 >string (str1) to a given value (other than NUL) could be coded:
 >
 >	*str1 = ch;
 >	for (c1 = str1; *++c1 != '\0'; *c1 = *(c1 -1));
 >
 >or (remembering the characteristics of the implementation):
 >
 >	*str1 = ch;
 >	strcpy(str1+1, str1)
 >
 >but I think the latter is easier to comprehend.


Gosh, I find these both really complicated.  (I must say however that
the most complicated part of the first example is the fact that the
_for_ body is placed inside the control structure, and the increment
inside the test!)

Why not do the simple:

    for(p = str1; *p; p++)    /* optionally insert "!= '\0'" */
	*p = ch;

ajr

-- 
"Comment, Spock?"
"Very bad poetry, Captain."

mouse@mcgill-vision.UUCP (der Mouse) (04/12/88)

In article <848@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes:
> In article <10895@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
>> In article <4295@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes:
>>> And so far nobody has named a compiler/library/OS/environment that
>>> *doesn't* just copy left to right.
>> To be fair, I *do* know of one:  The 4.3BSD Vax strcpy() uses the
>> Vax locc and movc3 instructions.  movc3 moves in whichever direction
>> is nondestructive.

This is not quite true.  I just looked at it, and the 4.3 VAX strcpy()
does use locc and movc3.  However, this doesn't imply that the strcpy()
operation is done whichever way is nondestructive.  Why is this?
Because the string may be longer than 64k.  The code loops, from left
to right, doing 64k-1 chunks until it has it whittled down to less than
64k.  Thus, the code works right for non-overlapping operands and for
cases where left-to-right would work.  The other sort of overlap will
work non-destructively for lengths up to 64k-1, and above that will do
replication with a stride of 64k-1.

> This may not be such a wonderful idea:  according to the DEC manuals,
> some VAX models do not implement the locc instruction.

Primarily the MicroVAX-II (and possibly the -I as well).

> (The machine will trap to some sort of library which emulates the
> missing instructions.)

It just traps through a specific vector in the SCB, much like a device
interrupt or an exception.

> Getting this right for strings longer than 2^16-a few characters must
> be a nightmare:  both locc and movc3 have a 16-bit length operand.

This is whence the looping I mentioned above.

> (This has never made sense to me.)

The 16-bit limitation on the string instructions?  Yeah, me either.
Anybody from DEC care to explain what this silly restriction is doing
there?

					der Mouse

			uucp: mouse@mcgill-vision.uucp
			arpa: mouse@larry.mcrcim.mcgill.edu

ray@micomvax.UUCP (Ray Dunn) (04/13/88)

In article <1988Mar31.183321.4740@sq.uucp> msb@sq.UUCP (Mark Brader) writes:
>Regarding:
>> > If a machine has a fast instruction that
>> > does a search for a byte and a fast block move instruction, it would
>> > probably be best for strcpy to be written....

Hmm.  I fell foul to this practice in MicroSoft 4.0 string library routines
fairly recently.

strchr (at least, probably others) unbeknownst to me does this to determine
the length of the string prior to doing the character search.

....Now, to optimize my file operations, I used a 32K buffer, and was using
strchr to find the line endings!  Can you say S.L.O.W.

I wonder how much software is out there just now running s.l.o.w.l.y
because of this practice?

Ray Dunn.  ..{philabs,mnetor}!micomvax!ray

karl@haddock.ima.isc.com (Karl Heuer) (06/13/89)

In article <4400001@tdpvax> scott@tdpvax.UUCP writes:
>The second question deals with strcpy().  Is it like memcpy in that if the
>arguments memory overlap the behavior is undefined or is it different.  Is
>pre-ANSI and ANSI different on this.

In both pre-ANSI and ANSI, strcpy() has the same disclaimer as memcpy().
If you want to copy overlapping strings, you should probably use
	memmove(dest, src, strlen(src));
since memmove() does have predictable behavior on overlap.

(In all known implementations, strcpy() happens to do the right thing for ONE
direction of copy, but this has never been guaranteed, and I wouldn't try to
rely on it.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

bright@Data-IO.COM (Walter Bright) (06/17/89)

In article <13674@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
<If you want to copy overlapping strings, you should probably use
<	memmove(dest, src, strlen(src));
<since memmove() does have predictable behavior on overlap.

I get involved with helping people debug C code from time to time, and the
bug in the above code occurs frequently. I.e., the line should be:
	memmove(dest, src, strlen(src) + 1);
I'm pointing this out because it's such a common bug that it's one of the
things I routinely look for. Remember:
	static char src[] = "abc";
	sizeof(src) == 4
	strlen(src) == 3

karl@haddock.ima.isc.com (Karl Heuer) (06/20/89)

In article <2013@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright) writes:
>In article <13674@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>	memmove(dest, src, strlen(src));
>
>	memmove(dest, src, strlen(src) + 1);

(I die.  My replacement reads my uncommented code and deletes a fragment he
doesn't understand.  Eventually the subroutine is sold to the government, and
the bug causes nuclear missles to be launched by accident.  The other side
retaliates, and all die.  O the embarrassment.)%

This is correct, of course; it only works to stop after strlen() if the
receiving buffer is known to have a null character in the appropriate spot.
Even if this were known information (e.g. if we're right-shifting a string in
a null-padded buffer), it's cheaper to use the +1 than to document why it
isn't necessary.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
________
% First person to identify the reference wins a defunct root password.