chris@mimsy.UUCP (Chris Torek) (04/09/88)
In article <4343@ihlpf.ATT.COM> nevin1@ihlpf.ATT.COM (00704a-Liber) writes: >You are not defining *what* the function does (ie, you are not making an >abstract *description* of the function); you are defining *how* the >function does a strcpy (ie, how it is suppose to be *implemented*). Nope. I can define what strcpy should do without saying how it should do it: char *strcpy(char *dst, char *src); 0. copies string `src' to `dst'. `src' and `dst' shall not overlap. or 1. copies string `src' to `dst'; if src and dst overlap, the result is implementation-defined. or 2. copies string `src' to `dst' nondestructively. or 3. copies string `src' to `dst' such that the copy is nondestructive when src and dst are distinct or when src < dst. or 4. copies string `src' to `dst'. By the time strcpy returns, the result is as if the copy were done using the following code: while ((*dst++ = *src++) != 0) /*void*/; >There is no 'such that' part in the specification of strcpy(). In *whose* specification? >Strcpy(), according to the man page, INCLUDING THE WARNING What warning? There is no warning in my string(3). Maybe the warning in yours is a bug :-) . >You are saying that overlapping does *not* yield surprises, which is a direct >contradiction with the specification. WHAT specification? The dpANS uses something like number 1 above; I have been saying that it may be best for it to use any of 2, 3, or 4. V7 Unix uses none of the above. As general design principles, let me offer these statements: - provide as few primitives as you can get away with; - make them as general as possible. Moving a string within a single buffer is a reasonable thing to want to do; if it is cheap enough to do that with the same primitive that moves strings from one buffer to another, I would say it should be done. I think having separate `memcpy' and `memmove' routines is a mistake, just as I think having multiple kinds of files (blocked, unblocked, random, sequential, ...) in an O/S is a mistake. If you must add a feature, or a restriction, or a new routine, make sure it carries its weight (as dmr put it). I think allowing overlapping strings in strcpy carries its weight better than does asking people to use memmove(dst, src, strlen(src)). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
nevin1@ihlpf.ATT.COM (00704a-Liber) (04/12/88)
In article <10987@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >In article <4343@ihlpf.ATT.COM> nevin1@ihlpf.ATT.COM (00704a-Liber) writes: >>You are not defining *what* the function does (ie, you are not making an >>abstract *description* of the function); you are defining *how* the >>function does a strcpy (ie, how it is suppose to be *implemented*). >Nope. I can define what strcpy should do without saying how it should >do it: > char *strcpy(char *dst, char *src); >0. copies string `src' to `dst'. `src' and `dst' shall not overlap. As my man page states. >1. copies string `src' to `dst'; if src and dst overlap, the result > is implementation-defined. As a few people on the net want it stated. >2. copies string `src' to `dst' nondestructively. This one can never be right, since some types of overlap are destructive. >3. copies string `src' to `dst' such that the copy is nondestructive > when src and dst are distinct or when src < dst. By 'src < dst' do you mean 'strlen(src) < sizeof(dst) / sizeof(char)', or do you mean that the addresses should just be subtracted?? (Assuming you are talking about the length(dst) instead of address dst, I'm not sure what you mean by length(dst). It can't be strlen(dst), since this is meaningless for a newly malloc()ed block.) Since you are (thoretically, anyway) trying to define a standard, please be more precise with your terms. That's what got us into this trouble in the first place! :-) >4. copies string `src' to `dst'. By the time strcpy returns, the > result is as if the copy were done using the following code: > > while ((*dst++ = *src++) != 0) /*void*/; Sorry, but this IS defining it in terms of an implementation! If you were to define it in terms of what the properties of your 'while' statement is, then I would be satisfied that your definition is implementation-free. Just what are the properties of implementing strcpy() as the while loop you stated above? Here is the list I came up with: Case I: strlen(src) < abs(src - dst) This is the non-destructive strcpy() that we all know and love. :-) Case II: 0 < dst - src <= strlen(src) This is an infinite loop which trashes memory starting at location dst. Case III: src == dst Nothing happens except for a few lost CPU cycles. Case IV: 0 < src - dst <= strlen(src) This is the DESTRUCTIVE strcpy(). When done, the array src[] is the string which was formerly pointed to by 2 * src - dst. Now, ask yourself a question. Wouldn't it be nice to tell the difference between someone using strcpy() for a non-destructive use instead of using strcpy() for this VERY SPECIALIZED destructive use (as given in Case IV)?? Personally, there are very few times that I need or want to change the src string AND the dst string in this manner at the same time (none that I can think of). And those few times that I need to do both of these at the same time I would rather call a different function, for the sake of readability and maintainability. But, if this is added to the Standard, then strcpy() should always be used when Case IV destructive copies are needed as well as when Case I non-destructive copies are needed. From what I understand, a degenerate of Case IV is currently being relied upon (ie, destructive copies where the programmer doesn't care about what happens to the src string). If this were added to the Standard, then the whole of Case IV would start being relied upon, and this would just lead to horrible programming styles!! >As general design principles, let me offer these statements: > - provide as few primitives as you can get away with; > - make them as general as possible. I agree. However, it is worth adding a not-so-general primitive if it will be used a lot and/or it's efficiency can be significantly improved (such as having printf() as well as the more general fprintf()). >I think allowing overlapping strings in >strcpy carries its weight better than does asking people to use >memmove(dst, src, strlen(src)). But you don't allow all types of overlapping strings with your primitive; only a very special subset of overlapping strings (where src >= dst). And by adding this to the Standard, you also allow an abuse of strcpy() when it is used specifically to modify the src string. -- _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 ' ) ) "The secret compartment of my ring I fill / / _ , __o ____ with an Underdog super-energy pill." / (_</_\/ <__/ / <_ These are solely MY opinions, not AT&T's, blah blah blah
chris@mimsy.UUCP (Chris Torek) (04/12/88)
I am not going to respond to the whole thing, for I am getting quite sick of this debate. `>>' below is mine, `>' is Nevin: In article <4383@ihlpf.ATT.COM> nevin1@ihlpf.ATT.COM (00704a-Liber) writes: >>[optional def. 2] copies string `src' to `dst' nondestructively. >This one can never be right, since some types of overlap are destructive. It is `right' in the same sense that it is `right' to describe memmove(src, dst, len) as `nondestructive'. >>3. copies string `src' to `dst' such that the copy is nondestructive >> when src and dst are distinct or when src < dst. >By 'src < dst' do you mean 'strlen(src) < sizeof(dst) / sizeof(char)', or >do you mean that the addresses should just be subtracted?? If `src' and `dst' point to different objects, they cannot overlap, and there is no question as to interference. If `src' and `dst' *do* point to places within the same object---e.g., char buf[1000]; src = &buf[0]; dst = &buf[500]; ---then the two pointers can be meaningfully subtracted, so the condition is simply `src < dst'. In other words, yes, the pointers should simply be subtracted, as long as it is meaningful to do so. >Since you are (thoretically, anyway) trying to define a standard, please be >more precise with your terms. I might if I thought anything might come of this. >>4. copies string `src' to `dst'. By the time strcpy returns, the >> result is as if the copy were done using the following code: >> >> while ((*dst++ = *src++) != 0) /*void*/; >Sorry, but this IS defining it in terms of an implementation! Yes. You may have noticed that I was working from least to most concrete [possibly with definitions 0 & 1 reversed]. The absolute most concrete definition is to say `this library function shall be implemented by the following C code'---that pins the semantics of the routine down as firmly as they may ever be pinned (providing, of course, that you have already defined the actions of the various statements). I was merely trying to show (with the `overlap' definitions 2, 3) that one can be less concrete and still make claims about overlapping copies. >If you were to define it in terms of what the properties of your >'while' statement is, then I would be satisfied that your definition >is implementation-free. The properties of the `while' statement were defined back in section three. Why should I repeat them? But this *is* an implemetational specification, although there is an escape clause (the `as if' rule). >From what I understand, a degenerate of Case IV is currently being relied >upon (ie, destructive copies where the programmer doesn't care about what >happens to the src string). Yes. >If this were added to the Standard, then the whole of Case IV would >start being relied upon, Quite possibly. >and this would just lead to horrible programming styles!! That remains to be demonstrated. >>I think allowing overlapping strings in >>strcpy carries its weight better than does asking people to use >>memmove(dst, src, strlen(src)). >But you don't allow all types of overlapping strings with your primitive; [which definition? I think he means 3] >only a very special subset of overlapping strings (where src >= dst). >And by adding this to the Standard, you also allow an abuse of strcpy() >when it is used specifically to modify the src string. [now he probably means 4, but more generally:] What makes this an abuse, beyond the fact that right now *your* manual entry (but not mine!) says so? Why is this *inherently* wrong? The facts: - There exists code now that looks like this: char buf[SIZE]; ... if (buf[0] == '/') /* remove the leading slash */ (void) strcpy(buf, buf + 1); - According to the current draft, this operation is `implementation defined'. - There are no known implementations in which this does anything other than what the comment in the above code suggests. - Hence, making this particular action well-defined would affect no known implementations, but would make the code above portable. Opinions: - This intended action of that code is a reasonable thing to want. (`char *bp; bp = buf[0]=='/' ? buf+1 : buf' will usually be faster and cleaner, but may be contraindicated for some other reason.) - Defining strcpy such that this operation is well-defined is a reasonable thing to do. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
ok@quintus.UUCP (Richard A. O'Keefe) (04/13/88)
In article <4383@ihlpf.ATT.COM>, nevin1@ihlpf.ATT.COM (00704a-Liber) writes: > As my man page states. > >1. copies string `src' to `dst'; if src and dst overlap, the result > > is implementation-defined. I find that the 2nd edition of the System V Interface Definition says that Character movement is performed differently in different implementations. Thus overlapping moves may yield surprises. Since I generally take the SVID to be the official definition of things, it follows that I was wrong to rely on strcpy(s, s+1) and should now use my own C code in such cases. Sigh. It would be nice to have a definition of what precisely IS an "overlapping move": I have encountered machines where the critical area was the range of _words_ including a sequence, not the bytes alone. Nevin Liber objected to Chris Torek's attempt to define strcpy() by exhibiting C code for it. It may be naive of me, but since the rest of the standard is supposed to define the constructs Chris Torek used, giving a definition by means of C code seems to me to be the ideal way of defining such operations. I have no reason to expect the dpANS drafters to be any better at writing English definitions of things than Chris Torek is at writing C code, and you can at least _test_ the C code to see if it does what you intended. From a user's point of view, having something defined so clearly and unambiguously seems like a good idea. C code is appropriate for defining some things (notably the "string" operations) and not appropriate for others (notably the floating-point library functions). I think there are two reasons for this: (a) The "string" operations only need primitive operations which the rest of the standard is supposed to define thoroughly, but cos() and so on depend on floating-point arithmetic which the standard leaves rather vague and (necessarily) rather implementation-dependent. (b) The "string" operations belong to C, so the C community can define them however they please, but cos() and so on already have other definitions, so can't be bent to suit C's convenience.