bobc@tektools.UUCP (Bob Crane) (08/09/85)
I was looking through the book, _The C Programming Language_, and came across something very disturbing. In chapter 5, pg. 101 it says: As the final abbreviation, we again observe that a comparison against \0 is redundant, so the function is often written as strcpy(s, t) /* copy t to s; pointer version 3 */ char *s, *t; { while (*s++ = *t++) ; } Although this may seem cryptic at first sight, the notational convenience is considerable, and the idiom should be mastered, if for no other reason than that you will see it frequently in C programs. Yeaacch!!!!!! It was still very cryptic to me the tenth time that I read it!!! A friend explained it to me by saying that the character in the 'while' expression is converted to an int and that the NULL character has an ascii value of 0 so the test will exit when the NULL character is encountered. I have trouble believing that the above has advantages of great speed OR readability over: strcpy(s,t) /* copy t to s; pointer version 2 */ char *s, *t; { while ((*s++ = *t++) != '\0') ; } Does anyone out there support the author by saying that Version 3 of 'strcpy' is better than Version 2? Bob Crane !tektronix!tektools!bobc (503)627-5379
gwyn@BRL.ARPA (VLD/VMB) (08/11/85)
First, the (char) in the while(expression) is NOT converted to an (int) in case 3; it is tested against zero directly. In case 2 it is converted to (int) for the comparison against '\0'. I think case 2 is certainly more readable, but as the book says, you need to learn to read things like case 3 since a lot of code is like that. More usually one will see something like char *s; ... while ( *s++ ) ... This really is a standard C idiom, although I don't recommend writing code that way. I personally prefer to distinguish between Boolean expressions (such as comparisons) and arithmetic expressions, using strictly Boolean expressions as conditions. Thus: while ( *s++ != '\0' ) or even while ( (int)*s++ != '\0' ) The typecast is perhaps overly fussy; it is not required by the language rules and may detract from readability. Tests for NULL pointers and flags often are written if ( p ) ... if ( flag & BIT ) ... rather than if ( p != NULL ) ... if ( (flag & BIT) != 0 ) ... (I prefer the latter.) Get used to it..
nather@utastro.UUCP (Ed Nather) (08/11/85)
> Does anyone out there support the author by saying that Version 3 of > 'strcpy' is better than Version 2? > Bob Crane No. -- Ed Nather Astronomy Dept, U of Texas @ Austin {allegra,ihnp4}!{noao,ut-sally}!utastro!nather nather%utastro.UTEXAS@ut-sally.ARPA
ark@alice.UUCP (Andrew Koenig) (08/12/85)
strcpy(s, t) /* copy t to s; pointer version 3 */ char *s, *t; { while (*s++ = *t++) ; } strcpy(s,t) /* copy t to s; pointer version 2 */ char *s, *t; { while ((*s++ = *t++) != '\0') ; } > Does anyone out there support the author by saying that Version 3 of > 'strcpy' is better than Version 2? Yes. In version 3, I am saying that the character that terminates a string is the same character is that is the implicit subject of an unstated comparison in a `while' statement. In version 2, the string terminator is an explicitly stated constant. Viewed that way, the two versions are equivalent only by coincidence.
darryl@ISM780.UUCP (08/12/85)
> Although this may seem cryptic at first sight, the notational > convenience is considerable, and the idiom should be mastered, > if for no other reason than that you will see it frequently in > C programs. > >I have trouble believing that the above has advantages of great >speed OR readability over: Note that K&R didn't say that the terse form had speed or readability advantages; their comment was that the lack of keystrokes overrode other considerations, once you got used to it. They were writing code using the ed editor on 110 or 300 baud terminals; anything that cut down the number of keystrokes was a big win. If you don't like the popular idioms in C, no says (well, at least, I don't) you have to use them. But you'd better get used to them, 'cause you'll see them a lot. Hnery Spencer aside, there does not seem to be a great force in the C community to throw these idioms out of the language or of common use. I suggest that they are here to stay; if you don't like them, you're going to be lumping them for a long time to come. --Darryl Richman, INTERACTIVE Systems Corp. ...!cca!ima!ism780!darryl The views expressed above are my opinions only.
haahr@siemens.UUCP (08/12/85)
Relevant code: (Kernighan & Ritchie, chapter 5, page 105) strcpy(s, t) /* copy t to s; pointer version 3 */ char *s, *t; { while (*s++ = *t++) ; } Bob Crane (tektools!bobc) writes: > ... [text from K&R, everyone owns it, no point quoting again] ... > > Yeaacch!!!!!! It was still very cryptic to me the tenth time that I read > it!!! A friend explained it to me by saying that the character in the > 'while' expression is converted to an int and that the NULL character has > an ascii value of 0 so the test will exit when the NULL character is > encountered. > > I have trouble believing that the above has advantages of great > speed OR readability over: > > strcpy(s,t) /* copy t to s; pointer version 2 */ > char *s, *t; > { > while ((*s++ = *t++) != '\0') > ; > } > > Does anyone out there support the author by saying that Version 3 of > 'strcpy' is better than Version 2? I do. Why? Read on. Doug Gwyn (brl-tgr!gwyn) responds: > I think case 2 is certainly more readable, but as the book says, you > need to learn to read things like case 3 since a lot of code is like > that. More usually one will see something like > char *s; > ... > while ( *s++ ) > ... > This really is a standard C idiom, although I don't recommend writing > code that way. I personally prefer to distinguish between Boolean > expressions (such as comparisons) and arithmetic expressions, using > strictly Boolean expressions as conditions. Thus: > while ( *s++ != '\0' ) > > Tests for NULL pointers and flags often are written > if ( p ) > ... > if ( flag & BIT ) > ... > rather than > if ( p != NULL ) > ... > if ( (flag & BIT) != 0 ) > ... > (I prefer the latter.) Get used to it.. The two pieces are different in terms of the abstraction presented. while (*s++ != ANYTHING) ... This code looks for some character in a string. In C, the character '\0' is the character after the last character in a string, so when you find that character, you have reached the end of the string. It is an idiom that we have all gotten used to, knowing to look for '\0'. On the other hand: while (*s++) ... loops through a string until the first 'false' character. Now what does falseness for a character mean? A logical (and, in the case of C, correct) interpretation is that we have reached the end of a string. With the case of (p != NULL) I can understand Doug's argument a little bit better, because NULL is a better abstraction for pointer to nothing (i.e. end of a list) than '\0' is for end of a string. But code like while (p->next) ... says "while there is a pointer after p on the list" very clearly. The (flag & BIT) comparison is also easier for me to understand than the explicit test because it allows me to forget about the low-level bit-twiddling that is going on, and worry about the actual test. Now, the hard case is the one Bob brought up. while (*s++ = *t++) ... looks very much like the while (*s++ == *t++) ... one would expect from strcmp or similar functions. I think a comment or something in this case is much more help than the explicitly redundant comparison against zero. This is a matter of personal preference. The reason I wouldn't put the "!= '\0'" in this code is that it doesn't tell you anything, unless you are used to a convention that says something like "thou shalt always compare everything but explicit tests to 0." But putting in a '\0' test won't even make lint complain on the one where it doesn't belong. Again, with the possible confusion brought up because of the = and == operators, maybe one should take special care with tests like this one. The idea of abstracting a test beyond explicitly testing for zero is nice and C is not the first language to do it. Bjarne Stroustrup recognized this and included in C++ (as part of the general overloading capability), the ability to overload comparisons, and retained the convention that an if is an implicit comparison against zero. Any class can be the object of an if or while and the appropriate comparison operator is called. A conditional of the form while (cin) ... // cin is the stream associated with stdin while fail when there is no more input on cin. Exactly what one would expect. While while (cin.state != eof && cin.state != fail) ... (or whatever it is exactly -- I forget) tells you explitly what it is doing, it tells you more than you normally need to know. Because the values that fail tests in C (null pointer, character beyond end of string) are logical and consistent, they provide a nice abstraction beyond worrying about what should be implementation details (i.e. '\0' is the end of a string, NULL is the pointer to nothing). Paul Haahr ..!princeton!macbeth!haahr
jeq@laidbak.UUCP (Jonathan E. Quist) (08/15/85)
>Relevant code: (Kernighan & Ritchie, chapter 5, page 105) > > strcpy(s, t) /* copy t to s; pointer version 3 */ > char *s, *t; > { > while (*s++ = *t++) > ; > } > > >Bob Crane (tektools!bobc) writes: >> ... [text from K&R, everyone owns it, no point quoting again] ... >> >> Yeaacch!!!!!! It was still very cryptic to me the tenth time that I read >> it!!! A friend explained it to me by saying that the character in the >> 'while' expression is converted to an int and that the NULL character has >> an ascii value of 0 so the test will exit when the NULL character is >> encountered. >> >> I have trouble believing that the above has advantages of great >> speed OR readability over: >> >> strcpy(s,t) /* copy t to s; pointer version 2 */ >> char *s, *t; >> { >> while ((*s++ = *t++) != '\0') >> ; >> } >> >> Does anyone out there support the author by saying that Version 3 of >> 'strcpy' is better than Version 2? Personally, I find Version 3 easier to deal with, but then, I learned C after years of assembly language bit-twiddling. Version 3 (and many other "standard" C constructs) happens to be the form I settled on in various assembly language implementations with various microprocessors. In my case, compactness of code and speed were of utmost importance. (In some cases, saving 10 bytes of instructions meant the difference between using a 2K or 4K EPROM.) In working on things like tty drivers and such, I found forms without the "extra" comparison easier to live with while scanning unfamiliar code, because, though cryptic, it is compact and unambigous, and after a while (i.e. with experience), I think is is easier to recognize (*s++ = *t++) than to see ((*s++ = *t++) != '\0') and stop to think `what are they comparing?' I think it's more or less the same as finding it easier to scan through lower case comments (as opposed to all upper case) to find something I wrote "some time back." I suspect that this has something to do with the fact that there is more size contrast between lower case letters than the same in upper case. This is only my personal theory. I don't mean to flame those who prefer version 3, this is just my own preference. As to efficiency, that would depend upon the hardware and the cleverness of the compiler. If the machine sets a zero flag when a '\0' is transferred, then a "branch if zero" type of intruction can be immediately executed, without additionally comparing the character to 0. Whether the compiler takes advantage of this is another matter... Jonathan E. Quist Lachman Associates, Inc. ihnp4!laidbak!jeq ``I deny this is a disclaimer.''
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (08/15/85)
Well, yes, but. Abstract objects (such as input data streams) can have more than one interesting predicate. What would testing for the "truth" of such an object mean? Clearly you would have to include a (predicate) selection operation, and that isn't notably different from just writing the boolean expression (predicate) that one has in mind. Just packaged differently.
guy@sun.uucp (Guy Harris) (08/19/85)
> As to efficiency (of "if (*p++ = *q++)" vs "if ((*p++ = *q++) == '\0')" > - gh), that would depend upon the hardware and > the cleverness of the compiler. If the machine sets > a zero flag when a '\0' is transferred, then a > "branch if zero" type of intruction can be immediately executed, > without additionally comparing the character to 0. > Whether the compiler takes advantage of this is another matter... What?? I dunno about your compilers, but every one I've worked with generates the exact same code for both constructs; some may even convert the first to the second in their internal representation (it's been too long since I've poked inside PCC to remember). It doesn't take much in the way of compiler technology to make the efficiency issue irrelevant in this case. Guy Harris