edh@ux.acs.umn.edu (Eric "The Mentat Philosopher" Hendrickson) (10/17/90)
Basically, what I want to do is take a string of upper/lower case, and make it all upper case. Here is a first try at it, #include <ctype.h> main() { char *duh = "Hello"; printf("%s\n", duh); while (*duh <= strlen(duh)) { if (islower(*duh)) *duh = toupper(*duh); *duh++; } printf("%s\n", duh); } And what I get is : Hello Hello What I want is: Hello HELLO Can anybody point out a good way of doing this? Thanks much, Eric Hendrickson -- /----------"Oh carrots are divine, you get a dozen for dime, its maaaagic."-- |Eric (the "Mentat-Philosopher") Hendrickson Academic Computing Services |edh@ux.acs.umn.edu The game is afoot! University of Minnesota \-"What does 'masochist' and 'amnesia' mean? Beats me, I don't remember."--
poser@csli.Stanford.EDU (Bill Poser) (10/17/90)
In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes: [believes that there is a problem with toupper and gives code including the following] > > char *duh = "Hello"; > printf("%s\n", duh); > while (*duh <= strlen(duh)) { > if (islower(*duh)) *duh = toupper(*duh); > *duh++; > } The problem here is in the while termination condition. What this tests is whether the numerical value of the current character (*duh) is less than or equal to the length of the the string duh, which happens always to be five. This condition is never satisfied, so the code in the loop is never executed. (An aside: since strlen(duh) never changes, either you or the compiler should move it outside the loop.) Bill
bhoughto@cmdnfs.intel.com (Blair P. Houghton) (10/17/90)
In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes >#include <ctype.h> >main() >{ > char *duh = "Hello"; > printf("%s\n", duh); > while (*duh <= strlen(duh)) { Change `*duh' to `duh'. > if (islower(*duh)) *duh = toupper(*duh); > *duh++; Ditto. Increment the pointer, not the character. > } > printf("%s\n", duh); Use a different variable here. `duh' will now point to 'O', not 'H', if the loop is entered. >} > >And what I get is : >Hello >Hello Basically, since `*duh' is a character, and a printable one at that, its value as an integer in (*duh <= strlen(duh)) is going to be something on the order of 60, while strlen(duh) is 5. The loop is skipped because 60 is never <= 5. --Blair "End of lesson. No opportunistic comment on use of the word 'duh.' ...except maybe indirectly..."
bhoughto@cmdnfs.intel.com (Blair P. Houghton) (10/17/90)
Sorry if anyone saw my earlier posting to this thread; a) I thought I was mailing it; b) I thought I had hit 'l' for 'list' instead of 's' for 'send': and there were several things yet to edit. I went after ^C, but it was too late... the cancellation should be getting through any time now... In article <15857@csli.Stanford.EDU> poser@csli.stanford.edu (Bill Poser) writes: >In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes: >> char *duh = "Hello"; >> printf("%s\n", duh); >> while (*duh <= strlen(duh)) { >> if (islower(*duh)) *duh = toupper(*duh); >> *duh++; >> } >> printf("%s\n",duh) > >The problem here is in the while termination condition. What this tests There's more than just the one problem (that *duh will be > strlen(duh)); 0. *duh refers to the character, not the location; 1. The loop changes the value of the pointer `duh', so it may print nothing other than "O" once you get the loop to work; 2. merely using (duh <= strlen(duh)) won't fix it; the value of the pointer `duh' is almost certain to be larger than strlen(duh). I won't give fixes here; it's too instructive to work them out yourself, especially at the apparent level of understanding. >(An aside: since strlen(duh) never changes, either you or the compiler >should move it outside the loop.) Trivial optimization compared to the massive bugs extant. --Blair "I've been saying 'duh' myself a lot, lately..."
bruce@seismo.gps.caltech.edu (Bruce Worden) (10/17/90)
In article poser@csli.stanford.edu (Bill Poser) writes: >In article edh@ux.acs.umn.edu (Eric D. Hendrickson) writes: >[believes that there is a problem with toupper and gives code including >the following] >> >> char *duh = "Hello"; >> printf("%s\n", duh); >> while (*duh <= strlen(duh)) { >> if (islower(*duh)) *duh = toupper(*duh); >> *duh++; >> } >The problem here is in the while termination condition. [ .... ] >[ ... ] so the code in the loop is never executed. >(An aside: since strlen(duh) never changes, either you or the compiler >should move it outside the loop.) On the contrary, if this loop actually executed, the value of `strlen(duh)' would change at every iteration because `duh' is incremented in the loop. Similarly, in the final statement (deleted above): printf("%s\n",duh); `duh' would point off the end of the string if the loop actually executed. I sent the original poster this code with an explanation, which people may comment on as they see fit: #include <ctype.h> main() { char *duh = "Hello"; int i, limit = strlen(duh); printf("%s\n", duh); for(i=0; i<limit; i++) { if (islower(duh[i])) duh[i] = toupper(duh[i]); } printf("%s\n", duh); } P.S. Why did I rewrite the `while' loop above as a `for' loop? I have found `for' loops to be very efficient (if that is a consideration) and, as I have said here before, I find subscripted arrays to be clearer and less error prone than incremented pointers (plus, vectorizing compilers love finding those iteration variables.) (Having said that, I hope nobody finds a bug in my loop.) -------------------------------------------------------------------------- C. Bruce Worden bruce@seismo.gps.caltech.edu 252-21 Seismological Laboratory, Caltech, Pasadena, CA 91125
poser@csli.Stanford.EDU (Bill Poser) (10/17/90)
In article <473@inews.intel.com> bhoughto@cmdnfs.intel.com (Blair P. Houghton) writes: >In article <15857@csli.Stanford.EDU> poser@csli.stanford.edu (Bill Poser) writes: >>The problem here is in the while termination condition. What this tests > >There's more than just the one problem (that *duh will be > strlen(duh)); > >0. *duh refers to the character, not the location; This is one aspect of the problem I pointed out, that comparing the value of a character to the length of the string is not useful. >1. The loop changes the value of the pointer `duh', so it may print >nothing other than "O" once you get the loop to work; This will be avoided if an index is used and compared to strlen(duh). The issue only arises if one compares duh to duh+strlen(duh), in which case a copy of the pointer must be used. >2. merely using (duh <= strlen(duh)) won't fix it; the >value of the pointer `duh' is almost certain to be larger >than strlen(duh). Another aspect of the same problem I pointed out. Why worry about an obviously wrong "fix" that nobody has suggested? The problem doesn't have to do with dereferencing - it has to do with confusing pointers and indices. >>(An aside: since strlen(duh) never changes, either you or the compiler >>should move it outside the loop.) > >Trivial optimization compared to the massive bugs extant. Which is why its an aside. Bill
salomon@ccu.umanitoba.ca (Dan Salomon) (10/17/90)
In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes: >Basically, what I want to do is take a string of upper/lower case, and make >it all upper case. Here is a first try at it, > > >#include <ctype.h> >main() >{ > char *duh = "Hello"; > printf("%s\n", duh); > while (*duh <= strlen(duh)) { > if (islower(*duh)) *duh = toupper(*duh); > *duh++; > } > printf("%s\n", duh); >} There are at least four errors in this code. Two of them are in your while statement. There is no point in repeatedly recomputing the string length, and no point in comparing either a pointer, or the character it points to to that length. Instead test for the end of the string by finding the terminating null character. The other two errors, incrementing the character instead of the pointer, and trying to print the string by pointing to its end, were mentioned in earlier postings. Try the following version: #include <ctype.h> main() { char *duh = "Hello"; char *cur; printf("%s\n", duh); cur = duh; while (*cur) { if (islower(*cur)) *cur = toupper(*cur); cur++; } printf("%s\n", duh); } Sometimes it pays to stay in bed on Monday, rather than spending the rest of the week debugging Monday's code. :-) -- Dan Salomon -- salomon@ccu.UManitoba.CA Dept. of Computer Science / University of Manitoba Winnipeg, Manitoba R3T 2N2 / (204) 275-6682
ghoti+@andrew.cmu.edu (Adam Stoller) (10/17/90)
Both the original code posted and that as supplied by others - seems to accept the fact that char *duh = "Hello"; can be modified. From what I recall, for your simple test function to work, you would either have to use: char duh[] = "Hello"; or pass/read in a string into either a malloc'ed area or char array -- before being able to modify it. Of course I could be wrong - but...for my $0.02 function contribution: #include <ctype.h> int main() { char duh[] = "Hello"; /* see (1), below */ char *s = NULL; printf("%s\n", duh); for (s = duh; *s != '\0'; s++){ *s = toupper(*s); /* see (2), below */ } printf("%s\n", duh); } (1) some older compilers will require this to be declared static, before allowing you to use aggregate initialization. (2) under ANSI you don't need to test for islower() - pre-ANSI requires the islower() test because many of the macros used to define islower and toupper were brain-dead --fish
scc@rlgvax.UUCP (Stephen Carlson) (10/17/90)
In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes: >Basically, what I want to do is take a string of upper/lower case, and make >it all upper case. Here is a first try at it, > > >#include <ctype.h> >main() >{ > char *duh = "Hello"; > printf("%s\n", duh); > while (*duh <= strlen(duh)) { > if (islower(*duh)) *duh = toupper(*duh); > *duh++; > } > printf("%s\n", duh); >} Since others have pointed out the problem with the while loop condition, I would like to point out that with a declaration of char *duh = "Hello"; the compiler is free to put this string in read-only memory (text). Then the subsequent if (...) *duh = toupper(*duh); will dump core with a segmentation violation (SIGSEGV). You may have lucked out since the incorrect loop condition avoids this statement. I would recommend declaring `duh' as a (static) array and then using a pointer to do the work on the array: #include <stdio.h> #include <ctype.h> int main() { static char duh[] = "Hello"; char *p = duh; printf("%s\n", duh); while (*p) { /* or (*p != '\0') if that is your style */ if (islower(*p)) *p = toupper(*p); p++; } printf("%s\n", duh); return 0; } Notes: Declaring a char array and initializing it to a string will copy it to a writable area. It might even be more efficient. On some systems, toupper() is safe to use even if the char is not a lower case letter. On other systems, the islower() test is necessary. ANSI standardizes this. The expression `*duh++' will increment the pointer as you want, but it will do a useless deference (hence lint's "null effect"). In no case will it increment the char it points to as others incorrectly state. By the way, the new program lints (ignoring the frivolous "returns a value that is always ignored" message) and runs with no problem. -- Stephen Carlson | ICL OFFICEPOWER Center scc@rlgvax.opcr.icl.com | 11490 Commerce Park Drive ..!uunet!rlgvax!scc | Reston, VA 22091
jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) (10/17/90)
Try this one: void strupper(char *str) { for (;*str!='\0';str++) *str=toupper(*str); } ----------------- Jeffrey Hutzelman America Online: JeffreyH11 Internet/BITNET:jh4o+@andrew.cmu.edu, jhutz@drycas.club.cc.cmu.edu, jh4o@cmuccvma >> Apple // Forever!!! <<
jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) (10/17/90)
No, the loop will work as advertised. See my previous post for a
function that does it with an incremented pointer.
-----------------
Jeffrey Hutzelman
America Online: JeffreyH11
Internet/BITNET:jh4o+@andrew.cmu.edu, jhutz@drycas.club.cc.cmu.edu,
jh4o@cmuccvma
>> Apple // Forever!!! <<
will@kfw.COM (Will Crowder) (10/17/90)
In article <1990Oct16.221035.10764@nntp-server.caltech.edu> bruce@seismo.gps.caltech.edu (Bruce Worden) writes: >I sent the original poster this code with an explanation, which people may >comment on as they see fit: > >#include <ctype.h> >main() { > char *duh = "Hello"; > int i, limit = strlen(duh); > printf("%s\n", duh); > for(i=0; i<limit; i++) { > if (islower(duh[i])) duh[i] = toupper(duh[i]); > } > printf("%s\n", duh); >} > >P.S. Why did I rewrite the `while' loop above as a `for' loop? I have >found `for' loops to be very efficient (if that is a consideration) and, >as I have said here before, I find subscripted arrays to be clearer and >less error prone than incremented pointers (plus, vectorizing compilers >love finding those iteration variables.) (Having said that, I hope nobody >finds a bug in my loop.) Well, I don't immediately see any bugs in the loop. Agreed that incremented pointers are less clear than subscripted arrays, but they are usually more expensive, especially with older compilers. In this case, in order to explain the problem to the poster, you have to start talking about pointer/array equivalence, what duh[i] really means, etc. etc., and he's obviously not quite ready for that yet. I sent the poster a heavily commented version of the following, along with a blanket apology for the ridiculously large number of partially or completely incorrect answers to his very simple question. #include <stdio.h> #include <ctype.h> main() { char *duh = "Hello"; char *p; printf("%s\n", duh); p = duh; while (*p != '\0') { if (islower(*p)) *p = toupper(*p); p++; } printf("%s\n", duh); } Will
profesor@wpi.WPI.EDU (Matthew E Cross) (10/18/90)
In article <wb76pN600awOE3SaQz@andrew.cmu.edu> jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) writes: >Try this one: > >void strupper(char *str) >{ >for (;*str!='\0';str++) > *str=toupper(*str); >} Nope, won't work - the return value of 'toupper' is undefined if the input is not a lowercase character. Try: void strupper(char *str) { for (;*str!='\0';str++) *str=islower(*str)?toupper(*str):*str; } (I hope I got the '? :' syntax right...) -- +----------------------------------------------------+------------------------+ | "The letter U has a lot of uses ... | Looking for | profesor@wpi.wpi.edu | | I like to play it like a guitar!" | suggestions +------------------------+ | -Sesame Street | for new gweepco programs... |
salomon@ccu.umanitoba.ca (Dan Salomon) (10/18/90)
In article <wb76pN600awOE3SaQz@andrew.cmu.edu> jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) writes: >Try this one: > >void strupper(char *str) >{ >for (;*str!='\0';str++) > *str=toupper(*str); >} There is a problem with this solution on some systems. Berkeley UNIX BSD 4.3 requires that the parameter of toupper be a lowercase letter. The result is undefined if it is not. Therefore the test using islower may be necessary on some systems. This makes toupper pretty useless in portable programs, but those are the breaks. -- Dan Salomon -- salomon@ccu.UManitoba.CA Dept. of Computer Science / University of Manitoba Winnipeg, Manitoba R3T 2N2 / (204) 275-6682
will@kfw.COM (Will Crowder) (10/18/90)
In article <1990Oct17.165509.10914@kfw.COM> I wrote: >I sent the poster a heavily commented version of the following, along with >a blanket apology for the ridiculously large number of partially or completely >incorrect answers to his very simple question. Mea culpa. First I go off and complain about partially or completely incorrect answers to his simple question, and then, as has been pointed out to my in e-mail by <ico.isc.com!rcd> that my solution also contains an error: char *duh = "Hello"; duh points to a constant string. Should've been char duh[] = "Hello"; Now, maybe I just didn't want to start a whole go-around again about the difference between the two, or maybe I was too lazy to explain, or maybe (and this is the most likely) I just overlooked it. Oooopppps! <sheepish :) :)> Will
mikey@ontek.com (michelle (international krill) lee) (10/18/90)
In comp.lang.c, edh@ux.acs.umn.edu (Eric D. Hendrickson) writes: | | #include <ctype.h> | main() | { | char *duh = "Hello"; | printf("%s\n", duh); | while (*duh <= strlen(duh)) { | if (islower(*duh)) *duh = toupper(*duh); | *duh++; | } | printf("%s\n", duh); | } The usual suspects have pointed out the obvious problems; thus the only things remaining are nitpicky in the extreme, but that never stopped me from posting before... 1. While not a problem in this context, it's generally advisable to use isascii() to check that whatever is being converted to upper case is actually an ascii character. 2. My manual page makes reference to a _toupper() macro. Adding an "#ifdef _toupper" to check if the macro is available could speed things up marginally, at the expense of defeating what- ever locale facility is available. 3. Making "duh" a register variable wouldn't hurt, especially if the above code were to be completely debugged and turned into a more general utility. 4. Modification of the constant character array "Hello" may be a no-no for certain compilers and/or certain compiler options. 5. An exit or a return statement would be nice. 6. #includ-ing <stdio.h> is consider good practice in code which uses the standard i/o facilities like printf().
brad@SSD.CSD.HARRIS.COM (Brad Appleton) (10/18/90)
In article <wb76pN600awOE3SaQz@andrew.cmu.edu> jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) writes: >Try this one: > >void strupper(char *str) >{ >for (;*str!='\0';str++) > *str=toupper(*str); >} You need to be careful here! It all depends on your compiler. For some compilers, the toupper function/macro performs the functional equivalent of: c = (c - 'a') + 'A'; with other compilers, the functionality of toupper is more like this: if ( c >= 'a' && c <= 'z' ) c = (c - 'a') + 'A'; In other words, some compilers will blindly convert the character to uppercase, regardless of what the character was whereas other compilers will make sure the value is indeed lowercase before trying to modify it to be uppercase. You will have to double-check your documentation for this. I think that the BSD Unix/C toupper() MUST take a lowercase letter and has undefined results otherwise whereas the AT&T Unix/C toupper() will give the desired result even if the character was not lowercase to begin with (Im not 100% positive about that though, anyone care to enlighten me). ______________________ "And miles to go before I sleep." ______________________ Brad Appleton brad@travis.ssd.csd.harris.com Harris Computer Systems ...!uunet!hcx1!brad Fort Lauderdale, FL USA ~~~~~~~~~~~~~~~~~~~~ Disclaimer: I said it, not my company! ~~~~~~~~~~~~~~~~~~~
jackm@agcsun.UUCP (Jack Morrison) (10/18/90)
>> char *duh = "Hello"; >> printf("%s\n", duh); >> while (*duh <= strlen(duh)) { >> if (islower(*duh)) *duh = toupper(*duh); >> *duh++; >> } > >(An aside: since strlen(duh) never changes, either you or the compiler >should move it outside the loop.) > Bill Even better, just use while (*duh) { if (islower(*duh)) *duh = toupper(*duh); duh++; } (or for anal types, :-) while (*duh != '\0') { -- "How am I typing? Call 1-303-279-1300" Jack C. Morrison Ampex Video Systems 581 Conference Place, Golden CO 80401
svissag@hubcap.clemson.edu (Steve L Vissage II) (10/18/90)
From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU (Matthew E Cross): > Nope, won't work - the return value of 'toupper' is undefined if the input is > not a lowercase character. So define your own toupper() macro. That's what I did. #define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch) You don't even have to do any casts, because C is pretty free with it's int<->char conversions. > void strupper(char *str) > { > for (;*str!='\0';str++) > *str=islower(*str)?toupper(*str):*str; > } ^ | *str=toupper(*str); Steve L Vissage II
bruce@seismo.gps.caltech.edu (Bruce Worden) (10/19/90)
svissag@hubcap.clemson.edu (Steve L Vissage II) writes: >From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU (Matthew E Cross): >> Nope, won't work - the return value of 'toupper' is undefined if the input is >> not a lowercase character. > >So define your own toupper() macro. That's what I did. >#define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch) > [ ... ] I wouldn't recommend defining a macro with the same name as a library function. And from what I remember from the `toupper()' and `tolower()' discussion here about three months ago, I think it was generally agreed that a macro that evaluates its argument three times must be used with great caution ( toupper(getchar()) can happen, e.g.), and that the simple subtraction ( ch-32 ) and comparisons ( ch<123, ch>96) are inherently non-portable. P.S. I, among others, missed the *duh = "hello"; bug. My apologies. -------------------------------------------------------------------------- C. Bruce Worden bruce@seismo.gps.caltech.edu 252-21 Seismological Laboratory, Caltech, Pasadena, CA 91125
jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) (10/19/90)
No. In ANSI C, toupper is required to leave the character alone if it
is not lowercase.
-----------------
Jeffrey Hutzelman
America Online: JeffreyH11
Internet/BITNET:jh4o+@andrew.cmu.edu, jhutz@drycas.club.cc.cmu.edu,
jh4o@cmuccvma
>> Apple // Forever!!! <<
karl@haddock.ima.isc.com (Karl Heuer) (10/19/90)
In article <1990Oct17.170914.683@wpi.WPI.EDU> profesor@wpi.WPI.EDU (Matthew E Cross) writes: >Nope, won't work - the return value of 'toupper' is undefined if the input is >not a lowercase character. Fixed in ANSI C. For those who are using pre-ANSI systems where this doesn't hold, I recommend coding in ANSI style, and writing your own ANSI-compatible headers and libraries as needed. This minimizes the trauma when you finally graduate to ANSI C. My personal ansi/ctype.h is: #include "/usr//include/ctype.h" #undef tolower #undef toupper #if defined(__STDC__) extern int toupper(int); extern int tolower(int); #else extern int toupper(); extern int tolower(); #endif Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
pgd@bbt.se (10/19/90)
In article <18575@haddock.ima.isc.com> karl@ima.isc.com (Karl Heuer) writes: > >ANSI C. My personal ansi/ctype.h is: > #include "/usr//include/ctype.h" > #undef tolower > #undef toupper ... Is there some special benefit of saying "/usr//include" instead of ^^ "/usr/include"?
jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) (10/19/90)
I wrote: > No. In ANSI C, toupper is required to leave the character alone if it > is not lowercase. However, as several people have pointed out to me, BSD 4.3 UNIX does not follow this rule. I ran the test program on the following machine types, and got the folowing results: Machine O/S Works Correctly? ------- --- ---------------- DECstation 3100 4.3 BSD* Yes Sun 3 4.2 BSD** No VAXstation 3100 VMS 5.4 Yes Apple IIgs GS/OS 5.0.2 Should, but not ORCA/C 1.1 actually tested*** *or so it claims (Ultrix V something) **or so it claims (I think SunOS 3.5) ***I didn't test it, but it claims to work that way. ----------------- Jeffrey Hutzelman America Online: JeffreyH11 Internet/BITNET:jh4o+@andrew.cmu.edu, jhutz@drycas.club.cc.cmu.edu, jh4o@cmuccvma >> Apple // Forever!!! <<
stanley@fozzie.UUCP (John Stanley) (10/20/90)
jackm@agcsun.UUCP (Jack Morrison) writes: > > (or for anal types, :-) > while (*duh != '\0') { > Or, for even less possibility for screw-ups: while ( '\0' != *duh ) { The reason for the order becomes clearer in equality testing, when the compiler will complain about ( '\0' = *duh ) and not ( *duh = 0 ). It is real easy to catch a == vs. = problem this way. This is my signature. It doesn't contain my name at all!
karl@haddock.ima.isc.com (Karl Heuer) (10/20/90)
In article <1990Oct19.145302.24826@bbt.se> pgd@bbt.se writes: >Is there some special benefit of saying "/usr//include" instead of >"/usr/include"? Yes. It kludges around the warning provided by some compilers that believe it's a bad idea to explicitly #include from /usr/include. (I agree with them, but since I don't have a good alternative, I do it anyway.) Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
msb@sq.sq.com (Mark Brader) (10/21/90)
Not yet pointed out in all this discussion is that just because you retrieve a value through a pointer of type char *, it isn't necessarily a permissible argument of EITHER islower() or toupper(). In early implementations, an argument of islower() or toupper() has to be in the range 0 to 127, which isascii() checks. In ANSI implementations, isascii() is allowed to not exist, but the argument of islower() or toupper() can validly go as high as MAX_UCHAR, so you only need to ensure the char is nonnegative. This, then, should be a solution: #ifdef __STDC__ /* ANSI C */ # if (MAX_CHAR < MAX_UCHAR) /* chars are signed */ # define TOUPP(c) ((c) < 0? (c): toupper((c))) # else # define TOUPP(c) toupper((c)) # endif #else # define TOUPP(c) ((isascii((c)) && islower((c))? toupper((c)): ((c))) #endif for (p = duh; *p != '\0'; ++p) *p = TOUPP(*p); -- Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com #define MSB(type) (~(((unsigned type)-1)>>1)) This article is in the public domain.
msb@sq.sq.com (Mark Brader) (10/23/90)
My previous posting misspelled CHAR_MAX and UCHAR_MAX. What I meant to say was, of course, this: #ifdef __STDC__ /* ANSI C */ # if (CHAR_MAX < UCHAR_MAX) /* chars are signed */ # define TOUPP(c) ((c) < 0? (c): toupper((c))) # else # define TOUPP(c) toupper((c)) # endif #else # define TOUPP(c) ((isascii((c)) && islower((c))? toupper((c)): ((c))) #endif for (p = duh; *p != '\0'; ++p) *p = TOUPP(*p); -- Mark Brader "I don't care HOW you format char c; while ((c = SoftQuad Inc., Toronto getchar()) != EOF) putchar(c); ... this code is a utzoo!sq!msb, msb@sq.com bug waiting to happen from the outset." --Doug Gwyn This article is in the public domain.
asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) (10/23/90)
In article <11021@hubcap.clemson.edu> svissag@hubcap.clemson.edu (Steve L Vissage II) writes: > From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU > (Matthew E Cross): > > Nope, won't work - the return value of 'toupper' is undefined if the input > > is not a lowercase character. > > So define your own toupper() macro. That's what I did. > #define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch) Two points: first, your macro is *disgustingly unreadable* (and probably incorrect ...haven't checked). This is better: #define toupper(ch) (((ch) >= 'a' && (ch) <= 'z') ? (ch) + 'A' - 'a' : (ch)) #define tolower(ch) (((ch) >= 'A' && (ch) <= 'Z') ? (ch) + 'a' - 'A' : (ch)) Doing it this way, you *almost* know what's going on without comments. The other way, who the heck knows why you're using 123? Shouldn't that be ">= 96", not "> 96"? I happen to 'remember' that 'a' is 96 in ASCII, and that the ASCII lower case and upper case are 32 apart ... but I 'forget' whether it's +32 or -32. My way, I need to remember diddly, and the compiler handles everything at compile time. Your way, if I'm trying to fix or improve your program, how do I know for sure that you're using 123 and 96 as ASCII characters? Worse, what happens if we have to port this to (heaven forbid!) EBCDIC? Those numbers are less than worthless. (I know, I know, mine is worthless in EBCDIC also, but that's not the point! I'll talk about that ...) Second point: The ANSI standard (refering to _Standard C_ by Plauger & Brodie, published by Microsoft Press), shows "toupper" and "tolower" as converting after checking (I.e., it's safe to pass it a non-alpha character). Power C (and, I believe others) have a "_toupper" and "_tolower" (with a leading underscore) which perform the conversion without checking (for speed, when you're sure of the input). Defining this macro all over again is unnecessary. In fact, defining this macro is *dangerous*. Using a locally defined "toupper" routine from the standard <ctype.h> *guarantees* that the local hardware has been taken into account. Furthermore, this is one more thing you don't have to debug. Have you tested that macro over the entire ASCII character set? Checked 'a', '`', '@', 'A', 'Z', '[', 'z', '{', etc.? (boundary checks) Conclusion: If there is a standard routine that does what you want, *use it*. This increases reliability and portability and reduces debug time. If you're not sure what the routine does, RTFM and/or get help. If you not sure that something you want to do has already been done, ask someone. Good chance it's already been done. -- =============Opinions are Mine, typos belong to /bin/ucb/vi============= "We're sorry, but the reality you have dialed is no | Alvin longer in service. Please check the value of pi, | "the Chipmunk" or pray to your local deity for assistance." | Sylvain =============================================UUCP: hplabs!felix!asylvain
jrbd@craycos.com (James Davies) (10/23/90)
In article <1990Oct18.182650.7188@nntp-server.caltech.edu> bruce@seismo.gps.caltech.edu (Bruce Worden) writes: >svissag@hubcap.clemson.edu (Steve L Vissage II) writes: >>From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU (Matthew E Cross): >>> Nope, won't work - the return value of 'toupper' is undefined if the input is >>> not a lowercase character. >> >>So define your own toupper() macro. That's what I did. >>#define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch) >> [ ... ] > >I wouldn't recommend defining a macro with the same name as a library >function. I would certainly second that notion, and go further to say that the average programmer also shouldn't be messing around with system includes at all. I once sent out some C code for a C program to a guy who had modified his C compiler's definition of "isalpha" so that it used a table lookup rather than the supplied library function (for "efficiency", of course). He lifted the macro definitions for his new isalpha from another compiler and then made up the table by himself. Trouble was, he had an off-by-one error in the table, so that it considered "Z" to not be a letter. After about two hours on the phone with him running his debugger and me coaching, we found the problem. (Of course, he didn't tell me about this in advance, I had to infer it from my program's behaviour). I suspect his toalpha macro will make up the time we wasted sometime in the next century...
bruce@seismo.gps.caltech.edu (Bruce Worden) (10/24/90)
Just a thought, but since toupper() and islower() claim to want an int as their argument, shouldn't we all be explicitly casting the char that we have been happily feeding them, rather than rely on the implicit conversion? -------------------------------------------------------------------------- C. Bruce Worden bruce@seismo.gps.caltech.edu 252-21 Seismological Laboratory, Caltech, Pasadena, CA 91125
george@hls0.hls.oz (George Turczynski) (10/24/90)
In article <2466@ux.acs.umn.edu>, edh@ux.acs.umn.edu (Eric "The Mentat Philosopher" Hendrickson) writes: > Basically, what I want to do is take a string of upper/lower case, and make > it all upper case. Here is a first try at it, > > [Code deleted] > > Can anybody point out a good way of doing this? Since people have already commented on the oversights (?) in your code, I won't add any more. So here's a piece of code that does the trick, but is perhaps best implemented as a function: /* --- Cut here --- */ #include<stdio.h> #include<ctype.h> main() { char *work, *duh= "Hello"; printf("%s\n",duh); /* The important piece follows... */ for( work= duh; *work; work++ ) if( islower(*work) ) *work= toupper(*work); /* That was it ! */ printf("%s.\n",duh); exit(0); } /* --- Cut here --- */ I hope that this might help you to solve your problem. Have a good day... -- George P. J. Turczynski, Computer Systems Engineer. Highland Logic Pty Ltd. ACSnet: george@highland.oz |^^^^^^^^^^^^^^^^^^^^^^^^| Suite 1, 348-354 Argyle St Phone: +61 48 683490 | Witty remarks are as | Moss Vale, NSW. 2577 Fax: +61 48 683474 | hard to come by as is | Australia. --------------------------- space to put them ! ---------------------------
asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) (10/26/90)
Ye gadz, recursive follow-ups! In article <152580@felix.UUCP> asylvain@felix.UUCP, I wrote: > In article <11021@hubcap.clemson.edu> svissag@hubcap.clemson.edu > (Steve L Vissage II) writes: > > From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU > > (Matthew E Cross): > > > Nope, won't work - the return value of 'toupper' is undefined if the input > > > is not a lowercase character. > > > > So define your own toupper() macro. That's what I did. > > #define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch) > > Two points: first, your macro is *disgustingly unreadable* (and probably > incorrect ...haven't checked). This is better: > > #define toupper(ch) (((ch) >= 'a' && (ch) <= 'z') ? (ch) + 'A' - 'a' : (ch)) > #define tolower(ch) (((ch) >= 'A' && (ch) <= 'Z') ? (ch) + 'a' - 'A' : (ch)) Upon re-reading this, I've decided that it's not *much* better from a readability point of view. Try this: #define toupper(ch) (((ch) >= 'a' && (ch) <= 'z') \ ? (ch) + 'A' - 'a' \ : (ch)) #define tolower(ch) (((ch) >= 'A' && (ch) <= 'Z') \ ? (ch) + 'a' - 'A' \ : (ch)) Please note that spaces are deliberate. This is also what is known as a "dangerous" macro, in that if you pass it something like '*ch++', your results may not be what you expect. Therefore, following convention, it ought to be TOUPPER and TOLOWER as warning. I still maintain that you should forget the whole thing and use the library functions.
avery@netcom.UUCP (Avery Colter) (10/26/90)
comp.lang.c/16831, edh@ux.acs.umn.edu > Basically, what I want to do is take a string of upper/lower case, and make > it all upper case. Here is a first try at it, > #include <ctype.h> > main() > { > char *duh = "Hello"; > printf("%s\n", duh); > while (*duh <= strlen(duh)) { a) You should #include <string.h> if you're going to use strlen. b) (*duh <= strlen (duh)) is a type mismatch. strlen returns an integer, while *duh is a character. And even with typecasting, as some others have said it is not a good condition. Better to do something like: while (*duh != NULL) Then you don't need to use #strlen or include <string.h>. You can do this for one simple reason, at least with the compiler I use: the value strlen returns is the number of characters BEFORE THE TERMINATING NULL CHARACTER! > if (islower(*duh)) *duh = toupper(*duh); > *duh++; You gottit backwards, and dereferenced: ++duh; is what you want. > } > printf("%s\n", duh); > } > And what I get is : > Hello > Hello Small wonder: that while condition, even if it gets automatically typecasted, is comparing the ASCII value of 'H' to the length of the string, which is 5. Just make the while condition that the terminating NULL character of the string has been reached. -- Avery Ray Colter {apple|claris}!netcom!avery {decwrl|mips|sgi}!btr!elfcat (415) 839-4567 "Fat and steel: two mortal enemies locked in deadly combat." - "The Bending of the Bars", A. R. Colter
karl@haddock.ima.isc.com (Karl Heuer) (10/31/90)
In article <15591@netcom.UUCP> avery@netcom.UUCP (Avery Colter) writes: >Better to do something like: > while (*duh != NULL) Almost, but please don't spell it "NULL". This is traditionally used for the null pointer constant, which is not at all related to the null character (except that each is obtained by converting a constant zero to the appropriate type). On some systems the compiler won't accept the above, since the macro NULL is defined with pointer syntax. "while (*duh != '\0')" is better. >> *duh++; > >You gottit backwards, and dereferenced: ++duh; is what you want. You're right about the dereference being redundant, but the "backwards" bit is purely a style issue: "++duh" and "duh++" are exactly equivalent in this context, since the result of the expression isn't being used. Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
karl@robot.in-berlin.de (Karl-P. Huestegge) (11/07/90)
asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) writes: >In fact, defining this macro is *dangerous*. Using a locally defined >"toupper" routine from the standard <ctype.h> *guarantees* that the local >hardware has been taken into account. Furthermore, this is one more thing >you don't have to debug. Have you tested that macro over the entire ASCII >character set? Checked 'a', '`', '@', 'A', 'Z', '[', 'z', '{', etc.? >(boundary checks) >Conclusion: If there is a standard routine that does what you want, *use it*. >This increases reliability and portability and reduces debug time. If you're >not sure what the routine does, RTFM and/or get help. If you not sure that >something you want to do has already been done, ask someone. Good chance >it's already been done. Sorry, I missed the starting point of the discussion. There is another reason to use the standard library functions: The international Charactersets (8bit, ISO-8859-1 for example). On my international development system toupper('a-umlaut') is ('A-umlaut'), which is of course *not* 'a-umlaut'-32 or ('a-umlaut' - 'a'-'A'). The functions accesses a library of the local language-set (depending on the environment-var LC_CTYPE) One additional advice: Please don't use isascii() in text-functions, because this forbits all international chars > 127. Use isprint() instead (or whatever is appropriate). Please keep your code 8-bit clean. Thousands of Users thank you. (all the Renes, Angeliques, Mullers and Angstroms would be happy ;-). -- Karl-Peter Huestegge karl@robot.in-berlin.de Berlin Friedenau ..unido!fub!geminix!robot!karl
jimp@cognos.UUCP (Jim Patterson) (11/09/90)
In article <1990Nov7.043705.15051@robot.in-berlin.de> karl@robot.in-berlin.de (Karl-P. Huestegge) writes: >asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) writes: > >>Conclusion: If there is a standard routine that does what you want, *use it*. >>This increases reliability and portability and reduces debug time. > >There is another reason to use the standard library functions: >The international Charactersets (8bit, ISO-8859-1 for example). On my >international development system toupper('a-umlaut') is ('A-umlaut'), >which is of course *not* 'a-umlaut'-32 or ('a-umlaut' - 'a'-'A'). >The functions accesses a library of the local language-set (depending >on the environment-var LC_CTYPE) You're fortunate to be working on a system that's "working" in this regard. We at one point abandoned the vendor's ctype.h functions because they simply ignored 8-bit character sets. It wouldn't have been so bad if they just considered the extended characters as graphics or something, but in fact the functions were implemented with 128-byte tables so anything with the 8th bit set returned arbitrary results. I won't mention the vendor since they have since put in very good internationalization support, but I suspect this sort of problem is present in a number of older implementations. -- Jim Patterson Cognos Incorporated UUCP:uunet!mitel!cunews!cognos!jimp P.O. BOX 9707 PHONE:(613)738-1440 3755 Riverside Drive NOT a Jays fan (not even a fan) Ottawa, Ont K1G 3Z4
zvs@bby.oz.au (Zev Sero) (11/12/90)
>>>>> On 7 Nov 90 04:37:05 GMT, karl@robot.in-berlin.de (Karl-P. Huestegge) said:
Karl> One additional advice: Please don't use isascii() in text-functions,
Karl> because this forbits all international chars > 127. Use isprint()
Karl> instead (or whatever is appropriate).
Unfortunately, in many implementations, including SunOS, the only
ctype.h functions/macros that are guaranteed to work on chars >127 are
isascii and toascii. If you want your code to work on such systems,
i.e. you are doing things like
c = isupper (c) ? tolower (c) : c;
which is unnecessary in standard C, then you must also use isascii.
c = isascii (c) && isupper (c) ? tolower (c) : c;
To find out whether a character can safely be sent to a printer, in
such an implementation, you must use
if (isascii (c) && isprint (c))
otherwise, as I learned the hard way, your program will dump core.
---
Zev Sero - zvs@bby.oz.au
As I recall, zero was invented by Arabic mathematicians
thousands of years ago. It's a pity it still frightens
or confuses people. - Doug Gwyn
msb@sq.sq.com (Mark Brader) (11/12/90)
Karl-P. Huestegge (karl@robot.in-berlin.de) writes: > One additional advice: Please don't use isascii() in text-functions, > because this forbids all international chars > 127. Use isprint() > instead (or whatever is appropriate). > Please keep your code 8-bit clean. Thousands of Users thank you. The trouble with this advice is that isprint() is not a replacement for isascii(). All of the "ctype functions" other than isascii() are restricted in the arguments they can take, so as to permit the simple implementation by table lookup. In an ASCII environment, isascii() serves as a validator, to see whether the argument value is permissible to pass to other "ctype functions". isprint() is merely another "ctype function" with the same domain of validity for its argument as the rest. Now, isascii() itself is not in ANSI C. (More precisely, implementations are allowed but not required to provide it, along with any other is...() functions not mentioned explicitly in the standard.) As a replacement for it in its role as a validator, my usual suggestion is: #include <ctype.h> #ifdef __STDC__ # include <limits.h> # define IS_CTYPABLE(c) (((c) < UCHAR_MAX && (c) >= 0) || (c) == EOF) #else # define IS_CTYPABLE isascii #endif We would then see things like if (IS_CTYPABLE (*p) && islower (*p)) *p = toupper (*p); But this does not allow for non-ANSI, non-ASCII environments where the "ctype functions" accept a greater range of argument values than isascii() returns true on. I'm not aware of any way to make an automated test for those environments, which could conveniently be added to the #ifdef above. Perhaps Karl can suggest a way. Caveat: this article was prepared without reference to the final standard. Please email me if you detect errors, and I'll post a correction. (This is, incidentally, *almost always* the way that errors on Usenet are best handled: give the poster a chance to announce their own error first. For these purposes, not reading the FAQ list counts as an error.) -- Mark Brader, SoftQuad Inc., Toronto "... pure English is de rigueur" utzoo!sq!msb, msb@sq.com -- Manchester Guardian Weekly This article is in the public domain.
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/12/90)
In article <1990Nov12.040933.5419@sq.sq.com>, msb@sq.sq.com (Mark Brader) writes: > # define IS_CTYPABLE(c) (((c) < UCHAR_MAX && (c) >= 0) || (c) == EOF) Knowing that EOF is -1, one could do this with one evaluation of (c) -- always a courteous thing to do in a macro -- # define IS_CTYPABLE(c) \ ((unsigned)((c)+EOF) < (unsigned)(UCHAR_MAX+EOF)) I've never used isascii() myself because I had always constructed the program so that I knew the codes were in range without needing a run- time test; if you've got something you _think_ is a character and it's outside the range that the ctype macros can handle what can you do but report an error, and why leave it that late to check? -- The problem about real life is that moving one's knight to QB3 may always be replied to with a lob across the net. --Alasdair Macintyre.
gwyn@smoke.brl.mil (Doug Gwyn) (11/12/90)
In article <1990Nov12.040933.5419@sq.sq.com> msb@sq.sq.com (Mark Brader) writes: >Now, isascii() itself is not in ANSI C. It doesn't need to be. All values of unsigned char, as well as EOF, work just fine as is*() arguments.
doom@informix.com (Mark Dooling) (11/23/90)
Simple question: Is there a newsgroup specialising on curses? If not, is anyone interested? =======mark dooling - informix uk