[comp.lang.c] Is there a good example of how toupper

edh@ux.acs.umn.edu (Eric "The Mentat Philosopher" Hendrickson) (10/17/90)

Basically, what I want to do is take a string of upper/lower case, and make
it all upper case.  Here is a first try at it,


#include <ctype.h>
main()
{
	char *duh = "Hello";
	printf("%s\n", duh);
	while (*duh <= strlen(duh)) {
		if (islower(*duh)) *duh = toupper(*duh);
		*duh++;
	}
	printf("%s\n", duh);
}

And what I get is :

Hello
Hello

What I want is:

Hello
HELLO

Can anybody point out a good way of doing this?

Thanks much,
			Eric Hendrickson
-- 
/----------"Oh carrots are divine, you get a dozen for dime, its maaaagic."--
|Eric (the "Mentat-Philosopher") Hendrickson	  Academic Computing Services
|edh@ux.acs.umn.edu	   The game is afoot!	      University of Minnesota
\-"What does 'masochist' and 'amnesia' mean?   Beats me, I don't remember."--

poser@csli.Stanford.EDU (Bill Poser) (10/17/90)

In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes:
[believes that there is a problem with toupper and gives code including
the following]
>
>	char *duh = "Hello";
>	printf("%s\n", duh);
>	while (*duh <= strlen(duh)) {
>		if (islower(*duh)) *duh = toupper(*duh);
>		*duh++;
>	}

The problem here is in the while termination condition. What this tests
is whether the numerical value of the current character (*duh) is
less than or equal to the length of the the string duh, which happens always
to be five. This condition is never satisfied, so the code in the loop 
is never executed.

(An aside: since strlen(duh) never changes, either you or the compiler
should move it outside the loop.) 
							Bill

bhoughto@cmdnfs.intel.com (Blair P. Houghton) (10/17/90)

In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes
>#include <ctype.h>
>main()
>{
>	char *duh = "Hello";
>	printf("%s\n", duh);
>	while (*duh <= strlen(duh)) {

Change `*duh' to `duh'.

>		if (islower(*duh)) *duh = toupper(*duh);
>		*duh++;

Ditto.  Increment the pointer, not the character.

>	}
>	printf("%s\n", duh);

Use a different variable here.  `duh' will now point to 'O', not 'H',
if the loop is entered.

>}
>
>And what I get is :
>Hello
>Hello

Basically, since `*duh' is a character, and a printable one at that,
its value as an integer in (*duh <= strlen(duh)) is going to
be something on the order of 60, while strlen(duh) is 5.  The
loop is skipped because 60 is never <= 5.

				--Blair
				  "End of lesson.  No opportunistic
				   comment on use of the word 'duh.'
				   ...except maybe indirectly..."

bhoughto@cmdnfs.intel.com (Blair P. Houghton) (10/17/90)

Sorry if anyone saw my earlier posting to this thread; a) I thought
I was mailing it; b) I thought I had hit 'l' for 'list' instead
of 's' for 'send': and there were several things yet to edit.
I went after ^C, but it was too late...
the cancellation should be getting through any time now...

In article <15857@csli.Stanford.EDU> poser@csli.stanford.edu (Bill Poser) writes:
>In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes:
>>	char *duh = "Hello";
>>	printf("%s\n", duh);
>>	while (*duh <= strlen(duh)) {
>>		if (islower(*duh)) *duh = toupper(*duh);
>>		*duh++;
>>	}
>>      printf("%s\n",duh)
>
>The problem here is in the while termination condition. What this tests

There's more than just the one problem (that *duh will be > strlen(duh));

0.  *duh refers to the character, not the location;

1.  The loop changes the value of the pointer `duh', so it may print
nothing other than "O" once you get the loop to work;

2.  merely using (duh <= strlen(duh)) won't fix it; the
value of the pointer `duh' is almost certain to be larger
than strlen(duh).

I won't give fixes here; it's too instructive to work them out
yourself, especially at the apparent level of understanding.

>(An aside: since strlen(duh) never changes, either you or the compiler
>should move it outside the loop.) 

Trivial optimization compared to the massive bugs extant.

				--Blair
				  "I've been saying 'duh' myself
				   a lot, lately..."

bruce@seismo.gps.caltech.edu (Bruce Worden) (10/17/90)

In article poser@csli.stanford.edu (Bill Poser) writes:
>In article edh@ux.acs.umn.edu (Eric D. Hendrickson) writes:
>[believes that there is a problem with toupper and gives code including
>the following]
>>
>>	char *duh = "Hello";
>>	printf("%s\n", duh);
>>	while (*duh <= strlen(duh)) {
>>		if (islower(*duh)) *duh = toupper(*duh);
>>		*duh++;
>>	}
>The problem here is in the while termination condition. [ .... ]
>[ ... ] so the code in the loop is never executed.
>(An aside: since strlen(duh) never changes, either you or the compiler
>should move it outside the loop.) 

On the contrary, if this loop actually executed, the value of `strlen(duh)' 
would change at every iteration because `duh' is incremented in the loop.  
Similarly, in the final statement (deleted above):

	printf("%s\n",duh);

`duh' would point off the end of the string if the loop actually executed.  
I sent the original poster this code with an explanation, which people may 
comment on as they see fit:

#include <ctype.h>
main() {
        char *duh = "Hello";
        int i, limit = strlen(duh);
        printf("%s\n", duh);
        for(i=0; i<limit; i++) {
                if (islower(duh[i])) duh[i] = toupper(duh[i]);
        }        
        printf("%s\n", duh);
}

P.S.  Why did I rewrite the `while' loop above as a `for' loop?  I have 
found `for' loops to be very efficient (if that is a consideration) and, 
as I have said here before, I find subscripted arrays to be clearer and 
less error prone than incremented pointers (plus, vectorizing compilers 
love finding those iteration variables.)  (Having said that, I hope nobody 
finds a bug in my loop.)
--------------------------------------------------------------------------
C. Bruce Worden                            bruce@seismo.gps.caltech.edu
252-21 Seismological Laboratory, Caltech, Pasadena, CA 91125

poser@csli.Stanford.EDU (Bill Poser) (10/17/90)

In article <473@inews.intel.com> bhoughto@cmdnfs.intel.com (Blair P. Houghton) writes:
>In article <15857@csli.Stanford.EDU> poser@csli.stanford.edu (Bill Poser) writes:
>>The problem here is in the while termination condition. What this tests
>
>There's more than just the one problem (that *duh will be > strlen(duh));
>
>0.  *duh refers to the character, not the location;

This is one aspect of the problem I pointed out, that comparing the
value of a character to the length of the string is not useful.

>1.  The loop changes the value of the pointer `duh', so it may print
>nothing other than "O" once you get the loop to work;

This will be avoided if an index is used and compared to strlen(duh).
The issue only arises if one compares duh to duh+strlen(duh), in which
case a copy of the pointer must be used.

>2.  merely using (duh <= strlen(duh)) won't fix it; the
>value of the pointer `duh' is almost certain to be larger
>than strlen(duh).

Another aspect of the same problem I pointed out. Why worry
about an obviously wrong "fix" that nobody has suggested?
The problem doesn't have to do with dereferencing - it has to do
with confusing pointers and indices.

>>(An aside: since strlen(duh) never changes, either you or the compiler
>>should move it outside the loop.) 
>
>Trivial optimization compared to the massive bugs extant.

Which is why its an aside.

						Bill

salomon@ccu.umanitoba.ca (Dan Salomon) (10/17/90)

In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes:
>Basically, what I want to do is take a string of upper/lower case, and make
>it all upper case.  Here is a first try at it,
>
>
>#include <ctype.h>
>main()
>{
>	char *duh = "Hello";
>	printf("%s\n", duh);
>	while (*duh <= strlen(duh)) {
>		if (islower(*duh)) *duh = toupper(*duh);
>		*duh++;
>	}
>	printf("%s\n", duh);
>}

There are at least four errors in this code.  Two of them are in your
while statement.  There is no point in repeatedly recomputing the
string length, and no point in comparing either a pointer, or the
character it points to to that length.  Instead test for the end of the
string by finding the terminating null character.  The other two
errors, incrementing the character instead of the pointer, and trying
to print the string by pointing to its end, were mentioned in earlier
postings.  Try the following version:

#include <ctype.h>
main()
{
	char *duh = "Hello";
	char *cur;

	printf("%s\n", duh);
	cur = duh;
	while (*cur) {
		if (islower(*cur)) *cur = toupper(*cur);
		cur++;
	}
	printf("%s\n", duh);
}

Sometimes it pays to stay in bed on Monday, rather than spending the
rest of the week debugging Monday's code.  :-)
-- 

Dan Salomon -- salomon@ccu.UManitoba.CA
               Dept. of Computer Science / University of Manitoba
	       Winnipeg, Manitoba  R3T 2N2 / (204) 275-6682

ghoti+@andrew.cmu.edu (Adam Stoller) (10/17/90)

Both the original code posted and that as supplied by others - seems to
accept the fact that

	char *duh = "Hello";

can be modified.  From what I recall, for your simple test function to
work, you would either have to use:

	char duh[] = "Hello";

or pass/read in a string into either a malloc'ed area or char array --
before being able to modify it.

Of course I could be wrong - but...for my $0.02 function contribution:

#include <ctype.h>
int main()
{
    char duh[] = "Hello"; /* see (1), below */
    char *s = NULL;
    printf("%s\n", duh);
    for (s = duh; *s != '\0'; s++){
        *s = toupper(*s); /* see (2), below */
    }
    printf("%s\n", duh);
}

(1) some older compilers will require this to be declared static, before
allowing you to use aggregate initialization.

(2) under ANSI you don't need to test for islower() - pre-ANSI requires
the islower() test because many of the macros used to define islower and
toupper were brain-dead

--fish

scc@rlgvax.UUCP (Stephen Carlson) (10/17/90)

In article <2466@ux.acs.umn.edu> edh@ux.acs.umn.edu (Eric D. Hendrickson) writes:
>Basically, what I want to do is take a string of upper/lower case, and make
>it all upper case.  Here is a first try at it,
>
>
>#include <ctype.h>
>main()
>{
>	char *duh = "Hello";
>	printf("%s\n", duh);
>	while (*duh <= strlen(duh)) {
>		if (islower(*duh)) *duh = toupper(*duh);
>		*duh++;
>	}
>	printf("%s\n", duh);
>}

Since others have pointed out the problem with the while loop condition,
I would like to point out that with a declaration of

	char *duh = "Hello";

the compiler is free to put this string in read-only memory (text).  Then
the subsequent

	if (...) *duh = toupper(*duh);

will dump core with a segmentation violation (SIGSEGV).  You may have lucked
out since the incorrect loop condition avoids this statement.  I would
recommend declaring `duh' as a (static) array and then using a pointer to
do the work on the array:

#include <stdio.h>
#include <ctype.h>

int main()
{
	static char duh[] = "Hello";
	char *p = duh;

	printf("%s\n", duh);
	while (*p) {          /* or (*p != '\0') if that is your style */
		if (islower(*p))
			*p = toupper(*p);
		p++;
	}
	printf("%s\n", duh);
	return 0;
}

Notes:
	Declaring a char array and initializing it to a string will copy it
to a writable area.  It might even be more efficient.

	On some systems, toupper() is safe to use even if the char is not a
lower case letter.  On other systems, the islower() test is necessary.
ANSI standardizes this.

	The expression `*duh++' will increment the pointer as you want, but
it will do a useless deference (hence lint's "null effect").  In no case
will it increment the char it points to as others incorrectly state.

	By the way, the new program lints (ignoring the frivolous "returns a
value that is always ignored" message) and runs with no problem.

-- 
Stephen Carlson            | ICL OFFICEPOWER Center
scc@rlgvax.opcr.icl.com    | 11490 Commerce Park Drive
..!uunet!rlgvax!scc        | Reston, VA  22091

jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) (10/17/90)

Try this one:

void strupper(char *str)
{
for (;*str!='\0';str++)
	*str=toupper(*str);
}
-----------------
Jeffrey Hutzelman
America Online: JeffreyH11
Internet/BITNET:jh4o+@andrew.cmu.edu, jhutz@drycas.club.cc.cmu.edu,
                jh4o@cmuccvma

>> Apple // Forever!!! <<

jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) (10/17/90)

No, the loop will work as advertised.  See my previous post for a
function that does it with an incremented pointer.
-----------------
Jeffrey Hutzelman
America Online: JeffreyH11
Internet/BITNET:jh4o+@andrew.cmu.edu, jhutz@drycas.club.cc.cmu.edu,
                jh4o@cmuccvma

>> Apple // Forever!!! <<

will@kfw.COM (Will Crowder) (10/17/90)

In article <1990Oct16.221035.10764@nntp-server.caltech.edu> bruce@seismo.gps.caltech.edu (Bruce Worden) writes:

>I sent the original poster this code with an explanation, which people may 
>comment on as they see fit:
>
>#include <ctype.h>
>main() {
>        char *duh = "Hello";
>        int i, limit = strlen(duh);
>        printf("%s\n", duh);
>        for(i=0; i<limit; i++) {
>                if (islower(duh[i])) duh[i] = toupper(duh[i]);
>        }        
>        printf("%s\n", duh);
>}
>
>P.S.  Why did I rewrite the `while' loop above as a `for' loop?  I have 
>found `for' loops to be very efficient (if that is a consideration) and, 
>as I have said here before, I find subscripted arrays to be clearer and 
>less error prone than incremented pointers (plus, vectorizing compilers 
>love finding those iteration variables.)  (Having said that, I hope nobody 
>finds a bug in my loop.)

Well, I don't immediately see any bugs in the loop.  Agreed that incremented
pointers are less clear than subscripted arrays, but they are usually more
expensive, especially with older compilers.  In this case, in order to explain
the problem to the poster, you have to start talking about pointer/array
equivalence, what duh[i] really means, etc. etc., and he's obviously
not quite ready for that yet.

I sent the poster a heavily commented version of the following, along with
a blanket apology for the ridiculously large number of partially or completely
incorrect answers to his very simple question.

#include <stdio.h>
#include <ctype.h>

main()
{

	char *duh = "Hello";
	char *p;
	printf("%s\n", duh);
	p = duh;
	while (*p != '\0') {
		if (islower(*p))
			*p = toupper(*p);
		p++;
	}
	printf("%s\n", duh);

}

Will

profesor@wpi.WPI.EDU (Matthew E Cross) (10/18/90)

In article <wb76pN600awOE3SaQz@andrew.cmu.edu> jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) writes:
>Try this one:
>
>void strupper(char *str)
>{
>for (;*str!='\0';str++)
>	*str=toupper(*str);
>}

Nope, won't work - the return value of 'toupper' is undefined if the input is
not a lowercase character.  Try:

void strupper(char *str)
{
for (;*str!='\0';str++)
       *str=islower(*str)?toupper(*str):*str;
}

(I hope I got the '? :' syntax right...)
-- 
+----------------------------------------------------+------------------------+
| "The letter U has a lot of uses ... | Looking for  |  profesor@wpi.wpi.edu  |
|  I like to play it like a guitar!"  | suggestions  +------------------------+
|          -Sesame Street             | for new gweepco programs...           |

salomon@ccu.umanitoba.ca (Dan Salomon) (10/18/90)

In article <wb76pN600awOE3SaQz@andrew.cmu.edu> jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) writes:
>Try this one:
>
>void strupper(char *str)
>{
>for (;*str!='\0';str++)
>	*str=toupper(*str);
>}

There is a problem with this solution on some systems.
Berkeley UNIX BSD 4.3 requires that the parameter of toupper
be a lowercase letter.  The result is undefined if it is not.
Therefore the test using islower may be necessary on some systems.
This makes toupper pretty useless in portable programs, but
those are the breaks.
-- 

Dan Salomon -- salomon@ccu.UManitoba.CA
               Dept. of Computer Science / University of Manitoba
	       Winnipeg, Manitoba  R3T 2N2 / (204) 275-6682

will@kfw.COM (Will Crowder) (10/18/90)

In article <1990Oct17.165509.10914@kfw.COM> I wrote:

>I sent the poster a heavily commented version of the following, along with
>a blanket apology for the ridiculously large number of partially or completely
>incorrect answers to his very simple question.

Mea culpa.  First I go off and complain about partially or completely
incorrect answers to his simple question, and then, as has been pointed
out to my in e-mail by <ico.isc.com!rcd> that my solution also contains
an error:

	char *duh = "Hello";

duh points to a constant string.  Should've been

	char duh[] = "Hello";

Now, maybe I just didn't want to start a whole go-around again about
the difference between the two, or maybe I was too lazy to explain,
or maybe (and this is the most likely) I just overlooked it. 

Oooopppps!    <sheepish :) :)>

Will

mikey@ontek.com (michelle (international krill) lee) (10/18/90)

In comp.lang.c, edh@ux.acs.umn.edu (Eric D. Hendrickson) writes:
|
| #include <ctype.h>
| main()
| {
|       char *duh = "Hello";
|       printf("%s\n", duh);
|       while (*duh <= strlen(duh)) {
|               if (islower(*duh)) *duh = toupper(*duh);
|               *duh++;
|       }
|       printf("%s\n", duh);
| }

 The usual suspects have pointed out the obvious problems;  thus 
 the only things remaining are nitpicky in the extreme, but that
 never stopped me from posting before...

1. While not a problem in this context, it's generally advisable
   to use isascii() to check that whatever is being converted to 
   upper case is actually an ascii character.

2. My manual page makes reference to a _toupper() macro.  Adding 
   an "#ifdef _toupper" to check if the macro is available could 
   speed things up marginally, at the expense of defeating what-
   ever locale facility is available.  

3. Making "duh" a register variable wouldn't hurt, especially if
   the above code were to be completely debugged and turned into
   a more general utility.

4. Modification of the constant character array "Hello" may be a 
   no-no for certain compilers and/or certain compiler options.

5. An exit or a return statement would be nice.

6. #includ-ing <stdio.h> is consider good practice in code which 
   uses the standard i/o facilities like printf().

brad@SSD.CSD.HARRIS.COM (Brad Appleton) (10/18/90)

In article <wb76pN600awOE3SaQz@andrew.cmu.edu> jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) writes:
>Try this one:
>
>void strupper(char *str)
>{
>for (;*str!='\0';str++)
>	*str=toupper(*str);
>}

You need to be careful here! It all depends on your compiler. For some
compilers, the toupper function/macro performs the functional equivalent
of:

	c = (c - 'a') + 'A';

with other compilers, the functionality of toupper is more like this:

	if ( c >= 'a'  &&  c <= 'z' )
		c = (c - 'a') + 'A';

In other words, some compilers will blindly convert the character to uppercase,
regardless of what the character was whereas other compilers will make sure
the value is indeed lowercase before trying to modify it to be uppercase.

You will have to double-check your documentation for this. I think that 
the BSD Unix/C toupper() MUST take a lowercase letter and has undefined
results otherwise whereas the AT&T Unix/C toupper() will give the desired
result even if the character was not lowercase to begin with (Im not 100%
positive about that though, anyone care to enlighten me).

______________________ "And miles to go before I sleep." ______________________
 Brad Appleton        brad@travis.ssd.csd.harris.com   Harris Computer Systems
                          ...!uunet!hcx1!brad          Fort Lauderdale, FL USA
~~~~~~~~~~~~~~~~~~~~ Disclaimer: I said it, not my company! ~~~~~~~~~~~~~~~~~~~

jackm@agcsun.UUCP (Jack Morrison) (10/18/90)

>>	char *duh = "Hello";
>>	printf("%s\n", duh);
>>	while (*duh <= strlen(duh)) {
>>		if (islower(*duh)) *duh = toupper(*duh);
>>		*duh++;
>>	}
>
>(An aside: since strlen(duh) never changes, either you or the compiler
>should move it outside the loop.) 
>							Bill

Even better, just use

	while (*duh) {
		if (islower(*duh)) *duh = toupper(*duh);
		duh++;
	}

(or for anal types, :-)
	while (*duh != '\0') {

-- 
"How am I typing?  Call 1-303-279-1300"     Jack C. Morrison
Ampex Video Systems    581 Conference Place, Golden CO 80401

svissag@hubcap.clemson.edu (Steve L Vissage II) (10/18/90)

From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU (Matthew E Cross):
> Nope, won't work - the return value of 'toupper' is undefined if the input is
> not a lowercase character.
  
So define your own toupper() macro.  That's what I did.
#define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch)
 
You don't even have to do any casts, because C is pretty free with it's
int<->char conversions.

> void strupper(char *str)
> {
> for (;*str!='\0';str++)
>        *str=islower(*str)?toupper(*str):*str;
> }                  ^
                     | 
         *str=toupper(*str);
              
Steve L Vissage II

bruce@seismo.gps.caltech.edu (Bruce Worden) (10/19/90)

svissag@hubcap.clemson.edu (Steve L Vissage II) writes:
>From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU (Matthew E Cross):
>> Nope, won't work - the return value of 'toupper' is undefined if the input is
>> not a lowercase character.
>  
>So define your own toupper() macro.  That's what I did.
>#define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch)
> [ ... ]

I wouldn't recommend defining a macro with the same name as a library
function.  And from what I remember from the `toupper()' and `tolower()'
discussion here about three months ago, I think it was generally agreed 
that a macro that evaluates its argument three times must be used with 
great caution ( toupper(getchar()) can happen, e.g.), and that the simple 
subtraction ( ch-32 ) and comparisons ( ch<123, ch>96) are inherently
non-portable.

P.S. I, among others, missed the *duh = "hello"; bug.  My apologies.
--------------------------------------------------------------------------
C. Bruce Worden                            bruce@seismo.gps.caltech.edu
252-21 Seismological Laboratory, Caltech, Pasadena, CA 91125

jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) (10/19/90)

No.  In ANSI C, toupper is required to leave the character alone if it
is not lowercase.
-----------------
Jeffrey Hutzelman
America Online: JeffreyH11
Internet/BITNET:jh4o+@andrew.cmu.edu, jhutz@drycas.club.cc.cmu.edu,
                jh4o@cmuccvma

>> Apple // Forever!!! <<

karl@haddock.ima.isc.com (Karl Heuer) (10/19/90)

In article <1990Oct17.170914.683@wpi.WPI.EDU> profesor@wpi.WPI.EDU (Matthew E Cross) writes:
>Nope, won't work - the return value of 'toupper' is undefined if the input is
>not a lowercase character.

Fixed in ANSI C.

For those who are using pre-ANSI systems where this doesn't hold, I recommend
coding in ANSI style, and writing your own ANSI-compatible headers and
libraries as needed.  This minimizes the trauma when you finally graduate to
ANSI C.  My personal ansi/ctype.h is:
	#include "/usr//include/ctype.h"
	#undef tolower
	#undef toupper
	#if defined(__STDC__)
	extern int toupper(int);
	extern int tolower(int);
	#else
	extern int toupper();
	extern int tolower();
	#endif

Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint

pgd@bbt.se (10/19/90)

In article <18575@haddock.ima.isc.com> karl@ima.isc.com (Karl Heuer) writes:
>
>ANSI C.  My personal ansi/ctype.h is:
>	#include "/usr//include/ctype.h"
>	#undef tolower
>	#undef toupper
...

Is there some special benefit of saying "/usr//include" instead of
					     ^^
"/usr/include"?

jh4o+@andrew.cmu.edu (Jeffrey T. Hutzelman) (10/19/90)

I wrote:

> No.  In ANSI C, toupper is required to leave the character alone if it
> is not lowercase.

However, as several people have pointed out to me, BSD 4.3 UNIX does not
follow this rule.  I ran the test program on the following machine
types, and got the folowing results:

Machine              O/S               Works Correctly?
-------              ---               ----------------
DECstation 3100      4.3 BSD*          Yes
Sun 3                4.2 BSD**         No
VAXstation 3100      VMS 5.4           Yes
Apple IIgs           GS/OS 5.0.2       Should, but not
                     ORCA/C 1.1        actually tested***

*or so it claims (Ultrix V something)
**or so it claims (I think SunOS 3.5)
***I didn't test it, but it claims to work that way.
-----------------
Jeffrey Hutzelman
America Online: JeffreyH11
Internet/BITNET:jh4o+@andrew.cmu.edu, jhutz@drycas.club.cc.cmu.edu,
                jh4o@cmuccvma

>> Apple // Forever!!! <<

stanley@fozzie.UUCP (John Stanley) (10/20/90)

jackm@agcsun.UUCP (Jack Morrison) writes:
> 
> (or for anal types, :-)
> 	while (*duh != '\0') {
> 
   Or, for even less possibility for screw-ups:

	while ( '\0' != *duh ) {

The reason for the order becomes clearer in equality testing, when the
compiler will complain about ( '\0' = *duh ) and not ( *duh = 0 ). It is
real easy to catch a == vs. = problem this way.

This is my signature. It doesn't contain my name at all!

karl@haddock.ima.isc.com (Karl Heuer) (10/20/90)

In article <1990Oct19.145302.24826@bbt.se> pgd@bbt.se writes:
>Is there some special benefit of saying "/usr//include" instead of
>"/usr/include"?

Yes.  It kludges around the warning provided by some compilers that believe
it's a bad idea to explicitly #include from /usr/include.  (I agree with them,
but since I don't have a good alternative, I do it anyway.)

Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint

msb@sq.sq.com (Mark Brader) (10/21/90)

Not yet pointed out in all this discussion is that just because
you retrieve a value through a pointer of type char *, it isn't
necessarily a permissible argument of EITHER islower() or toupper().

In early implementations, an argument of islower() or toupper()
has to be in the range 0 to 127, which isascii() checks.  In ANSI
implementations, isascii() is allowed to not exist, but the
argument of islower() or toupper() can validly go as high as
MAX_UCHAR, so you only need to ensure the char is nonnegative.

This, then, should be a solution:

	#ifdef __STDC__			/* ANSI C */
	# if (MAX_CHAR < MAX_UCHAR)	/* chars are signed */
	#   define TOUPP(c) ((c) < 0? (c): toupper((c)))
	# else
	#   define TOUPP(c) toupper((c))
	# endif
	#else
	# define TOUPP(c) ((isascii((c)) && islower((c))? toupper((c)): ((c)))
	#endif

	for (p = duh; *p != '\0'; ++p)
		*p = TOUPP(*p);

-- 
Mark Brader, SoftQuad Inc., Toronto, utzoo!sq!msb, msb@sq.com
#define	MSB(type)	(~(((unsigned type)-1)>>1))

This article is in the public domain.

msb@sq.sq.com (Mark Brader) (10/23/90)

My previous posting misspelled CHAR_MAX and UCHAR_MAX.  What I meant
to say was, of course, this:

	#ifdef __STDC__			/* ANSI C */
	# if (CHAR_MAX < UCHAR_MAX)	/* chars are signed */
	#   define TOUPP(c) ((c) < 0? (c): toupper((c)))
	# else
	#   define TOUPP(c) toupper((c))
	# endif
	#else
	# define TOUPP(c) ((isascii((c)) && islower((c))? toupper((c)): ((c)))
	#endif

	for (p = duh; *p != '\0'; ++p)
		*p = TOUPP(*p);

-- 
Mark Brader		   "I don't care HOW you format   char c; while ((c =
SoftQuad Inc., Toronto	    getchar()) != EOF) putchar(c);   ... this code is a
utzoo!sq!msb, msb@sq.com    bug waiting to happen from the outset." --Doug Gwyn

This article is in the public domain.

asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) (10/23/90)

In article <11021@hubcap.clemson.edu> svissag@hubcap.clemson.edu
(Steve L Vissage II) writes:
> From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU
> (Matthew E Cross):
> > Nope, won't work - the return value of 'toupper' is undefined if the input
> > is not a lowercase character.
>   
> So define your own toupper() macro.  That's what I did.
> #define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch)

Two points: first, your macro is *disgustingly unreadable* (and probably
incorrect ...haven't checked).  This is better:

#define toupper(ch) (((ch) >= 'a' && (ch) <= 'z') ? (ch) + 'A' - 'a' : (ch))
#define tolower(ch) (((ch) >= 'A' && (ch) <= 'Z') ? (ch) + 'a' - 'A' : (ch))

Doing it this way, you *almost* know what's going on without comments.
The other way, who the heck knows why you're using 123?   Shouldn't that
be ">= 96", not "> 96"?  I happen to 'remember' that 'a' is 96 in ASCII,
and that the ASCII lower case and upper case are 32 apart ... but I
'forget' whether it's +32 or -32.  My way, I need to remember diddly,
and the compiler handles everything at compile time.  Your way, if I'm
trying to fix or improve your program, how do I know for sure that
you're using 123 and 96 as ASCII characters?

Worse, what happens if we have to port this to (heaven forbid!) EBCDIC?
Those numbers are less than worthless.  (I know, I know, mine is worthless
in EBCDIC also, but that's not the point!  I'll talk about that ...)

Second point: The ANSI standard (refering to _Standard C_ by Plauger & Brodie,
published by Microsoft Press), shows "toupper" and "tolower" as converting
after checking (I.e., it's safe to pass it a non-alpha character).  Power C
(and, I believe others) have a "_toupper" and "_tolower" (with a leading
underscore) which perform the conversion without checking (for speed, when
you're sure of the input).  Defining this macro all over again is unnecessary.

In fact, defining this macro is *dangerous*.  Using a locally defined
"toupper" routine from the standard <ctype.h> *guarantees* that the local
hardware has been taken into account.  Furthermore, this is one more thing
you don't have to debug.  Have you tested that macro over the entire ASCII
character set?  Checked 'a', '`', '@', 'A', 'Z', '[', 'z', '{', etc.?
(boundary checks)

Conclusion: If there is a standard routine that does what you want, *use it*.
This increases reliability and portability and reduces debug time.  If you're
not sure what the routine does, RTFM and/or get help.  If you not sure that
something you want to do has already been done, ask someone.  Good chance
it's already been done.
--
=============Opinions are Mine, typos belong to /bin/ucb/vi=============
"We're sorry, but the reality you have dialed is no   |            Alvin
longer in service.  Please check the value of pi,     |   "the Chipmunk"
or pray to your local deity for assistance."          |          Sylvain
=============================================UUCP: hplabs!felix!asylvain

jrbd@craycos.com (James Davies) (10/23/90)

In article <1990Oct18.182650.7188@nntp-server.caltech.edu> bruce@seismo.gps.caltech.edu (Bruce Worden) writes:
>svissag@hubcap.clemson.edu (Steve L Vissage II) writes:
>>From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU (Matthew E Cross):
>>> Nope, won't work - the return value of 'toupper' is undefined if the input is
>>> not a lowercase character.
>>  
>>So define your own toupper() macro.  That's what I did.
>>#define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch)
>> [ ... ]
>
>I wouldn't recommend defining a macro with the same name as a library
>function.  

I would certainly second that notion, and go further to say that the
average programmer also shouldn't be messing around with system includes
at all.  I once sent out some C code for a C program to a guy
who had modified his C compiler's definition of "isalpha" so that it
used a table lookup rather than the supplied library function (for
"efficiency", of course).  He lifted the macro definitions for his
new isalpha from another compiler and then made up the table by
himself.  Trouble was, he had an off-by-one error in the table, so that
it considered "Z" to not be a letter.  After about two hours on the
phone with him running his debugger and me coaching, we found the
problem.  (Of course, he didn't tell me about this in advance, I
had to infer it from my program's behaviour).  

I suspect his toalpha macro will make up the time we wasted
sometime in the next century...

bruce@seismo.gps.caltech.edu (Bruce Worden) (10/24/90)

Just a thought, but since toupper() and islower() claim to want an int as
their argument, shouldn't we all be explicitly casting the char that we 
have been happily feeding them, rather than rely on the implicit conversion?
--------------------------------------------------------------------------
C. Bruce Worden                            bruce@seismo.gps.caltech.edu
252-21 Seismological Laboratory, Caltech, Pasadena, CA 91125

george@hls0.hls.oz (George Turczynski) (10/24/90)

In article <2466@ux.acs.umn.edu>, edh@ux.acs.umn.edu (Eric "The Mentat Philosopher" Hendrickson) writes:
> Basically, what I want to do is take a string of upper/lower case, and make
> it all upper case.  Here is a first try at it,
> 
> [Code deleted]
> 
> Can anybody point out a good way of doing this?

Since people have already commented on the oversights (?) in your code, I won't
add any more.  So here's a piece of code that does the trick, but is perhaps
best implemented as a function:


/* --- Cut here --- */

#include<stdio.h>
#include<ctype.h>

main()
{
	char *work, *duh= "Hello";

	printf("%s\n",duh);

	/* The important piece follows... */

	for( work= duh; *work; work++ )
		if( islower(*work) )
			*work= toupper(*work);
	
	/* That was it ! */

	printf("%s.\n",duh);

	exit(0);
}

/* --- Cut here --- */

I hope that this might help you to solve your problem.  Have a good day...


-- 
George P. J. Turczynski,   Computer Systems Engineer. Highland Logic Pty Ltd.
ACSnet: george@highland.oz |^^^^^^^^^^^^^^^^^^^^^^^^| Suite 1, 348-354 Argyle St
Phone:  +61 48 683490      |  Witty remarks are as  | Moss Vale, NSW. 2577
Fax:    +61 48 683474      |  hard to come by as is | Australia.
---------------------------   space to put them !    ---------------------------

asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) (10/26/90)

Ye gadz, recursive follow-ups!

In article <152580@felix.UUCP> asylvain@felix.UUCP,
  I wrote:
> In article <11021@hubcap.clemson.edu> svissag@hubcap.clemson.edu
> (Steve L Vissage II) writes:
> > From article <1990Oct17.170914.683@wpi.WPI.EDU>, by profesor@wpi.WPI.EDU
> > (Matthew E Cross):
> > > Nope, won't work - the return value of 'toupper' is undefined if the input
> > > is not a lowercase character.
> >   
> > So define your own toupper() macro.  That's what I did.
> > #define toupper(ch) ((ch<123 && ch>96) ? ch-32 : ch)
> 
> Two points: first, your macro is *disgustingly unreadable* (and probably
> incorrect ...haven't checked).  This is better:
> 
> #define toupper(ch) (((ch) >= 'a' && (ch) <= 'z') ? (ch) + 'A' - 'a' : (ch))
> #define tolower(ch) (((ch) >= 'A' && (ch) <= 'Z') ? (ch) + 'a' - 'A' : (ch))

Upon re-reading this, I've decided that it's not *much* better from a
readability point of view.  Try this:

#define toupper(ch) (((ch) >= 'a' && (ch) <= 'z')      \
                                ?  (ch) + 'A' - 'a'     \
				:  (ch))

#define tolower(ch) (((ch) >= 'A' && (ch) <= 'Z')      \
                                ?  (ch) + 'a' - 'A'     \
				:  (ch))

Please note that spaces are deliberate.  This is also what is known as a
"dangerous" macro, in that if you pass it something like '*ch++', your
results may not be what you expect.  Therefore, following convention,
it ought to be TOUPPER and TOLOWER as warning.  I still maintain that
you should forget the whole thing and use the library functions.

avery@netcom.UUCP (Avery Colter) (10/26/90)

comp.lang.c/16831, edh@ux.acs.umn.edu

> Basically, what I want to do is take a string of upper/lower case, and make
> it all upper case.  Here is a first try at it,


> #include <ctype.h>
> main()
> {
> 	char *duh = "Hello";
> 	printf("%s\n", duh);
> 	while (*duh <= strlen(duh)) {

a) You should #include <string.h> if you're going to use strlen.

b) (*duh <= strlen (duh)) is a type mismatch. strlen returns an
integer, while *duh is a character. And even with typecasting,
as some others have said it is not a good condition.

Better to do something like:

while (*duh != NULL)

Then you don't need to use #strlen or include <string.h>.
You can do this for one simple reason, at least with the
compiler I use: the value strlen returns is the number of
characters BEFORE THE TERMINATING NULL CHARACTER!	
>	if (islower(*duh)) *duh = toupper(*duh);
>	*duh++;

You gottit backwards, and dereferenced:

++duh; is what you want.

>	}
>	printf("%s\n", duh);
> }

> And what I get is :

> Hello
> Hello

Small wonder: that while condition, even if it gets automatically
typecasted, is comparing the ASCII value of 'H' to the length of
the string, which is 5.

Just make the while condition that the terminating NULL character
of the string has been reached.


-- 
Avery Ray Colter    {apple|claris}!netcom!avery  {decwrl|mips|sgi}!btr!elfcat
(415) 839-4567   "Fat and steel: two mortal enemies locked in deadly combat."
                                     - "The Bending of the Bars", A. R. Colter

karl@haddock.ima.isc.com (Karl Heuer) (10/31/90)

In article <15591@netcom.UUCP> avery@netcom.UUCP (Avery Colter) writes:
>Better to do something like:
>	while (*duh != NULL)

Almost, but please don't spell it "NULL".  This is traditionally used for the
null pointer constant, which is not at all related to the null character
(except that each is obtained by converting a constant zero to the appropriate
type).  On some systems the compiler won't accept the above, since the macro
NULL is defined with pointer syntax.  "while (*duh != '\0')" is better.

>>	*duh++;
>
>You gottit backwards, and dereferenced: ++duh; is what you want.

You're right about the dereference being redundant, but the "backwards" bit is
purely a style issue: "++duh" and "duh++" are exactly equivalent in this
context, since the result of the expression isn't being used.

Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint

karl@robot.in-berlin.de (Karl-P. Huestegge) (11/07/90)

asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) writes:

>In fact, defining this macro is *dangerous*.  Using a locally defined
>"toupper" routine from the standard <ctype.h> *guarantees* that the local
>hardware has been taken into account.  Furthermore, this is one more thing
>you don't have to debug.  Have you tested that macro over the entire ASCII
>character set?  Checked 'a', '`', '@', 'A', 'Z', '[', 'z', '{', etc.?
>(boundary checks)

>Conclusion: If there is a standard routine that does what you want, *use it*.
>This increases reliability and portability and reduces debug time.  If you're
>not sure what the routine does, RTFM and/or get help.  If you not sure that
>something you want to do has already been done, ask someone.  Good chance
>it's already been done.

Sorry, I missed the starting point of the discussion.

There is another reason to use the standard library functions:
The international Charactersets (8bit, ISO-8859-1 for example). On my
international development system toupper('a-umlaut') is ('A-umlaut'),
which is of course *not* 'a-umlaut'-32 or ('a-umlaut' - 'a'-'A').
The functions accesses a library of the local language-set (depending
on the environment-var LC_CTYPE)

One additional advice: Please don't use isascii() in text-functions,
because this forbits all international chars > 127. Use isprint()
instead (or whatever is appropriate).
Please keep your code 8-bit clean. Thousands of Users thank you.
(all the Renes, Angeliques, Mullers and Angstroms would be happy ;-).

-- 
Karl-Peter Huestegge                       karl@robot.in-berlin.de
Berlin Friedenau                           ..unido!fub!geminix!robot!karl

jimp@cognos.UUCP (Jim Patterson) (11/09/90)

In article <1990Nov7.043705.15051@robot.in-berlin.de> karl@robot.in-berlin.de (Karl-P. Huestegge) writes:
>asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) writes:
>
>>Conclusion: If there is a standard routine that does what you want, *use it*.
>>This increases reliability and portability and reduces debug time.
>
>There is another reason to use the standard library functions:
>The international Charactersets (8bit, ISO-8859-1 for example). On my
>international development system toupper('a-umlaut') is ('A-umlaut'),
>which is of course *not* 'a-umlaut'-32 or ('a-umlaut' - 'a'-'A').
>The functions accesses a library of the local language-set (depending
>on the environment-var LC_CTYPE)

You're fortunate to be working on a system that's "working" in this
regard.  We at one point abandoned the vendor's ctype.h functions
because they simply ignored 8-bit character sets. It wouldn't have
been so bad if they just considered the extended characters as
graphics or something, but in fact the functions were implemented with
128-byte tables so anything with the 8th bit set returned arbitrary
results. I won't mention the vendor since they have since put in very
good internationalization support, but I suspect this sort of problem
is present in a number of older implementations.
-- 
Jim Patterson                              Cognos Incorporated
UUCP:uunet!mitel!cunews!cognos!jimp        P.O. BOX 9707    
PHONE:(613)738-1440                        3755 Riverside Drive
NOT a Jays fan (not even a fan)            Ottawa, Ont  K1G 3Z4

zvs@bby.oz.au (Zev Sero) (11/12/90)

>>>>> On 7 Nov 90 04:37:05 GMT, karl@robot.in-berlin.de (Karl-P. Huestegge) said:
Karl> One additional advice: Please don't use isascii() in text-functions,
Karl> because this forbits all international chars > 127. Use isprint()
Karl> instead (or whatever is appropriate).

Unfortunately, in many implementations, including SunOS, the only
ctype.h functions/macros that are guaranteed to work on chars >127 are
isascii and toascii.  If you want your code to work on such systems,
i.e. you are doing things like 
  c = isupper (c) ? tolower (c) : c;
which is unnecessary in standard C, then you must also use isascii.
  c = isascii (c) && isupper (c) ? tolower (c) : c;

To find out whether a character can safely be sent to a printer, in
such an implementation, you must use 
  if (isascii (c) && isprint (c))
otherwise, as I learned the hard way, your program will dump core.
---
                                Zev Sero  -  zvs@bby.oz.au
As I recall, zero was invented by Arabic mathematicians
thousands of years ago.  It's a pity it still frightens
or confuses people.           - Doug Gwyn

msb@sq.sq.com (Mark Brader) (11/12/90)

Karl-P. Huestegge (karl@robot.in-berlin.de) writes:
> One additional advice: Please don't use isascii() in text-functions,
> because this forbids all international chars > 127. Use isprint()
> instead (or whatever is appropriate).
> Please keep your code 8-bit clean. Thousands of Users thank you.

The trouble with this advice is that isprint() is not a replacement
for isascii().  All of the "ctype functions" other than isascii()
are restricted in the arguments they can take, so as to permit the
simple implementation by table lookup.  In an ASCII environment,
isascii() serves as a validator, to see whether the argument value
is permissible to pass to other "ctype functions".  isprint() is
merely another "ctype function" with the same domain of validity
for its argument as the rest.

Now, isascii() itself is not in ANSI C.  (More precisely, implementations
are allowed but not required to provide it, along with any other is...()
functions not mentioned explicitly in the standard.)  As a replacement
for it in its role as a validator, my usual suggestion is:

	#include <ctype.h>

	#ifdef __STDC__
	#  include <limits.h>
	#  define IS_CTYPABLE(c) (((c) < UCHAR_MAX && (c) >= 0) || (c) == EOF)
	#else
	#  define IS_CTYPABLE isascii
	#endif

We would then see things like

	if (IS_CTYPABLE (*p) && islower (*p)) *p = toupper (*p);

But this does not allow for non-ANSI, non-ASCII environments where the
"ctype functions" accept a greater range of argument values than
isascii() returns true on.  I'm not aware of any way to make an
automated test for those environments, which could conveniently be
added to the #ifdef above.  Perhaps Karl can suggest a way.

Caveat: this article was prepared without reference to the final standard.
Please email me if you detect errors, and I'll post a correction.
(This is, incidentally, *almost always* the way that errors on Usenet
are best handled: give the poster a chance to announce their own error
first.  For these purposes, not reading the FAQ list counts as an error.)
-- 
Mark Brader, SoftQuad Inc., Toronto	"... pure English is de rigueur"
utzoo!sq!msb, msb@sq.com			-- Manchester Guardian Weekly

This article is in the public domain.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/12/90)

In article <1990Nov12.040933.5419@sq.sq.com>, msb@sq.sq.com (Mark Brader) writes:
> 	#  define IS_CTYPABLE(c) (((c) < UCHAR_MAX && (c) >= 0) || (c) == EOF)

Knowing that EOF is -1, one could do this with one evaluation of (c)
-- always a courteous thing to do in a macro --
	# define IS_CTYPABLE(c) \
	((unsigned)((c)+EOF) < (unsigned)(UCHAR_MAX+EOF))

I've never used isascii() myself because I had always constructed the
program so that I knew the codes were in range without needing a run-
time test; if you've got something you _think_ is a character and it's
outside the range that the ctype macros can handle what can you do but
report an error, and why leave it that late to check?
-- 
The problem about real life is that moving one's knight to QB3
may always be replied to with a lob across the net.  --Alasdair Macintyre.

gwyn@smoke.brl.mil (Doug Gwyn) (11/12/90)

In article <1990Nov12.040933.5419@sq.sq.com> msb@sq.sq.com (Mark Brader) writes:
>Now, isascii() itself is not in ANSI C.

It doesn't need to be.  All values of unsigned char, as well as EOF,
work just fine as is*() arguments.

doom@informix.com (Mark Dooling) (11/23/90)

Simple question: Is there a newsgroup specialising on curses? If not,
is anyone interested?

=======mark dooling - informix uk