[comp.lang.c] toupper, isupper.., and the flamewar in comp.binaries.ibm.pc.d

mcdonald@aries.scs.uiuc.edu (Doug McDonald) (06/19/91)

Ok, its time to move this here. There has been a very big flamefest in
comp.binaries.ibm.pc.d about 8 bit chars and editors for use with Icelandic.

What does this have to do with comp.lang.c? The answer is, of course,
the isupper, islower, toupper, tolower functions or macros. Now that
ANSI C is the standard for C, and all compiler vendors have had plenty
of time to implement it [no flames on that - it is a fact], what should we
expect those four functions to do? I think we just should say "be
done with this problem" and, for 8 bit (or nine or 10, but not 16) chars,
just use a table. But how to set the table for all possible languages.
A 1024 byte table is not very big - but 1024 bytes for all possible
locales is obvious a bit large. What to do?


I see a big set of tables, which exists as a separate file, which
an installation mode of the program reads at install time and loads the
correct tables (or tables, allowing limited settability inside the
main program) into the data part of the executable.


The outcome of the flamewar on c.b.i.p.d is clear: this is NECESSARY
for a lot of different programs. We need to figure out the best way to
do it.


Doug McDonald
 

mouse@thunder.mcrcim.mcgill.edu (der Mouse) (06/22/91)

In article <1991Jun19.142712.18614@ux1.cso.uiuc.edu>, mcdonald@aries.scs.uiuc.edu (Doug McDonald) writes:

> Ok, its time to move this here. There has been a very big flamefest
> in comp.binaries.ibm.pc.d about 8 bit chars and editors for use with
> Icelandic.

> What does this have to do with comp.lang.c?  The answer is, of
> course, the isupper, islower, toupper, tolower functions or macros.
> Now that ANSI C is the standard for C, and all compiler vendors have
> had plenty of time to implement it [...], what should we expect those
> four functions to do?

Test for, and convert to, lower and upper case.  Isn't that what
they're supposed to do?

> I think we just should say "be done with this problem" and, for 8 bit
> (or nine or 10, but not 16) chars, just use a table.

Why not use a table for 16-bit chars?  In some environments, it's
entirely reasonable.  (And for some environments, using a table for
8-bit chars is an unreasonable waste of space.)

> But how to set the table for all possible languages.

You don't need to deal with all possible languages; you need worry
about only those in your particular implementation.  For example, it
seems entirely reasonable to me to read a table from the filesystem
when setlocale() is called, and perhaps have a table or two (or
knowledge of how to build them) compiled in, in case the disk file is
inaccessible for some reason at run-time.

> I see a big set of tables, which exists as a separate file, which an
> installation mode of the program reads at install time and loads the
> correct tables (or tables, allowing limited settability inside the
> main program) into the data part of the executable.

You could do that, too: have some way to choose which table is the
default one....

But realize also that this is the least of your problems.  For example,
"%s: -%c flag needs an additional argument" isn't of much use in an
environment where users can't be expected to understand English.  You
not only need to change your <ctype.h> macros and functions, you also
need to change all the messages in your program.  Often you'll need to
change more; for example, different countries have different
conventions for displaying many things.  (Money and time are two
obvious examples, obvious enough that mechanisms have been invented to
help deal with them.  There are plenty of others.)

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu