[comp.std.c] Asian Character Sets & wide characters

rja@edison.GE.COM (rja) (05/09/89)

In article <1989May3.154934.3205@utzoo.uucp>, 
     henry@utzoo.uucp (Henry Spencer) writes:
> [deleted stuff]  There are standard shift sequences to reach other alphabets.
> (Although shifts are an enormous pain in string manipulation, which is
> why ANSI C recognizes the notion of "wide character" to deal with such
> things internally as unshifted codes.)  Someday the terminals etc. will
> speak ISO Latin, and that will solve this set of problems.  (Then we'll
> have the oriental languages to deal with... the existing code-extension
> hooks can cope in theory, but in practice it's cumbersome.)

I believe that there is either an ISO or an X/OPEN standard for shift
sequences to change character sets.

I think that wide characters are setup to handle Asian character-based 
languages already rather than having left Asian character sets out in the 
cold.  I know that the Japanese character set standards all will fit as either
8-bit (for Kana) or 16-bit (for Kanji) characters.  The group trying to
draft an international standard for Chinese reportedly has already
decided to use a 16-bit definition with the space 0-255 (decimal)
reserved for ISO 8859 character sets and ISO control sequences.

I have no idea if the Chinese group is trying to coordinate with
the Kanji standard or not.  I certainly hope so.

The AT&T Japanese NLS for AT&T UNIX System V is probably a good
place for folks to look as an example of how Asian languages can
be cleanly supported.  HP did some pioneering work in Japanese UNIX,
but have gotten themselves out of synch with the Japanese character
set standards and so isn't a really good example just now.

This has strayed a bit.  Followups to the character set standards
should be redirected to comp.std.internet...