[comp.std.internat] code for char set conversions

rschwartz@OFFICE.WANG.COM (R. Schwartz@Wang R&D Net) (05/21/91)

Has anyone got code written (preferably in 'C') for doing conversions
between Unicode and 8859 (various flavors handled by a table-driven
algorithm, of course) and vice versa?

Same question regarding Unicode and 6937 flavors, and vice versa?

Same question regarding 8859 flavors and 10646?

Same question regarding 6937 flavors and 10646?

Same question regarding Unicode and 10646 and vice versa, at least for the
European languages subsets?

The reason for my interest is theoretical, not practical.  By examining
the complexity and performance of these routines I hope to either confirm or
rebut my feelings about the technical implications of Unicode's utilization
of trailing non-spacing diacritics.

rich schwartz   (All views expressed are my own, and not Wang Labs, Inc.'s.)
 rschwartz@office.wang.com      VOICE (508) 967 5027     FAX (508) 967 0947
     Wang Labs, Inc., M/S 019-58A, 1 Industrial Ave., Lowell, MA 01851

keld@login.dkuug.dk (Keld J|rn Simonsen) (05/23/91)

rschwartz@OFFICE.WANG.COM (R. Schwartz@Wang R&D Net) writes:

>Has anyone got code written (preferably in 'C') for doing conversions
>between Unicode and 8859 (various flavors handled by a table-driven
>algorithm, of course) and vice versa?

And same question on conversion between other character sets...


I have written code for conversion between about 100 character sets
(in C) but actually none of the mentioned conversions were included.
They are planned though.  The reason why these conversions are
not included in the abut 10000 possible converisons I have, is because
these are the most difficult. And I find the conversions to and from
Unicode the most difficult of them all, so that will be the last to
be implemented, but it will come....

The code is available in dkuug.dk:pub/ch.shar0[12] by ftp, FTAM or email.

Keld Simonsen