[comp.std.internat] ISO 10646 and Unicode

sommar@enea.se (Erland Sommarskog) (12/29/90)

Also sprach Dimitri Vulis, (dlv@cunyvms1.gc.cuny.edu):
>Many folks (including myself) feel that 10646 is a bad DIS.
>
>There is, however, a 16-bit proposal known as Unicode. I'd like to
>see a Unix that handles 16-bit text well! :)

I don't know too much about either Unicode and 10646, but of
what I know 10646 seems to be the winner. Unicode seems to
be incompatible and insufficient. Also, Unicode is always
16-bit, whereas 10646 leaves room for 8-bit as well as
16- and 32-bit characters. Also, 10646 will have an ISO
stamp on it. Things like Unicode will only make a mess
of it. I want *one* standard, not two, on the risk of
having none in practice. If you think 10646 is in bad shape, 
object to it and try do to change it. It is still in draft.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
"There is only one success, namely to lead your life in your own way"
Anyone who can give a source for this?

keld@login.dkuug.dk (Keld J|rn Simonsen) (12/29/90)

Well, ISO 10646 is at DIS stage now. It is out for voting terminating
1991-06-06. The DIS has the document nbr ISO/IEC JTC1/SC2/WG2 N666.

That 10646 is a Draft International Standard (DIS) should mean that
the standard is technically stable, and that there only can be done 
editorial changes to the standard. No substantial changes can be made.

This is also one of the advantages of ISO 10646, it is available now,
while UNICODE first will be available about April 91.

On the other hand, I do not care that much about one or two more
character sets. There has been several 16 bit character sets on
the market for quite some time, viz. the Chinese, Japanese and Korean
standards.

Keld Simonsen

dlv@cunyvms1.gc.cuny.edu (Dimitri Vulis, CUNY GC Math) (12/31/90)

My impression is that Unicode is pretty stable. Way back in July I made
a few comments on the Cyrillic section; they accepted the major ones,
but didn't want to mess with the rest, thinking it to be pretty stable.
It's unlikely that the final version will be more than slightly
differnet from what's floating around now.

Happy new year,
Dimitri Vulis, D&M
BITNET:            DLV@CUNYVMS1
Internet:          DLV@CUNYVMS1.GC.CUNY.EDU
Snail:             Department of Mathematics/Box 330
                   City Univesrity of New York Graduate Center
                   33 West 42 Street
                   New York, NY 10036-8099
                   USA

ck@voa3.VOA.GOV (Chris Kern) (01/09/91)

It is obvious to us, as a major consumer of office automation in multiple
languages, that anything short of Unicode (or something of its ilk) is just
a short-term expedient -- and not a particularly desirable one, at that.

In the long run, widespread acceptance of an eight-bit standard actually
will retard progress toward the free interchange of text, since it is
certain to be more difficult to persuade the industry to adopt two
standards -- an interim eight-bit one followed by a permanent 16-bit
standard -- than it would be to make a single, clean break.

-- 
Chris Kern     ck@voa3.voa.gov     ...uunet!voa3!ck     +1 202-619-2020

keld@login.dkuug.dk (Keld J|rn Simonsen) (01/13/91)

It was stated in this group some time ago that UNICODE could be viewed
as one of the most useful implementations of ISO 10646. ISO 10646
would allow many kinds of subsetting and UNICODE could be one of them,
was the message in this article (which is now expired on my system).

Reading the specifications of ISO 10646 and from the knowledge I have
of UNICODE, it is not possible to have UNICODE as a version or subset
of ISO 10646. It is true that ISO 10646 can have subsets, but they
are all with encodings defined in ISO 10646, for instance a 16 bit
subset has to use the encoding of a ISO 10646 plane - all the 
characters in the subset must use the same codes for the two
bytes as the 10646 specifies.

As far as I know there is no direct relationship between 10646
and UNICODE encodings. For example UNICODE uses byte codes
0-31 and 127-159, which is otherwise reserved for control characters.
A feature of UNICODE is thus that it can have more (about 33 % more)
characters in 16 bits than 10646, at the price of losing compatibility
with ISO standards like ISO 646 and 8859.

Keld Simonsen