randall@uvaarpa.virginia.edu (Randall Atkinson) (03/15/90)
From: randall@uvaarpa.virginia.edu (Randall Atkinson) As one who is fairly active in the multilingual computing side of things, I'm fairly certain that it just isn't worth it to try to make ISO 646 the basis of *anything* for the practical reason that it wasn't well thought out to begin with and has already been superceded by the ISO 8859/* family of 8-bit character sets. The latter fully support European linguistic needs (yes, including Danish and Icelandic and ...) and can be used quite nicely with most UNIX shells that I'm familiar with. I thought that trigraphs got excessive attention back when ANSI C was being developed and I fear that excessive attention will be devoted to ISO 646 when there are other areas of internationalisation that really deserve being thought about and solved cleanly. Most of the vendors of hardware in Europe are supporting ISO 8859/1 now, so it is the real long term solution to European needs anyway. Worrying about support for ISO 646 is a mistake, worrying about supporting ISO 8859/* and the Asian need for larger character sets being fully supported and ways of handling date formats and such aren't a mistake at all. Volume-Number: Volume 18, Number 73
marius@rhi.hi.is (Marius Olafsson) (03/17/90)
From: marius@rhi.hi.is (Marius Olafsson) randall@uvaarpa.virginia.edu (Randall Atkinson) writes: > I'm fairly certain that it just isn't worth >it to try to make ISO 646 the basis of *anything* for the >practical reason that it wasn't well thought out to begin with >and has already been superceded by the ISO 8859/* family of >8-bit character sets. I agree. The ISO 8859 series of charactersets have the (in my opinion neccessary) quality that the *complete* set of ASCII characters can be represented. If ISO 646 will be taken into consideration must we then allow alternate syntax in the varius shells and utilites that make use of the characters {}[]@\| and ` - I think that is a can of worms best left unopened. >The latter fully support European linguistic needs (yes, including >Danish and Icelandic and ...) and can be used quite nicely with >most UNIX shells that I'm familiar with. And it seems that most major manufacturers already have (or have announced) support for ISO 8859 - at least HP-UX, Ultrix, AIX, SunOS and more I am sure. The X window system now supports ISO 8859 fonts, the latest Adobe rel of Postscripts support ISO 8859 encoding of the fonts, and the list goes on ... NONE provide any support for or consideration for ISO 646 (fortunately). > I fear that excessive attention will be >devoted to ISO 646 when there are other areas of internationalisation >that really deserve being thought about and solved cleanly. Definately, and serious consideration should be given to the way X/Open has defined some of these other areas. That system actually works pretty well in practice. It has been used here for about two years (on HP-UX). -- Marius Olafsson internet: marius@rhi.hi.is University of Iceland UUCP: {mcsun,sunic,uunet}!isgate!rhi!marius Volume-Number: Volume 18, Number 77
wheeler@ida.org (David Wheeler) (03/17/90)
From: wheeler@ida.org (David Wheeler) domo@tsa.co.uk (Dominic Dunlop): = From: Dominic Dunlop <domo@tsa.co.uk> = = Report on ISO/IEEE JTC1/SC22/WG15 Rapporteur Group on = Internationalization Meeting of 5th - 7th = March, 1990, Copenhagen, Denmark = = Dominic Dunlop -- domo@tsa.co.uk = = The Standard Answer Ltd. = I enjoyed your posting, thank you! You included a lot of "what this phrase really means" that I appreciated. = = 3. ISO 646[4], the earliest ISO standard for information = technology, is the international derivative of ASCII. = Its Danish variant replaces ASCII's } with aa. Around = the world, #$@[\]^`{|}~, all of which have a special = meaning to the shell, are replaced by other characters = in standards derived from ISO 646. See [5] for much = more information. = Isn't there an 8-bit standard character set that defines the first 128 characters as a standard set (say as USASCII, provincial I'm afraid but it would break no Unix tools), then includes all the international characters as those with values > 127? If this were used in the POSIX standard, wouldn't this solve many problems for those using a Latin-based alphabet? Or is this standard unused in the real world? Admittedly this eliminates the non-Latin alphabet world, and that is a weakness. = Apart from all this organizational stuff, we did review some = existing documents. For example, DTR (draft technical = report) 10176, a product of SC14, discusses the treatment of = characters appearing in language constructs, variable names, = literals and comments, and turns out to have implications = for sh, awk, yacc and the other ``little languages'' defined = in DP 9945-2, the forthcoming international standard for the = shell and tools. And a document from SC22's study group on = character sets suggests that source files should have some = means of announcing the character set that they're using. = Could this mean typed files or resource forks for POSIX6? = Gee. How would we hide that? = Some C programs would have to be fixed to deal with signed characters but at least the rules would be simple: 128+ are ordinary characters & can be used in identifiers, etc. Source file tagging for language sounds like an abomination! --- David A. Wheeler wheeler@ida.org Volume-Number: Volume 18, Number 80