irf@kuling.UUCP (Bo Thide) (04/03/88)
In article <1399@edison.GE.COM> rja@edison.GE.COM (rja) writes: >In recent postings in comp.unix.wizards, International language support >has been mentioned. Dave Decot (spelling ?) @ HP indicated that there is >on-going standards development in these areas. I'd really like to find >out more about what is going on. Would folks involved with this care to >comment ?? > My particular area of interest is Asian Language Support. I'm aware [deleted ...] > I'd also be interested in the European support, particularly X/OPEN >standards that exist or are in development. In fact, it would be nice >if X/OPEN could post (quarterly perhaps) a summary of its standards >development. This newsgroup seems under-utilised as it is. NLS is a great idea! To give you some background, I take the liberty to quote from the EUUG Newsletter Vol7 No2 article "An Overview of the Native Language System" by Michael J. C. Terry (mcjt@inset.co.uk): "In January this year [1987 -bt], the X/OPEN group published the second edition of its X/OPEN Portability Guide (XPG). Section 3 of the guide included a software internationalisation interface standard specification -- the Native Language System (NLS). Although many propietary solutions to the internationalisation problem have been attempted over the years, this is the first time that a commercial standard has been specified for internationalisation on UNIX (R) [or should it be (TM)? -bt] systems. The X/OPEN NLS standard specification has arrived as a response to a pressure that has been growing slowly but relentlessly from non-English-speaking UNIX users as use of the system has filtered down from the ivory towers of academe to the air-conditioned offices of modern commerce. It is not surprising that this internationalisation specification has emerged from the X/OPEN group rather than from AT&T -- after all, despite the recent addition of American companies to the X/OPEN roll call, X/OPEN started out as a purely European grouping, and is still predominantly European. What is perhaps surprising is that the NLS specification is based on an internationalisation architecture developed in the USA by Hewlett-Packard. ... Hewlett-Packard have a working version of NLS on their HP-UX opreating system. The source code has been made available to the other members of X/OPEN in order to expedite its implementation on currently available versions of UNIX. ... The eventual intention is that NLS will support multiple 8-bit character sets. The XPG states: This first issue of the X/OPEN NLS specification defines the major transmission codeset for Western European use as the standard IS8859/1, and also recommends its use as the corresponding internal codeset. Other codesets will be identified in later issues. The IS8859/1 codeset is capable of supporting most major Western European languages. In addition, it is compatible with ASCII functionality, since it incorporates the ASCII codeset as the first 128 characters of the codeset" To describe the agrred-upon standard codeset I quote from "International Standard. Information processing -- 8-bit single byte coded graphic character sets -- Part 1: Latin alphabet No. 1", ISO 8859-1, First edition 1987-02-15: ISO 8859 [this is the correct name -bt] consists of several parts. Each part specifies a set of up to 191 graphic characters and the coded representation of each of these characters by means of a single 8-bit byte. The use of control functions for the coded representation of composite characters is prohibited by ISO 8859. Each set is intended for use for a group of languages. ISO 8859/2 secifies a set of 191 graphic charactes identified as Latin alphabet No. 2. .... This set of graphic characters, the Latin alphabet No. 1, is intended for use in data processing and text applications and may also be used for information interchange. The set contains graphic characters used for general purpose applications in typical office environments in at least the following languages: Danish, Dutch, English, Faroese, Finnish, French, German, Icelandic, Irish, Italian, Norwegian, Portuguese, Spanish and Swedish" The ISO 8859/1 codeset contains things like soft hyphen, capital and small letter A with acute accent, capital and small Icelandic characters ETH and THORN, capital and small german letter SHARP S, capital and small letter A with ring above, diaeresis characters, and much more. ISO 8859/2 is useful for Albanian, Czech, English, German, Hungarian, Polish, Rumanian, Serbocroatian, Slovak and Slovene and contain for instance characters with carons ("inverted circumflex accents") used in some of these languages. The ISO 8859 codesets are very complete and are extremely cleverly designed with the capitals coded as SHIFTed small characters. This is NOT true for other 8-bit character codesets! (Do you listen, HP???) -Bo -- >>> Bo Thide', Swedish Institute of Space Physics, S-755 90 Uppsala, Sweden <<< Phone (+46) 18-300020. Telex: 76036 (IRFUPP S). UUCP: ..enea!kuling!irfu!bt