leif@erisun.UUCP (Leif Samuelsson) (06/11/85)
In article <211@pyrltd.UUCP> bejc@pyrltd.UUCP (Brian Clark) writes: > /usr/group/UK is proposing to establish an International Working Group to > develop current ideas on the integration of European character sets into > formal proposals. I think we need to define our terms here. What is the difference between saying "internationalising" and "nationalising"? To me, they seem to be two radically different concepts. Translating Unix commands to other languages and/or incorporating other character sets should really be called "nationalising", while the word "internationalising" should be used to describe the act of making Unix less depending on the U.S. character set. (And thereby making "nationalising" Unix an easier task). For everyone's info, the following eleven characters are to be considered national, and should be avoided in software meant to be "international": #$@[\]^{|}~ ---- Leif Samuelsson Ericsson Information Systems AB ..mcvax!enea!erix!erisun!leif Advanced Workstations Division S-172 93 SUNDBYBERG 59 19 N / 17 57 E SWEDEN
aeb@mcvax.UUCP (Andries Brouwer) (06/11/85)
In article <330@erisun.UUCP> leif@erisun.UUCP (Leif Samuelsson) writes: > >For everyone's info, the following eleven characters are to be >considered national, and should be avoided in software meant to >be "international": > > #$@[\]^{|}~ > No, one wishes to use the full national character set in identifiers, command names etc. On the other hand, one also wishes to use the graphics mentioned, both in texts and as syntax specifiers. Finally, to write all european languages that use the roman alphabet requires a little more than eleven additional characters. Conclusion: make the codes for Scandinavian aa,ae,oe, for Icelandic -d,th, for German sz, for Dutch ij, for French c,, for Spanish n~, for Turkish dotless i, for accented vowels in many languages and the various special symbols in Polish, Czech and Romanian distinct from each other and from the codes for the graphics mentioned above. Clearly this requires an expansion of the ASCII space from 7-bit to 8-bit.
andersa@kuling.UUCP (Anders Andersson) (06/18/85)
In article <330@erisun.UUCP> leif@erisun.UUCP (Leif Samuelsson) writes: >In article <211@pyrltd.UUCP> bejc@pyrltd.UUCP (Brian Clark) writes: >> /usr/group/UK is proposing to establish an International Working Group to >> develop current ideas on the integration of European character sets into >> formal proposals. >For everyone's info, the following eleven characters are to be >considered national, and should be avoided in software meant to >be "international": > > #$@[\]^{|}~ Regardless of whether this is the proper way to go about the problem or not (I don't think it is), shouldn't "`" be among those characters? The subject line (and the fact that this discussion goes to net.unix, not net.text) seems a little strange to me. Does the Working Group focus on character sets in Unix specifically? I consider this a problem of natural language text representation in general. I would appreciate if someone could make these things clear.