sommar@enea.se (Erland Sommarskog) (11/19/89)
(This is hardly news for comp.std.internat readers, but the subject belongs to that group.) Salmela Jarmo (js@kaarne.tut.fi) writes: >PS. The ASCII standard that supports national characters is really >needed. Well, ASCII supports all national characters it can think of. I.e, American. But, seriously it exists. The standard you want is ISO 8859, which is a family of eight-bit standards, all with good all ASCII in the 0-127 slots, new control characters in 128-159, non-break space in 160 and "soft hyphen" in ord('-') + 128. Then the rest is different in the various standards, which are five standards with Latin characters, and one each with Kyrillic, Arabian, Hebrew and Greek characters. I don't if all of them are settled, but at least Latin-1 and Latin-2 are. One can predict that for the next few years Latin-1 will be the most important since it covers all major Western European languages except Welsh and Catalan I think. Latin-2 covers Eastern European languages. Then of course there is problem to start posting Usenet articles from your VT320 using Latin-1. People with seven-bit terminals, of which there probably are a few, will get the new characters folded into old making your text quite incomprehensible, even worse than those brackets and braces you get using the national seven-bit conventions for dotted "a":s and "o":s. -- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
heimir@rhi.hi.is (Heimir Thor Sverrisson) (11/20/89)
sommar@enea.se (Erland Sommarskog) writes: ... deleted description of the eight bit character set standard, ISO 8859 (especially ISO 8859/1 or Latin-1). >Then of course there is problem to start posting Usenet articles >from your VT320 using Latin-1. People with seven-bit terminals, >of which there probably are a few, will get the new characters >folded into old making your text quite incomprehensible, even >worse than those brackets and braces you get using the national >seven-bit conventions for dotted "a":s and "o":s. People with seven bit terminals can put filters on their news readers so they get something meaningful out of the eight bit charaters. They could for example translate the upper case icelandic thorn into 'Th' and 'o accute' into 'o'. Then I would be able to use my middle name SPELLED CORRECTLY in my signature. I could also send you direct mail in Danish and you could answer me in Swedish. We have been using the ISO set here in Iceland for some years now and I'm very surprised of how far behind the Scandinavian contries are in this sense, they all seem to be using (their own special version of) seven bit modified ASCII sets. -- Heimir Thor Sverrisson heimir@rhi.hi.is
psv@nada.kth.se (Peter Svanberg) (11/21/89)
In article <1353@krafla.rhi.hi.is> heimir@rhi.hi.is (Heimir Thor Sverrisson) writes: > >People with seven bit terminals can put filters on their news readers >so they get something meaningful out of the eight bit charaters. They >could for example translate the upper case icelandic thorn into 'Th' >and 'o accute' into 'o'. Then I would be able to use my middle name >SPELLED CORRECTLY in my signature. I could also send you direct mail >in Danish and you could answer me in Swedish. > As usual, when you change fundamental things like this, you must make it as invisible as possible for everybody who hasn't got the equipment for or isn't interested in the improvements you can get as a consequence of the change. So, those who want the improvements is the ones who must make an effort to GET them, not everybody else to AVOID them (at least not when "everybody else" is in great majority). >We have been using the ISO set here in Iceland for some years now and >I'm very surprised of how far behind the Scandinavian contries are in >this sense, they all seem to be using (their own special version of) >seven bit modified ASCII sets. There are a number of problems with converting to use an eight bit character set. A large one is that most of the software and hardware we use doesn't know anything about it. (Yes, this is slowly changing now, but it isn't good yet, and certainly was not several years ago!) What did you use before? Have you really converted to ISO 8859-1 everywhere in Iceland? On which operating systems? Other differences between us and you is that you have more non-ASCII characters than we have and that you - being a small isolated country - are very caring of your language etc. (For us it's rather the opposite on the latter point.) But, as I said, things are changing. I predict some character set confusion (of another kind than the current) in Europe in the next few years, followed by - comparatively - calm, in perhaps five years. --- psv@nada.kth.se (should work!) Peter Svanberg uunet!nada.kth.se!psv (for lazy nodes...) Dept of Num An & CS psv%nada.kth.se@uunet.uu.net (ARPA nodes) Royal Institute of Tech Stockholm, SWEDEN
finn@mojo.UUCP (Finn Markmanrud) (11/22/89)
Please be kind to us poor beginners! I have no ideas on how to convert ^ to Th or anything similar. Being the only Norwegian in the company (I think), I am pretty sure I cannot get a request through to include this on our system. Some day I might be able to make my own conversion in my own directory, but until then, I would appreciate being able to read mail & news from my Scandinavian friends. Most of them use oe, ae, and aa as substitutes, and it works very well. We use 7-bits, and from what I hear, this is no longer any good. Am I about to loose touch with my old country / continent? Maybe it's not as bad as it sounds, but I thought I'd remind all you whiz's out there that there are a few people who call themselves "users," and do just that - use the facilities provided. Please be gentle! -- +=====================+========================+=============================+ | Finn Markmanrud | finn@mojo.nec.com | "It can't happen here." | | (508) 264 8668 | Boxboro, MA | F.Z. | +=====================+========================+=============================+
heimir@rhi.hi.is (Heimir Thor Sverrisson) (11/23/89)
psv@nada.kth.se (Peter Svanberg) writes: >>People with seven bit terminals can put filters on their news readers >>so they get something meaningful out of the eight bit charaters. >As usual, when you change fundamental things like this, you must make >it as invisible as possible for everybody who hasn't got the equipment >for or isn't interested in the improvements you can get as a >consequence of the change. So, those who want the improvements is the >ones who must make an effort to GET them, not everybody else to AVOID >them (at least not when "everybody else" is in great majority). Because of the structure of ISO 8859, the eight-bit characters will fold into 'printable' seven-bit characters anyhow. If someone does not change his old system to interpret the eight-bit characters, so what? He's not interested anyway! >>We have been using the ISO set here in Iceland for some years now and >>I'm very surprised of how far behind the Scandinavian contries are in >>this sense, they all seem to be using (their own special version of) >>seven bit modified ASCII sets. >There are a number of problems with converting to use an eight bit >character set. A large one is that most of the software and hardware >we use doesn't know anything about it. (Yes, this is slowly changing >now, but it isn't good yet, and certainly was not several years ago!) You will be surprised if you really try to use eight bit data :-) Most systems are at least 'eight-bit transparent', i.e. they don't 'scrub' the data to seven-bit. Unix systems that I've used that do better than that are for example HP-UX, IBM's AIX (both RT and PS/2) all Unix's for Intel 80386 I've tested. The worst experience I've had recently was with a Sun 4 csh that logs you out if you enter a character with the eighth bit set! Many software packages now allow eight-bit data. I was just testing Informix RDBS on this same Sun 4 and found out that I could really enter eight bit data into forms, what I could not do two years ago. We've also got some public domain software that has been *corrected* to be able to use eight-bit characters such as mailers, editors and news readers. >What did you use before? Have you really converted to ISO 8859-1 >everywhere in Iceland? On which operating systems? We did have a national version of ISO-646 that could not cover all the accented characters we've got. The Unix systems are generally using ISO, which is the only official Iclandic standard for eight-bit character sets. On PC's people are using a national version of the American PC-set (yuk) and very few have adopted Code Page 850 that came from IBM when they introduced the PS/2 line. On the IBM-360/370 and 3X and AS400 they are using some (different) versions of EBCDIC :-( >Other differences between us and you is that you have more non-ASCII >characters than we have and that you - being a small isolated country >- are very caring of your language etc. (For us it's rather the >opposite on the latter point.) The first point is certainly true, our alphabet has 36 characters, which means that we need 20 characters (uc+lc) that are not in ASCII. I would certainly not tolerate a letter from the authorities that would not have my name spelled correctly ! >But, as I said, things are changing. I predict some character set >confusion (of another kind than the current) in Europe in the next few >years, followed by - comparatively - calm, in perhaps five years. I don't think it will even take so long. All major hardware manufacturers have made most of their terminal equipment independent of the character set by moving functions into software that were previously done in hardware. The european market is also the fastest growing for many soft- ware houses and is in many cases already bigger than the US market. If these people really want to make it over here they can solve many of their problems by using ONE character set that covers the US, Europe and South America! -- Heimir Thor Sverrisson heimir@rhi.hi.is
magnus@rhi.hi.is (Magnus Gislason) (11/25/89)
heimir@rhi.hi.is (Heimir Thor Sverrisson) writes: [Talking about the Icelandic alphabet] >The first point is certainly true, our alphabet has 36 characters, which >means that we need 20 characters (uc+lc) that are not in ASCII. I would You should know that the Icelandic alphabet does not include C, Q, W and Z, and thus only contains 32 characters. :-)
einari@rhi.hi.is (Einar Indridason) (11/26/89)
In article <1383@krafla.rhi.hi.is> magnus@rhi.hi.is (Magnus Gislason) writes: >heimir@rhi.hi.is (Heimir Thor Sverrisson) writes: > >[Talking about the Icelandic alphabet] > >>The first point is certainly true, our alphabet has 36 characters, which >>means that we need 20 characters (uc+lc) that are not in ASCII. I would > >You should know that the Icelandic alphabet does not include C, Q, W and Z, >and thus only contains 32 characters. :-) I will most definitely not write 'pizza' as 'pissa' :-) (Besides 'pissa' has another meaning in icelandic as well) But I'm really pissed off (no 'pizza' here :-) about 'americaned' software which does not allow us here in Iceland to use our full national character set. For example, DBase-III does not allow the big 'thorn', but instead considers that as a end-of-file. Meaning that whatever comes after the big thorn is ignored. Some editors choke or perform some unwanted commands, whenever the special icelandic characters are used, like 'kill-file', 'save-and-quit' and other nasties like that. If there are any software-writers out there, please consider us Icelanders (and other), that must use 8-bit character set. While you are doing that, could you consider adding some 'sorting tables' so that we can sort our applications in the icelandic way. ???????????????????? -- To quote Alfred E. Neuman: "What! Me worry????" Internet: einari@rhi.hi.is UUCP: ..!mcvax!hafro!rhi!einari
stefan@svax.cs.cornell.edu (Kjartan Stefansson) (11/26/89)
In article <1386@krafla.rhi.hi.is> einari@rhi.hi.is (Einar Indridason) writes: >In article <1383@krafla.rhi.hi.is> magnus@rhi.hi.is (Magnus Gislason) writes: >>heimir@rhi.hi.is (Heimir Thor Sverrisson) writes: >> >>[Talking about the Icelandic alphabet] >> >>>The first point is certainly true, our alphabet has 36 characters, which >>>means that we need 20 characters (uc+lc) that are not in ASCII. I would >> >>You should know that the Icelandic alphabet does not include C, Q, W and Z, >>and thus only contains 32 characters. :-) We can argue about this, but the main point is of course, that for every practical purposes, Icelanders need to deal with those 36 characters. For instance, every character you mention, appears in the phone directory -- names of Icelandic people. (although the roots of their names are typically foreign, or poor foreign imitation :-) >But I'm really pissed off (no 'pizza' here :-) about 'americaned' software which >does not allow us here in Iceland to use our full national character set. ...[examples deleted] >If there are any software-writers out there, please consider us Icelanders >(and other), that must use 8-bit character set. Reminds me of this fantastic software called X11. They have several nice fonts, including the full ISO-8859-1 standards. But typically applications strip the most significant bit in the data, so they can only display the English set :-( Of course there is always a way to go around it, and I know Icelanders have managed to hack their way through, in several cases. But that simply illustrates how stupid the design was, not to make this an option in the first place. Kjartan.
matsc@sics.se (Mats Carlsson) (11/27/89)
In article <1383@krafla.rhi.hi.is> magnus@rhi.hi.is (Magnus Gislason) writes:
You should know that the Icelandic alphabet does not include C, Q, W and Z,
and thus only contains 32 characters. :-)
Really? Wasn't it quite recently that a spelling reform said words
like "yzt" should be spelled with an s instead of a z, reverting an
earlier law which banned writing s instead of z? Didn't Halldor
Laxness even spend some time in prison for this "crime"?
--
Mats Carlsson
SICS, PO Box 1263, S-164 28 KISTA, Sweden Internet: matsc@sics.se
Tel: +46 8 7521543 Ttx: 812 61 54 SICS S Fax: +46 8 7517230
stefan@svax.cs.cornell.edu (Kjartan Stefansson) (11/27/89)
In article <MATSC.89Nov27092541@vishnu.sics.se> matsc@sics.se (Mats Carlsson) writes: >In article <1383@krafla.rhi.hi.is> magnus@rhi.hi.is (Magnus Gislason) writes: > You should know that the Icelandic alphabet does not include C, Q, W and Z, > and thus only contains 32 characters. :-) > >Really? Wasn't it quite recently that a spelling reform said words >like "yzt" should be spelled with an s instead of a z, reverting an >earlier law which banned writing s instead of z? Yes, this is correct. 'z' used to be perfectly valid Icelandic letter. But it is pronounced as 's' in modern Icelandic. The only way to distinguish between 's' and 'z' in spelling, was to know the root of the word. Few years ago, a spelling reform was made, to replace the 'z' by a 's'. > Didn't Halldor >Laxness even spend some time in prison for this "crime"? Halldor Laxness has been known for his style of spelling, which in general is closer to the spoken language than the official spelling. In his early work he was criticized a lot for this, but I don't believe he was ever imprisoned for it! Kjartan.