minow@decvax.UUCP (Martin Minow) (01/18/86)
"ISO Latin 1 8-bit alphabet, what is it?" -- these notes are mostly from memory, and I apologize in advance for any errors. Latin-1 is intended to replace the current mess of National Replacement Character Sets (the ones that use any or all of #@[\]^`{|} for letters that aren't in the US national alphabet that we usually call ASCII). The alphabet is currently a draft international standard, being developed by ISO, ANSI, and CBEMA (European Business Equipment Manufacturers). It is very similar to the "Dec-Multinational" alphabet available with the VT200-series terminals, and Dec's personal computers. It suits the needs of the majority of Western European Latin-letter languages, and there are proposals for "Latin-2" and "Latin-3" to suit needs of Polish, Lithuanian, etc. Latin-1 adds accented variants to upper- and lower-case vowels, as well as a number of other language-specific letters. There are also a number of additional symbols. AEIOU and aeiou are provided in grave, acute, circumflex, and umlaut variants. The following letters are also provided: A-ring and a-ring (Swedish, Danish, Finnish, Norwegian) AE and ae ligatures (Danish) A-tilde and a-tilde C-cedilla and c-cedilla (French) N-tilde and n-tilde (Spanish) O-tilde and o-tilde O-slash and o-slash (Danish, Norwegian) OE and oe ligatures (Danish) ss (German sharp-s) Y-umlaut and y-umlaut (French, also used for the ij ligature in Dutch) The above refers only to Dec-Multinatinal. Latin-1 adds a few more letters -- I believe these include Icelandic th and dh, and Turkish undotted-i and dotted-I. While upper- and lower-case variants of the letters are related in the same way as "standard" ASCII, the rules to convert between cases are language-dependent. For example, lower-case accented letters generally lose their accents in French, but not in Swedish. In preparing for Latin-1, you should carefully go over your programs to remove any instance of "high-bit used for a flag". Also, programs such as grep that let you search for "any alphabetic" or -- worse -- "upper-case" are going to need rethinking. Hoping the above hasn't been too incorrect, Martin Minow decvax!minow
kay@warwick.UUCP (Kay Dekker) (02/05/86)
In article <163@decvax.UUCP> minow@decvax.UUCP (Martin minow) writes, apologising in advance for inexactitudes, as he is doing it from memory, and I (little smartass!) step in for a couple of corrections and a swipe at my antepenultimately loathed editor: > Y-umlaut and y-umlaut (French, also used for the ij ligature in Dutch) Since when has French used umlaute? >For example, lower-case accented letters >generally lose their accents in French *Upper*-case letters in French usually lose their accents. >In preparing for Latin-1, you should carefully go over your programs >to remove any instance of "high-bit used for a flag". Ho Boy! isn't vi going to need rewriting... Kay. -- Virtue is its own punishment. ... mcvax!ukc!warwick!kay
mikeb@inset.UUCP (Mike Banahan) (02/05/86)
In article <402@snow.warwick.UUCP> kay@warwick.UUCP (Kay Dekker) writes: > .. Isn't vi going to need rewriting (to remove use of the eighth bit) > Kay. Doesn't it just! But the work is under way as we speak. You should see the Japanese vi that UNIX Pacific have done (though they call it jvi). -- Mike Banahan, Technical Director, The Instruction Set Ltd. mcvax!ukc!inset!mikeb
goudreau@dg_rtp.UUCP (Bob Goudreau) (02/07/86)
>Since when has French used umlaute?
For quite a long time. For example, "Citro\:en", "Saint-Sa\:ens", "No\:el",
where "\:e" stands for umlaut-e.
Bob Goudreau
urban@spp2.UUCP (Mike Urban) (02/08/86)
In article <402@snow.warwick.UUCP> kay@warwick.UUCP (Kay Dekker) writes: > >>In preparing for Latin-1, you should carefully go over your programs >>to remove any instance of "high-bit used for a flag". > >Ho Boy! isn't vi going to need rewriting... > And everything else. But is Latin-1 really suitable as a replacement for any particular national character set? In particular, it doesn't collate correctly for any single country's alphabetization scheme, except of course the English-speakers. And just think of the rewrites for "isalpha" and all that stuff... I think we have a mess on our hands. -- Mike Urban ...!trwrb!trwspp!spp2!urban "You're in a maze of twisty UUCP connections, all alike"
minow@decvax.UUCP (Martin Minow) (02/10/86)
(sigh) Having started this mess, let me state that the two dots over vowels (and perhaps y) fulfill different roles (and have different names). They define totally distinct vowels in the Scandinavian and Finnish languages, a vowel modification in German, and a syllable boundary in English and French. The technical term for the two dots is "dieresis". Since I can't spell dieresis (had to look it up) and assumed the gentle reader would understand (or not care), I used a more familiar term. My apologies. Dieresis is used in English and French to indicate a syllable break. Proper journals, such as the New York Times and the New Yorker, add a dieresis to the second 'o' of "cooperate" and most readers should be familiar with "Noel" (Christmas) spelled with dieresis over the 'e'. Hope this clears things up. Martin Minow decvax!minow
taylor@glasgow.glasgow.UUCP (Jem Taylor) (02/10/86)
In article <133@dg_rtp.UUCP> goudreau@dg_rtp.UUCP (Bob Goudreau) writes: >>Since when has French used umlaute? >For quite a long time. For example, "Citro\:en", "Saint-Sa\:ens", "No\:el", >where "\:e" stands for umlaut-e. The point is that 'umlaut' is the german for a mark placed on a vowel to indicate a vowel+letter-e combination - as in Go:ring/Goering. In French the symbol 'trema' (visually identical to umlaut) is used on the letters i and e to indicate that the sound is broken in two (Noe:l) rather than flowing ( Noel, pronounced as per knoll ). "Vive l'Alsace libre!" -Jem
kay@warwick.UUCP (Kay Dekker) (02/12/86)
I asked (following a posting about the ISO Latin 1 alphabet): >>Since when has French used umlaute? and Bob Goudreau replied, saying: >For quite a long time. For example, "Citro\:en", "Saint-Sa\:ens", "No\:el", >where "\:e" stands for umlaut-e. Err, those aren't umlaute, (well, at least not in my book), they're diaereses: marks to indicate that adjacent vowels should be pronounced separately. I believe my question still stands. Kay. -- Virtue is its own punishment. ... mcvax!ukc!warwick!kay
goudreau@dg_rtp.UUCP (02/17/86)
In article <360@glasgow.glasgow.UUCP> taylor@glasgow.UUCP (Jem Taylor) writes: >>>Since when has French used umlaute? >>For quite a long time. For example, "Citro\:en", "Saint-Sa\:ens", "No\:el", >>where "\:e" stands for umlaut-e. > >The point is that 'umlaut' is the german for a mark placed on a vowel to >indicate a vowel+letter-e combination - as in Go:ring/Goering. > >In French the symbol 'trema' (visually identical to umlaut) is used on the >letters i and e to indicate that the sound is broken in two (Noe:l) rather >than flowing ( Noel, pronounced as per knoll ). > >"Vive l'Alsace libre!" > >-Jem Actually, my point is that *any* information system's implementation of a French character set *should* include a way of generating this character. Whether you want to call it "e avec trema", "umlaut - e" or even "yo" (as in Russian) makes no difference. The important issue is its distinction from plain "e" or even from similar (but not identical) looking accents like the Hungarian dieresis. Bob Goudreau
wagner@utcs.uucp (Michael Wagner) (02/23/86)
In article <163@dg_rtp.UUCP> goudreau@dg_rtp.UUCP (Bob Goudreau) writes: > (...) The important issue is its distinction from plain "e" >or even from similar (but not identical) looking accents like the Hungarian >dieresis. > >Bob Goudreau Well, my Hungarian dictionary, having been written to enable Hungarians to learn English rather than to enable me to understand Hungarian, doesn't give the proper name for these symbols. But there are two of them. One which looks like an oomlaut (although it has a different name), and one where the two dots are stretched into lines that slope up and to the right. The second form lengthens the vowel but otherwise keeps it sounding like the oomlaut form. Michael
goudreau@dg_rtp.UUCP (02/26/86)
In article <1118@utcs.uucp> wagner@utcs.UUCP (Michael Wagner) writes: >In article <163@dg_rtp.UUCP> goudreau@dg_rtp.UUCP (Bob Goudreau) writes: >> (...) The important issue is its distinction from plain "e" >>or even from similar (but not identical) looking accents like the Hungarian >>dieresis. > >Well, my Hungarian dictionary, having been written to enable Hungarians >to learn English rather than to enable me to understand Hungarian, doesn't >give the proper name for these symbols. But there are two of them. >One which looks like an oomlaut (although it has a different name), and >one where the two dots are stretched into lines that slope up and to the >right. The second form lengthens the vowel but otherwise keeps it sounding >like the oomlaut form. > >Michael That's what I meant. The distinction between these accents is sometimes lost by non-Hungarian readers. Bob Goudreau