aeb@mcvax.UUCP (Andries Brouwer) (07/26/85)
Last time I just mentioned a few accents that occurred to me while writing - let me now give a more detailed overview of what accents exist. 1. Accents on top - Acute accent (') occurs on top of almost anything; many languages have 'a 'e 'i 'o 'u ; Icelandic also 'y ; Slovak also 'y 'r 'l ; Polish also 'c 'n 's 'z ; Latvian has a character that is sometimes printed as 'g (see below); etc. Note that the ' on 'a has not the same slope as the ' on 'i . - Grave accent (`) occurs in many languages in `a `e `i `o `u ; Slovene `r - Circumflex (^) occurs in many languages in ^a ^e ^i ^o ^u ; Esperanto has ^c ^g ^h ^j ^s ; accented Latvian has ^l . - Trema/Diaeresis/Umlaut (::/") occurs as umlaut in many languages in "a "o "u (e.g. German, Slovak, Finnish, Swedish, Turkish, Hungarian); as trema in ::a ::e ::i ::o ::u . - Hacek (h\'a\vcek) (v) occurs in many Slavic languages; Czech has ve vc vn vs vr vz ; Slovak also vD ; Esperanto vu . In transcriptions one meets other letters with hacek, e.g. Armenian vj . - When the letter that should get the hacek is tall, then it gets a comma at the upper right instead: Czech has ,d ,t ; Slovak also ,l . - Dot above (:) occurs in various places; the most obvious ones are :z in Polish and :e in Lithuanian, but I found it also e.g. as :n in the African language Bamoum. - Macron (overline) (-) occurs as -a -e -i in Latvian, as -u in Lithuanian and is otherwise generally used to denote the length of vowels. - Corona (circle above) (o) is found in Scandinavian oa and Czech ou . - Tilde (~) is found in Spanish ~n , Portuguese ~a ~o and otherwise e.g. in accented Baltic languages: ~a ~e ~i ~o ~y ~m ~n ~l ~r ~.e . - Breve (half circle above) (U) is found in Rumanian Ua , Turkish Ug , Vietnamese Ua and is otherwise generally used to denote short vowels. - Double acute ('') is found in Hungarian ''o and ''u . - High tone mark (question mark without dot) (?) is found in Vietnamese ?a ?o ?u . - In Latvian the palatalized sounds have a comma below, as we shall see, but in ,g there is no room for the , to go below, and one finds it on top instead. I have met three variations: 'g (acute accent), ,g (high centered comma) and I,g (high centered inverted comma). Sometimes the high centered inverted comma is met in other places; I have seen I,k and I,t in transliterated Armenian and I,p in Sorbian. - In old Croatic texts one finds the double grave accent (``) as in ``a ``e ``i ``r . 2. Accents below - Cedille (,) or left hook occurs in French ,c ; in Turkish ,s ; in Rumanian ,s ,t ; in Latvian ,k ,l ,n ,r (and ,K ,L ,N ,R ,G - for ,g see above). These hooks do not always resemble a comma. - Rude (L) or right hook occurs in Polish La Le ; in Thai and old Norse Lo ; in Lithuanian La Li Lu ; in old Latvian Le Lk . These hooks start right from the center, sometime almost at the center, sometimes at the lower right hand corner. - Dot below (.) occurs in Vietnamese .a .e .o ; in transliterations from Arabic or Sanskrit one meets .d .t .s .r .h etc. - Corona below (0) occurs in transliterations, often to indicate that a sonorant has syllabic value: 0m 0n 0l 0r 0s . - Breve below (u) occurs in transliteration of Sanskrit and Hittite uh . - Double dot below (..) seems to occur in transliterated Urdu ..t . - Vertical bar below (|) seems to occur in Yoruba |o . - Circumflex below (A) seems to occur in Bamileki and Venda Ae . 3. Accents on more than one letter simultaneously - An arc on top may join two letters, like in the transliteration of the Russian "relected R" as IU{ia} . - In Tagalog occurs a tilde on the ng digraph: ~{ng} . - Underline (_) is often used to indicate that two letters transliterate one sound, e.g. in various Indian languages _{kh} . - Similarly the double underline (=) is sometimes used when the combination of two letters stands must represent two distinct sounds, e.g. Urdu ={gh} . (See also the ligature above.) Note that I do not propose a naming scheme for accented symbols here - the chosen denotations are purely ad hoc. Simple schemes as discussed earlier almost always work, but fail when one letter carries several diacritical marks. In Vietnamese one finds letters with acute and circumflex side by side (so that it looks like a rotated 'less than or equals' sign): {'^}a {'^}e {'^}o and towers like '^o ^a. ?Ua ~^e (read from top to bottom). In Lithuanian one meets ~.e ~u, {.'}e '-u etc. Clearly, when symbols can have three or more accents in various mutual positions then some nontrivial grammar is needed to describe the situation. 4. Special symbols Various ligatures are conventionally treated as a single symbol. One has Dutch ij , German ss (or sz), French oe and Scandinavian (and Latin) ae . Turkish has dotless i (.i). Icelandic has the thorn (bp) or (th). Some symbols with a crossbar are Polish /l and /L ; Scandinavian /o and /O ; Vietnamese and Yugoslavian and Icelandic -d and -D ; Icelandic +d (eth). Well, this is what I have found so far. The places where I said "seems to occur" the information is quoted from an old draft version of ISO standard ISO 5426 (dated 1975-07-10). I would be thankful if people mailed me their additions and corrections.
irenas@tekig4.UUCP (Irena Sifrar) (08/01/85)
Andries Brouwer writes: >1. Accents on top > >- Grave accent (`) occurs in many languages in `a `e `i `o `u ; > Slovene `r > I have never seen `r in Slovene. There are no accents on Slovene letters except when you want to denote the stress (mostly only dictionary use). In a way "r" can be one of the stressed letters, as in "mrtev", but the word is actually pronounced [mer'tev], so the accent actually falls on the implicit e (sounds like "a" in English, not like "ei"). I'd really like to see some examples of `r, if there are any. Actually, Slovene does have three occurrences of accent that just have to be there: hacek on top of c, s, z. Even if c, s, or z are capitalized, the hacek remains itself. (see below) >- When the letter that should get the hacek is tall, then it gets a > comma at the upper right instead: Czech has ,d ,t ; Slovak also ,l . > >4. Special symbols > >Some symbols with a crossbar are >Polish /l and /L ; Scandinavian /o and /O ; Vietnamese and Yugoslavian >and Icelandic -d and -D ; Icelandic +d (eth). > There is no such language as Yugoslavian. There is Macedonian, Serbo-Croatian (slight differences between the two), and Slovene. Serbo-Croatian is the most common, even the Macedonians and the Slovenes can speak in it. The language of the government is usually Serbo-Croatian, though at the assemblies people can talk in any of the three above mentioned languages. Irena Sifrar