[net.text] about diacritical marks

gadfly@ihuxn.UUCP (Gadfly) (01/01/70)

--
Vis-a-vis the Danes' addition of {oA} to the alphabet to supersede
the use of double-a, I heard that when this happened (this century),
all people with names beginning "Aa..." got their choice of leaving
them as they were or respelling them with the {oA}.  Going with
the new letter meant moving from the beginning of the phone book
(and any other alphabetized list) to the end.
-- 
                    *** ***
JE MAINTIENDRAI   ***** *****
                 ****** ******  15 Aug 85 [28 Thermidor An CXCIII]
ken perlow       *****   *****
(312)979-7753     ** ** ** **
..ihnp4!iwsl8!ken   *** ***

storm@diku.UUCP (Kim Fabricius Storm) (07/31/85)

In article <775@mcvax.UUCP> aeb@mcvax.UUCP (Andries Brouwer) writes:
>Last time I just mentioned a few accents that occurred to me while
>writing - let me now give a more detailed overview of what accents
>exist.

>- Corona (circle above) (o) is found in Scandinavian oa and Czech ou .
                                                      ^^
>Various ligatures are conventionally treated as a single symbol.
>One has Dutch ij , German ss (or sz), French oe and
>Scandinavian (and Latin) ae .
                          ^^
>Some symbols with a crossbar are
>Polish /l and /L ; Scandinavian /o and /O ; ...
                                 ^^
As a Dane, I would like to point out some misunderstandings in your article.
If not a misunderstanding, then somebody is completely ignorant of facts!

According to my knowledge ae /o and oa (marked with ^^ in the extract) are NOT
an 'a-e ligature', an 'o with a crossbar' or an 'a with a circle above' - they
are genuine letters in the danish alphabet:
   A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA
(please observe the ordering of the letters.)
I don't think you can show me any danish dictionary, in which AE /O and oA don't
have their own sections far away from those of A E and O!
  I have seen a few american books with a Danish summary at the end, e.g.
Brinch-Hansen: The Architecture of Concurrent Programs, where it is obvious
that the publisher has treated AE /O and oA as special cases of A E and O, as
in your list; the result is terrible, and if that is what we can expect from
future versions of troff then ... UURGH! It might be ok for a short summary
in Danish... But not a whole book, please! 
  Please don't correct me about troff - I know that one can map the proper
letters into the fonts, if they are available in the printer, but I would
not like to see a version of troff in Denmark claiming to have a Danish
alphabet, if it is made with a-e ligatures, / on O's and a scaled down
'o' on top of A's!

  BTW: As indicated above W is not always considered a genuine letter in Danish.
in danish W only occurs in a few personal names and in foreign words, and in
most dictionaries it is treated just as if it was a V.

>I would be thankful if people mailed me their additions and corrections.

If you still persist in calling AE a ligature, then I would suggest that
you included w (or vv) as a ligature too; according to  your rules, any letter
that COULD be composed by other 'normal' (english?) letters, should be
considered a ligature. Any objections from the people conserned should of
course be ignored!
   Treating W as a ligature also has the great advantage, that the
character becomes available for general purpose use, e.g. as delimiter or
escape character. This is consistent with the use of {|} and [\], which
unfortunately are used for the Danish letters ae /o oa AE /O and oA -
C-programs are great fun :-( to edit on my terminal, you wouldn't believe it
was C if you saw it!

----
Kim F. Storm, DIKU, University of Copenhagen, Denmark.

andersa@kuling.UUCP (Anders Andersson) (08/02/85)

AE This goes to both net.text and net.nlang, and currently I think this  AA
AE cross-posting is appropriate. Probably it won't be at a later time... AA

In article <1087@diku.UUCP> storm@diku.UUCP (Kim Fabricius Storm) writes:
>According to my knowledge ae /o and oa (marked with ^^ in the extract) are NOT
>an 'a-e ligature', an 'o with a crossbar' or an 'a with a circle above' - they
>are genuine letters in the danish alphabet:
>   A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA

I was about to bring up almost the same question. However, I wasn't sure
whether this part of the problem is within the scope of the discussion,
and I don't think anyone has actually claimed that these "ligatures",
"umlauts" etc. really are less important versions of other characters in
all languages concerned. So far only their visual representation has been
considered.

Just for anyone's information, here is the Swedish alphabet also:
   A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z oA "A "O
Note the different ordering in the end. The same for Finnish I guess,
except that they don't have oA.

>in danish W only occurs in a few personal names and in foreign words, and in
>most dictionaries it is treated just as if it was a V.

The same in Swedish. However, I think 'E and "U should be mentioned
together with W, as they also show up sometimes in personal names.
'E is treated like E, and "U like Y.

If our intention is to create some digital representation of European
written language in a wider sense, and not just those funny graphical
things, then we have to look into this problem as well, yes. And when
we refer to text formatters, this will most likely be the case, I guess...

1. Different alphabets simply sort their letters differently.

2. Various languages put different "value" in the letters they use. In
   the Scandinavian languages, "A and "O (or their correspondants) are
   "real" letters, while in German they are not, just "umlauts". This
   might effect sorting, in that they are "treated as" some other letters.
   I would be glad to receieve some Frenchman's veiw on their myriad of
   accents!

3. There might be slight differences in the printed representation.
   In handwriting, I might use tilde (~) instead of double-dot (")
   over A and O, but when I started writing in German, my teacher
   pointed out that I should not use tilde on the "umlauts".

4. Try to define an international case conversion function when there are
   only two representations of I (with and without dot). "International"
   means that it should work properly in both Paris and Ankara.

I don't think we can count on that a single text is "written" in one
language only, and thus make general assumptions on how to treat the
letters. For instance, when sorting a list of personal names: Should
G"unther be put before or after Gustaf? Or just think of a world atlas!

Note to the eventual implementors: Please reserve some place where we
could later put an escape sequence to switch over to an entirely
different alphabet -- soon we will want to write in Greek, Hebrew or
even Bulgarian...

   Anders Andersson
   ...!seismo!mcvax!enea!kuling!andersa

esa@kvvax4.UUCP (Esa K Viitala) (08/06/85)

In article <kuling.777> andersa@kuling.UUCP (Anders Andersson) writes:
  >Just for anyone's information, here is the Swedish alphabet also:
  >   A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z oA "A "O
  >Note the different ordering in the end. The same for Finnish I guess,
  >except that they don't have oA.

Oh, but the Finns do.  Not all the Finns like it but it is there, the 
ordering is the same as above, though.  (There are some 6-8% Swedish 
speaking citizens in Finland, and Swedish is an official language 
in Finland, too.)

In Norwegian the ordering is the same as in Danish:
  A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA
but Norwegians treat double A (or double a) as oA (oa), which 
causes some additional problems to the sorting algorithm.  (Maybe 
the Danes do it, too?) For instance the phone book lists surnames 
beginning with double A (Aa) together with names beginning with 
the *letter* oA.  I.e.  a name, say, 'Aasmundsen' will be listed 
after a name 'oAgren', but 'Aasmundsen' will be before 
'oAsmundsen' and, of course 'Asmundsen' will be listed in the 
beginning of the book, under A.  Got it?  :-) :-).  
  >
  >>in danish W only occurs in a few personal names and in foreign words, and in
  >>most dictionaries it is treated just as if it was a V.
  >The same in Swedish. 
That, I believe, is very much the same in Finland and in Norway.  
Norwegians are a bit more careless in adapting words from other 
languages though.  Therefore, in Norwegian dictionary, one finds 
words such as 'whisky', 'wienerbr/od' and 'wagon', whereas Finns 
write 'viski', 'viinerleip"a' and 'vaunu'.  Except that Finns 
rarely say 'viski', they prefer 'votka' :-) :-).  

-- 

---ekv, {seismo,okstate,garfield,decvax,philabs}!mcvax!kvport!kvvax4!esa

tmb@talcott.UUCP (Thomas M. Breuel) (08/08/85)

In article <642@kvvax4.UUCP>, esa@kvvax4.UUCP (Esa K Viitala) writes:
> In article <kuling.777> andersa@kuling.UUCP (Anders Andersson) writes:
>   >Just for anyone's information, here is the Swedish alphabet also:
>   >   A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z oA "A "O
>   >Note the different ordering in the end. 
>
> In Norwegian the ordering is the same as in Danish:
>   A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA
> but Norwegians treat double A (or double a) as oA (oa), which 
> causes some additional problems to the sorting algorithm.  

German has special characters for the vowel combinations 'ae', 'oe',
'ue', and 'sz'. These were introduced as a matter of convenience in
handwriting: the first three combinations ('Umlaute') are written
as the first vowel with two small parallel lines on top (contracted
to dots in printed matter), which is actually a small script 'e'.
The consonant combination 'sz' is written as a 'beta' like character,
which is a contracted form of the script combination of 's' and 'z'.

In dictionaries, the umlaute are either found under the corresponding
vowel, or under the corresponding vowel combinations. They are never
listed separately. Likewise, for dictionary purposes, 'sz' is treated
as 'ss'.

It is considered acceptable in informal writing to spell out
umlaute and to re-place the 'sz' character by 'ss' if the special
characters are not available. (It is a sign of lack of knowledge,
when, as I have seen quite frequently in texts prepared by speakers
of English, the umlaute are replaced by plain vowels. This is
annoying and hard to read!)

Personally, I think the introduction of the umlaute and the 'sz'
into German print was a mistake: they do not improve readability;
there function is at most to make the print look nicer, much as
letter combinations like 'ft' in certain English typefaces. I
sincerely hope that they will disappear from the common written
language. On the other hand, compared to Danish, Swedish, or Finnish,
German at least does not have problems with ordering or representation
in standard Roman letters. It is possible to write readable German
without the use of diacritical marks, without special characters,
and without changing the dictionary order in doing so.

Diacritical marks, contracted letters, and special characters are
not a sign of cultural identity -- they are annoying leftovers from
a time in which people used to do most of their writing with a pen
(or a brush, on the other side of the world). Let's hope they'll
soon get out of fashion!

						Thomas.

storm@diku.UUCP (Kim Fabricius Storm) (08/09/85)

In article <642@kvvax4.UUCP> esa@kvvax4.UUCP (Esa K Viitala) writes:

>In Norwegian the ordering is the same as in Danish:
>  A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA
>but Norwegians treat double A (or double a) as oA (oa), which 
>causes some additional problems to the sorting algorithm.  (Maybe 
>the Danes do it, too?)
Yes, aa and oa are treated alike in Danish also.  In fact, oa was invented
more than 100 years ago, as an abbreviation for the 
frequent use of double-a in Danish (and Norwegian and Swedish). But instead
of having it as a diacritical mark on an A, it became a whole new letter
in itself placed last in the alphabeth.

>Norwegians are a bit more careless in adapting words from other 
>languages though.  Therefore, in Norwegian dictionary, one finds 
>words such as 'whisky', 'wienerbr/od' and 'wagon', whereas Finns 
>write 'viski', 'viinerleip"a' and 'vaunu'.
Danes are just as careless as the Norwegians - we also write
whisky, wienerbr/od, and waggon (with two g's!).

Kim F. Storm, U of Copenhagen, Denmark.  storm@diku.UUCP

kimcm@diku.UUCP (Kim Christian Madsen) (08/09/85)

In article <642@kvvax4.UUCP> esa@kvvax4.UUCP (Esa K Viitala) writes:
>In Norwegian the ordering is the same as in Danish:
>  A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA
>but Norwegians treat double A (or double a) as oA (oa), which 
>causes some additional problems to the sorting algorithm.  (Maybe 
>the Danes do it, too?) For instance the phone book lists surnames 
>beginning with double A (Aa) together with names beginning with 
>the *letter* oA.  I.e.  a name, say, 'Aasmundsen' will be listed 
>after a name 'oAgren', but 'Aasmundsen' will be before 
>'oAsmundsen' and, of course 'Asmundsen' will be listed in the 
>beginning of the book, under A.  Got it?  :-) :-).  

We danes also, orders the the double A as oA, in fact in the old days
the letter oA didn't exist in the danish alphabet, but I think it was
officially introduced as a result of a writting reform in 1948.

						Regards
						Kim Chr. Madsen
					a.k.a.	kimcm@diku.uucp

kimcm@diku.UUCP (Kim Christian Madsen) (08/09/85)

In article <642@kvvax4.UUCP> esa@kvvax4.UUCP (Esa K Viitala) writes:
>In Norwegian the ordering is the same as in Danish:
>  A B C D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA

Oh, I belive the norwegian alphabet must look this way:
    A B (C) D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA

I've never seen any norwegian using the letter 'C' they seem to replace
it by an 'S', like 'sentrum', 'sykel'...

						Regards
						Kim Chr. Madsen
					a.k.a.  kimcm@diku.uucp

esa@kvvax4.UUCP (Esa K Viitala) (08/12/85)

In article <diku.1118> kimcm@diku.UUCP (Kim Christian Madsen) writes:
  >Oh, I belive the norwegian alphabet must look this way:
  >    A B (C) D E F G H I J K L M N O P Q R S T U V (W) X Y Z AE /O oA

Hmmm, if continue along this line, then it really is
A B (C) D E F G H I J K L M N O P (Q) R S T U V (W) (X) Y (Z) AE /O oA

However, there are words listed under C. But some dictionaries do not have
an entry for Q. X and Z have ususally very small entries, too.

For Finnish I'd write the alphabet like this:
A (B) (C) (D) E (F) (G) H I J K L M N O P (Q) R S T U V (W) (X) Y (Z) (oA) "A "O 
But one finds words starting with letters in parenthesis. At least in names.


-- 

---ekv, {seismo,okstate,garfield,decvax,philabs}!mcvax!kvport!kvvax4!esa

pooh@ut-sally.UUCP (Pooh @ the Utility Muffin Research Kitchen) (08/13/85)

In article <483@talcott.UUCP> tmb@talcott.UUCP (Thomas M. Breuel) writes:
>
>German has special characters for the vowel combinations 'ae', 'oe',
>'ue', and 'sz'. These were introduced as a matter of convenience in
>handwriting: the first three combinations ('Umlaute') are written
>as the first vowel with two small parallel lines on top (contracted
>to dots in printed matter), which is actually a small script 'e'.
>The consonant combination 'sz' is written as a 'beta' like character,
>which is a contracted form of the script combination of 's' and 'z'.

Actually, I met extremely few Germans in West Germany who
even knew what the word "Umlaut" meant.  They just call the
letters by their pronounced form.  We're the ones who make
a distinction between the letter and the diacritical mark.

Pooh

pooh@purdue-ecn.ARPA     pur-ee!pooh

"If there is a God, then He will reward you;
and if there isn't, who has been playing all
these games with Jacques Kohn?" -- Isaac Bashevis Singer

guido@boring.UUCP (08/13/85)

In Dutch, the alphabet is as follows:
A B C D E F G H I J K L M N O P Q R S T U V W X IJ Z
:-)

jaap@mcvax.UUCP (Jaap Akkerhuis) (08/14/85)

In article <6574@boring.UUCP> guido@mcvax.UUCP (Guido van Rossum) writes:
 > In Dutch, the alphabet is as follows:
 > A B C D E F G H I J K L M N O P Q R S T U V W X IJ Z
 > :-)

Of course this hasn't to do anything any more with diacritical marks
but more with "how to find a name in a dictionary".
Of course guido isn't serious. But to explain the subtilities of the
Dutch alphabet takes a while.

There are basically three ways:

The Dutch alfabet:

A B C D E F G H I J K L M N O P Q R S T U V W X IJ Z
Note that Y isn't in the Dutch alfabet.

The tolerant alphabet:
A B C D E F G H I J K L M N O P Q R S T U V W X IJ Y Z
Here it is allowed to freely mix IJ and Y.

The PTT alphabet:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Here the ligature is not considered as a special char, but as a
combination of I and J.

This according to the "Opperlandse taal-& letterkunde", Battus 1981.

But of course, since the Dutch Post office changed the typography to
something worse they changed also to the "Tolerant Alphabet", as I just
found out.

These ways of ordering makes thing rather amusing. In the nearest
dictionary lying around (Prisma Nederlands-Engels...) I find IJsland
under the "I" and in the phone book under "Y".

	jaap	(mcavx!jaap)

PS. How to phone Iceland from Holland? 09-354, costs ca. f 1,80 a
    minute. Write this down, you may need it.