[net.nlang] Hor.Hacking Finnish/Estonian/Hungarian/Turkish

dick@tjalk.UUCP (Dick Grune) (10/27/85)

Due to massive requests (plus/or the absence of massive protests):
here are Hacks 6-9 to distinguish between Finnish, Estonian,
Hungarian and Turkish; apologies will be found on the bottom line
of this article.

These are funny languages; they look weird, as if to be used by
recent immigrants from Mars only.  They look like APL to an
honest C programmer, but, ... once you are hooked, you turn out one
one-liner after the other!  So be warned!

Finnish and Estonian are very related, they watch each others' TV
programs, apparently with little difficulty.  Hungarian is related
but not more so than English is to Russian (F ka"si, Es ka"si, H ke'z:
E: hand, but F talo, H ha'z: house).

Turkish has quite the same stucture, but scientists are still
fighting over its relation with the above.  It has few words in
common with the above, but its structure is so similar that it helps
in learning it; see below.  There are words like T elma, H alma: E apple;
T anne, H anya: E mother, which may mean something.

These languages differ considerably from English and other familiar
languages, both in their structure and in their external appearance.

1.	They have double dots over the vowels, like German, Swedish or
Icelandic.   The reason is that they have some eight different vowels,
which are not easily accomodated by the latin A E I O U (Y).
So they put double dots over the A O and U.  Hungarian has even
short double dots and long double dots! (Short ones for a short o:
and u:, long ones for long o" and u").

2.	They don't have gender, i.e., no difference between he, she
or it, or his or her.  I suppose they have non-linguistic means of
distinguishing!

3.	They don't have consonant clusters (except an occasional ng
or nk).  There is no way to write "skruntch".

4.	They have awfully long words, by our standards.  There are
several reasons for this:
    a.	They consider compound words to be single words.  It
	is as if we would write shoebox, spicemerchant or
	leastcosterrorrecovery.
    b.	They put a lot of small words in a single word:
	E.g.: "in my house"
		H ha'zamban (ha'z-am-ban)	E house-mine-in
		T evimde (ev-im-de)		E house-mine-in
	but	F talossani (talo-ssa-ni)	E house-in-mine
	(watch the H -am-, T -im-, both meaning my; M seams to be Me
	all around; is this coincidence?).  This takes care of most
	of the small words that so delightfully brighten up a page
	of English!
    c.	They have no subordinate clauses, i.e. sentences that get
	glued to other sentences and cannot stand alone, in English
	marked by ... which ..., ... that ...  or ... because.
	Instead they use verb-nouns, of which they have plenty and
	which are best explained by example:
	They say, e.g.:
		alive-being-mine gladdens-me
			i.e.: I am glad that I am alive
	or even
		town-in John-uncle my-having-seen-him you-to I-told
			i.e.: I told you I've seen uncle John in town
	Especially Turkish can easily nest this 3 to 4 deep, with
	single sentences that may extend over 7 to 10 lines of code.
	A parser's [Turkish] delight!
	(Except Hungarian, which has normal subordinate clauses,
	like you and me)

5.	They don't have a verb for 'to have'.  Instead they use some
	form of "to be/to exist" with some form of possession:
	F minulla on raha --		on-me is money
	H pe'nzem van --		money-mine is
	T param var --			money-mine is
	E				I have money


And now for the hacks (assuming that through some magic agency you
know already you have to choose between F, Es, H and T):

Hack 6: If it contains double vowels, it's either Finnish or
	Estonian.  Hungarian only has double vowels by accident,
that is, where vowels happen to run together in the formation of
compound words:
	H Dunaujva'ros (Duna-uj-va'ros, E Danube-new-town
a place in Hungary).  It's the same with Turkish:
	T babaanne (baba-anne, E father-mother = grandmother)
(don't take the y for a vowel in Turkish, it's the same y as in
English "you"!)

Hack 7: If all the words end in vowels or in -n or -t, it's Finnish;
	on the other hand, if it contains b or g, it's Estonian;
likewise, if it contains u" it's Estonian, the u"-sound being
written y in Finnish (same sound as u" in German or u in French).
Estonian can be very roughly described as Finnish with most of the
end vowels cut off.
	Exx: F pa"iva", Es pa"ev: E day; F jalka, Es jalg: E foot.

Note:	Finnish is by far the most difficult language I've seen, no
insult intended.  This is *not* because of its 15 cases; in essence a
case ending is just a preposition that got moved and glued to the end
of the word (sometimes the gluing involves a lot of hanky-panky, to the
dismay of foreign students).  It is *not* because their verbs have
some 100 forms, since so has Hebrew.  It is *not* because it is
highly irregular, since so are the French verbs (all Indo-european
languages suffer from a bad case of verbitis irregularis), and we've
all mastered them, haven't we.  No, it is because it has ALL these
things together!  For each noun you have to learn 4 forms, duly
supplied by the better dictionary:
	man:
		mies --		a man
		miehen --	man's
		miesta" --	men
		miehia" --	men, partitive plural like in French:
			paljon miehia" -- beaucoup d'hommes -- many men
If you know these 4 you can construct the other 11.  If you think this
is unfair and "man" is probably irregular, here are two other entries:
F: viipale	viipaleen	viipaletta	viipaleita -- slice of bread
F: greippi	greippin	greippia"	greippeja" -- grapefruit
When they started on the verb I fled class, screaming.

Hack 8:
It's Hungarian when:
-	it has vowels with single accents on them
	Note: it just makes them longer
-	it has sz, cs or words ending in more than one consonant
	Note: cs is pronounced tsh, sz as s, c as ts, and single s as sh!
		this makes H csa'rda's  pr. tshaarrdaash
		and H gulya's  pr. gooyaash
It's Turkish when:
-	it has a c-cedille, that is a c-with-tail, or an s-cedille
	Note: c-cedille is dj, s-cedille is sh
-	a g with a half moon over it; it makes the preceeding
	vowel long, but is generally derived from an honest g or k.
-	most remarkable of all, an i without a dot, or a capital I
	WITH a dot on it: I.stanbul; this is another unfamiliar vowel.

Note:	Turkish may be complicated, but it is completely regular; no
irregular verbs, no irregular plurals etc.  Its main difficulty lies
in the complicated way they say things.

Note:	Given a verb V you can form another verb that means "to have
to V"; same with "to be able to V", and some others.  This is the
normal way to say such things:
H olvasni: E to read;		H olvashatni: E to be able to read,
H olvasom: E I read it;		H olvashatom: E I can read it.


Hack 9:	If it looks like Finnish but has no umlauts (those double
dots) at all, you're barking up the wrong tree altogether and looking
at a piece of Swahili, a language spoken by a considerable chunk
of Africa and remarkably similar to Finnish, but totally unrelated
it, even in the gullible mind of

					Dick Grune
					Vrije Universiteit
					de Boelelaan 1081
					1081 HV  Amsterdam
					the Netherlands

PS: Get some text or newspaper in these languages, if you are interested.
There's nothing like first-hand experience!

... but Lucretia ... she Borgia to death! -- Tom Lehrer.

zwicky@osu-eddie.UUCP (Elizabeth D. Zwicky) (10/30/85)

In general, Finnish is not all that irregular; it's just that the regularity
is somewhat more complex than English speakers expect. The only irregular
verb I have come across in Finnish is _olen_, to be, which is irregular
in almost every language in the world. Finnish has a complex
set of ordered phonological rules, and begs to be taught by a linguist,
who can teach the rules, instead of the bare facts. It makes a lot
more sense that way. It isn't any easier, but it makes more sense.

It also has 16 cases, give or take; 2 are almost entirely poetic. 
Telling it by a lack of clusters is misleading though, because it
has clusters medially, just not initially or finally. (The rule is
that only a single consonant can occur next to a boundary; within a 
word clusters can occur where the syllable boundary falls between
the consonants). 

I like it precisely because it is so regular. Spelling is almost completly
phonetic.

 -Elizabeth D. Zwicky

esa@kvvax4.UUCP (Esa K Viitala) (11/04/85)

In article <> zwicky@osu-eddie.UUCP (Elizabeth D. Zwicky) writes:
  >In general, Finnish is not all that irregular; it's just that the regularity
  >is somewhat more complex than English speakers expect. The only irregular

Great. That's my impression, too. However, it is long time ago I went
through my text books (as a friend from ujocs pointed out in a reply 
to my article a couple of months ago). I have been waiting for 
some reactions to this discussion from Finland (ujocs & taycs & al). 
I hope they subscribe to this group, let us hear from you.

It would be interesting to compare, say, hyphenating programs (one for
English and the other for Finnish). This would give some insight in the
regularity/irregularity of the Finnish language.
Unfortunately I do not have any such programs for Finnish. Could someone
in Finland, please mail me one. I'd prefer one written in such a
language that I can get it compiled (C, Pascal, Fortran, Lisp, sorry,
no Algol compilers here ...). Of course any other language will do, if
you don't have it written in any of the afore laguages.

  >verb I have come across in Finnish is _olen_, to be, which is irregular
The stem of verb 'to be' really is 'olla'

I am 			mina" olen
you are			sina" olet
(s)he is		ha"n on
we are			me olemme
you are			te olette
they are		he ovat

  >
  >I like it precisely because it is so regular. Spelling is almost completly
  >phonetic.
Only "almost"??


-- 

---ekv, {seismo,okstate,garfield,decvax,philabs}!mcvax!kvvax4!esa

michael@spar.UUCP (Not Bill Joy) (11/07/85)

>It [Finnish] also has 16 cases, give or take; 2 are almost entirely poetic. 
>Telling it by a lack of clusters is misleading though, because it
>has clusters medially, just not initially or finally. (The rule is
>that only a single consonant can occur next to a boundary; within a 
>word clusters can occur where the syllable boundary falls between
>the consonants).  -Elizabeth D. Zwicky

    I am familiar with the case systems of many IndoEuropean languages
    (8 seems to be the maximum), but 16 cases seems most outrageous!
    If somebody has the time, I would be most interested to understand
    how these cases are used. Do the other languages whose relatedness
    to Finnish is established {Estonian, Hungarian} or suspected 
    {Turkish, Mongolian, Korean} have case inflectional systems their
    bear any resemblance? 
    
    Lappish and, I believe, an AmerIndian language spoken by those who
    once occupied the San Francisco bay area have also been linked to
    the Finno-Ugric languages, BTW.

-michael

michaelm@bcsaic.UUCP (michael b maxwell) (11/12/85)

In article <642@spar.UUCP> michael@max.UUCP (System Administrator) writes:
>    ...Lappish and, I believe, an AmerIndian language spoken by those who
>    once occupied the San Francisco bay area have also been linked to
>    the Finno-Ugric languages, BTW.
Yes, and Shuar (a Jivaroan language of Ecuador) = Japanese.  Don't
believe everything you hear!
-----------------
Disclaimer:  What? Give my employer credit for my opinions?! NO WAY!!!
-----------------
-- 
Mike Maxwell
Boeing Artificial Intelligence Center
	...uw-beaver!uw-june!bcsaic!michaelm

zwicky@osu-eddie.UUCP (Elizabeth D. Zwicky) (11/15/85)

In article <157@kvvax4.UUCP> esa@kvvax4.UUCP (Esa K Viitala) writes:
>In article <> zwicky@osu-eddie.UUCP (Elizabeth D. Zwicky) writes:
>  >I like it precisely because it is so regular. Spelling is almost completly
>  >phonetic.
>Only "almost"??

Yeah, only almost. But only if you want to do it really, really _right_.
My Finnish teacher is a native speaker of Estonian, and a phoneticist, and
she wants it _right_. That means making some stress and length of initial
consonant distinctions that are controlled by the presence of glottal
stops on the end of the word before, which are not spelled, or on the
properties of some clitics (They look like separate words, but are
stressed as if they were part of the (usually) preceding word). Syllable
boundaries also make subtle length distinctions, and are not always
determinable from spelling. But I admit it is really really close
to being perfectly phonetic; so close that today I ran across a Finnish
word, which I was told means something like "stringing letters together
to get words when youare learning to read and all you can do is resognize
letters" My Finnish teacher was hoping that I as a native English speaker
knew a word that meant something like that (closer than my book's 
translation "spelling"), but there isn't one, because it won't really work
in English.

	-Elizabeth D. Zwicky

zwicky@osu-eddie.UUCP (Elizabeth D. Zwicky) (11/15/85)

In article <642@spar.UUCP> michael@max.UUCP (System Administrator) writes:
>>It [Finnish] also has 16 cases, give or take; 2 are almost entirely poetic. 
>>Telling it by a lack of clusters is misleading though, because it
>>has clusters medially, just not initially or finally. (The rule is
>>that only a single consonant can occur next to a boundary; within a 
>>word clusters can occur where the syllable boundary falls between
>>the consonants).  -Elizabeth D. Zwicky
>
>    I am familiar with the case systems of many IndoEuropean languages
>    (8 seems to be the maximum), but 16 cases seems most outrageous!
>    If somebody has the time, I would be most interested to understand
>    how these cases are used. Do the other languages whose relatedness
>    to Finnish is established {Estonian, Hungarian} or suspected 
>    {Turkish, Mongolian, Korean} have case inflectional systems their
>    bear any resemblance? 
>    

I don't know about the rest of them, but Estonian also has 16 cases.
A Finnish linguist of my acquaintance has hypothesized (partly
in jest) that all languages have cases in numbers that are powers
of two, so that if you were going to have more than 8, you had to have
16. Estonian also has three possible lengths for vowels and
consonants, as opposed to two, (Finnish has two, short and long;
Estonian has short, long, and overlong) making it even more terrifying
for English speakers.

	-Elizabeth D. Zwicky