[comp.sys.amiga] Internationalism and phonemes

sutela@polaris.utu.fi (Kari Sutela) (10/31/90)

ifarqhar@sunc.mqcc.mq.oz.au (Ian Farquhar) writes:

>In article <sutela.656751981@polaris> sutela@polaris.utu.fi (Kari Sutela) writes:
[I wrote about hardcoding a language on a chip]

>Do phonems vary greatly from language to language?

I'm not a linguist, but I'd think that they do (at least, a bit). I, for
example, have had difficulties in producing reasonably sounding Finnish
speech. On the other hand, this might be a problem with all languages ---
the phonemes are just an approximation of the real human ones (or perhaps
I just haven't tried hard enough). Anyway, if you think about non-european
languages (for example, some african languages), I'd guess that there are
indeed phonemes which can't be produced with the ones which we are using.

>One feature that I would like to see in AmigaOS is the provision of
>support for internationalism  ie. a library that knows what time zone
>the user is in, what their system of measurement is (ie. metric or 
>otherwise), the language, the currency, date formats etc.

Exactly, I agree. On the other hand, I seem to remember a vague comment
(by someone from CATS?) about a locale.library in OS 2.0 --- is there
such a beast in 2.0? I'd think that such a library should also provide
functions for deciding word-breaking characters, etc. I find it annoying
when a text-editor word-wraps and breaks a Finnish word into two, just
because it considered an umlaut-a a non-word characte. Preferably
all accented characters should be considered as word-characters --- one
doesn't always write in one language, for example, foreign names could
include strange, accented characters.

Kari Sutela	sutela@polaris.utu.fi

-- 
Kari Sutela	sutela@polaris.utu.fi

peterk@cbmger.UUCP (Peter Kittel GERMANY) (10/31/90)

In article <sutela.657357692@polaris> sutela@polaris.utu.fi (Kari Sutela) writes:
>I'd think that such a library should also provide
>functions for deciding word-breaking characters, etc. I find it annoying
>when a text-editor word-wraps and breaks a Finnish word into two, just
>because it considered an umlaut-a a non-word characte. Preferably
>all accented characters should be considered as word-characters --- one
>doesn't always write in one language, for example, foreign names could
>include strange, accented characters.

Gould point, why didn't this come from me?

So here comes a PLEA TO ALL PROGRAMMERS:
If you write a program that is considered to process text, precisely
plain ASCII text, then use EVERY character that is available.
(Well, a real control char still should remain a control char.)
Two examples of how not to do it:
1. Type command in DOS with the hex option. Every char with the MSB set
   is displayed as a dot, even when it is quite a normal, readable
   char for other people (seems like I should write another enhancement
   request).
2. The less program used on earlier Fish disks as a replacement for
   more to look into the readmes. It just canceled the MSB, thus
   changing the 8-bit chars to some quite different 7-bit ones. And
   this in a time where Fred is so polite to let programmers put also
   their own language readmes (besides the English version) on the
   disk. They simply get crippled and look unprofessional.

-- 
Best regards, Dr. Peter Kittel  // E-Mail to  \\  Only my personal opinions... 
Commodore Frankfurt, Germany  \X/ {uunet|pyramid|rutgers}!cbmvax!cbmger!peterk

zerkle@iris.ucdavis.edu (Dan Zerkle) (11/01/90)

In article <sutela.657357692@polaris> sutela@polaris.utu.fi (Kari Sutela) writes:
>ifarqhar@sunc.mqcc.mq.oz.au (Ian Farquhar) writes:
>
>>In article <sutela.656751981@polaris> sutela@polaris.utu.fi (Kari Sutela) writes:
>[I wrote about hardcoding a language on a chip]
>
>>Do phonems vary greatly from language to language?
>
>I'm not a linguist, but I'd think that they do (at least, a bit). I, for
>example, have had difficulties in producing reasonably sounding Finnish
>speech. On the other hand, this might be a problem with all languages ---

This is a really big problem.  My former workplace has mostly
developed and English text-to-speech interpreter (available RSN from
Panasonic!).  However, there was also a large amount of work done to
get the thing to speak Japanese and Chinese.  This is not simply a
matter of getting it to understand the input.  Rather, a tremendous
amount of the work was "tuning" the various sounds to be more
comprehensible and natural.

As a side note, we started off from the MITtalk system, which in turn
was developed from a system developed by the late great Dennis Klatt.
After I showed off my Amiga and they listened to the voice, the folks
at work were very sure that the Amiga text-to-speech was a derivative
of one of these.  So, if you hear a box with Panasonic on the outside
talking in a manner similar to your Amiga (only much, much better),
you'll know why they sound alike.

					-Dan

             Dan Zerkle  zerkle@iris.ucdavis.edu  (916) 754-0240
           Amiga...  Because life is too short for boring computers.