[comp.sys.amiga] Questions/Comments on translator/narrator

cc@ucla-cs.UUCP (02/10/87)

Keywords:


Over the last few days I've seen several comments and questions regarding the
Narrator/Translator speech system.  As I am one of the authors, I hope that I
can answer and/or respond appropriately.

First off, please note that Narrator is designed to produce a non-regional
dialect of American English.  It can be bent to speak in foreign languages (in
this context British English is to be considered foreign), but don't expect
it to sound reasonable.  We are currently discussing with Commodore the 
possibility of producing European language versions, but there is no firm
committment yet.  Note that the development of foreign language versions
entails considerably more than producing a new phoneme inventory; prosodics,
phonological conversion rules (not to be confused with Translator's text-to-
phoneme rules), supra-segmentals, etc, etc, etc all must be developed for each
individual language.  Also remember that we do not just store phonemes and
splice them together (this produces very poor speech quality), but we
generate phonemes from acoustic/phonetic data (formants, amplitudes, stress,
prosodics, etc).  On top of this are the supra-segmental features (such as
fundamental frequency) which differ from language to language (cf. the msg
regarding the use of Narrator for Thai).  It be a big job.

->Keith Doyle
  I don't want to discourage you from experimenting with splicing phonemes
  together, but there have been many papers written about that subject and
  they are not encouraging.  If you are going to proceed, check out IEEE 
  ASSP, JASA, and other literature, you'll find a wealth of info there.  You
  are right in assuming that you can't just abut phonemes together.  The
  transitions are very important.  One other problem is how to address stress 
  at the word as well as the sentence level.  Anyhoo, check out the literature
  on diphone and demisyllable synthesis, and by all means keep in touch, I am
  always interested in what people are doing regarding speech synthesis and
  recognition.

->Alan Kent
  THAI????  I wish you luck.  Actually I am very interested in Thai myself.
  I began to study it a few months ago after visiting Thailand.  I even went
  so far as to produce a Thai font for my (ugh) Mac.  As you determined, you
  can use a ? for the rising tone, and a . for the dropped tone.  Stess 
  numbers added to syllables are handled in a fairly complex method due to 
  the nature of the pitch rules for English.  However, stess numbers added
  to single syllables should raise the pitch of that syllable and possibly
  make for a halfway acceptable high/acute tone.  I suggest that you use a 
  9.  This probably won't work super well because in English the pitch tends
  to decline as the utterance goes on, but you can try.  I don't know of any
  way to get the kind of tone used in speaking the number 5, but I'll play
  around a bit.  

->everyone
  Sorry if this msg rambles too much, I need to get some sleep.



--jk--

P.S.  If anyone wants more info on narrator/translator drop me a line.