cc@ucla-cs.UUCP (02/10/87)
Keywords: Over the last few days I've seen several comments and questions regarding the Narrator/Translator speech system. As I am one of the authors, I hope that I can answer and/or respond appropriately. First off, please note that Narrator is designed to produce a non-regional dialect of American English. It can be bent to speak in foreign languages (in this context British English is to be considered foreign), but don't expect it to sound reasonable. We are currently discussing with Commodore the possibility of producing European language versions, but there is no firm committment yet. Note that the development of foreign language versions entails considerably more than producing a new phoneme inventory; prosodics, phonological conversion rules (not to be confused with Translator's text-to- phoneme rules), supra-segmentals, etc, etc, etc all must be developed for each individual language. Also remember that we do not just store phonemes and splice them together (this produces very poor speech quality), but we generate phonemes from acoustic/phonetic data (formants, amplitudes, stress, prosodics, etc). On top of this are the supra-segmental features (such as fundamental frequency) which differ from language to language (cf. the msg regarding the use of Narrator for Thai). It be a big job. ->Keith Doyle I don't want to discourage you from experimenting with splicing phonemes together, but there have been many papers written about that subject and they are not encouraging. If you are going to proceed, check out IEEE ASSP, JASA, and other literature, you'll find a wealth of info there. You are right in assuming that you can't just abut phonemes together. The transitions are very important. One other problem is how to address stress at the word as well as the sentence level. Anyhoo, check out the literature on diphone and demisyllable synthesis, and by all means keep in touch, I am always interested in what people are doing regarding speech synthesis and recognition. ->Alan Kent THAI???? I wish you luck. Actually I am very interested in Thai myself. I began to study it a few months ago after visiting Thailand. I even went so far as to produce a Thai font for my (ugh) Mac. As you determined, you can use a ? for the rising tone, and a . for the dropped tone. Stess numbers added to syllables are handled in a fairly complex method due to the nature of the pitch rules for English. However, stess numbers added to single syllables should raise the pitch of that syllable and possibly make for a halfway acceptable high/acute tone. I suggest that you use a 9. This probably won't work super well because in English the pitch tends to decline as the utterance goes on, but you can try. I don't know of any way to get the kind of tone used in speaking the number 5, but I'll play around a bit. ->everyone Sorry if this msg rambles too much, I need to get some sleep. --jk-- P.S. If anyone wants more info on narrator/translator drop me a line.