Alex.Waibel@CAD.CS.CMU.EDU (03/18/87)
With respect to the inquiry about the Toshiba Voice Recognition Chip, here's two words of caution: First off, recognition performance claims in percent are nice to know, but in general should be taken with a grain of salt. These numbers are HEAVILY dependent on whether speech was recorded in a quiet or noisy environment, whether the speaker is cooperative or not, whether the test was done speaker-dependently or independently, whether the vocabulary in question is ambiguous (BOOK, COOK, TOOK) or not (BOOK, UNIVERSITY). Most of the current systems are also isolated word systems, i.e., one must make pauses between words. Whether such a system will work or not therefore relly depends on your particular recognition task and environment. Japanese has also two convenient properties: Words are mostly consonant-vowel sequences, and the Japanese writing system (Kana) consists of essentially sequences of syllable symbols. Toshiba and other Japanese manufacturers therefore have systems that allow the speaker to speak one of the (in the order of 100 or so (including some alternates) kanas at a time and have the word processor then convert a sequence of kanas into a kanji (the chinese word symbol). Now, unfortunately, this doesn't carry over easily into English. Since English syllables employ complex consonants clusters, there are more in the order of 20,000 English syllables (with 100,000 possible), which makes for a substantially harder recognition task. Also speaking these syllables in isolation is a lot less natural than in Japanese since our writing system isn't syllable based. The corresponding recognition of phonemes in stead of syllables in English is a VERY hard problem with good recognition accuracy hard to come by. Toshiba and other manufacturers (in Japan and the USA) have also whole word based systems, but most of them require training of the system, i.e, all words in the vocabulary must be read in at least once by the user. I've seen the systems at Toshiba and they do indeed do impressive work, but as far as hooking it up to your home computer and talking away in English, I'm afraid the story is still a little more complicated than that. Alex Waibel, CMU