tgd@orstcs.CS.ORST.EDU (Tom Dietterich) (11/09/89)
Your accuracy claims for NETtalk are greatly exaggerated. I have replicated the NETtalk study using the same training data. In this case, training on 1000 words chosen at random from the 20000-word dictionary provided by Sejnowski. After running back propagation for 30 epochs using the parameters given in Sejnowski and Rosenberg (1986), I obtain the following results. Testing is performed on a randomly chosen test set of 1000 words. WORDS LETTERS (PHON/STRESS) BITS ------------------------------------------------------------------ BP TRAIN: 65.3 94.0 97.0 96.4 99.5 TEST : 14.9 71.6 81.8 81.4 96.7 Numbers give percentage of correct performance: Explanation: TRAIN: performance on the training set TEST: performance on the test set BITS: average performance on the 26 output bits of the network. STRESS: performance on the 5 stress bits PHONEME: performance on the 21 phoneme bits LETTERS: performance on all 26 bits WORDS: performance on whole words (i.e., each letter must be correct). The nettalk network has 120 hidden units, 203 input units (that code, very sparsely, a 7 letter window), and 26 output units (that code in a distributed fashion the 54 phonemes and 6 stresses). The 26 output bits are mapped to the nearest phoneme/stress combination that was observed in the training data. (i.e., a pass was made over the training data to find all phoneme/stress pairs appearing in the data. Decoding only considers those pairs. Ties are broken in favor of the phoneme/stress pair that appeared more frequently.) This decoding scheme is superior to decoding to the nearest syntactically legal phoneme/stress pair. --Tom Dietterich
heck@Sunburn.Stanford.EDU (Stefan P. Heck) (11/10/89)
According to Rumelhart in his ANN/PDP class here, Nettalk was trained on a set of the 1000 most common words rather than a random set. This run took overnight to learn. They later also did a second test using 10 000 words. I don't know for which run the accuracy figures are, but supposedly it got 87% right except on words which were irregular. The best competitor at the time was about 89% accurate. Human capability was estimated at 96%. Stefan CSD
hougen@umn-cs.CS.UMN.EDU (Dean Hougen) (11/10/89)
In article <13659@orstcs.CS.ORST.EDU> tgd@orstcs.CS.ORST.EDU (Tom Dietterich) writes: >Your accuracy claims for NETtalk are greatly exaggerated. I have >replicated the NETtalk study using the same training data. In this >case, training on 1000 words chosen at random from the 20000-word >dictionary provided by Sejnowski. ^^^^^^ >Testing is performed on a randomly chosen test set of 1000 words. ^^^^^^^^ I was under the impression that Sejnowski had NETtalk read real sentences in real paragraphs, not randomly ordered words. Right? BTW, did you present the input as one long string of charcters with the words seperated by a single space or did you present the words one at a time (i.e. as a long string of characters with the words seperated by three or more spaces) or did you do something else (what?)? I'll leave you to determine what effect any of this could have on NETtalk's performance. Dean Hougen -- "Stop making sense. Stop making sense. Stop making sense, making sense." - Talking Heads, "Stop Making Sense," _Stop Making Sense_
tgd@aramis.rutgers.edu (Tom Dietterich) (11/13/89)
From: heck@Sunburn.Stanford.EDU (Stefan P. Heck) writes According to Rumelhart in his ANN/PDP class here, Nettalk was trained on a set of the 1000 most common words rather than a random set. This run took overnight to learn. They later also did a second test using 10 000 words. I don't know for which run the accuracy figures are, but supposedly it got 87% right except on words which were irregular. The best competitor at the time was about 89% accurate. Human capability was estimated at 96%. I have also run the algorithm on the 1000 most common words. The results are quite similar to those I reported for 1000 randomly selected words. Testing is performed on the remaining 19000 words in the dictionary. WORDS LETTERS (PHON/STRESS) BITS ------------------------------------------------------------------- BP TRAIN: 76.6 94.8 97.1 97.3 99.6 120 hidden units TEST : 13.4 68.1 78.7 80.0 96.0 Sejnowski and Rosenberg also trained and tested nettalk on a corpus of connected conversational speech. I don't have access to that data, so I haven't replicated that part of their study. In my work (and in the S&R original), the 1000 most common words are presented one-at-a-time surrounded by blanks. Thomas G. Dietterich Department of Computer Science Computer Science Bldg, Room 100 Oregon State University Corvallis, OR 97331-3902