lynne@brillig.umd.edu (Lynne D'Autrechy) (11/10/89)
In a JHU TR (EECS-86/01) published before the Sejnowski and Rosenberg Complex Systems paper, they talk about two criteria that were used to judge the performance of the network. The first criterion is a "perfect match" criterion and the second criterion is a "best guess" criterion. Quoting from the TR, The output was considered a "perfect match" if the value of each articulatory feature was within a margin of 0.1 of its correct value. This was a much stricter criterion than the "best guess", which was the phoneme making the smallest angle with the output vector. As reported in the TR, two types of input were used -- continuous informal speech and words taken from the dictionary. For the first type of input, informal speech, the percentage of correct best guesses after learning was 95% while the percentage of perfect matches was 55%. For the second type of input, with a network of 120 hidden units, the best guess performance was 98% while the perfect match performance was about 52%. Only the "best guess" statistics were reported in the Complex Systems article. In summary, the impressiveness of the results achieved by NETtalk depends on which criterion you use to judge the performance of the network.
kfl@speech2.cs.cmu.edu (Kai-Fu Lee) (11/12/89)
Quoting from Dennis Klatt's "Text-to-speech Conversion" In the Journal of Acoustical Society of America, Sep. 1987: "[Sejnowsk's NETALK was trained and tested] .. on a 20,000 word phonemic dictionary. When evaluated on the words of this training set, the network was correct for about 90% of phonemes and stress patterns. ... A typical knowledge-based rule system is calimed to perform at about ... 97%. ... Lucassen and Mercer ... used the forward-backward algorithm of the IBM speech recognition strategy on a 50,000 word lexicon... They obtained correct letter-to-phoneme correspondences for 94% of the letters in words in a random sample from a 5000 word office-correspondence task." Note that study (1) tests on the training set (earlier posts indicates at most 80% accuracy was obtained on test set), while study (3) reports on test set results, and study (2) is somewhere in the middle. So as far as performance is concerned, NETALK does not work nearly as well as conventional techniques.
nf0a+@andrew.cmu.edu (Nathan W. Fullerton) (11/12/89)
In response to the many messages that have been claiming conventional rule based methods get more accurate results than NETtalk, I would like to point out that a accuracy is not the only advantage NETtalk claims. I am not familiar with the specifics on NETtalk, but I have done some work with back propagation and found that the code is remarkably simple and easy to manipulate, I assume that since NETtalk uses back propagation it also has those same advatages. I've heard that rule based systems can become EXTREMELY large when the application is not strictly conducive to a rule based system. I've written back propagation programs in less than 45 pages of LISP code (I've heard higher numbers are the norm but the programs worked, 87% accuracy on OCR applications). We can't take only accuracy into account. Back propagation has other advantages, small size program code, speed of training, and versatility. -Nathan Fullerton
eliot@phoenix.Princeton.EDU (Eliot Handelman) (11/12/89)
In article <cZLCp3S00Xc5I6tElg@andrew.cmu.edu> nf0a+@andrew.cmu.edu (Nathan W. Fullerton) writes:
;
; In response to the many messages that have been claiming conventional
;rule based methods get more accurate results than NETtalk, I would like
;to point out that a accuracy is not the only advantage NETtalk claims.
What appears to be true is:
1. You get the EFFECT of coding rules without explicit rules
2. The net is a more compact piece of data than a rule base
3. It still works when you perform the neural net equivalent of a lobotomy
I may be wrong but I seem to recall that Rosenberg's work centered on
point 3. They cut away areas of connections and it still worked reasonably
well. What is interesting is how the memory is distributed, not what the
thing can do.
tgd@aramis.rutgers.edu (Tom Dietterich) (11/13/89)
In article <cZLCp3S00Xc5I6tElg@andrew.cmu.edu>, nf0a+@andrew.cmu.edu (Nathan W. Fullerton) writes: > > In response to the many messages that have been claiming conventional > rule based methods get more accurate results than NETtalk, I would like > to point out that a accuracy is not the only advantage NETtalk claims. > [...] I've written back > propagation programs in less than 45 pages of LISP code (I've heard > higher numbers are the norm but the programs worked, 87% accuracy on OCR > applications). > We can't take only accuracy into account. Back propagation has other > advantages, small size program code, speed of training, and versatility. > > -Nathan Fullerton ID3 can be implemented in a handfull of functions (4000 bytes). It is a simpler and more direct algorithm that backpropagation. Several studies have shown speed of training for ID3 between 10 and 100 times faster than backpropagation. While ID3 is very versatile, backpropagation *is* definitely more versatile. Rule-based systems (such as those described by Klatt) may attain superior performance. The challenge is to come up with learning methods that can match the performance of hand-crafted rule bases. It looks like neither ID3 nor backpropagation can meet this challenge, but the precise comparative studies have not been done. --Tom Dietterich tgd@cs.orst.edu