[comp.ai.neural-nets] NETtalk results

lynne@brillig.umd.edu (Lynne D'Autrechy) (11/10/89)

In a JHU TR (EECS-86/01) published before the Sejnowski and Rosenberg Complex 
Systems paper, they talk about two criteria that were used to judge the 
performance of the network.  The first criterion is a "perfect match" criterion
and the second criterion is a "best guess" criterion.  Quoting from the TR,

	The output was considered a "perfect match" if the value of
	each articulatory feature was within a margin of 0.1 of its
	correct value.  This was a much stricter criterion than the 
	"best guess", which was the phoneme making the smallest angle
	with the output vector.


As reported in the TR, two types of input were used -- continuous informal
speech and words taken from the dictionary.  For the first type of input,
informal speech, the percentage of correct best guesses after learning was
95% while the percentage of perfect matches was 55%.  For the second type
of input, with a network of 120 hidden units, the best guess performance
was 98% while the perfect match performance was about 52%.

Only the "best guess" statistics were reported in the Complex Systems
article.  In summary, the impressiveness of the results achieved by
NETtalk depends on which criterion you use to judge the performance
of the network.

kfl@speech2.cs.cmu.edu (Kai-Fu Lee) (11/12/89)

Quoting from Dennis Klatt's "Text-to-speech Conversion" In the Journal
of Acoustical Society of America, Sep. 1987:

"[Sejnowsk's NETALK was trained and tested] .. on a 20,000 word phonemic
dictionary.  When evaluated on the words of this training set, the network
was correct for about 90% of phonemes and stress patterns.

... A typical knowledge-based rule system is calimed to perform at
about ... 97%.

... Lucassen and Mercer ... used the forward-backward algorithm of the IBM
speech recognition strategy on a 50,000 word lexicon...  They obtained
correct letter-to-phoneme correspondences for 94% of the letters in words in
a random sample from a 5000 word office-correspondence task."

Note that study (1) tests on the training set (earlier posts indicates
at most 80% accuracy was obtained on test set), while study (3) reports
on test set results, and study (2) is somewhere in the middle.  So as far
as performance is concerned, NETALK does not work nearly as well as
conventional techniques.

nf0a+@andrew.cmu.edu (Nathan W. Fullerton) (11/12/89)

	In response to the many messages that have been claiming conventional
rule based methods get more accurate results than NETtalk, I would like
to point out that a accuracy is not the only advantage NETtalk claims. 
I am not familiar with the specifics on NETtalk, but I have done some
work with back propagation and found that the code is remarkably simple
and easy to manipulate, I assume that since NETtalk uses back
propagation it also has those same advatages.  I've heard that rule
based systems can become EXTREMELY large when the application is not
strictly conducive to a rule based system.  I've written back
propagation programs in less than 45 pages of LISP code (I've heard
higher numbers are the norm but the programs worked, 87% accuracy on OCR
applications). 
	We can't take only accuracy into account.  Back propagation has other
advantages, small size program code, speed of training, and versatility. 

-Nathan Fullerton

eliot@phoenix.Princeton.EDU (Eliot Handelman) (11/12/89)

In article <cZLCp3S00Xc5I6tElg@andrew.cmu.edu> nf0a+@andrew.cmu.edu (Nathan W. Fullerton) writes:
;
;	In response to the many messages that have been claiming conventional
;rule based methods get more accurate results than NETtalk, I would like
;to point out that a accuracy is not the only advantage NETtalk claims. 

What appears to be true is:

1. You get the EFFECT of coding rules without explicit rules
2. The net is a more compact piece of data than a rule base
3. It still works when you perform the neural net equivalent of a lobotomy

I may be wrong but I seem to recall that Rosenberg's work centered on 
point 3. They cut away areas of connections and it still worked reasonably
well. What is interesting is how the memory is distributed, not what the
thing can do.

tgd@aramis.rutgers.edu (Tom Dietterich) (11/13/89)

In article <cZLCp3S00Xc5I6tElg@andrew.cmu.edu>, nf0a+@andrew.cmu.edu (Nathan W. Fullerton) writes:
> 
> 	In response to the many messages that have been claiming conventional
> rule based methods get more accurate results than NETtalk, I would like
> to point out that a accuracy is not the only advantage NETtalk claims. 
> [...]  I've written back
> propagation programs in less than 45 pages of LISP code (I've heard
> higher numbers are the norm but the programs worked, 87% accuracy on OCR
> applications). 
> 	We can't take only accuracy into account.  Back propagation has other
> advantages, small size program code, speed of training, and versatility. 
> 
> -Nathan Fullerton

ID3 can be implemented in a handfull of functions (4000 bytes).  It is
a simpler and more direct algorithm that backpropagation.  Several
studies have shown speed of training for ID3 between 10 and 100 times
faster than backpropagation.  While ID3 is very versatile,
backpropagation *is* definitely more versatile.

Rule-based systems (such as those described by Klatt) may attain
superior performance.  The challenge is to come up with learning
methods that can match the performance of hand-crafted rule bases.

It looks like neither ID3 nor backpropagation can meet this challenge,
but the precise comparative studies have not been done.

--Tom Dietterich
tgd@cs.orst.edu