[comp.ai.neural-nets] NETtalk

wilkins@nprdc.arpa (Charles Wilkins) (11/13/89)

It seems to me that people are missing the point regarding NetTalk.  Terry
is a biophysicist, and most of the other people involved are psychologists
and linguists.  It would be great if NetTalk outperformed every possible 
alternative method.  But the main purpose of NetTalk is to model human 
performance.  Obviously NetTalk is far from a perfect model of the brain
(just compare number of nodes to number of neurons), but it is probably a 
much better model than a list of linguistic rules.  And if you look at 
Terry's research (and similar models such as Rumelhart and McClelland's,
two psychologists by the way, model of learning the past tense of verbs) 
you will see that the nature of the mistakes (i.e. how similar they are 
to the mistakes humans make) is as important to their models as is the 
degree to which the networks perform.  
It is valid to discuss how well networks compare to other methods, but
it is unfair to attack Terrys work solely on that criterion.
 

kfl@speech2.cs.cmu.edu (Kai-Fu Lee) (11/14/89)

In article <4539@arctic.nprdc.arpa>, wilkins@nprdc.arpa (Charles Wilkins) writes:
> It seems to me that people are missing the point regarding NetTalk.  
> ....
> It is valid to discuss how well networks compare to other methods, but
> it is unfair to attack Terrys work solely on that criterion.

Agreed.  But I didn't see any 'attacks' -- merely citations of results and
papers.  Even if there was an attack, it was not (and should not be)
on NetTalk, but on incorrect and misleading claims such as the following 
quote from William Allman's book:  

	"NETalk was trained on a 1000-word transcript of a first-grader's 
	recorded conversation.  Using back propagation, in just 36 hours of 
	training, NETalk learned to enunciate English with over 95% accuracy."

tgd@orstcs.CS.ORST.EDU (Tom Dietterich) (11/17/89)

   wilkins@nprdc.arpa (Charles Wilkins) writes:

   It seems to me that people are missing the point regarding NetTalk.  Terry
   is a biophysicist, and most of the other people involved are psychologists
   and linguists.  It would be great if NetTalk outperformed every possible
   alternative method.  But the main purpose of NetTalk is to model human
   performance.  

This is incorrect.  Here is an excerpt from the original paper [1]:

  "Despite these similarities with human learning and memory, NETtalk is
  too simple to serve as a good model for the acquisition of reading
  skills in humans.  The network attempts to accomplish in one stage
  what occurs in two stages in human development.  Children learn to
  talk first, and only after representations for words and their
  meanings are well developed do they learn to read. [...]

  NETtalk can be used as a research tool to explore many aspects of
  network coding, scaling, and training in a domain that is far from
  trivial. [...]"


The NETtalk task is an interesting learning task with potential
practical applications.  It is entirely appropriate to compare the
training time and final performance of NETtalk with other learning
algorithms. 

--Tom Dietterich

[1] Parallel Networks that Learn to Pronounce English Text.  T.
Sejnowski and C. Rosenberg, Complex Systems 1: 145-68.  1987.

wilkins@nprdc.arpa (Charles Wilkins) (11/17/89)

I don't want to get into a protracted argument about nettalk, but I would
like to point out the following.  One important point about nettalk, was
the relationship of the mistakes it made compared to the mistakes children
make as they are learning to talk.  Another PDP model by Rumelhart and 
McClelland teaches a network to learn the past tense of verbs.  They use
a training corpus consisting approximately of the first few hundred verbs
that children learn.  They then show the network words it hasn't seen. The
success of the model is not demonstrated by the network responding to 
'take' as 'took', but rather as 'taked', the same sort of mistake a child
would make.  
These models can be looked at in many different ways, but I stand by my
claim that the primary purpose these models are used by the 'PDP' group
(Sejnowski,Rumelhart,McClelland, etal) is to model (however successfully or 
unsuccessfully) human cognition, and if it leads to other uses (e.g.
statistical tool) then wonderful.
These are strictly my own opinion, but I think that the 2-volume set
'Parallel Distibuted Processing: Explorations in the microstructure of
cognition' bears me out.

-Chuck Wilkins

dtgcube (Edward Jung) (11/18/89)

In article <4616@arctic.nprdc.arpa> wilkins@nprdc.arpa (Charles Wilkins) writes:

   the relationship of the mistakes it made compared to the mistakes children
   make as they are learning to talk.  Another PDP model by Rumelhart and 
   McClelland teaches a network to learn the past tense of verbs.  They use
   a training corpus consisting approximately of the first few hundred verbs
   that children learn.  They then show the network words it hasn't seen. The
   success of the model is not demonstrated by the network responding to 
   'take' as 'took', but rather as 'taked', the same sort of mistake a child
   would make.  

This claim was refuted by linguists; there was subsequently a bit of a "contraversy"
surrounding this claim (the similarity of errors to that found in children, and
the effects of artificial "lesions" on the system) in Science (a year or two ago).

David Rumelhart himself might be amused to hear that the PDP group is so united
in their search for the basis of human cognition, although undoubtedly the phrase
"neural network" would imply some relationship to cognition (or at least its
"microstructure".  Indeed, prior to the 1988 ICNN (while there was still some doubt
to the physiological relevance of the then-current connectionist model),
connectionists were attempting to determine a biological mechanism, or an update
to their own back-propagation mechanism, that would unify physiological and
theoretical neural networks (hence the flurry of interest in NMDA receptors, etc
in 1988-1989).  Backpropagation has little physiological relevance to real neural
networks.

This is a topic rich in contraversy, but that should not inhibit connectionist
research!  At this point in time, it is premature to judge approaches by anything
other than their performance.  Even most human behavioral studies are not backed
by anything else.

-- 
Edward Jung                             The Deep Thought Group, L.P.
BIX: ejung                                      3400 Swede Hill Road
NeXT or UNIX mail                                Clinton, WA.  98236
        UUCP: uunet!dtgcube!ed          Internet: ed@dtg.com

gt4150b@prism.gatech.EDU (RODRIGUEZ,THOMAS KENDALL) (07/22/90)

	I'm looking for references to NETtalk, the text to speech
translator that used neural networks.  I believe there was a lot
posted about this several months ago.  Any references anyone
could give about other text to speech translation techniques
would be greatly appreciated.  Thanks.

	tom

Tom Rodriguez            Georgia Institute of Technology, Atlanta Georgia, 30332
uucp: ...!{allegra,amd,hplabs,seismo,ut-ngp}!gatech!prism!gt4150b
ARPA: gt4150b@prism.gatech.edu
"There's no heaven, but there's no hell either... except this one."