[comp.ai.neural-nets] solving the XOR problem with Rumelhart's neural net

greg@huey.Jpl.Nasa.GOV (Greg Wanish) (06/25/91)

Has anyone been able to recreate Rumelhart's performance for the XOR problem? 
I am referring to the results published in PDP Vol. 1,  Chapter 8: "Learning Internal
Representations".  A student in his lab, Yves Chauvin, trains a 3 node net to solve the
XOR problem in an average of 245 iterations.  When I attempt to recreate this result,
my net solves it in ~2000 iterations.  I believe I have used the correct input range (0,1) and
output values (.1, .9), and I have used correct values for learning rate and momentum --
n  = .25 and momentum = .9. Omitting the momnetum term, did not decrease my iteration time.
I have used several different sets of weights, and my results have been consistently longer
than Rumelhart's.

				Greg

ted@aps1.spa.umn.edu (Ted Stockwell) (06/25/91)

In article <1991Jun24.182310.11745@jato.jpl.nasa.gov> greg@huey.Jpl.Nasa.GOV (Greg Wanish) writes:

> Has anyone been able to recreate Rumelhart's performance for the XOR
> problem?  I am referring to the results published in PDP Vol. 1,
> Chapter 8: "Learning Internal Representations".  A student in his lab,
> Yves Chauvin, trains a 3 node net to solve the XOR problem in an
> average of 245 iterations.  When I attempt to recreate this result, my
> net solves it in ~2000 iterations.  I believe I have used the correct
> input range (0,1) and output values (.1, .9), and I have used correct
> values for learning rate and momentum -- n = .25 and momentum = .9.
> Omitting the momnetum term, did not decrease my iteration time.  I
> have used several different sets of weights, and my results have been
> consistently longer than Rumelhart's.
> 
> 				   Greg

I attempted to train 1000 nets with weights and bias terms randomly
initialized to values in the range from 0.5 to -0.5 with a learning
rate of 0.25 and a momentum of 0.9 .  Training was aborted if it had
not succeeded after 750 passes through the training data.  This was
the case for 264 of the attempts.  For the networks that completed
training, an average of 499 passes were required, however the number
of passes needed were well distributed from 250 to greater than 700.

 In the book, it does not say how Chauvin's networks were initialized,
and it appears that this makes a significant difference.

--
Ted Stockwell                                     U of MN, Dept. of Astronomy
ted@aps1.spa.umn.edu                          Automated Plate Scanner Project

rcssjh@minyos.xx.rmit.oz.au (Steven Hayes) (06/25/91)

greg@huey.Jpl.Nasa.GOV (Greg Wanish) writes:

>Has anyone been able to recreate Rumelhart's performance for the XOR problem? 
>I am referring to the results published in PDP Vol. 1,  Chapter 8: "Learning Internal
>Representations".  A student in his lab, Yves Chauvin, trains a 3 node net to solve the
>XOR problem in an average of 245 iterations.  When I attempt to recreate this result,
>my net solves it in ~2000 iterations.  I believe I have used the correct input range (0,1) and
>output values (.1, .9), and I have used correct values for learning rate and momentum --
>n  = .25 and momentum = .9. Omitting the momnetum term, did not decrease my iteration time.
>I have used several different sets of weights, and my results have been consistently longer
>than Rumelhart's.

>				Greg


Have you tried it with a lower value for the momentum term? say 0.5?

-Steve

kooijman@duteca4.et.tudelft.nl (Richard Kooijman) (06/25/91)

ted@aps1.spa.umn.edu (Ted Stockwell) writes:

>In article <1991Jun24.182310.11745@jato.jpl.nasa.gov> greg@huey.Jpl.Nasa.GOV (Greg Wanish) writes:

>> Has anyone been able to recreate Rumelhart's performance for the XOR
>> problem?  I am referring to the results published in PDP Vol. 1,
>> Chapter 8: "Learning Internal Representations".  A student in his lab,
>> Yves Chauvin, trains a 3 node net to solve the XOR problem in an
>> average of 245 iterations.  When I attempt to recreate this result, my
>> net solves it in ~2000 iterations.  I believe I have used the correct
>> input range (0,1) and output values (.1, .9), and I have used correct
>> values for learning rate and momentum -- n = .25 and momentum = .9.
>> Omitting the momnetum term, did not decrease my iteration time.  I
>> have used several different sets of weights, and my results have been
>> consistently longer than Rumelhart's.
>> 
>> 				   Greg

>I attempted to train 1000 nets with weights and bias terms randomly
>initialized to values in the range from 0.5 to -0.5 with a learning
>rate of 0.25 and a momentum of 0.9 .  Training was aborted if it had
>not succeeded after 750 passes through the training data.  This was
>the case for 264 of the attempts.  For the networks that completed
>training, an average of 499 passes were required, however the number
>of passes needed were well distributed from 250 to greater than 700.

> In the book, it does not say how Chauvin's networks were initialized,
>and it appears that this makes a significant difference.

For anyone who is interested, I can train a 2-2-1 network to solve the 
XOR function in an average of 15 steps.

I use neuron output ranges of -1 to 1. Any value beyond that crashes the
back-propagation rule but I haven't figured out exactly why. Does anybody
know the answer.


Richard.