coms2146@waikato.ac.nz (Alistair Veitch, University of Waikato, New Zealand) (08/16/90)
Has anybody out there worked with Williams and Zipsers "Real-time recurrent learning algorithm"? [Connection Science, Vol 1, No 1]. We are currently trying to implement this algorithm, but have run into some problems. We've got it to run succesfully on the various XOR problems described, the "ab" problem (recognise the first "b" after an "a") and the oscillation problems. What we can't seem to achieve is success for the Turing machine problem. As this is perhaps the major result of the paper, it seems important to duplicate it to reassure ourselves that everything is correct. Has anyone else had success/failure with this problem? If success, would it be possible to post your source? (We think we've got it right, but...) -- Alistair Veitch Phone: +64 71 562889 ext. 8768 Internet: coms2146@waikato.ac.nz +64 71 562388 (home) SNAIL: Computer Science Dept, University of Waikato, Hamilton, New Zealand
ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards) (08/17/90)
In article <1243.26cac1c4@waikato.ac.nz> coms2146@waikato.ac.nz (Alistair Veitch, University of Waikato, New Zealand) writes: >Has anybody out there worked with Williams and Zipsers "Real-time recurrent >learning algorithm"? [Connection Science, Vol 1, No 1]. I haven't actually implemented this algorithm, but I have heard that it is important to use the "Teacher Forcing" method they discuss to learn difficult problems. You might also want to look at J. Schmidhuber, "Making the World Differentiable: On using supervised learning fully-recurrent networks for dynamic reinforcement learning and planning in non-stationary environments", FKI Report 125-90, Technische Univeritat Munchen, 1990. A pole-balancer is trained by reinforcement learning (i.e. apply pain when the pole is dropped). And to explain why gradient-descent methods will probably not give you reasonable temporal learning see J. Schmidhuber, "Towards compositional learning with dynamic neural networks", FKI Report 129-90, TUM, April 1990. He explains that gradient-descent-only methods must take into account training learned during all past time steps when dealing with a new problem. For "toy" temporal learning problems, this is not a big impediment. For "serious" temporal learning problems, dynamic neural systems must develop methods of breaking goals down into subgoals, most of which have already been learned, some of which need to be developed by gradient-descent. In this way, only small problems are trained by gradient-descent, and they are used by the system combinatorially to allow the network-of-networks to solve real problems by "divide-and-conquer" methods. The research is very fresh into this area, and I think in about a year there will be a move away from naive implementations of gradient-descent learning in both stationary and temporal learning and a move towards connectionist compositional learning (Cascade-Correlation is a simple example of this). -Thomas Edwards