dtinker@gpu.utcs.utoronto.ca (Prof. David Tinker) (05/10/89)
An interesting (non-tech) account of Neural Network applications in Chemistry appeared in "Chemical and Engineering News", April 24, 1989. The article describes papers given at a symposium on NN's held by the Division of Computers in Chemistry of the American Chemical Society (the article dosn't give detail on this symposium). Besides a thumbnail sketch of NN theory, several applications are described. D.W. Elrod, G.M. Maggiora and R.G. Trenary used the "ANSim" commercial NN simulator(*) to predict reaction products involving nitrations of monosubstututed benzenes; a 25-element connection table incorporated data on identity, connectivity and charges of non-H atoms. The net, a back-prop algorithm was trained and tested with 13 test compounds - it predicted 10 of the 13 product distributions correctly, about as well as "three organic chemists at Upjohn". An "artificial intelligence" program (not specified) did not do as well. No reference was cited. (* Source: "Science Applications International Corp", running on a 386 PC-AT clone with math co-processor). D.F. Stubbs described use of a NN program (not described in detail) to predict adverse gastrointestinal applications of non-steroidal anti inflammatory drugs, using data on pK, blood half-life, molecular weight and dosage. The net was able to predict adverse drug reaction frequency to an accuracy of 1%. J.D. Bryngelson and J.J. Hopfield discussed use of a NN to predict protein secondary structure from data on amino-acid properties. Here, a couple of recent references are given: N. Qian & T.J. Sejnowski, J. Mol. Biol. 202(4), 865, 1988; L.H. Holley & M. Karplus, Proc. Natl. Acad. Sci. 86(1), 152, 1989. Another approach was described by M.N. Liebman. If anyone has more details on the meeting or the work described, or further references to chemistry/biochemistry applications, please post! ------------------------------------------------------------------------ Disclaimer: I'm just learning about all this stuff! -- ----------------------------------------------------------------------------- ! David O. Tinker ! ^ ^ ! UUCP: dtinker@gpu.utcs.utoronto.ca ! ! Department of Biochemistry !< O O >! BITNET: dtinker@vm.utcs.utoronto.ca ! ! University of Toronto ! ^ ! BIX: dtinker ! ! TORONTO, Ontario, Canada ! ##### ! Voice: (416) 978-3636 ! ! M5S 1A8 ! ! And so on. ! ! ! Hi ho ! ! -----------------------------------------------------------------------------
ted@nmsu.edu (Ted Dunning) (05/11/89)
alan lapedes and co at los alamos were able to predict whether short sequences of dna coded for specific proteins with good accuracy using modified neural nets. much of the work at lanl using neural net methods has been supplanted by doyne farmers local approximation method which (for many problems) is several orders of magnitude more computationally efficient. the use of radial basis functions improves the value of this method considerably (in addition to making the link to nn techniques even stronger).
andrew@berlioz (Lord Snooty @ The Giant Poisoned Electric Head) (05/11/89)
"Science News", April 29th, p271 "Neural Network Predicts Reactions". The key quote for me was "When tested with 13 [..] not in the training set [of 32] the network predicted [..] proportions within 20% of actual values in 10 cases. That equals the performance by a small set of human chemists and beats out by three an existing conventional computer expert system for predicting reaction outcomes." Go Nets! -- Andrew Palfreyman USENET: ...{this biomass}!nsc!logic!andrew National Semiconductor M/S D3969, 2900 Semiconductor Dr., PO Box 58090, Santa Clara, CA 95052-8090 ; 408-721-4788 there's many a slip 'twixt cup and lip
andrew@berlioz (Lord Snooty @ The Giant Poisoned Electric Head) (05/11/89)
In article <TED.89May10130744@kythera.nmsu.edu>, ted@nmsu.edu (Ted Dunning) writes: > much of the work at lanl using neural net methods has been supplanted > by doyne farmers local approximation method which (for many problems) > is several orders of magnitude more computationally efficient. the > use of radial basis functions improves the value of this method > considerably (in addition to making the link to nn techniques even > stronger). This is very interesting, Ted. Could you post some references to the net on Doyne Farmer's method, and perhaps a brief resume of his approach? -- Andrew Palfreyman USENET: ...{this biomass}!nsc!logic!andrew National Semiconductor M/S D3969, 2900 Semiconductor Dr., PO Box 58090, Santa Clara, CA 95052-8090 ; 408-721-4788 there's many a slip 'twixt cup and lip
aboulang@bbn.com (Albert Boulanger) (05/11/89)
In article <201@bach.nsc.com> Andrew Palfreyman asks: In article <TED.89May10130744@kythera.nmsu.edu>, ted@nmsu.edu (Ted Dunning) writes: > much of the work at lanl using neural net methods has been supplanted > by doyne farmers local approximation method which (for many problems) > is several orders of magnitude more computationally efficient. the > use of radial basis functions improves the value of this method > considerably (in addition to making the link to nn techniques even > stronger). This is very interesting, Ted. Could you post some references to the net on Doyne Farmer's method, and perhaps a brief resume of his approach? -- I should mention that besides Doyne Farmer's method, James Crutchfield and Bruce S. McNamara have a method for recovering the equations of motion from a time series. The reference is: "Equations of Motion from a Data Series", James Crutchfield & Bruce McNamara, Complex Systems, 1(1987) 417-452. Unfortunately, I never did get a reference to Doyne's method. Chaotically yours, Albert Boulanger BBN Systems & Technologies Corp. aboulanger@bbn.com
ted@nmsu.edu (Ted Dunning) (05/12/89)
In article <201@bach.nsc.com> andrew@berlioz (Lord Snooty @ The Giant Poisoned Electric Head) writes: In article <TED.89May10130744@kythera.nmsu.edu>, ted@nmsu.edu (Ted Dunning) writes: > much of the work at lanl using neural net methods has been supplanted > by doyne farmers local approximation method which (for many problems) ... This is very interesting, Ted. Could you post some references to the net on Doyne Farmer's method, and perhaps a brief resume of his approach? the best reference is the los alamos tech report LA-UR-88-901. i asssume that this is available from lanl, somehow (i got mine by hand). (interestingly, on the last page i not a us gov printing office number: 1988-0-573-034/80049). (you might also try doyne whose net address is jdf@lanl.gov) an extract of the abstract follows: Exploiting Chaos to Predict the Future and Reduce Noise J. Doyne Farmer and John J. Sidorowich We discuss new approaches to forecasting, noise reduction, and the analysis of experimental data. The basic idea is to embed the data in a state space and then use straightforward numerical techniques to build a nonlinear dynamical model. We pick an ad hoc nonlinear representation, and fit it to the data. For higher dimensional problems we find that breaking the domain into neighborhoods using local approxiamtion is usually better than using an arbitrary global representation. When random behavior is caused by low dimensional chaos our short term forecasts can be several orders of magnitude better than thos of standard linear methods. We derive error estimates for the accuracy of approximatin in terms of attractor dimension and Lyapunov exponents, the number of data points, and the extrapolation time. We demonstrate that for a given extrapolation time T iterating a short-term estimate is superior to computing an estimate for T directly. ... We propose a nonlinear averaging scheme for separating noise from deterministic dynamics. For chaotic time series the noise reduction possible depends exponentially on the length of the time series, whereas for non-chaotic behavior it is proportional to the square root. WHen the equations of motion are known exactly, we can achieve noise reductions of more than ten orders of magnitude. Wehn the equations are not known the limitation comes from predication error, but for low dimensional systems noise reductions of several orders of magnitude are still possible. The basic principles underlying our methods are similar to those of neural nets, but are more straightforward. For forecasting, we get equivalent or better results with vastly less computer time. We suggest that these ideas can be applied to a much larger class of problems. ---------------- hope that this helps.
mbkennel@phoenix.Princeton.EDU (Matthew B. Kennel) (05/12/89)
In article <39817@bbn.COM> aboulanger@bbn.com writes: >In article <201@bach.nsc.com> Andrew Palfreyman asks: > > In article <TED.89May10130744@kythera.nmsu.edu>, ted@nmsu.edu (Ted Dunning) writes: > > much of the work at lanl using neural net methods has been supplanted > > by doyne farmers local approximation method which (for many problems) > > is several orders of magnitude more computationally efficient. the > > use of radial basis functions improves the value of this method > > considerably (in addition to making the link to nn techniques even > > stronger). > > This is very interesting, Ted. Could you post some references to the net > on Doyne Farmer's method, and perhaps a brief resume of his approach? > -- > >I should mention that besides Doyne Farmer's method, James Crutchfield >and Bruce S. McNamara have a method for recovering the equations of >motion from a time series. The reference is: > >"Equations of Motion from a Data Series", James Crutchfield & Bruce >McNamara, Complex Systems, 1(1987) 417-452. > >Unfortunately, I never did get a reference to Doyne's method. > > >Chaotically yours, >Albert Boulanger >BBN Systems & Technologies Corp. >aboulanger@bbn.com The best reference for Farmers' local approximation method is in the LANL preprint: "Exploiting Chaos to Predict the Future and Reduce Noise" by Farmer and Sidrowich. Note that this paper only deals with predicting chaotic dynamical systems. Having just done a thesis on the same sort of prediction using neural networks, I might be able to give an outline of various methods. There are basically two major categories: 1) Global methods. In this scheme, one tries to fit some complex function to all the data. There are various functional forms. A) Linear methods. By "linear" I mean the output of the function depends linearly on the _free parameters_; all of these functions can represent nonlinear transformations from input to output, of course. If the output is linear in the free parameters (weights), there is a well-known guaranteed, optimal & deterministic learning algorithm for the least squared error measure. 1) Global polynomials. Basically a taylor expansion of degree d in m unknowns. This is what Crutchfield & MacNamara use. Advantages: easy to compute. Disadvantages, for complicated functions, you need large values of d and the number of free weights increases very rapidly, approx = d^m. High-degree polynomials tend to blow up away from the domains on which they are fitted. (trained) 2) Rational quotients of polynomials. Similar to above, but if the degrees of num. and denom. are same, they don't blow up. 3) Radial basis functions, with fixed centers. Here you choose the centers of your radial functions, and then learn the weights of the output layer using linear least squares, for example. This can put put in terms of a network with a single hidden layer of neurons, where the first layer of weights is fixed, and the second layer of weights leads to a linear output unit. See a recent paper by Broomhead & (somebody-or other) in Complex Systems. There's also a pre-print by Martin Casdagli, using this to predict chaotic systems. ED note: this method looks promising and should probably be investigated further, especially to see whether it works in applications other than chaotic systems prediction. B) Nonlinear methods. Now, you have to use an iterative gradient-descent type of method. 1) Standard sigmoidal back-prop net. Well known learning algorithm. 2) Radial basis function, with adjustable centers. (This is what I used). Lets you represent the mapping more accurately using the same size network (=free parameters) compared to sigmoidal net, I found. Learn with conjugate gradient. II) Local methods. Now, instead of trying to fit a global function to all examples, you make a simple individual fit _every_ time you need an output. Here's how Farmer's method works: Given some input vector, find the N closest points in the data base to this input vector. For each of these stored points, you know what the output was. With these input-output pairs, make a linear or quadratic approximation, and fit the free parameters using linear least squares. This should be fast because there aren't that many examples (10-30 say) and free parameters. "Learning" simply consists of putting the input data base into a k-d tree that permits you to retrieve nearest neighbors is O(log N) time, as needed to make predictions. This method has a nice intuitive interpretation: given some new input, you look around to see what things like that input did, and you do something like that. Advantages: Much faster computationally than an iterative gradient-descent optimization, especially for large data bases of examples. Doesn't require any hard crunching to learn some function. Probably more accurate for large data bases, too, because most people wouldn't have the patience or computer power to learn a large network to high accuracy. Disadvantages: The mapping isn't in a nice analytic functional form. You can't realize it in silicon. You need to carry around the data base of examples, and making predictions is slower (a search & small fit vs. a simple functional evaluation). ================================= Personal opinion: If you have a large data base (say over 1K examples) of noise-free continuous-valued examples, local linear and local quadratic methods will probably be the best. I don't know what the effect input noise would have on this method, compared to neural networks, but I don't think it would be that bad. For binary values, it may not be as good, for the whole method is rooted in the field of dynamical systems. But I don't think anybody's tried yet, so I have no real evidence one way or the other. ================== Get the preprint (it's very good) from Doyne Farmer at LANL. I believe his e-mail address is "jdf%heretic@lanl.gov". An earlier version of this appeared in Physical Review Letters, so it's definitely real. Matt Kennel mbkennel@phoenix.princeton.edu
hsg@romeo.cs.duke.edu (Henry Greenside) (05/12/89)
In these discussions of Farmer et al's methods versus neural nets, has anyone addressed the real issue, how to treat high-dimensional data? In his paper, Farmer et al point out the crucial fact that one can learn only low dimensional chaotic systems (where low is rather vague, say of dimension less than about 5). High dimensional systems require huge amounts of data for learning. Presumably many interesting data sets (weather, stock markets, chemical patterns, etc.) are not low-dimensional and neither method will be useful.
mbkennel@phoenix.Princeton.EDU (Matthew B. Kennel) (05/13/89)
In article <14480@duke.cs.duke.edu> hsg@romeo.UUCP (Henry Greenside) writes:
)In these discussions of Farmer et al's methods versus
)neural nets, has anyone addressed the real issue, how
)to treat high-dimensional data?
)
)In his paper, Farmer et al point out the crucial fact
)that one can learn only low dimensional chaotic systems
)(where low is rather vague, say of dimension less than
)about 5). High dimensional systems require huge amounts
)of data for learning. Presumably many interesting data
)sets (weather, stock markets, chemical patterns, etc.)
)are not low-dimensional and neither method will be
)useful.
Quite true. This is definitely a fundamental problem. As the
dimension gets higher and higher, the data series looks more and more
like true random noise, and so predicion becomes impossible. Note, for
example, that the output of your favorite "random number generator" is
most likely deterministic, but probably has such a high dimension that
you can't predict with any accuracy without knowing the exact algorithm
used.
The real problem comes down to finding a representation with both
smooth mappings and low (fractal) dimensional input spaces. This
requires plain old hard work and clever insight.
Matt Kennel
mbkennel@phoenix.princeton.edu