neuron-request@HPLABS.HP.COM ("Neuron-Digest Moderator Peter Marvit") (06/11/89)
Neuron Digest Saturday, 10 Jun 1989 Volume 5 : Issue 26 Today's Topics: Neural Net Applications in Chemistry Re: Neural Net Applications in Chemistry Re: Neural Net Applications in Chemistry Re: Neural Net Applications in Chemistry Re: Neural Net Applications in Chemistry Re: Neural Net Applications in Chemistry Re: Neural Net Applications in Chemistry Re: Neural Net Applications in Chemistry Re: Neural Net Applications in Chemistry Re: Neural Net Applications in Chemistry computer composition Info request on parallel implementation & invarient object recognition RE: Neurological Topology (leading to poll) Re: Neuron Digest V5 #20 RE: Neuron Digest V5 #24 Re: Neuron Digest V5 #24 Send submissions, questions, address maintenance and requests for old issues to "neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request" ARPANET users can get old issues via ftp from hplpm.hpl.hp.com (15.255.16.205). ------------------------------------------------------------ Subject: Neural Net Applications in Chemistry From: dtinker@gpu.utcs.utoronto.ca (Prof. David Tinker) Organization: University of Toronto Computing Services Date: Wed, 10 May 89 15:09:44 +0000 An interesting (non-tech) account of Neural Network applications in Chemistry appeared in "Chemical and Engineering News", April 24, 1989. The article describes papers given at a symposium on NN's held by the Division of Computers in Chemistry of the American Chemical Society (the article dosn't give detail on this symposium). Besides a thumbnail sketch of NN theory, several applications are described. D.W. Elrod, G.M. Maggiora and R.G. Trenary used the "ANSim" commercial NN simulator(*) to predict reaction products involving nitrations of monosubstututed benzenes; a 25-element connection table incorporated data on identity, connectivity and charges of non-H atoms. The net, a back-prop algorithm was trained and tested with 13 test compounds - it predicted 10 of the 13 product distributions correctly, about as well as "three organic chemists at Upjohn". An "artificial intelligence" program (not specified) did not do as well. No reference was cited. (* Source: "Science Applications International Corp", running on a 386 PC-AT clone with math co-processor). D.F. Stubbs described use of a NN program (not described in detail) to predict adverse gastrointestinal applications of non-steroidal anti inflammatory drugs, using data on pK, blood half-life, molecular weight and dosage. The net was able to predict adverse drug reaction frequency to an accuracy of 1%. J.D. Bryngelson and J.J. Hopfield discussed use of a NN to predict protein secondary structure from data on amino-acid properties. Here, a couple of recent references are given: N. Qian & T.J. Sejnowski, J. Mol. Biol. 202(4), 865, 1988; L.H. Holley & M. Karplus, Proc. Natl. Acad. Sci. 86(1), 152, 1989. Another approach was described by M.N. Liebman. If anyone has more details on the meeting or the work described, or further references to chemistry/biochemistry applications, please post! Disclaimer: I'm just learning about all this stuff! - --------------------------------------------------------------------------- ! David O. Tinker ! ^ ^ ! UUCP: dtinker@gpu.utcs.utoronto.ca ! ! Department of Biochemistry !< O O >! BITNET: dtinker@vm.utcs.utoronto.ca ! ! University of Toronto ! ^ ! BIX: dtinker ! ! TORONTO, Ontario, Canada ! ##### ! Voice: (416) 978-3636 ! ! M5S 1A8 ! ! And so on. ! ! ! Hi ho ! ! ------------------------------ Subject: Re: Neural Net Applications in Chemistry From: ted@nmsu.edu (Ted Dunning) Organization: NMSU Computer Science Date: Wed, 10 May 89 19:07:44 +0000 alan lapedes and co at los alamos were able to predict whether short sequences of dna coded for specific proteins with good accuracy using modified neural nets. much of the work at lanl using neural net methods has been supplanted by doyne farmers local approximation method which (for many problems) is several orders of magnitude more computationally efficient. the use of radial basis functions improves the value of this method considerably (in addition to making the link to nn techniques even stronger). ------------------------------ Subject: Re: Neural Net Applications in Chemistry From: andrew@berlioz (Lord Snooty @ The Giant Poisoned Electric Head) Organization: National Semiconductor, Santa Clara Date: Thu, 11 May 89 01:44:05 +0000 "Science News", April 29th, p271 "Neural Network Predicts Reactions". The key quote for me was "When tested with 13 [..] not in the training set [of 32] the network predicted [..] proportions within 20% of actual values in 10 cases. That equals the performance by a small set of human chemists and beats out by three an existing conventional computer expert system for predicting reaction outcomes." Go Nets! Andrew Palfreyman USENET: ...{this biomass}!nsc!logic!andrew National Semiconductor M/S D3969, 2900 Semiconductor Dr., PO Box 58090, Santa Clara, CA 95052-8090 ; 408-721-4788 there's many a slip 'twixt cup and lip ------------------------------ Subject: Re: Neural Net Applications in Chemistry From: aboulang@bbn.com (Albert Boulanger) Date: Thu, 11 May 89 15:26:47 +0000 I should mention that besides Doyne Farmer's method, James Crutchfield and Bruce S. McNamara have a method for recovering the equations of motion from a time series. The reference is: "Equations of Motion from a Data Series", James Crutchfield & Bruce McNamara, Complex Systems, 1(1987) 417-452. Unfortunately, I never did get a reference to Doyne's method. Chaotically yours, Albert Boulanger BBN Systems & Technologies Corp. aboulanger@bbn.com ------------------------------ Subject: Re: Neural Net Applications in Chemistry From: ted@nmsu.edu (Ted Dunning) Organization: NMSU Computer Science Date: Thu, 11 May 89 20:15:34 +0000 the best reference is the los alamos tech report LA-UR-88-901. i asssume that this is available from lanl, somehow (i got mine by hand). (interestingly, on the last page i not a us gov printing office number: 1988-0-573-034/80049). (you might also try doyne whose net address is jdf@lanl.gov) an extract of the abstract follows: Exploiting Chaos to Predict the Future and Reduce Noise J. Doyne Farmer and John J. Sidorowich We discuss new approaches to forecasting, noise reduction, and the analysis of experimental data. The basic idea is to embed the data in a state space and then use straightforward numerical techniques to build a nonlinear dynamical model. We pick an ad hoc nonlinear representation, and fit it to the data. For higher dimensional problems we find that breaking the domain into neighborhoods using local approxiamtion is usually better than using an arbitrary global representation. When random behavior is caused by low dimensional chaos our short term forecasts can be several orders of magnitude better than thos of standard linear methods. We derive error estimates for the accuracy of approximatin in terms of attractor dimension and Lyapunov exponents, the number of data points, and the extrapolation time. We demonstrate that for a given extrapolation time T iterating a short-term estimate is superior to computing an estimate for T directly. ... We propose a nonlinear averaging scheme for separating noise from deterministic dynamics. For chaotic time series the noise reduction possible depends exponentially on the length of the time series, whereas for non-chaotic behavior it is proportional to the square root. WHen the equations of motion are known exactly, we can achieve noise reductions of more than ten orders of magnitude. Wehn the equations are not known the limitation comes from predication error, but for low dimensional systems noise reductions of several orders of magnitude are still possible. The basic principles underlying our methods are similar to those of neural nets, but are more straightforward. For forecasting, we get equivalent or better results with vastly less computer time. We suggest that these ideas can be applied to a much larger class of problems. hope that this helps. ------------------------------ Subject: Re: Neural Net Applications in Chemistry From: mbkennel@phoenix.Princeton.EDU (Matthew B. Kennel) Organization: Princeton University, NJ Date: Thu, 11 May 89 20:34:22 +0000 The best reference for Farmers' local approximation method is in the LANL preprint: "Exploiting Chaos to Predict the Future and Reduce Noise" by Farmer and Sidrowich. Note that this paper only deals with predicting chaotic dynamical systems. Having just done a thesis on the same sort of prediction using neural networks, I might be able to give an outline of various methods. There are basically two major categories: 1) Global methods. In this scheme, one tries to fit some complex function to all the data. There are various functional forms. A) Linear methods. By "linear" I mean the output of the function depends linearly on the _free parameters_; all of these functions can represent nonlinear transformations from input to output, of course. If the output is linear in the free parameters (weights), there is a well-known guaranteed, optimal & deterministic learning algorithm for the least squared error measure. 1) Global polynomials. Basically a taylor expansion of degree d in m unknowns. This is what Crutchfield & MacNamara use. Advantages: easy to compute. Disadvantages, for complicated functions, you need large values of d and the number of free weights increases very rapidly, approx = d^m. High-degree polynomials tend to blow up away from the domains on which they are fitted. (trained) 2) Rational quotients of polynomials. Similar to above, but if the degrees of num. and denom. are same, they don't blow up. 3) Radial basis functions, with fixed centers. Here you choose the centers of your radial functions, and then learn the weights of the output layer using linear least squares, for example. This can put put in terms of a network with a single hidden layer of neurons, where the first layer of weights is fixed, and the second layer of weights leads to a linear output unit. See a recent paper by Broomhead & (somebody-or other) in Complex Systems. There's also a pre-print by Martin Casdagli, using this to predict chaotic systems. ED note: this method looks promising and should probably be investigated further, especially to see whether it works in applications other than chaotic systems prediction. B) Nonlinear methods. Now, you have to use an iterative gradient-descent type of method. 1) Standard sigmoidal back-prop net. Well known learning algorithm. 2) Radial basis function, with adjustable centers. (This is what I used). Lets you represent the mapping more accurately using the same size network (=free parameters) compared to sigmoidal net, I found. Learn with conjugate gradient. II) Local methods. Now, instead of trying to fit a global function to all examples, you make a simple individual fit _every_ time you need an output. Here's how Farmer's method works: Given some input vector, find the N closest points in the data base to this input vector. For each of these stored points, you know what the output was. With these input-output pairs, make a linear or quadratic approximation, and fit the free parameters using linear least squares. This should be fast because there aren't that many examples (10-30 say) and free parameters. "Learning" simply consists of putting the input data base into a k-d tree that permits you to retrieve nearest neighbors is O(log N) time, as needed to make predictions. This method has a nice intuitive interpretation: given some new input, you look around to see what things like that input did, and you do something like that. Advantages: Much faster computationally than an iterative gradient-descent optimization, especially for large data bases of examples. Doesn't require any hard crunching to learn some function. Probably more accurate for large data bases, too, because most people wouldn't have the patience or computer power to learn a large network to high accuracy. Disadvantages: The mapping isn't in a nice analytic functional form. You can't realize it in silicon. You need to carry around the data base of examples, and making predictions is slower (a search & small fit vs. a simple functional evaluation). ================================= Personal opinion: If you have a large data base (say over 1K examples) of noise-free continuous-valued examples, local linear and local quadratic methods will probably be the best. I don't know what the effect input noise would have on this method, compared to neural networks, but I don't think it would be that bad. For binary values, it may not be as good, for the whole method is rooted in the field of dynamical systems. But I don't think anybody's tried yet, so I have no real evidence one way or the other. ================== Get the preprint (it's very good) from Doyne Farmer at LANL. I believe his e-mail address is "jdf%heretic@lanl.gov". An earlier version of this appeared in Physical Review Letters, so it's definitely real. Matt Kennel mbkennel@phoenix.princeton.edu ------------------------------ Subject: Re: Neural Net Applications in Chemistry From: hsg@romeo.cs.duke.edu (Henry Greenside) Organization: Duke University CS Dept.; Durham, NC Date: Fri, 12 May 89 03:15:49 +0000 In these discussions of Farmer et al's methods versus neural nets, has anyone addressed the real issue, how to treat high-dimensional data? In his paper, Farmer et al point out the crucial fact that one can learn only low dimensional chaotic systems (where low is rather vague, say of dimension less than about 5). High dimensional systems require huge amounts of data for learning. Presumably many interesting data sets (weather, stock markets, chemical patterns, etc.) are not low-dimensional and neither method will be useful. ------------------------------ Subject: Re: Neural Net Applications in Chemistry From: aboulang@bbn.com (Albert Boulanger) Date: Fri, 12 May 89 19:17:56 +0000 [ Regarding article above by Henry Greenside: ] I think you have a legit concern here, but in many cases where high-dimensionality is suspected, it turns out to be low-dimensional. Farmer, Crutchfield and others have developed a way of estimating the dimensionality of the dynamics underlying time-series data by using a sliding-time window technique where data-values within the window are treated as independent dimensions. For example a 10000 point sample would be broken-up as 1000 10-vectors for a 10-dimensional embedding. One then varies the size of the window and compares the dimensionality of the attractor with its embedding space. When they diverge is an estimate of the dimensionality of the underlying dynamics. (I may have this wrong somewhat, but this is the gist.) What is truly amazing to an observer of nature, is the tremendous dimensionality reduction that occurs in many-body systems. I don't really understand why this is so, but it is this property that mean-field approaches to NNs capitalize on. People have looked the situation of low-dimensionality chaos coupled with noise. For example see: "Symbolic Dynamics of Noisy Chaos" J.P. Crutchfield and N.H. Packard Physica R-9D (1983) (I don't have the page numbers, since I have a preprint copy, sorry.) I have also heard of a growing interest to treat attractors in a measure-theoretical way to deal with dynamical systems coupled with noise. If anybody has pointers to this, please let me know. Albert Boulanger BBN Systems & Technologies Corporation aboulanger@bbn.com ------------------------------ Subject: Re: Neural Net Applications in Chemistry From: "Matthew B. Kennel" <phoenix!mbkennel@PRINCETON.EDU> Organization: Princeton University, NJ Date: 12 May 89 19:55:05 +0000 [ Regarding above article by hsg@romeo.UUCP (Henry Greenside) ] Quite true. This is definitely a fundamental problem. As the dimension gets higher and higher, the data series looks more and more like true random noise, and so predicion becomes impossible. Note, for example, that the output of your favorite "random number generator" is most likely deterministic, but probably has such a high dimension that you can't predict with any accuracy without knowing the exact algorithm used. The real problem comes down to finding a representation with both smooth mappings and low (fractal) dimensional input spaces. This requires plain old hard work and clever insight. Matt Kennel mbkennel@phoenix.princeton.edu ------------------------------ Subject: Re: Neural Net Applications in Chemistry From: aboulang@bbn.com (Albert Boulanger) Date: Fri, 12 May 89 23:44:48 +0000 In article <8393@phoenix.Princeton.EDU>, Matthew B. Kennel writes: Quite true. This is definitely a fundamental problem. As the dimension gets higher and higher, the data series looks more and more like true random noise, and so predicion becomes impossible. Note, for example, that the output of your favorite "random number generator" is most likely deterministic, but probably has such a high dimension that you can't predict with any accuracy without knowing the exact algorithm used. Actually they are low dimensional chaotic discrete maps. They have to be low dimensional to be efficient. The reason that they are cyclic is that there is no computer representation for irrationals. For a nice picture of this see page 90 of Knuth Vol 2. "The Numbers fall mainly in the planes." I have been playing with increasing the dimensionality of random number generators by asynchronous iterative methods on MIMD parallel machines. If anybody is interested in the dynamical apsects of discrete iterations, a good book to start with is: Discrete Iterations: A Metric Study Francios Robert Springer-Verlag 1986. The following is a nice think-piece on "discrete" randomness and "continuous" randomness: 2. Ford, Joseph. "How Random is a Coin Toss?". Physics Today (April 1983), 40-47. Very good introduction to the notions of chaos and deterministic randomness. Now, this is off the subject of NNs, and we could follow up on this in the group comp.theory.dynamic-sys. Albert Boulanger BBN Systems & Technologies Corp. aboulanger@bbn.com ------------------------------ Subject: computer composition From: Eric Harnden <EHARNDEN%AUVM.BITNET@CORNELLC.cit.cornell.edu> Date: Tue, 23 May 89 15:08:38 -0400 i know this is a little off the wall for the topic of this digest, but the reference to JimiMatrix leads me to think that there may be a resource here... i am sending this simultaneously to several of the lists to which i am subscribed. my apologies to those of my compatriots who have parallel subscriptions, and thus will see this more than once. there are many approaches to computer-generated music that i am aware of. in general they seem to fall into two categories. 1) analyze known works for event patterns and meta-patterns. this allows long probability sequences to be developed by direct derivation. 2) synthesize sequences with stochastic models with short memory. i'm interested in the idea of constructing an event tree, whose depth may well equal the number of events to be specified. i want to assign weight to an event based, not just on the occurence of its predecessor, but on the development of the string of which it is a member. i don't want to derive the rules for weighting from analysis of other strings. so i'm not writing a mozart program, and i'm not doing pink tunes, and i'm not implementing markov chains. before i go crashing headlong into this, does anybody have any thoughts, references, things i should know, people i should talk to... etc.? Eric Harnden (Ronin) <EHARNDEN@AUVM> The American University Physics Dept. (202) 885-2758 ------------------------------ Subject: Info request on parallel implementation & invarient object recognition From: plonski@aerospace.aero.org Date: Mon, 15 May 89 13:43:50 -0700 I am looking for information on simulating neural networks on a parallel processing architecture such as a Sequent or a network of SUNS. My interest is in how one breaks up the processing for various networks, but in particular backpropogation, when you have small number of processors (~10) available. Should processing be divided by layer, by node subdivision within a layer, by task, etc? What division yields the minimal communication overhead? Also, what are the effects of asynchronous processing. For Associate Memory models, asynchronous processing can be used to avoid limit cycles, but at what cost in speed of convergence. My other interest is in using higher order interconnections (ala Giles, et al.) to get a network that is invariant to geometric transformations in 2D images. It seems to me that the net effect is equivalent to using an initial layer to generate an invariant feature space and then applying an ordinary net to select features in this space. If this is the case then do higher order networks yield any advantages over preproccessing the data first to get an invariant feature space and using that as your input. I would be interested in any references or work that you have done, or know about regarding these issues. You can send the replies directly to me and I will summarize the replies and send them to all interested parties. Thank You. . . .__. The opinions expressed herin are soley |\./| !__! Michael Plonski those of the author and do not represent | | | "plonski@aero.org" those of The Aerospace Corporation. ------------------------------ Subject: RE: Neurological Topology (leading to poll) From: worden@emx.utexas.edu (Sue J. Worden) Date: Fri, 02 Jun 89 05:31:32 -0500 James Salsman says: "... I have read that in humans, the grey matter lobes act as insulation around a somewhat planar network called the "white matter" that is "crunched up," to form a ball. The white mater is not perfectly planar -- it is arranged in layers. The grey matter has fewer neurons than the white matter, but quite a few more capilaries..." Dorland's Illustrated Medical Dictionary says: "substantia alba, white substance: the white nervous tissue, constituting the conducting portion of the brain and spinal cord, and composed mostly of myelinated nerve fibers." "substantia grisea, gray substance: the gray nervous tissue, composed of nerve cell bodies, unmyelinated nerve fibers, and supportive tissue." Simplistically speaking, gray matter is a mass of neurons' cell bodies and white matter is a mass of neurons' axons. The cerebral cortex, for example, is several (six major) layers of cell bodies and, hence, is gray matter. Axons from cell bodies in the cortical layers (and elsewhere) form an underlying white matter region. Unfortunately, simple distinctions between white and gray matter, simple maps of anatomical structures, simple neuron models, simple axon-bundle "wiring diagrams", et cetera, don't appear to contribute much to a solid advancement of knowlege. It seems to me that, for the most part, we're all mucking about, spinning our wheels, getting nowhere fast. When I ponder my own work (with systems of randomly interconnected McCulloch-Pitts neurons), and the work of so many others with everything from sodium channels to psychopharmacology, from snails to psychotics, from perceptrons to artificial cochleae, et cetera, what I sense is a modern form of alchemy. It is as though we're all fluttering about on the periphery of our equivalent of a periodic table. I am, naturally, curious to know where my opinion falls on the bell curve of your honest thoughts. Ignoring for a moment the practicalities of publish-or-perish, winning grants/contracts, and so on: (1) Do any of you REALLY believe that any present line of research will directly lead to either: (a) widely-accepted understanding of our nervous systems, our intelligence, our personalities, our selves? (b) machine capabilities widely accepted to be equivalent to human capabilities? (2) If so, which line(s) of research? (3) If not, do you believe: (a) it is impossible or highly improbable? (b) it is possible, but we don't yet have the necessary "key(s)"? (my opinion) (c) something else? - Sue Worden Electrical and Computer Engineering University of Texas at Austin ------------------------------ Subject: Re: Neuron Digest V5 #20 From: Robert Morelos-Zaragoza <robert@wiliki.eng.hawaii.edu> Organization: University of Hawaii, College of Engineering Date: Wed, 24 May 89 16:20:38 -1000 Ladies/Gentlemen: I am looking for papers that indicate the conection between Neural- Networks and Coding Theory. In particular, how can Neural Network Theory help in the construction of new Error-Correcting Codes? Robert Morelos-Zaragoza (robert@wiliki.eng.hawaii.edu) Tel. (808) 938-6094. ------------------------------ Subject: RE: Neuron Digest V5 #24 From: edstrom%UNCAEDU.BITNET@CORNELLC.cit.cornell.edu Date: Wed, 31 May 89 08:31:38 -0600 > From: Ian Parberry <omega.cs.psu.edu!ian@PSUVAX1.CS.PSU.EDU> > >At NIPS last year, one of the workshop attendees told me that, assuming one >models neurons as performing a discrete or analog thresholding operation on >weighted sums of its inputs, the summation appears to be done in the axons >and the thresholding in the soma. ... > > >It's now time to write up the fault-tolerance result. I'd like to include >some references to "accepted" neurobiological sources which back up the >attendee's observation. Trouble is, I am not a neurobiologist, and do not >know where to look. Can somebody knowledgeable please advise me? > THe best reference that comes to mind is "The Synaptic Organization of the Brain an Introduction" by Gordon M. Shepherd, Oxford University Press, 1974 Its a bit outof date but still excellent. Many details have been filled in since 1974 but our basic understanding of the integrative properties of the neuron has not changed all that much. Also, any attempt to jump into contemporary literature on the subject will be much more difficult without this "classic" core knowledge. This book begins by discussing electrotonic phenomena, space constants, current flow in dendritic trees, etc... and then applies the principles to different neurons (cerebral pyramidal cells, cerebellar purkinje cells, spinal motorneurons, etc...). He also discusses some of the local circuits in the various parts of the vertbrate brain from which the sample neurons come. This is the best introduction to the subject that Iknow of and I recommend it highly. John Edstrom +-- In the Real World ----------+--- Elsewhere ---------+ |Dr. John P. Edstrom |EDSTROM@UNCAEDU Bitnet | |Div. Neuroscience |7641,21 CIS | |3330 Hospital Drive NW |JPEDstrom BIX | |Calgary, ALberta T2N 4N1 | | |CANADA (403) 220 4493 | | +-------------------------------+-----------------------+ ------------------------------ Subject: re: Neuron Digest V5 #24 From: stiber@CS.UCLA.EDU (Michael D Stiber) Date: Wed, 31 May 89 13:00:30 -0700 >Subject: Re: wanted: neurobiology references >From: Mark Robert Thorson <portal!cup.portal.com!mmm@uunet.uu.net> >Organization: The Portal System (TM) >Date: 22 Apr 89 00:13:57 +0000 > >I was taught, 10 years ago, that action potentials are believed to originate >at the axon hillock, which might be considered the transition between the >axon and the soma (cell body). See FROM NEURON TO BRAIN by Kuffler and >Nichols (Sinauer 1976), page 349. > >I would expect synaptic weights to be proportional to the axon circumference >where it joins the cell body, but I have no evidence to support that belief. Actually, this is a simplified model. Purkinje cells, for example, are hypothesized to have trigger zones in their dendritic arborization. See Segundo, J.P., "What Can Neurons do to Serve as Integrating Devices?", J. Theoret. Neurobiol. 5, 1-59 (1986). ------------------------------ End of Neurons Digest *********************