[comp.ai.neural-nets] request for philosophic reactions to connectionism

maner@bgsuvax.UUCP (Walter Maner) (04/21/89)

From article <370@eurtrx.UUCP>, by hans@eurtrx.UUCP (Hans Schermer):
> 
> 
> Can anyone out there give me a hand?
> I am looking for philosophical papers, books or articles, with reactions
> to connectionism as a model for the mind. 


The revised (paperback) edition of _Mind Over Machine_ by Hubert & Stuart
Dreyfus would be a good place to begin.  It's published by the Free Press,
which is a division of Macmillan.  The original (hardcover) edition of this
book appeared before connectionism was a hot item, but the revised edition
takes connectionism into account.

-- 
CSNet   : maner@research1.bgsu.edu             | 419/372-8719 
InterNet: maner@research1.bgsu.edu (129.1.1.2) | BGSU CS Dept
UUCP    : ... !osu-cis!bgsuvax!maner           | Bowling Green, OH 43403
BITNet  : MANER@BGSUOPIE

zqli@batcomputer.tn.cornell.edu (Zhenqin Li) (04/21/89)

A whole issue (Vol. XXVI, 1987) of "The Southern Journal of Philosophy"
(published by the Philosophy Dept of Memphis State Univ, Memphis, TN 38152),
is dedicated to philosophical discussions of Connectionism. Having not
spent time on the subject, I can not make judgements. The lists of references
there, however, seem to be extensive. 

rapaport@sunybcs.uucp (William J. Rapaport) (04/21/89)

In article <370@eurtrx.UUCP> hans@eurtrx.UUCP (Hans Schermer) writes:
>
>
>I am looking for philosophical papers, books or articles, with reactions
>to connectionism as a model for the mind. 

Try:

Horgan, Terence, & Tienson, John (eds.), _Connectionism and the
Philosophy of Mind:  Spindel Conference 1987_, in _Southern Journal of
Philosophy_, Vol. 26 supplement (1987).

Available from SJP, Dept. of Phil., Memphis State U., Memphis, TN 38152.

			William J. Rapaport
			Associate Professor of Computer Science
			Co-Director, Graduate Group in Cognitive Science
			Interim Director, Graduate Research Initiative
                       	                  in Cognitive and Linguistic Sciences

Dept. of Computer Science||internet:  rapaport@cs.buffalo.edu
SUNY Buffalo		 ||bitnet:    rapaport@sunybcs.bitnet
Buffalo, NY 14260	 ||uucp: {decvax,watmath,rutgers}!sunybcs!rapaport
(716) 636-3193, 3180     ||fax:  (716) 636-3464

myke@gatech.edu (Myke Rynolds) (04/21/89)

Hans Schermer writes:
>I am looking for philosophical papers, books or articles, with reactions
>to connectionism as a model for the mind. 

I think that BAMs (bi-direction associative memories) and it's conceptual
parent, ART (adaptive resonance theory) give a profound critique of the
connectionist models. Grossberg, the inventer of ART way back in '76, goes
into great detail about how nothing anyone in the connectist school of thought
has said is new, or even as powerful as what already exists! ART is proven
to converge on any complexity of input, no connectionist model can claim this.
They can learn only by limiting the complexity of the input, thus the failure
of bp to deal with large and complex systems.
For all its greater power, it is much much simpliar than these other models
that cloud the issue with ad hoc hockus pockus. Grossberg's model is nothing
more than matrix multiplication. You take a vector forward through a weight
matrix, then take it backwards through it. When it resonates on the correct
answer you're done. The most obvious way to get a weight matrix to satisfy
this problem on a series of such vectors is to stack them in a matrix and
do linear algebra. Walla!
An article on BAMs can be found in a Byte from last year.
BTW, Grossberg has three Ph.D's, two of which are in math and neurophysiology.
Connectionists are generally psychologists and computer scientists who do not
appreciate the deeper simplicity of math under the outer tremendous diversity.
-- 
Myke Rynolds
School of Information & Computer Science, Georgia Tech, Atlanta GA 30332
uucp:	...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!myke
Internet:	myke@gatech.edu

jha@lfcs.ed.ac.uk (Jamie Andrews) (04/21/89)

In article <4034@bgsuvax.UUCP> maner@bgsuvax.UUCP (Walter Maner) writes:
>The revised (paperback) edition of _Mind Over Machine_ by Hubert & Stuart
>Dreyfus would be a good place to begin...
>...  The original (hardcover) edition of this
>book appeared before connectionism was a hot item, but the revised edition
>takes connectionism into account.

     You mean they added a couple of chapters simplistically
trashing connectionism the way they simplistically trash the
rest of AI?  Terrific.

--Jamie, who has been in a bad mood all day
  jha@lfcs.ed.ac.uk
"Gonna melt them down for pills and soap"

kortge@Portia.Stanford.EDU (Chris Kortge) (04/21/89)

In article <18496@gatech.edu> myke@gatech.UUCP (Myke Rynolds) writes:
>
>I think that BAMs (bi-direction associative memories) and it's conceptual
>parent, ART (adaptive resonance theory) give a profound critique of the
>connectionist models. Grossberg, the inventer of ART way back in '76, goes
>into great detail about how nothing anyone in the connectist school of thought
>has said is new, or even as powerful as what already exists! ART is proven
>to converge on any complexity of input, no connectionist model can claim this.
>They can learn only by limiting the complexity of the input, thus the failure
>of bp to deal with large and complex systems.

Hold on a second!  Why is it, then, that people are using
back-propagation learning on most practical applications?  I agree that
bp has trouble with large systems, but it's important to look at the
*results* of the learning process, too.  BP can learn distributed
representations, which have advantages over strictly categorical ones,
which is what ART learns.  More importantly, since BP does supervised
learning, its internal representation is automatically suited to the
task at hand; ART is unsupervised, and thus it's categories are not
necessarily useful for facilitating the required outputs.

>For all its greater power, [ART] is much much simpliar than these other models
>that cloud the issue with ad hoc hockus pockus. Grossberg's model is nothing
>more than matrix multiplication. [...]

Then why can't I understand his papers?  (Don't answer that :-))
Most likely, it's because I'm a connectionist, and

>Connectionists are generally psychologists and computer scientists who do not
>appreciate the deeper simplicity of math under the outer tremendous diversity.

Well, be patient with us, okay?

Chris Kortge
kortge@psych.stanford.edu

dhw@itivax.iti.org (David H. West) (04/22/89)

In article <370@eurtrx.UUCP> hans@eurtrx.UUCP (Hans Schermer) writes:
>
>
>Can anyone out there give me a hand?
>I am looking for philosophical papers, books or articles, with reactions
>to connectionism as a model for the mind. 

Huh?  Connectionism can't *be* a model for the mind.  It might be a
good way to *implement* certain models of the mind, or even a
heuristic criterion for evaluating such models ("must map easily to 
hardware of this general nature").  But it doesn't relieve us of the
task of coming up with the models separately.

-David West            dhw@itivax.iti.org
		       {uunet,rutgers,ames}!sharkey!itivax!dhw
COMPIS, Industrial Technology Institute, PO Box 1485, 
Ann Arbor, MI 48106

mbkennel@phoenix.Princeton.EDU (Matthew B. Kennel) (04/22/89)

In article <18496@gatech.edu> myke@gatech.UUCP (Myke Rynolds) writes:
>
>I think that BAMs (bi-direction associative memories) and it's conceptual
>parent, ART (adaptive resonance theory) give a profound critique of the
>connectionist models. Grossberg, the inventer of ART way back in '76, goes
>into great detail about how nothing anyone in the connectist school of thought
>has said is new, or even as powerful as what already exists! ART is proven
>to converge on any complexity of input, no connectionist model can claim this.
>They can learn only by limiting the complexity of the input, thus the failure
>of bp to deal with large and complex systems.
>For all its greater power, it is much much simpliar than these other models
             ^^^^^^^^^^^^^ 
>that cloud the issue with ad hoc hockus pockus. Grossberg's model is nothing
>more than matrix multiplication. You take a vector forward through a weight
           ^^^^^^^^^^^^^^^^^^^^^^  
>matrix, then take it backwards through it. When it resonates on the correct
>answer you're done. The most obvious way to get a weight matrix to satisfy
>this problem on a series of such vectors is to stack them in a matrix and
>do linear algebra. Walla!
    ^^^^^^ 
    Voila!

That's exactly the point.  For linear problems, than I have no doubt that
classical algorithms (linear systems of equations) should work better than
gradient descent (BP), with the whole shebang of nice rigorous results, but
the whole point is that back-prop tries to learn general non-linear
transformations that AREN'T matrix multiplications.
For some kinds of associative memory something like ART
may be fine, but associative memory isn't the whole story.  It's
generalization (i.e. high-dimensional interpolation) which is the the most
interesting aspect of multi-layer perceptrons.

Can something like a BAM network be more efficient than an "encoder"
type of perceptron in terms of the number of connections?  

>An article on BAMs can be found in a Byte from last year.
>BTW, Grossberg has three Ph.D's, two of which are in math and neurophysiology.
>Connectionists are generally psychologists and computer scientists who do not
>appreciate the deeper simplicity of math under the outer tremendous diversity.

I've never been able to discern the deeper simplicity of math in any ART paper
that I've seen (which is very few, I must admit); back-prop is 

>-- 
>Myke Rynolds
>School of Information & Computer Science, Georgia Tech, Atlanta GA 30332
>uucp:	...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!myke
>Internet:	myke@gatech.edu


Matt Kennel
mbkennel@phoenix.princeton.edu

myke@gatech.edu (Myke Rynolds) (04/22/89)

At portia?! Hey, do you know Paul Gunnels?
Chris Kortge writes:
|Myke Rynolds writes:
||I think that BAMs (bi-direction associative memories) and it's conceptual
||parent, ART (adaptive resonance theory) give a profound critique of the
||connectionist models. Grossberg, the inventer of ART way back in '76, goes
||into great detail about how nothing anyone in the connectist school of thought
||has said is new, or even as powerful as what already exists! ART is proven
||to converge on any complexity of input, no connectionist model can claim this.
||They can learn only by limiting the complexity of the input, thus the failure
||of bp to deal with large and complex systems.
||
|Hold on a second!  Why is it, then, that people are using
|back-propagation learning on most practical applications?
Good question. Maybe its fad?

|I agree that
|bp has trouble with large systems, but it's important to look at the
|*results* of the learning process, too.  BP can learn distributed
|representations, which have advantages over strictly categorical ones,
|which is what ART learns.
False! ART learns internal reps. Both BP and ART generate their own internal
reps (for no good reason in my opinion), but BAMs simply associate input
vectors with output vectors.

|More importantly, since BP does supervised
|learning, its internal representation is automatically suited to the
|task at hand; ART is unsupervised, and thus it's categories are not
|necessarily useful for facilitating the required outputs.
But unless the superviser is omniscient, it doesn't know when to stop being
plastic to prevent memory washout. ART does not suffer from this. The lack
of need for a superviser is not a weakness, it is a tremendous advantage!
|
||For all its greater power, [ART] is much much simpliar than these other models
||that cloud the issue with ad hoc hockus pockus. Grossberg's model is nothing
||more than matrix multiplication. [...]
|
|Then why can't I understand his papers?  (Don't answer that :-))
|Most likely, it's because I'm a connectionist, and
Cuz the man is lost in his own little world. However, hes not being swept along
by any mobs either.
|
||Connectionists are generally psychologists and computer scientists who do not
||appreciate the deeper simplicity of math under the outer tremendous diversity.
|
|Well, be patient with us, okay?
Ok, as long as y'all see the light soon!
-- 
Myke Rynolds
School of Information & Computer Science, Georgia Tech, Atlanta GA 30332
uucp:	...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!myke
Internet:	myke@gatech.edu

myke@gatech.edu (Myke Rynolds) (04/22/89)

Matthew B. Kennel writes:
|That's exactly the point.  For linear problems, than I have no doubt that
|classical algorithms (linear systems of equations) should work better than
|gradient descent (BP), with the whole shebang of nice rigorous results, but
|the whole point is that back-prop tries to learn general non-linear
|transformations that AREN'T matrix multiplications.
|For some kinds of associative memory something like ART
|may be fine, but associative memory isn't the whole story.  It's
|generalization (i.e. high-dimensional interpolation) which is the the most
|interesting aspect of multi-layer perceptrons.

Ah! It is if you are dealing with real valued neurons, which BP gives the
fascade of doing, but infact it only uses the high and low end of the range
and is thus binary. With binary neurons, non-linear models are not one iota
more powerful. Infact, they only increase the complexity of the alg. I'm
working on a sparse linear equation solver with binary compaction to function
as a BAM type associative memory. I want to beat the masters at chess.
(ALL my facualty are excited by my ideas! Thats to say, I'm not a crank)

|Can something like a BAM network be more efficient than an "encoder"
|type of perceptron in terms of the number of connections?  
Its an associative memory, not an encoder. Night and day.

|>Connectionists are generally psychologists and computer scientists who do not
|>appreciate the deeper simplicity of math under the outer tremendous diversity.
|I've never been able to discern the deeper simplicity of math in any ART paper
|that I've seen (which is very few, I must admit); back-prop is 
Thats because the dude is clueless about people.
-- 
Myke Rynolds
School of Information & Computer Science, Georgia Tech, Atlanta GA 30332
uucp:	...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!myke
Internet:	myke@gatech.edu

mbkennel@phoenix.Princeton.EDU (Matthew B. Kennel) (04/22/89)

In article <18504@gatech.edu) myke@gatech.UUCP (Myke Rynolds) writes:
)Matthew B. Kennel writes:
)|That's exactly the point.  For linear problems, than I have no doubt that
)|classical algorithms (linear systems of equations) should work better than
)|gradient descent (BP), with the whole shebang of nice rigorous results, but
)|the whole point is that back-prop tries to learn general non-linear
)|transformations that AREN'T matrix multiplications.
)|For some kinds of associative memory something like ART
)|may be fine, but associative memory isn't the whole story.  It's
)|generalization (i.e. high-dimensional interpolation) which is the the most
)|interesting aspect of multi-layer perceptrons.
)
)Ah! It is if you are dealing with real valued neurons, which BP gives the
)fascade of doing, but infact it only uses the high and low end of the range
)and is thus binary. With binary neurons, non-linear models are not one iota
)more powerful. Infact, they only increase the complexity of the alg.
             ????????????????
Huh? Wasn't the inability to learn non-linear
transformations the fatal stake through the heart of the single-layer 1960's
perceptrons?  Can ART learn parity?  What makes it different from a
classical perceptron?  You said ART was basically matrix multiplication; if
so, I have serious doubts about its power.

)I'm
)working on a sparse linear equation solver with binary compaction to function
)as a BAM type associative memory. I want to beat the masters at chess.
)(ALL my facualty are excited by my ideas! Thats to say, I'm not a crank)
)
You can beat Kasparov with _linear_ tranformations?  Or maybe what you
mean by linear isn't the same as what I'm thinking of.  By linear
I mean linear in the input vectors: a single-layer classical perceptron.
I can easily deal with algorithms that are linear in the _free parameters_,
but still represent _nonlinear_ transformations, like radial basis functions.

)|Can something like a BAM network be more efficient than an "encoder"
)|type of perceptron in terms of the number of connections?  
)Its an associative memory, not an encoder. Night and day.
)
What I mean is some network that has a small internal layer that then
fans back out to an output layer.  As a purely contrived example, consider
associating 128 bit binary numbers where only a single bit in each string
is on.

For a single layer linear system:
128 neurons -) 128 neurons = 16,384 connections.

For a multi-layer "encoder":
128 neurons -) 7 neurons -) 128 neurons = 1,792 connections.

This is something like what I'm trying to get at.

)|)Connectionists are generally psychologists and computer scientists who do not
)|)appreciate the deeper simplicity of math under the outer tremendous diversity.
)|I've never been able to discern the deeper simplicity of math in any ART paper
)|that I've seen (which is very few, I must admit); back-prop is 
)Thats because the dude is clueless about people.

Do you think you could try to make a simple mathematical description of
ART?  It would be enlightening.

)-- 
)Myke Rynolds
)School of Information & Computer Science, Georgia Tech, Atlanta GA 30332
)uucp:	...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!myke
)Internet:	myke@gatech.edu

myke@gatech.edu (Myke Rynolds) (04/23/89)

Matthew B. Kennel writes:
>Myke Rynolds (me)  writes:
>)Ah! It is if you are dealing with real valued neurons, which BP gives the
>)fascade of doing, but infact it only uses the high and low end of the range
>)and is thus binary. With binary neurons, non-linear models are not one iota
>)more powerful. Infact, they only increase the complexity of the alg.
>             ????????????????
>Huh? Wasn't the inability to learn non-linear
>transformations the fatal stake through the heart of the single-layer 1960's
>perceptrons?  Can ART learn parity?  What makes it different from a
>classical perceptron?  You said ART was basically matrix multiplication; if
>so, I have serious doubts about its power.
You neglect the fact that any set of vectors can be made orthoganal. If you
tried to teach a BAM to count in binary, it would fail of course. That hardly
means that you can't teach it to count. By expanding the number of elements in
a vector, fewer of them need to be set, which still further decreases the % set.
This isn't ART, this was my realization that ART isn't really limited. You can
add higher order combinations of the inputs to get better multi-d taylor aproxs
of a function, but that is conceptually the same as expanding the input from
binary to unary.

>You can beat Kasparov with _linear_ tranformations?  Or maybe what you
>mean by linear isn't the same as what I'm thinking of.  By linear
>I mean linear in the input vectors: a single-layer classical perceptron.
>I can easily deal with algorithms that are linear in the _free parameters_,
>but still represent _nonlinear_ transformations, like radial basis functions.
You're blowing me away :-) I don't know what you mean here, but I spent two
quaters talking to mathematicians about multi-dimensional taylor expansions,
that seemed to me to be where the most power was. But I was always disturbed
by the fact that preforming such expansions on binary inputs left with just a
lot more binary inputs! Its simply another form of binary input expansion.
>
>)|Can something like a BAM network be more efficient than an "encoder"
>)|type of perceptron in terms of the number of connections?  
>)Its an associative memory, not an encoder. Night and day.
>What I mean is some network that has a small internal layer that then
>fans back out to an output layer.  As a purely contrived example, consider
>associating 128 bit binary numbers where only a single bit in each string
>is on.
Encoders are in some ways the reverse of what I'm talking about. You have
an unary input (one element set) that activates a binary rep in a much smaller
internal layer. Whats the internal layer for? Why not just associate the unary
input to the output? That is a much simpilar problem for a linear transform.

>Do you think you could try to make a simple mathematical description of
>ART?  It would be enlightening.
Sure! Input signals go through the LTM matrix and get contrast enhanced.
The extreme of contast enhancement is winner takes all with only one output
element set. The LTM is an adaptive linear filter with associative decay.

The output F2 activation gets sent back through the LTM traces to F1, and if
the error is too great, the system resets and another F2 activation pattern
is attempted. The reset is self scaled, if a pattern with three 1's set has
an error it means much more than if a pattern with fifty 1's set only had one
bit wrong. I don't understand how the search for a new F2 activation pattern
works unfortunely.

Finally, there is a gain control, giving you three things that can be active
or inactive. The input, the top down F2 pattern, and the gain control. F1 only
sends its activation pattern up when two of these three sources are active.
The presence of an input pattern activates the gain control while the presence
of a top down F2 pattern inhibits it.

But all this stuff makes ART able to deal with a constantly changing input
environ, which a production system needs to deal with. My environ is constant,
thus I prefer BAM's simplicity.
-- 
Myke Rynolds
School of Information & Computer Science, Georgia Tech, Atlanta GA 30332
uucp:	...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!myke
Internet:	myke@gatech.edu

adverb@bucsb.UUCP (Josh Krieger) (04/23/89)

Please stop dismissing ART out of ignorance!

Grossberg's papers are exceptionally difficult to understand because
each discovery is dependent on the "minimal" biological building
blocks discovered in the past 30 years. It would be impossible
to understand microprocessor intricacies with only basic knowledge
of transistors; so don't expect to understand Grossberg's work
without a little research and time. ART is a beautiful discovery.

A good introduction might be trying to read "How does a Brain Build
a Cognitive Code", it's reprinted in Anderson and Rosenthall's
"Neurocomputing" and Grossberg's "Studies of Mind and Brain". It will
take about 4 readings to get the information solid. But don't
be frightened by it.

What follows is meant to be a very simple explanation of how
ART I works. The description is not complete; however, it gives
an idea of the inherent power in the model.

ART consists of 2 subsytems: 1) An attentional subsystem
and 2) An orienting subsytem:


            Attentional Subsystem       Orienting Subsystem

              **** Layer 2 ****
               |            <-   <-----\
	       |             |           \
               ->            |             \
              **** Layer 1 ****   --->       ****
               ! ! ! ! ! ! ! !
                   Input

The attentional subsystem will accept an input pattern
into Layer 1. Bottom-up signals will filter the pattern
into Layer 2 where noise is removed and features are enhanced.
All the cells in Layer 2 will compete and the one which
wins the competition produces a "learned expectancy" or top-down
signal back to layer 1. If the expectancy is not an accurate match
of the input pattern (there is a definition of accurate), then
the orienting subsystem is notified by the attentional subsystem.
The orienting subsystem proceeds to send an arousal burst back
to the attentional subsystem which resets the "category cell"
that produced the expectancy. The process is repeated until
a cell is found which produces an accurate top-down
expectancy of the input. In other words, one cell in layer 2
will represent the exemplar of particular category
and that cell will not be recoded to a different category
even if a continuous set of input patterns are presented to the
network. If a perfect match is not found, a category cell in
layer 2 is chosen and Bottom-up and Top-down activations reverberate
(or resonate) until both the bottom-up connections and the
top-down connections have learned the input pattern.

The advantages:

 1) Once a pattern is learned retrieval time occurs in one
    feedforward pass.
 2) ART has a vigilance level which allows it to
    place input patterns into categories based upon course
    or fine distinctions (Course destinctions would be
    classifying the letters C and D in the same category, while fine
    would involve placing them in different categories.
 3) Old learning is not washed away by the
    "blooming, buzzing confusion", of the real world.
    ART is stable to old learning yet plastic to new information.
 4) ART will converge regardless of the parameter settings.
    There is no tweaking of learning rates.
 5) ART has a vast amount of solid psychological and biological
    evidence for its existence.

Grossberg himself writes:

"The success of these circuits in organizing large interdisciplinary
data bases suggests that they will remain building blocks in any
future theory that supplants the present stage of understanding."

In other words, ART will not be replaced. It will be added to.

"Studies of Mind and Brain" is an incredibly difficult collection to get
through. "The Adapative Brain" is much easier, although it still
relies on previous concepts. Grossberg's work is THE most important
research available in neural networks and to ignore it would be
the equivalent, putting things into a historical perspective,
of having disregarded the potential significance of the transistor
in the 1940's.

One last comment: Backprop and ART are apples and oranges!

-- Josh Krieger (adverb%bucsx.bu.edu@bu-cs.bu.edu)

rao@enuxha.eas.asu.edu (Arun Rao) (04/23/89)

In article <1@bucsb.UUCP>, adverb@bucsb.UUCP (Josh Krieger) writes:
> Please stop dismissing ART out of ignorance!
> 
> Grossberg's papers are exceptionally difficult to understand because
> each discovery is dependent on the "minimal" biological building
  [stuff deleted]
> ART is a beautiful discovery.
> 
  [simple explanation deleted]
> 
> The advantages:
> 
>  1) Once a pattern is learned retrieval time occurs in one
>     feedforward pass.
>  2) ART has a vigilance level which allows it to
>     place input patterns into categories based upon course
>     or fine distinctions (Course destinctions would be
>     classifying the letters C and D in the same category, while fine
>     would involve placing them in different categories.
>  3) Old learning is not washed away by the
>     "blooming, buzzing confusion", of the real world.
>     ART is stable to old learning yet plastic to new information.
>  4) ART will converge regardless of the parameter settings.
>     There is no tweaking of learning rates.
>  5) ART has a vast amount of solid psychological and biological
>     evidence for its existence.
> 

	I would not presume to pass judgement on any theory, but I dispute
the (seemingly) widely held view that ART is a novel method of
"learning", as it were, and of classification. I have actually waded through
the kilometers of prose that Grossberg has written, simulated his equations
and otherwise beaten the subject to death. A lot may have still escaped me,
but I have reached the conclusion that ART is nothing but a parallel
clustering algorithm in which exemplars are stored in the (bottom-up)
weights. Effectively, a parallel dot-product is computed between the
unknown vector and all stored exemplars and the maximum dot-product
chosen. It turns out that the dot-product scheme is "imperfect" in the
sense that multiple vectors may generate identical dot-products, and so
the top-down adaptive filter is used to correct the ambiguity thus
produced.

	Conventional clustering methods function in almost exactly the
same manner - if they were parallelized, they would do all that 
ART does, complete with "learning". IMHO, the novelty in ART lies not
so much in the method by which classification and learning is accomplished
(and to say so would be to do a grave injustice to the scores of workers
who have studied clustering methods and parallel algorithms) as in the
way in which it is psycho-physiologically justified.

	Lippmann has pointed out the clustering algorithm analogy in
"An Introduction to Computing with Neural Nets" (IEEE ASSP Magazine,
April 1987), but he does not emphasize it to any great degree.

- Arun
-- 
Arun Rao
ARPANET: rao@enuxha.asu.edu BITNET: agaxr@asuacvax
950 S. Terrace Road, #B324, Tempe, AZ 85281
Phone: (602) 968-1852 (Home) (602) 965-2657 (Office)

kortge@Portia.Stanford.EDU (Chris Kortge) (04/24/89)

In article <1@bucsb.UUCP> adverb@bucsb.bu.edu (Josh Krieger) writes:
>Please stop dismissing ART out of ignorance!
>
>Grossberg's papers are exceptionally difficult to understand because
>each discovery is dependent on the "minimal" biological building
>blocks discovered in the past 30 years. It would be impossible
>to understand microprocessor intricacies with only basic knowledge
>of transistors; so don't expect to understand Grossberg's work
>without a little research and time. ART is a beautiful discovery.
>
Good analogy.  Notice that even though I don't understand microprocessor
intricacies, I can still program a computer.  Grossberg seems very
concerned with the physical implementation of his systems, and thus
includes a lot of complicated mathematical detail.  There's nothing
objectively wrong with this, but a lot of it may be unneccessary for
describing the essential information-processing properties, and keeps
"ignorant" people at a distance.

>The advantages:
> [5 advantages deleted]

Like Grossberg, you neglect to point out the disadvantages.  Categorical
representations are inflexible--they allow similarity, and thus
generalization, based on only one dimension (two patterns are in the
same category, or they are not).  Also, unsupervised learning can't
guarantee its representations will be relevant to survival of the
organism.  Using a complex external teacher might be considered
cheating, but you at *least* need reinforcement somewhere.
 
>In other words, ART will not be replaced. It will be added to.

I agree (surprised?).  Backprop has a major drawback, that being the
interference problem--present learning is highly disruptive of past
learning.  This is probably just another way of saying that learning is
too slow; in any case, everyone agrees it's a problem.  On the other
hand, probably not many "backprop people" would agree with me that the
solution is to add something to ART.  They would rather add something to
backprop.  My view is that ART is what will need to be added to
backprop, so either way of viewing it will work.  I think this merger
would be much easier if Grossberg's writing style were more objective,
less detailed, and more info-processing oriented.  He has extremely
important ideas, which people are not giving enough attention.

>One last comment: Backprop and ART are apples and oranges!

Unfortunately, you're right.  And it's time to do some genetic
engineering!

Chris Kortge
kortge@psych.stanford.edu

manj@brand.usc.edu (B. S. Manjunath) (04/24/89)

kortge@portia.stanford writes:
>>In article <1@bucsb.UUCP> adverb@bucsb.bu.edu (Josh Krieger) writes:
>>Please stop dismissing ART out of ignorance!
>>

>[deleted]

 >Good analogy.  Notice that even though I don't understand microprocessor
 >intricacies, I can still program a computer.  Grossberg seems very
 >concerned with the physical implementation of his systems, and thus
 >includes a lot of complicated mathematical detail.  There's nothing
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 >objectively wrong with this, but a lot of it may be unneccessary for
 >describing the essential information-processing properties, and keeps
 >"ignorant" people at a distance.
  
 Unfortunately ! Also I would like to add that most of it is unnecessary 
too . All the analysis that is carried out in the paper (BTW I am referring 
to the paper by Carpenter and Grossberg, " A Massively Parallel ....",
CVGIP 1987, pp. 54 - 115 = -61 !) makes many simplifying assumptions.
The STM dynamics is completely ignored and it is assumed to be in the 
equilibrium state whenever a pattern is presented. See for eg. Section 18
( Yes - Eighteen/27) of the paper.

>The advantages:
> [4 advantages deleted]
>>  5) ART has a vast amount of solid psychological and biological
>>     evidence for its existence.
        
	This probably is the strongest point in favour of ART. However I am 
yet to completely "decode" the ART2 ( For analog patterns). Simulations 
seem to work but I see no reason for so many layers within F1. I understand
that the system does some kind of normalization of the input patterns but 
I fail to see why so many of them are needed. Could anyone comment on this ?

 >Chris Kortge

 bs manjunath
 manj@brand.usc.edu

myke@gatech.edu (Myke Rynolds) (04/25/89)

Arun Rao writes:
>	I would not presume to pass judgement on any theory, but I dispute
>the (seemingly) widely held view that ART is a novel method of
>"learning", as it were, and of classification.

Grossberg designed ART I in '76. BAM is nothing more than a solution to
systems of linear equations, which is equivolent to least means squared,
linear programming, linear regession, matrix psuedo-inverses, on and on...
You've nothing to dispute here, its what I ment in one of my earlier messages
when I refered to the deeper simplicity of math. Its an excedingly old trick
in a new form, cognition!
-- 
Myke Rynolds
School of Information & Computer Science, Georgia Tech, Atlanta GA 30332
uucp:	...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!myke
Internet:	myke@gatech.edu

aj-mberg@dasys1.UUCP (Micha Berger) (04/25/89)

Hans Schermer writes:

>Can anyone out there give me a hand?
>I am looking for philosophical papers, books or articles, with reactions
>to connectionism as a model for the mind. 


I didn't think connectionism tries to model the mind, I rather thought it
models the brain. An actual working model of the mind doesn't neccesarily
require modelling the brain. Psychoanalysts have been doing it (with
varying success) for years.
One could make a connectionist model of the mind, if, somehow, one could map
cognitive function (left intentionally ambiguous) to a set of connection
weights.
-- 
					Micha Berger				     
Disclaimer: All opinions expressed here are my own. The spelling, noone's.
email: ...!cmcl2!hombre!dasys1!aj-mberg	       Aspaklaria Publications
  vox: (718) 380-7572			       73-32 173 St, Hillcrest, NY 11366

aarons@syma.sussex.ac.uk (Aaron Sloman) (05/01/89)

hans@eurtrx.UUCP (Hans Schermer) writes:

> Date: 20 Apr 89 10:40:46 GMT
>
> Can anyone out there give me a hand?
> I am looking for philosophical papers, books or articles, with reactions
> to connectionism as a model for the mind.
> I would be interested in texts discussing representationalism, (sub)symbolic
> representations, materialism, and other philosophical subjects that could be
> influenced by a connectionist theory of the mind.
> .....etc.....

Here's my pennyworth.

I am amazed when people try to produce philosophical arguments to
show that connectionist models are superior, or inferior, to other
kinds of AI models of mental processes.

Instead of getting involved in these silly disputes, people should
try to understand the rich multiplicity of function of the human
mind and try to see what kinds of architectures might account for
that multiplicity, and what kinds of mechanisms are capable of
fitting in to those architectures in order to fulfil the roles
required.

For example the mechanisms required for low level vision are likely
to be somewhat different from the mechanisms used in multiplying 395
by 11 in your head. Both are likely to be different from (though
they may overlap with) the mechanisms involved in associative
retrieval of stored memories on the basis of partial matches ("Suzie
had a little goat" Yes? No? who had what then?) Then there is our
ability to store and retrieve intricate detail exactly, as when we
memorize a long poem or a piano sonata.

Different again must be the mechanisms by which new motives
(desires, fears, wishes, and the like) are generated (by physical
needs, by perceiving something in the environment, by thinking about
past events or future possibilities etc). These motives, in turn,
interact in many intricate ways with other motives, beliefs,
percepts, personality traits, etc. Some, but not all,
motives(desires) become intentions. ("Yes, I will try to get ...."
or "That is very tempting, but I mustn't...").

Planning processes sometimes arise out of intentions ("Now, how can
I get that box open. Perhaps I can borrow a crow-bar from Jim,
though I'll have to offer him something in return, he's so
mean...Now where can I find him. His wife will know..."). But
sometimes intentions directly interact with percepts to generate
behaviour controlled by tight feedback loops (like bringing your car
to a gentle stop just at the traffic lights).

Some kinds of abilities seem to encompass a finite or fixed
dimensional range of possibilities (e.g. the set of ways of moving
your arm so that your forefinger moves quickly in a smooth path from
touching one thing to touching another?) whereas other abilities
involve a kind of generative competence that implies unbounded
complexity, at least in principle, (e.g. the set of algebraic
expressions you can evaluate).

There are very many different kinds of learning, training,
development, improvement. Some kinds of actions can be achieved
perfectly once you know what to do (long division). Others require
training or tuning of low level mechanisms, in ways that are very
hard to understand (coaxing a beautiful tone out of a violin).

Some things are inaccessible to consciousness normally yet can
become accessible after appropriate training, such as the use of
grammatical categories in producing or understanding language. (One
kind of philosophical training is concerned with this kind of
heightened awareness. Compare learning phonetics.)

We can do some things in parallel (walking and talking, listening
and looking, enjoying a meal and a view, or seeing the different
ballet dancers that form an intricate and changing pattern), yet
others are difficult or impossible, like reciting two poems in your
head at once.

Some things are easily reversed (sing a high note and swoop down to
a low note - then do it in reverse) but others not (recite a poem
then say it backwards).

Some kinds of mental processes are transformed by alcohol and other
drugs, and some not. E.g. alcohol (in relatively small doses) may
alter what you will agree to do, but it probably won't change the
semantic interpretation you give to "The cat sat on the mat".

There are many far more detailed requirements for explanatory
mechanisms. It seems to me absurd to argue over whether either
connectionist models or conventionalist AI models provide better
theories of the nature of mind when it is patently clear both are
still miles away from accounting for more than highly simplified
versions of tiny fragments of human ability.

Instead of silly squabbles we need to work both top-down (collecting
requirements for adequate models and explanations), and bottom up
(trying to investigate different kinds of mechanisms and finding out
what can and cannot be achieved by putting them together in
different ways).

It seems very likely that the final story (if we ever find it) will
involve many different kinds of mechanisms put together in a complex
variety of ways. Attempts to do it all using one kind of technique
(Production systems, Logic, PDP mechanisms) will then just look
silly.


Aaron Sloman,
School of Cognitive and Computing Sciences,
Univ of Sussex, Brighton, BN1 9QN, England
    INTERNET: aarons%uk.ac.sussex.cogs@nsfnet-relay.ac.uk
              aarons%uk.ac.sussex.cogs%nsfnet-relay.ac.uk@relay.cs.net
    JANET     aarons@cogs.sussex.ac.uk
    BITNET:   aarons%uk.ac.sussex.cogs@uk.ac
        or    aarons%uk.ac.sussex.cogs%ukacrl.bitnet@cunyvm.cuny.edu
    UUCP:     ...mcvax!ukc!cogs!aarons
            or aarons@cogs.uucp

coggins@coggins.cs.unc.edu (Dr. James Coggins) (05/03/89)

> Can anyone out there give me a hand?
> I am looking for philosophical papers, books or articles, with reactions
> to connectionism as a model for the mind.
> I would be interested in texts discussing representationalism, (sub)symbolic
> representations, materialism, and other philosophical subjects that could be
> influenced by a connectionist theory of the mind.
> .....etc.....

My assessment of the neural net area is as follows:
(consider these Six Theses nailed to the church door)

1. NNs are a parallel implementation technique that shows promise for
making perceptual processes run in real time.

2. There is nothing in the NN work that is fundamentally new except
as a fast implementation.  Their ability to learn incrementally from
a series of samples nice but not new.  The way they learn and make
decisions is decades old and first arose in communication theory,
then was further developed in statistical pattern recognition.

3. The claims that NNs are fundamentally new are founded on ignorance
of statistical pattern recognition or on simplistic views of the
nature of statistical pattern recognition.  I have heard supposedly
competent people working in NNs claim that statistical pattern
recognition is based on assumptions of Gaussian distributions which
are not required in NNs, therefore NNs are fundamentally different.
This is ridiculous.  Statistical pattern recognition is not bound to
Gaussians, and NNs do, most assuredly, incorporate distributional
assumptions in their decision criteria. 

4. A more cynical view that I do not fully embrace says that the main
function of "Neural Networks" is as a label for money.  It is a flag
you wave to attract money dispensed by people who are interested in
the engineering of real-time perceptual processing and who are
ignorant of statistical pattern recognition and therefore the lack of
substance of the neural net field.

5. Neural nets raise lots of engineering questions but little science.
Much of the excitement they have raised is based on uncritical
acceptance of "neat" demos and ignorance. As such, the area resembles
a religion more than a science.  

6. The "popularity" of neural net research is a consequence of the
miserable mathematical backgrounds of computer science students (and
some professors!).  You don't need to know any math to be a hacker, but
you have to know math and statistics to work in statistical pattern
recognition.  Thus, generations of computer science students are
susceptible to hoodwinking by neat demos based on simple mathematical
and statistical techniques that incorporate some engineering hacks
that can be tweaked forever.  They'll think they are accomplishing
something by their endless tweaking because they don't know enough
math and statistics to tell what's really going on.

---------------------------------------------------------------------
Dr. James M. Coggins          coggins@cs.unc.edu
Computer Science Department   A neuromorphic minimum distance classifier!
UNC-Chapel Hill               Big freaking hairy deal.
Chapel Hill, NC 27599-3175                -Garfield the Cat
and NASA Center of Excellence in Space Data and Information Science
---------------------------------------------------------------------
 

pastor@bigburd.PRC.Unisys.COM (Jon Pastor) (05/03/89)

In article <935@syma.sussex.ac.uk> aarons@syma.sussex.ac.uk (Aaron Sloman) writes:
>
>Here's my pennyworth.
>   [...lots of cogent remarks omitted...]
>It seems to me absurd to argue over whether either
>connectionist models or conventionalist AI models provide better
>theories of the nature of mind when it is patently clear both are
>still miles away from accounting for more than highly simplified
>versions of tiny fragments of human ability.
>
>Instead of silly squabbles we need to work both top-down (collecting
>requirements for adequate models and explanations), and bottom up
>(trying to investigate different kinds of mechanisms and finding out
>what can and cannot be achieved by putting them together in
>different ways).
>
>It seems very likely that the final story (if we ever find it) will
>involve many different kinds of mechanisms put together in a complex
>variety of ways. Attempts to do it all using one kind of technique
>(Production systems, Logic, PDP mechanisms) will then just look
>silly.
>
>
I was going to send e-mail directly to Aaron, but I believe that his
message is so critical, and I am so grateful to him for expending the
effort that it quite obviously took to prepare it, that I wanted to thank
him publicly.

The arguments one hears about the relative merits of this or that
computational model (architecture, paradigm, programming language, etc.)
tend to generate considerable friction, and thus much heat -- but not much
light.  As Aaron notes, it is inconceivable that any single AI model will
provide adequate support for understanding human cognition, or even the
relatively simpler -- but still staggeringly difficult -- problem of
building intelligent systems, even in limited domains.  I will add to
Aaron's comments the observation that proponents of various schools of
thought act as though their objectives are the only ones that matter.  For
example, biological plausibility is critical in Neural Networks if and only
if you are using the NNs to model, and thus understand, human cognition;
those of us who are primarily interested in using NNs as a computational
tool are not concerned with biological plausibility except from an abstract
intellectual perspective -- we want to know whether the NN will do the
tasks we want it to do, and how well.  

Similarly, mathematical rigor (e.g., proofs of convergence) is undeniably a
Good Thing, but many of us got into AI because rigorous solution techniques
often require assumptions and restrictions that do not hold in the real world.
For example, Expected Utility Theory presumes rationality on the part of the
decision-maker, but there is ample and incontrovertible evidence that
decision-makers depart from the rationality axioms in numerous ways
(intransitivity of preference, inability to make objective probability
judgements, inability to cope with problems of high dimensionality,
sensitivity to time and other pressures, etc.).  I don't think that it's an
accident that I've seen the same heuristic search techniques applied in
Symbolic AI, Connectionist AI, and Operations Research; we're all trying to
solve problems that do not have closed-form solutions, and for which
analytic solutions are generally intractable.  Nor do I think that it's
coincidental that the most widely-used tool in each of these three disciplines 
(forward-chaining rule-based systems, BP with some form of steepest-descent 
in the error space, and Linear Programming, respectively) has no convergence
proof -- but works quite well in practice.  Only a fool would claim that
mathematical rigor is unimportant, but practitioners will gladly use a tool
that has strong empirical support while the theoreticians continue looking
for formal results vindicating the empirical results.


Thank you, Aaron, for stating the case for eclecticism, and against
parochialism, so eloquently.  I hope that future discussions in this
newsgroup will take heed of his message that all of us need to understand
and use each other's tools, and that squabbles about the relative merits of
one tool over another without regard for the problem at hand are not
constructive, instructive, or good science.

ian@omega (Ian Parberry) (05/04/89)

In article <10139@burdvax.PRC.Unisys.COM> pastor@bigburd.PRC.Unisys.COM (Jon Pastor) writes:

>Similarly, mathematical rigor (e.g., proofs of convergence) is undeniably a
>Good Thing, but many of us got into AI because rigorous solution techniques
>often require assumptions and restrictions that do not hold in the real world.

> Only a fool would claim that
>mathematical rigor is unimportant, but practitioners will gladly use a tool
>that has strong empirical support while the theoreticians continue looking
>for formal results vindicating the empirical results.
>

I want to put in a plug for theoreticians here.  I see too much theory-bashing
at neural network conferences.  I am told that this is the legacy of Minsky
and Papert.  I have heard "distinguished" neural networkers say things like
(I'm paraphrasing from a faulty memory here ) "we don't need theory",
"combinatorics is harder than analysis", "those discrete theory people don't
know what they're doing", even in invited talks.  This does not set a good
example.

I am a theoretician.  I believe that theory and practice are complementary.
Experiments may show that there is something deep going on.  Theory trys to
explain it.  Theoretical results steer future experiments (pruning the search
tree, if you like).  Each side feeds the other.

Theory is hard.  We make assumptions and restrictions because the mathematical
tools do not yet (and may never) exist for dealing with the real problems.
This doesn't mean that theory is impractical.  Experimentation is hard too,
but each takes some of the uncertainty out of the other.  Experiments tell
theoreticians what to try to prove (and sometimes how to go about it).  Theory
tells experimenters what approaches are more likely to be profitable than
others.

Some people like to do theory, and some hate it.  That's fine.  But each
side should at least learn what the other side is doing, and what it means.

pastor@bigburd.PRC.Unisys.COM (Jon Pastor) (05/08/89)

In article <4548@psuvax1.cs.psu.edu> ian@theory.cs.psu.edu (Ian Parberry) writes:
>In article <10139@burdvax.PRC.Unisys.COM> pastor@bigburd.PRC.Unisys.COM (Jon Pastor) writes:
>
>>Similarly, mathematical rigor (e.g., proofs of convergence) is undeniably a
>>Good Thing, but many of us got into AI because rigorous solution techniques
>>often require assumptions and restrictions that do not hold in the real world.
>
>> Only a fool would claim that
>>mathematical rigor is unimportant, but practitioners will gladly use a tool
>>that has strong empirical support while the theoreticians continue looking
>>for formal results vindicating the empirical results.
>>
>
>I want to put in a plug for theoreticians here.  I see too much theory-bashing
>at neural network conferences.  

I can see how you might have drawn the inference that I am theory-bashing;
please rest assured that this is not the case, and I am sorry to have given
that impression, if I did.  What I was bashing was the notion that if it
can't be proven, it's useless, which is also prevalent in some circles (e.g.,
the biological plausibility issue, the absence of convergence proofs for
BP).  

>I am a theoretician.  I believe that theory and practice are complementary.

I agree wholeheartedly with this and essentially everything else you said.
We all need each other, and neither the theoretical nor the applied 
researcher can affort to ignore or reject the other.

hkhenson@cup.portal.com (H Keith Henson) (05/20/89)

This may be entirely redundant to these groups, but two books which 
strongly support Aaron Sloman's views (aarons@cogs.sussex.ac.uk) are
_The Social Brain_ by Micheal Gazzaniga, and (of course) _Society of
Mind_ by Marvin Minsky.  If there are others, or articles, I would
appreciate email or postings.  Keith Henson (hkhenson@cup.portal.com)