[comp.ai] The symbol grounding problem

harnad@mind.UUCP (Stevan Harnad) (05/09/87)

To define a SUBsymbolic "level" rather than merely a NONsymbolic
process or phenomenon one needs a formal justification for the implied
up/down-ness of the relationship. In the paradigm case -- the
hardware/software distinction and the hierarchy of compiled
programming languages -- the requisite formal basis for the hierarchy is
quite explicit. It is the relation of compilation and implementation.
Higher-level languages are formally compiled into lower level ones
and the lowest is implemented as instructions that are executed by a
machine. Is there anything in the relation of connectionist processes
to symbolic ones that justifies calling the former "sub"-symbolic in
anything other than a a hopeful metaphorical sense at this time?

The fact that IF neural processes are really connectionistic (an
empirical hypothesis) THEN connectionist models are implementable in
the brain defines a super/sub relationship between connectionist
models and neural processes (conditional, of course, on the validity
-- far from established or even suggested by existing evidence -- of
the empirical hypothesis), but this would still have no bearing on
whether connectionism can be considered to stand in a sub/super relationship
to a symbolic "level." There is of course also the fact that any discrete
physical process is formally equivalent in its input/output relations
to some turing machine state, i.e., some symbolic state. But that would
make every such physical process "subsymbolic," so surely turing
equivalence cannot be the requisite justification for the putative
subsymbolic status of connectionism in particular.

A fourth sense of down-up (besides hardware/software, neural
implementability and turing-equivalence) is psychophysical
down-upness. According to my own bottom-up model, presented in the book I
just edited (Categorical Perception, Cambridge University Press 1987),
symbols can be "grounded" in nonsymbolic representations in the
following specific way:

Sensory input generates (1) iconic representations -- continuous,
isomorphic analogs of the sensory surfaces. Iconic representations
subserve relative discrimination performance (telling pairs of things
apart and judging how similar they are).

Next, constraints on categorization (e.g., either natural
discontinuities in the input, innate discontinuities in the internal
representation, or, most important, discontinuities *learned* on the
basis of input sampling, sorting and labeling with feedback) generate
(2) categorical representations -- constructive A/D filters which preserve
the invariant sensory features that are sufficient to subserve reliable
categorization performance. [It is in the process of *finding* the
invariant features in a given context of confusable alternatives that I
believe connectionist processes may come in.] Categorical
representations subserve identification performance (sorting things
and naming them).

Finally, the *labels* of these labeled categories -- now *grounded*
bottom/up in nonsymbolic representations (iconic and categorical)
derived from sensory experience -- can then be combined and recombined
in (3) symbolic representations of the kind used (exclusively, and
without grounding) in contemporary symbolic AI approaches. Symbolic
representations subserve natural language and all knowledge and
learning by *description* as opposed to direct experiential
acquaintance.

In response to my challenge to justify the "sub" in "subsymbolic" when
one wishes to characterize connectionism as subsymbolic rather than
just nonsymbolic, rik%roland@sdcsvax.ucsd.edu (Rik Belew) replies:

>	I do intend something more than non-symbolic when I use the term
>	sub-symbolic. I do not rely upon "hopeful neural analogies" or any
>	other form of hardware/software distinction. I use "subsymbolic"
>	to refer to a level of representation below the symbolic
>	representations typically used in AI... I also intend to connote
>	a supporting relationship between the levels, with subsymbolic
>	representations being used to construct symbolic ones (as in subatomic).

The problem is that the "below" and the "supporting" are not cashed
in, and hence just seem to be synonyms for "sub," which remains to
be justified. An explicit bottom-up hypothesis is needed to
characterize just how the symbolic representations are constructed out
of the "subsymbolic" ones. (The "subatomic" analogy won't do,
otherwise atoms risk becoming subsymbolic too...) Dr. Belew expresses
some sympathy for my own grounding hypothesis, but it is not clear
that he is relying on it for the justification of his own "sub."
Moreover, this would make connectionism's subsymbolic status
conditional on the validity of a particular grounding hypothesis
(i.e., that three representational levels exist as I described them,
in the specific relation I described, and that connectionistic
processes are the means of extracting the invariant features underlying
the categorical [subsymbolic] representation). I would of course be
delighted if my hypothesis turned out to be right, but at this point
it still seems a rather risky "ground" for justifying the "sub" status of
connectionism.

>	my interest in symbols began with the question of how a system might
>	learn truly new symbols. I see nothing in the traditional AI
>	definitions of symbol that helps me with that problem.

The traditional AI definition of symbol is simply arbitrary formal
tokens in a formal symbol system, governed by formal syntactic rules
for symbol manipulation. This general notion is not unique to AI but
comes from the formal theory of computation. There is certainly a
sense of "new" that this captures, namely, novel recombinations of
prior symbols, according to the syntactic rules for combination and
recombination. And that's certainly too vague and general for, say,
human senses of symbol and new-symbol. In my model this combinatorial
property does make the production of new symbols possible, in a sense.
But combinatorics is limited by several factors. One factor is the grounding
problem, already discussed (symbols alone just generate an ungrounded,
formal syntactic circle that there is no way of breaking out of, just as
in trying to learn Chinese from a Chinese-Chinese dictionary alone). Other
limiting factors on combinatorics are combinatory explosion, the frame problem,
the credit assignment problem and all the other variants that I have
conjectured to be just different aspects of the problem of the
*underdetermination* of theory by data. Pure symbol combinatorics
certainly cannot contend with these. The final "newness" problem is of
course that of creativity -- the stuff that, by definition, is not
derivable by some prior rule from your existing symbolic repertoire. A
rule for handling that would be self-contradictory; the real source of
such newness is probably partly statistical, and again connectionism may
be one of the candidate components.

>	It seems very conceivable to me that the critical property we will
>	choose to ascribe to computational objects in our systems symbols
>	is that we (i.e., people) can understand their semantic content.

You are right, and what I had inadvertently left out of my prior
(standard) syntactic definition of symbols and symbol manipulation was
of course that the symbols and manipulations must be semantically
interpretable. Unfortunately, so far that further fact has only led to
Searlian mysteries about "intrinsic" vs. "derived intentionality" and
scepticism about the the possibility of capturing mental processes
with computational ones. My grounding proposal is meant to answer
these as well.

>	the fact that symbols must be grounded in the *experience* of the
>	cognitive system suggests why symbols in artificial systems (like
>	computers) will be fundamentally different from those arising in
>	natural systems (like people)... if your grounding hypothesis is
>	correct (as I believe it is) and the symbols thus generated are based
>	in a fundamental way on the machine's experience, I see no reason to
>	believe that the resulting symbols will be comprehensible to people.
>	[e.g., interpretations of hidden units... as our systems get more
>	complex]

This is why I've laid such emphasis on the "Total Turing Test."
Because toy models and modules, based on restricted data and performance
capacities, may simply not be representative of and comparable to
organisms' complexly interrelated robotic and symbolic
functional capacities. The experiential base -- and, more
important, the performance capacity -- must be comparable in a viable
model of cognition. On the other hand, the "experience" I'm talking
about is merely the direct (nonsymbolic) sensory input history, *not*
"conscious experience." I'm a methodological epiphenomenalist on
that. And I don't understand the part about the comprehensibility of
machine symbols to people. This may be the ambiguity of the symbolic
status of putative "subsymbolic" representations again.

>	The experience lying behind a word like "apple" is so different
>	for any human from that of any machine that I find it very unlikely
>	that the "apple" symbol used by these two system will be comparable.

I agree. But this is why I proposed that a candidate device must pass
the Total Turing Test in order to be capture mental function.
Arbitrary pieces of performance could be accomplished in radically different
ways and would hence be noncomparable with our own.

>	Based on the grounding hypothesis, if computers are ever to understand
>	NL as fully as humans, they must have an equally vast corpus of
>	experience from which to draw. We propose that the huge volumes of NL
>	text managed by IR systems provide exactly the corpus of "experience"
>	needed for such understanding. Each word in every document in an IR
>	system constitutes a separate experiential "data point" about what
>	that word means. (We also recognize, however, that the obvious
>	differences between the text-base "experience" and the human
>	experience also implies fundamental limits on NL understanding
>	derived from this source.)... In this application the computer's
>	experience of the world is second-hand, via documents written by
>	people about the world and subsequently through users'queries of
>	the system

We cannot be talking about the same grounding hypothesis, because mine
is based on *direct sensory experience* ("learning by acquaintance")
as oppposed to the symbol combinations ("learning by description"),
with which it is explicitly contrasted, and which my hypothesis
claims must be *grounded* in the former. The difference between
text-based and sensory experience is crucial indeed, but for both
humans and machines. Sensory input is nonsymbolic and first-hand;
textual information is symbolic and second-hand. First things first.

>	I'm a bit worried that there is a basic contradiction in grounded
>	symbols. You are suggesting (and I've been agreeing) that the only
>	useful notion of symbols requires that they have "inherent
>	intentionality": i.e., that there is a relatively direct connection
>	between them and the world they denote. Yet almost every definition
>	of symbols requires that the correspondence between the symbol and
>	its referent be *arbitrary*. It seems, therefore, that your "symbols"
>	correspond more closely to *icons* (as defined by Peirce), which
>	do have such direct correspondences, than to symbols. Would you agree?

I'm afraid I must disagree. As I indicated earlier, icons do indeed
play a role in my proposal, but they are not the symbols. They merely
provide part of the (nonsymbolic) *groundwork* for the symbols. The
symbol tokens are indeed arbitrary. Their relation to the world is
grounded in and mediated by the (nonsymbolic) iconic and categorical
representations.

>	In terms of computerized knowledge representations, I think we have
>	need of both icons and symbols...

And reliable categorical invariance filters. And a principled
bottom-up grounding relation among them.

>	I see connectionist learning systems building representational objects
>	that seem most like icons. I see traditional AI knowledge
>	representation languages typically using symbols and indices. One of
>	the questions that most interests me at the moment is the appropriate
>	"ontogenetic ordering" for these three classes of representation.
>	I think the answer would have clear consequences for this discussion
>	of the relationship between connectionist and symbolic representations
>	in AI.

I see analog transformations of the sensory surfaces as the best
candidates for icons, and connectionist learning systems as
as possible candidates for the process that finds and extracts the invariant
features underlying categorical representations. I agree about traditional
AI and symbols, and my grounding hypothesis is intended as an answer about
the appropriate "ontogenetic ordering."

>	Finally, this view also helps to characterize what I find missing
>	in most *symbolic* approaches to machine learning: the world
>	"experienced" by these systems is unrealistically barren, composed
>	of relatively small numbers of relatively simple percepts (describing
>	blocks-world arches, or poker hands, for example). The appealling
>	aspect of connectionist learning systems (and other subsymbolic
>	learning approaches...) is that they thrive in exactly those
>	situations where the system's base of "experience" is richer by
>	several orders of magnitude. This accounts for the basically
>	*statistical* nature of these algorithms (to which you've referred),
>	since they are attempting to build representations that account for
>	statistically significant regularities in their massive base of
>	experience.

Toy models and microworlds are indeed barren, unrealistic and probably
unrepresentative. We should work toward models that can pass the Total
Turing Test. Invariance-detection under conditions of high
interconfusability is indeed the problem of a device or organism that
learns its categories from experience. If connectionism turns out to
be able to do this on a life-size scale, it will certainly be a
powerful candidate component in the processes underlying our
representational architecture, especially the categorical level. What
that architecture is, and whether this is indeed the precise
justification for connectionism's "sub" status, remains to be seen.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

harnad@mind.UUCP (05/20/87)

John X. Laporta <rutgers!mit-eddie!apollo!laporta> Apollo Computer,
Chelmsford, MA wrote:

>	You say that symbols are grounded in nonsymbolic sensory input.
>	You propose a model of segmentation... by which discontinuities
>	in the input map to segment boundaries... I wonder what you do with
>	the problem of segmentation of the visual spectrum.
>	...spectral segmentations differ widely across cultures.
>	The problem is that these breaks and their number vary widely...
>	what system intervenes to choose the set a particular culture favors
>	and asserts as obvious? What is the filter in the A/D converter?

More recent evidence seems to suggest that color segmentation does not
vary nearly as widely as had been believed (see M. Bornstein's work). There
may be some variability in the tuning of color boundaries, and some
sub-boundaries may be added sometimes, but the focal colors are governed by our
innate color receptor apparatus and they seem to be universal. The
partial flexibility of the boundaries -- short and long term -- must
be governed by learning, and the learning must consist of readjustment
of boundary locations as a function of color naming experience and
feedback, or perhaps even the formation of new sub-boundaries where
there are none. The innate color-detector mechanism would be the A/D
filter in the default case, and learning may set some of the boundary
fine-tuning parameters.

The really interesting case, though, and one that has not been tested
directly yet, is the one where boundary formation occurs de novo purely
as a result of learning. This does not happen with evolutionarily "prepared"
categories such as colors (although it may have happened in phylogeny),
but it may happen with arbitrary learned ones (e.g., perhaps musical
semitones). Here the A/D filter would be acquired from categorization
training alone: labeling with feedback. In simple one-dimensional continua,
what would be acquired would simply be some sort of a threshold
detector, but with more complex multidimensional stimuli the
feature-filter would have to be constructed by a more active inductive
process. This may be where connectionist algorithms come in.

Another important factor in the selectivity of the A/D feature-filter
is the "context" of alternatives: the sample of confusable members and
nonmembers of the categories in question on the basis of which the
features must be extracted; these also focus the uncertainty that the
filter must resolve if it is to generate reliable categorization
performance.

All this is described in the book under discussion (Categorical
Perception: The Groundwork of Cognition, Cambridge University Press
1987, S. Harnad, Ed.).

-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

harnad@mind.UUCP (Stevan Harnad) (05/22/87)

This is part 1 of a response to a longish exchange from
Rik Belew <rik%roland@SDCSVAX.UCSD.EDU> who asks:

>	... [1] what evidence causes you to postulate iconic and categorical
>	representations as being distinct?... Apart from a relatively few
>	cognitive phenomena (short-term sensory storage, perhaps mental
>	imagery), I am aware of little evidence of "continuous, isomorphic
>	analogues of the sensory surfaces" [your "iconic" representations].
>	[2] I see great difficulty in distinguishing between such
>	representations and "constructive A/D filters [`categorical'
>	representations] which preserve the invariant sensory features" based
>	simply on performance at any particular task. More generally, could
>	you [3] motivate your ``subserve'' basis for classifying cognitive 
>	representations.

[1] First of all, short-term sensory storage does not seem to constitute
*little* evidence but considerable evidence. The tasks we can perform
after a stimulus is no longer present (such as comparing and matching)
force us to infer that there exist iconic traces. The alternative
hypthesis that the information is already a symbolic description at
this stage is simply not parsimonious and does not account for all the
data (e.g., Shepard's mental rotation effects). These short-term
effects do suggest that iconic representations may only be temporary
or transient, and that is entirely compatible with my model. Something
permanent is also going on, however, as the sensory exposure studies
suggest: Even if iconic traces are always stimulus-bound and
transient, they seem to have a long-term substrate too, because their
acuity and reliability increases with experience.

I would agree that the subjective phenomenology of mental imagery is very
weak evidence for long-term icons, but successful performance on some
perceptual tasks drawing on long-term memory is at least as economically
explained by the hypothesis that the icons are still accessible as by the
alternative that only symbolic descriptions are being used. In my
model, however, most long-term effects are mediated by the categorical
representations rather than the iconic ones. Iconic representations
are hypothesized largely to account for short-term perceptual
performance (same/difference judgment, relative comparisons,
similarity judgments, mental rotation, etc.). They are also, of
course, more compatible with subjective phenomenology (memory images
seem to be more like holistic sensory images than like selective
feature filters or symbol strings).

[2] The difference between isomorphic iconic representations (IRs)
and selective invariance filters (categorical representations, CRs)
is quite specific, although I must reiterate that CRs are really a
special form of "micro-icon." They are still sensory, but they are
selective, discarding most of the sensory variation and preserving
only the features that are invariant *within a specific context of
confusable alternatives*. (The key to my approach is that identifying
or categorizing something is never an *absolute* task but a relative,
context-dependent one: "What's that?" "Compared to What?") The only
"features" preserved in a CR are the ones that will serve as a reliable
basis for sorting the instances one has sampled into their respective
categories (as learned from feedback indicating correct or incorrect
categorizing). The "context" (of confusable alternatives), however, is
not a short-term phenomenon. Invariant features are provisional, and
always potentially revisable, but they are parts of a stable,
long-term category-representational system, one that is always being
extended and updated on the basis of new categorization tasks and
samples. It constitutes an ever-tightening approximation.

So the difference between IRs and CRs ("constructive A/D filters") is
that IRs are context-independent, depending only on the
comparison of raw sensory configurations and on any transformations that
rely on isomorphism with the unfiltered sensory configuration, whereas
IRs are context-dependent and depend on what confusable alternatives
have been sampled and must then be reliably identified in
isolation. The features on which this successful categorization is based
cannot be the holistic configural ones, which blend continuously into
one another; they are features specifically selected and abstracted to
subserve reliable categorization (within the context of alternatives
sampled to date). They may even be "constructive" features, in the sense
that they are picked out by performing an active operation -- sensory,
comparative or even logical -- on the sensory input. Apart from this invariant
basis for categorization (let's call these selectively abstracted features
"micro-iconic") all the rest of the iconic information is discarded from the
category filter.

[3] Having said all this, it is easy to motivate my "subserve" as you
request: IRs are the representations that subserve ( = are required in
order to generate successful performance on) tasks that call for
holistic sensory comparisons and isomorphic transformations of
the unfiltered sensory trace (e.g., discrimination, matching,
similarity judgment) and CRs are the representations required to
generate successsful performance on tasks that call for reliable
identification of confusable alternatives presented in isolation. As a
bonus, the latter provide the grounding for a third representational
system, symbolic representations (SRs), whose elementary symbols are
the labels of the bounded categories picked out by the CRs and
"fleshed out" by the IRs. These elementary symbols can then be
rulefully combined and recombined into symbolic descriptions which, in
virtue of their reducibility to grounded nonsymbolic representations,
can now refer to, describe, predict and explain objects and events in
the world.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

harnad@mind.UUCP (Stevan Harnad) (05/22/87)

Rik Belew <rik%roland@SDCSVAX.UCSD.EDU> writes:

>	I use ``icon'' to mean much the same as your ``categorical
>	representations''... their direct, albeit statistical,
>	relationship with sensory features... distinguishes icons from
>	``symbols'', which are representations without structural
>	correspondence with the environment.

The criterion for being iconic is physical isomorphism
( = "structural correspondence"). This means that the relationship
between an object and its icon must be a physically invertible
(analog) transformation. In my model, iconic representations
are isomorphic with the unfiltered sensory projection of the
input they represent, whereas categorical representations
are only isomorphic with selected features of the input.
In that sense they are "micro-iconic." The important point is
that they are selective and based on abstracting some features and
discarding all the rest. The basis of selection is: "What features do
I need in order to categorize this input correctly, relative to other
confusable alternatives I have encountered and may encounter in the
future?" To call the input an "X" on the basis of such a selective,
context-governed feature filter, however, is hardly to say that one
has an "icon" of an "X" in the same sense that iconic representations
are icons of input sensory projections. The "structural
correspondence" is only with the selected features, not with the "object"
being named.

On the other hand, the complete absence of any structural
correspondence whatever is indeed what distinguishes both iconic and
categorical representations from symbolic ones. The heart of my symbol
grounding proposal is that in allowing you to speak of (identify,
label, categorize) "X's" at all, categorical representations have
provided you with a set of elementary labels, based on nonsymbolic
representations, that can now ground an otherwise purely syntactic
symbol system in the objects and events to which it refers. Note,
though, that the grounding is a strong constraint, one that renders
the symbolic system no longer the autonomous syntactic module of
conventional AI. The system is hybrid through-and-through. The
relations between the three kinds of representation are not modular but
bottom-up, with the nonsymbolic representations supporting the
symbolic representations' relation to objects. Most of the rules for
symbol binding, etc. are now constrained in ways that depart from the
freedom of ungrounded formal systems.

>	Your, more restricted, notion of ``symbol'' seems to differ in two
>	major respects: its emphasis on the systematicity of symbols; and its
>	use of LABELS (of categories) as the atomic elements.  I accept
>	the systematicity requirement, but I believe your labeling notion
>	confounds several important factors...
>	First, I believe you are using labels to mean POINTERS:
>	computationally efficient references to more elaborate and complete
>	representations... valuable not only for pointing from symbols
>	to icons (the role you intend for labels) but also from one place in
>	the symbolic representation to another...
>	many connectionists have taken this pointer quality to be
>	what they mean by "symbol."

I believe my grounding proposal is a lot more specific than merely a
pointing proposal. Pointing is, after all, a symbol-to-symbol
function. It may get you to an address, but it won't get you from a
word to the nonsymbolic object to which it refers. The labeling
performance that categorical representations subserve, on the other
hand, is an operation on objects in the world. That is why I proposed
grounding elementary symbols in it: Let the arbitrary labels of
reliably sorted object categories be the elementary symbols of the
symbolic system. Such a hybrid system would continue to have most of
the benefits of higher-order systematicity (compositionality), but with
nonsymbolic constraints "weighing down" its elementary terms. Consider
ordinary syntactic constraints to be "top-down" constraints on a
symbol-system. A grounded hybrid system would have "bottom-up"
constraints on its symbol combinations too.

As to the symbolic status of connectionism -- that still seems to be moot.

>	The other feature of your labeling notion that intrigues me is
>	the naming activity it implies.  This is where I see the issues
>	of language as becoming critical. ...truly symbolic representations and
>	language are co-dependent. I believe we agree on this point...
>	true symbol manipulation arose only as a response to language
>	Current connectionist research is showing just how
>	powerful iconic (and perhaps categorical) representations can
>	be... I use the term language broadly, to
>	include the behavior of other animals for example.

Labeling and categorizing is much more primitive than language, and
that's all I require to ground a symbol system. All this calls for is
reliable discrimination and identification of objects. Animals
certainly do it. Machines should be able to do it (although until they
approach the performance capacity of the "Total Turing Test" they may be
doing it modularly in a nonrepresentative way). Language seems to be
more than labeling and categorizing. It also requires *describing*,
and that requires symbol-combining functions that in my model depend
critically on prior labeling and categorizing.

Again, the symbolic/nonsymbolic status of connectionism still seems to
be under analysis. In my model the provisional role of connectionistic
processes is in inducing and encoding the invariant features in the
categorical representation.

>	the aspect of symbols [that] connectionism
>	needs most is something resembling pointers. More elaborate notions of
>	symbol introduce difficult semantic issues of language that can be
>	separated and addressed independently... Without pointers,
>	connectionist systems will be restricted to ``iconic'' representations
>	whose close correspondence with the literal world severely limits them
>	from ``subserving'' most higher (non-lingual) cognitive functioning.

I don't think pointer function can be divorced from semantic issues in
a symbol system. Symbols don't just combine and recombine according to
syntactic rules, they are also semantically interpretable. Pointing is a
symbol-to-symbol relation. Semantics is a symbol-to-object
relationship. But without a semantically interpretable system you
don't have a symbol system at all, so what would be pointing to what?

For what it's worth, I don't personally believe that there is any
point in connectionism's trying to emulate bits and pieces of the
virtues of symbol systems, such as pointing. Symbolic AI's
problem was that it had symbol strings that were interpretable as
"standing for" objects and events, but that relation seemed to be in
the head of the (human) interpreter, i.e., it was derivative, ungrounded.
Except where this could be resolved by brute-force hard-wiring into a
dedicated system married to its peripheral devices, this grounding
problem remained unsolved for pure symbolic AI. Why should
connectionism aspire to inherit it? Sure, having objects around that
you can interpret as standing for things in the world and yet still
manipulate formally is a strength. But at some point the
interpretation must be cashed in (at least in mind-modeling) and then
the strength becomes a weakness. Perhaps a role in the hybrid mediation
between the symbolic and the nonsymbolic is more appropriate for
connectionism than direct competition or emulation.

>	While I agree with the aims of your Total Turing Test (TTT),
>	viz. capturing the rich interrelated complexity characteristic
>	of human cognition, I have never found this direct comparison
>	to human performance helpful.  A criterion of cognitive
>	adequacy that relies so heavily on comparison with humans
>	raises many tangential issues.  I can imagine many questions
>	(e.g., regarding sex, drugs, rock and roll) that would easily
>	discriminate between human and machine. Yet I do not see such
>	questions illuminating issues in cognition. 

My TTT criterion has been much debated on the Net. The short reply is
that the goal of the TTT is not to capture complexity but to capture
performance capacity, and the only way to maximize your confidence
that you're capturing it the right way (i.e., the way the mind does it)
is to capture all of it. This does not mean sex, drugs and rock and
roll (there are people who do none of these). It means (1) formally,
that a candidate model must generate all of our generic performance
capacities (of discriminating, identifying, manipulating and describing
objects and events, and producing and responding appropriately to names
and descriptions), and (2) (informally) the way it does so must be
intuitively indistinguishable from the way a real person does, as
judged by a real person. The goal is asymptotic, but it's
the only one so far proposed that cuts the underdetermination of
cognitive theory down to the size of the ordinary underdetermination of
scientific theory by empirical observations: It's the next best thing
to being there (in the mind of the robot).

>	First, let's do our best to imagine providing an artificial cognitive
>	system (a robot) with the sort of grounding experience you and I both
>	believe necessary to full cognition.  Let's give it video eyes,
>	microphone ears, feedback from its affectors, etc.  And let's even
>	give it something approaching the same amount of time in this
>	environment that the developing child requires...
>	the corpus of experience acquired by such a robot is orders of magnitude
>	more complex than any system today... [yet] even such a complete
>	system as this would have a radically different experience of the
>	world than our own. The communication barrier between the symbols
>	of man and the symbols of machine to which I referred in my last
>	message is a consequence of this [difference].

My own conjecture is that simple peripheral modules like these will *not* be
enough to ground an artificial cognitive system, at least not
enough to make any significant progress toward the TTT. The kind of
grounding I'm proposing calls for nonsymbolic internal representations
of the kind I described (iconic representations [IRs] and categorical
representations [CRs]), related to one another and to input and output in
the way I described. The critical thing is not the grounding
*experience*, but what the system can *do* with it in order to
discriminate and identify as we do. I have hypothesized that it must have
IRs and CRs in order to do so. The problem is not complexity (at least
not directly), but performance capacity, and what it takes to generate
it. And the only relevant difference between contemporary machine
models and people is not their *experience* per se, but their
performance capacities. No model comes close. They're all
special-purpose toys. And the ultimate test of man/machine
"communication" is of course the TTT!

>	So the question for me becomes: how might we give a machine the
>	same rich corpus of experience (hence satisfying the total part
>	of your TTT) without relying on such direct experiential
>	contact with the world?  The answer for me (at the moment) is
>	to begin at the level of WORDS... the enormous textual
>	databases of information retrieval (IR) systems...
>	I want to take this huge set of ``labels,'' attached by humans to
>	their world, as my primitive experiential database... 
>	The task facing my system, then, is to look at and learn from this
>	world:... the textbase itself [and] interactions with IR users...
>	the system then adapts its (connectionist) representation...

Your hypothesis is that an information retrieval system whose only
source of input is text (symbols) plus feedback from human users (more
symbols) will capture a significant component of cognition. Your
hypothesis may be right. My own conjecture, however, is the exact
opposite. I don't believe that input consisting of nothing but symbols
constitutes "experience." I think it constitutes (ungrounded) symbols,
inheriting, as usual, the interpretations of the users with which the
system interacts. I don't think that doing connectionism instead of
symbol-crunching with this kind of input makes it any more likely to
overcome the groundedness problem, but again, I may be wrong. But
performance capacity (not experience) -- i.e., the TTT -- will have
to be the ultimate arbiter of these hypotheses.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

aweinste@Diamond.BBN.COM (Anders Weinstein) (05/27/87)

In article <770@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
>
>The criterion for being iconic is physical isomorphism
>( = "structural correspondence"). This means that the relationship
>between an object and its icon must be a physically invertible
>(analog) transformation. 

As I've seen you broach this criterion a few times now, I just thought I'd
remind you of a point that I thought was clearly made in our earlier
discussion of the A/D distinction: loss of information, i.e.
non-invertibility, is neither a necessary nor sufficient condition for 
analog to digital transformation.

Anders Weinstein

harnad@mind.UUCP (05/28/87)

Anders Weinstein of BBN wrote:

>	a point that I thought was clearly made in our earlier
>	discussion of the A/D distinction: loss of information, i.e.
>	non-invertibility, is neither a necessary nor sufficient condition for 
>	analog to digital transformation.

The only point that seems to have been clearly made in the sizable discussion
of the A/D distinction on the Net last year (to my mind, at least) was that no
A/D distinction could be agreed upon that would meet the needs and
interests of all of the serious proponents and that perhaps there was
an element of incoherence in all but the most technical and restricted
of signal-analytic candidates.

In the discussion to which you refer above (a 3-level bottom-up model
for grounding symbolic representations in nonsymbolic -- iconic and
categorical -- representions) the issue was not the A/D
transformation but A/A transformations: isomorphic copies of the
sensory surfaces. These are the iconic representations. So whereas
physical invertibility may not have been more successful than any of
the other candidates in mapping out a universally acceptable criterion
for the A/D distinction, it is not clear that it can be faulted as a
criterion for physical isomorphism.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

aweinste@diamond.bbn.com.UUCP (05/29/87)

Replying to my claim that 
>>	                                ...loss of information, i.e.
>>	non-invertibility, is neither a necessary nor sufficient condition for 
>>	analog to digital transformation.

in article <786@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
>
>The only point that seems to have been clearly made in the sizable discussion
>of the A/D distinction on the Net last year (to my mind, at least) was that no
>A/D distinction could be agreed upon ...
>
>In the discussion to which you refer above ...  the issue was not the A/D
>transformation but A/A transformations: isomorphic copies of the
>sensory surfaces. These are the iconic representations. So whereas
>physical invertibility may not have been more successful than any of
>the other candidates in mapping out a universally acceptable criterion
>for the A/D distinction, it is not clear that it can be faulted as a
>criterion for physical isomorphism.

Well the point is just the same for the A/A or "physically isomorphic"
transformations you describe.  Although the earlier discussion admittedly did
not yield a positive result, I continue to believe that it was at least
established that invertibility is a non-starter: invertibility has
essentially *nothing* to do with the difference between analog and digital
representation according to anybody's intuitive use of the terms.

The reason I think this is so clear is that for any one of the possible
transformation types -- A/D, A/A, D/A, or D/D -- one can find paradigmatic
examples in which invertibility either does or does not obtain.  A blurry
image is uncontroversially an analog or "iconic" representation, yet it is
non-invertible;  a digital recording of sound in the audible range is surely
an A/D transformation, yet it is completely invertible, etc.  All the
invertibility or non-invertibility of a transformation indicates is whether
or not the transformation preserves or loses information in the technical
sense. But loss of information is of course possible (and not necessary) in
any of the 4 cases.

I admit I don't know what the qualifier means in your criterion of "physical
invertibility"; perhaps this alters the case.

Anders Weinstein

harnad@mind.UUCP (Stevan Harnad) (05/29/87)

aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc.,
Cambridge, MA writes:

>	invertibility has essentially *nothing* to do with the difference
>	between analog and digital representation according to anybody's
>	intuitive use of the terms... A blurry image is uncontroversially
>	an analog or "iconic" representation, yet it is non-invertible;
>	a digital recording of sound in the audible range is surely an A/D
>	transformation, yet it is completely invertible. [I]nvertibility...
>	[only] indicates whether... the transformation preserves or loses
>	information in the technical sense. But loss of information is...
>	possible in any of the 4 cases... A/D, A/A, D/A, D/D...
>	I admit I don't know what the qualifier means in your criterion
>	of "physical invertibility"; perhaps this alters the case.

I admit that the physical-invertibility criterion is controversial and
in the end may prove to be unsatisfactory in delimiting a counterpart
of the technical A/D distinction that will be useful in formulating
models of internal representation in cognitive science. The underlying
idea is this:

There are two stages of A/D even in the technical sense. Signal
quantization (making a continuous signal discrete) and symbolization
(assigning names and addresses to the discrete "chunks"). Unless the
original signal is already discrete, the quantization phase involves a
loss of information. Some regions of input variation will not be retrievable
from the quantized image. The transformation is many-to-fewer instead
of one-to-one. A many-to-few mapping cannot be inverted so as to
recover the entire original signal.

Now I conjecture that it is this physical invertibility -- the possibility
of recovering all the original information -- that may be critical in
cognitive representations. I agree that there may be information loss in
A/A transformations (e.g., smoothing, blurring or loss of some
dimensions of variation), but then the image is simply *not analog in
the properties that have been lost*! It is only an analog of what it
preserves, not what it fails to preserve.

A strong motivation for giving invertibility a central role in
cognitive representations has to do with the second stage of A/D
conversion: symbolization. The "symbol grounding problem" that has
been under discussion here concerns the fact that symbol systems
depend for their "meanings" on only one of two possibilities: One is
an interpretation supplied by human users -- "`Squiggle' means `animal' and
`Squoggle' means `has four legs'" -- and the other is a physical, causal
connection with the objects to which the symbols refer. The first
source of "meaning" is not suitable for cognitive modeling, for
obvious reasons (the meaning must be intrinsic and self-contained, not
dependent on human mental mediation). The second has a surprising
consequence, one that is either valid and instructive about cognitive
representations (as I tentatively believe it is), or else a symptom of
the wrong-headedness of this approach to the grounding problem, and
the inadequacy of the invertibility criterion.

The surprising consequence is that a "dedicated system" -- one that is
hard-wired to its transducers and effectors (and hence their
interactions with objects in the world) may be significantly different
from the very *same* system as an isolated symbol-manipulating module,
cut off from its peripherals -- different in certain respects that could be
critical to cognitive modeling (and cognitive modeling only). The dedicated
system can be regarded as "analog" in the input signal properties that are
physically recoverable, even if there have been (dedicated) "digital" stages
of processing in between. This would only be true of dedicated systems, and
would cease to be true as soon as you severed their physical connection to
their peripherals.

This physical invertibility criterion would be of no interest whatever
to ordinary technical signal processing work in engineering. (It may
even be a strategic error to keep using the engineering "A/D"
terminology for what might only bear a metaphorical relation to it.)
The potential relevance of the physical invertibility criterion
would only be to cognitive modeling, especially in the constrain that
a grounded symbol system must be *nonmodular* -- i.e., it must be hybrid
symbolic/nonsymbolic.

The reason I have hypothesized that symbolic representations in cognition
must be grounded nonmodularly in nonsymbolic representations (iconic and
categorical ones) is based in part on the conjecture that the physical
invertibility of input information in a dedicated system may play a crucial
role in successful cognitive modeling (as described in the book under
discussion: "Categorical Perception: The Groundwork of Cognition,"
Cambridge University Press 1987). Of course, selective *noninvertibility*
-- as in categorizing by ignoring some differences and not others --
plays an equally crucial complementary role.

The reason the invertibility must be physical rather than merely
formal or conceptual is to make sure the system is grounded rather
than hanging by a skyhook from people's mental interpretations.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

aweinste@Diamond.BBN.COM (Anders Weinstein) (06/05/87)

In reply to my objection that
>>	invertibility has essentially *nothing* to do with the difference
>>	between analog and digital representation according to anybody's
>>	intuitive use of the terms

Stevan Harnad (harnad@mind.UUCP) writes in message <792@mind.UUCP>:

>There are two stages of A/D even in the technical sense. ... Unless the
>original signal is already discrete, the quantization phase involves a
>loss of information. Some regions of input variation will not be retrievable
>from the quantized image. The transformation ... cannot be inverted so as to
>recover the entire original signal.

Well, what I think is interesting is not preserving the signal itself but
rather the *information* that the signal carries.  In this sense, an analog
signal conveys only a finite amount of information and it can in fact be
converted to digital form and back to analog *without* any loss.

But in any case the point I've been emphasizing remains: the A/A
transformations you envisage are not going to be perfect (no "skyhooks" now,
remember?), so preservation or loss of information alone won't distinguish an
(intuitively) A/A from an A/D transfomation.  I think the following reply to
this point only muddies the waters:

>                           I agree that there may be information loss in
>A/A transformations (e.g., smoothing, blurring or loss of some
>dimensions of variation), but then the image is simply *not analog in
>the properties that have been lost*! It is only an analog of what it
>preserves, not what it fails to preserve.

You can take this line if you like, but notice that the same is true of a
*digitized* image -- in your terms, it is "analog" in the information it
preserves and not in the information lost.  This seems to me to be a very
unhappy choice of terminology!

Both analog and digitizing transformations must preserve *some* information.
If all you're *really* interested in is the quality of being (naturally)
information-preserving (i.e. physically invertible), than I'd strongly
recommend you just use one of these terms and drop the misleading use of
"analog", "iconic", and "digital".

>                           The "symbol grounding problem" that has
>been under discussion here concerns the fact that symbol systems
>depend for their "meanings" on only one of two possibilities: One is
>an interpretation supplied by human users... and the other is a physical, 
>causal connection with the objects to which the symbols refer. 
>The surprising consequence is that a "dedicated system" -- one that is
>hard-wired to its transducers and effectors... may be significantly different
>from the very *same* system as an isolated symbol-manipulating module,
>cut off from its peripherals ...

With regard to this "symbol grounding problem": I think it's been
well-understood for some time that causal interaction with the world is a
necessary requirement for artificial intelligence.  Recall that in his BBS
reply to Searle, Dennett dismissed Searle's initial target -- the "bedridden"
form of the Turing test -- as a strawman for precisely this reason. (Searle
believes his argument goes through for causally embedded AI programs as well,
but that's another topic.)

The philosophical rationale for this requirement is the fact that some causal
"grounding" is needed in order to determine a semantic interpretation.  A
classic example is due to Georges Rey: it's possible that a program for
playing chess could, when compiled, be *identical* to one used to plot
strategy in the Six Day War. If you look only at the formal symbol
manipulations, you can't distinguish between the two interpretations; it's
only by virtue of the causal relations between the symbols and the world that
the symbols could have one meaning rather than another.

But although everyone agrees that *some* kind of causal grounding is
necessary for intentionality, it's notoriously difficult to explain exactly
what sort it must be.  And although the information-preserving
transformations you discuss may play some role here, I really don't see how
this challenges the premises of symbolic AI in the way you seem to think it
does.  In particular you say that:

>The potential relevance of the physical invertibility criterion
>would only be to cognitive modeling, especially in the constraint that
>a grounded symbol system must be *nonmodular* -- i.e., it must be hybrid
>symbolic/nonsymbolic.

But why must the arrangement you envision must be "nonmodular" ?  A system
may contain analog and digital subsystems and still be modular if the
subsytems interact solely via well-defined inputs and outputs.

More importantly -- and this is the real motivation for my terminological
objections -- it isn't clear why *any* (intuitively) analog processing need
take place at all.  I presume the stance of symbolic AI is that sensory input
affects the system via an isolable module which converts incoming stimuli
into symbolic representations.  Imagine a vision sub-system that converts
incoming light into digital form at the first stage, as it strikes a grid of
photo-receptor surfaces, and is entirely digital from there on in.  Such a
system is still "grounded" in information-preserving representations in the
sense you require.

In short, I don't see any *philosophical* reason why symbol-grounding
requires analog processing or a non-modular structure.

Anders Weinstein
BBN Labs

harnad@mind.UUCP (06/07/87)

aweinste@Diamond.BBN.COM (Anders Weinstein)
of BBN Laboratories, Inc., Cambridge, MA writes:

>	[regarding invertibility, information preservation and the A/D
>	distinction]: what I think is interesting is not preserving the
>	signal itself but rather the *information* that the signal carries.
>	In this sense, an analog signal conveys only a finite amount of
>	information and it can in fact be converted to digital form and back
>	to analog *without* any loss.

This is an important point and concerns a matter that is at the heart
of the symbolic/nonsymbolic issue: What you're saying is appropriate for
ordinary communication theory and communication-theoretic
applications such as radio signals, telegraph, radar CDs, etc. In all these
cases the signal is simply a carrier that encodes information which is
subsequently decoded at the receiving end. But in the case of human
cognition this communication-theoretic model -- of signals carrying
messages that are encoded/decoded on either end -- may not be
appropriate. (Formal information theory has always had difficulties
with "content" or "meaning." This has often been pointed out, and I take
this to be symptomatic of the fact that it's missing something as a
candidate model for cognitive "information processing.")

Note that the communication-theoretic, signal-analytic view has a kind of 
built-in bias toward digital coding, since it's the "message" and not
the "medium" that matters. But what if -- in cognition -- the medium
*is* the message? This may well be the case in iconic processing (and
the performances that it subserves, such as discrimination, similarity
judgment, matching, short-term memory, mental rotation, etc.): It may
be the structure or "shape" of the physical signal (the stimulus) itself that
matters, not some secondary information or message it carries in coded
form. Hence the processing may have to be structure- or shape-preserving
in the physical analog sense I've tried to capture with the criterion
of invertibiliy.

>	a *digitized* image -- in your terms... is "analog" in the
>	information it preserves and not in the information lost. This
>	seems to me to be a very unhappy choice of terminology! Both analog
>	and digitizing transformations must preserve *some* information.
>	If all you're *really* interested in is the quality of being
>	(naturally) information-preserving (i.e. physically invertible),
>	than I'd strongly recommend you just use one of these terms and drop
>	the misleading use of "analog", "iconic", and "digital".

I'm not at all convinced yet that the sense of iconic and analog that I am
referring to is unrelated to the signal-analytic A/D distinction,
although I've noted that it may turn out, on sufficient analysis, to be
an independent distinction. For the time being, I've acknowledged that
my invertibility criterion is, if not necessarily unhappy, somewhat
surprising in its implications, for it implies (1) that being analog
may be a matter of degree (i.e., degree of invertibility) and (2) even
a classical digital system must be regarded as analog to a degree if
one is considering a larger "dedicated" system of which it is a
hard-wired (i.e., causally connected) component rather than an
independent (human-interpretation-mediated) module.

Let me repeat, though, that it could turn out that, despite some
suggestive similarities, these considerations are not pertinent to the
A/D distinction but, say, to the symbolic/nonsymbolic distinction --
and even that only in the special context of cognitive modeling rather than
signal analysis or artificial intelligence in general.

>	With regard to [the] "symbol grounding problem": I think it's been
>	well-understood for some time that causal interaction with the world
>	is a necessary requirement for artificial intelligence...
>	The philosophical rationale for this requirement is the fact that
>	some causal "grounding" is needed in order to determine a semantic
>	interpretation... But although everyone agrees that *some* kind of
>	causal grounding is necessary for intentionality, it's notoriously
>	difficult to explain exactly what sort it must be. And although the
>	information-preserving transformations you discuss may play some role
>	here, I really don't see how this challenges the premises of symbolic
>	AI in the way you seem to think it does.

As far as I know, there have so far been only two candidate proposals
to overcome the symbol grounding problem WITHOUT resorting to the kind
of hybrid proposal I advocate (i.e., without giving up purely symbolic
top-down modules): One proposal, as you note, is that a pure
symbol-manipulating system can be "grounded" by merely hooking it up
causally in the "right way" to the outside world with simple (modular)
transducers and effectors. I have conjectured that this strategy 
will not work in cognitive modeling (and I have given my supporting
arguments elsewhere: "Minds, Machines and Searle"). The strategy may work
in AI and conventional robotics and vision, but that is because these
fields *do not have a grounding problem*! They're only trying to generate
intelligent *pieces* of performance, not to model the mind in *all* its
performance capacity. Only cognitive modeling has a symbol grounding
problem.

The second nonhybrid way to try to ground a purely symbolic system in
real-world objects is by cryptology. Human beings, knowing already at least
one grounded language and its relation to the world, can infer the meanings
of a second one [e.g., ancient cuneiform] by using its internal formal
structure plus what they already know: Since the symbol permutations and
combinations of the unknown system (i.e., its syntactic rules) are constrained
to yield a semantically interpretatable system, sometimes the semantics can be
reliably and uniquely decoded this way (despite Quine's claims about the
indeterminacy of radical translation). It is obvious, however, that such
a "grounding" would be derivative, and would depend entirely on the
groundedness of the original grounded symbol system. (This is equivalent
to Searle's "intrinsic" vs. "derived intentionality.") And *that* grounding
problem remains to be solved in an autonomous cognitive model.

My own hybrid approach is simply to bite the bullet and give up on the
hope of an autonomous symbolic level, the hope on which AI and symbolic
functionalism had relied in their attempt to capture mental function.
Although you can get a lot of clever performance by building in purely 
symbolic "knowledge," and although it had seemed so promising that
symbol-strings could be interpreted as thoughts, beliefs, and mental
propositions, I have argued that a mere extension of this modular "top-down" 
approach, hooking up eventually with peripheral modules, simply won't
succeed in the long run (i.e., as we attempt to approach an asymptote of
total human performance capacity, or what I've called the "Total Turing Test")
because of the grounding problem and the nonviability of the two
"solutions" sketched above (i.e., simple peripheral hook-ups and/or
mediating human cryptology). Instead, I have described a nonmodular
hybrid representational system in which symbolic representations are
grounded bottom-up in nonsymbolic ones (iconic and categorical).
Although there is a symbolic level in such a system, it is not quite
the autonomous all-purpose level of symbolic AI. It trades its autonomy
for its groundedness.

>	[W]hy must the arrangement you envision be "nonmodular"? A system
>	may contain analog and digital subsystems and still be modular if
>	the subsystems interact solely via well-defined inputs and outputs.

I'll try to explain why I believe that a successful mind-model (one
able to pass the Total Turing Test) is unlikely to consist merely of a
pure symbol-manipulative module connected to input/output modules.
A pure top-down symbol system just consists of physically implemented
symbol manipulations. You yourself describe a typical example of
ungroundedness (from Georges Rey):

>		it's possible that a program for playing chess could,
>		when compiled, be *identical* to one used to plot
>		strategy in the Six Day War. If you look only at the
>		formal symbol manipulations, you can't distinguish between
>		the two interpretations; it's only by virtue of the causal
>		relations between the symbols and the world that the symbols
>		could have one meaning rather than another.

Now consider two cases of "fixing" the symbol interpretations by
grounding the causal relations between the symbols and the world. In
(1) a "toy" case -- a circumscribed little chunk of performance such as
chess-playing or war-games -- the right causal connections could be
wired according to the human encryption/decryption scheme: Inputs and
outputs could be wired into their appropriate symbolic descriptions.
There is no problem here, because the toy problems are themselves
modular, and we know all the ins and outs. But none but the most
diehard symbolic functionalist would want to argue that such a simple
toy model was "thinking," or even doing anything remotely like what we
do when we accomplish the same performance. The reason is that we are
capable of doing *so much more* -- and not by an assemblage of endless
independent modules of essentially the same sort as these toy models,
but by some sort of (2) integrated internal system. Could that "total"
system be just an oversized toy model -- a symbol system with its
interpretations "fixed" by a means analogous to these toy cases? I am
conjecturing that it is not.

Toy models don't think. Their internal symbols really *are*
meaningless, and hence setting them in the service of generating a toy
performance just involves hard-wiring our intended interpretations
of its symbols into a suitable dedicated system. Total (human-capacity-sized)
models, on the other hand, will, one hopes, think, and hence the
intended interpretations of their symbols will have to be intrinsic in
some deeper way than the analogy with the toy model would suggest, at
least so I think. This is my proposed "nonmodular" candidate:

Every formal symbol system has both primitive atomic symbols and composite
symbol-strings consisting of ruleful combinations of the atoms. Both
the atoms and the combinations are semantically interpretable, but
from the standpoint of the formal syntactic rules governing the symbol
manipulations, the atoms could just as well have been undefined or
meaningless. I hypothesize that the primitive symbols of a nonmodular
cognitive symbol system are actually the (arbitrary) labels of object
categories, and that these labels are reliably assigned to their referents
by a nonsymbolic representational system consisting of (i) iconic (invertible,
one-to-one) transformations of the sensory surface and (ii) categorical
(many-to-few) representations that preserve only the features that suffice to
reliably categorize and label sensory projections of the objects in
question. Hence, rather than being primitive and undefined, and hence
independent of interpretation, I suggest that the atoms of cognitive
symbol systems are grounded, bottom-up, in such a categorization
mechanism. The higher-order symbol combinations inherit the bottom-up
constraints, including the nonsymbolic representations to which they
are attached, rather than being an independent top-down symbol-manipulative
module with its connections to an input/output module open to being
fixed in various extrinsically determined ways.

>	it isn't clear why *any* (intuitively) analog processing need
>	take place at all. I presume the stance of symbolic AI is that
>	sensory input affects the system via an isolable module which converts
>	incoming stimuli into symbolic representations. Imagine a vision
>	sub-system that converts incoming light into digital form at the
>	first stage, as it strikes a grid of photo-receptor surfaces, and is
>	entirely digital from there on in. Such a system is still "grounded"
>	in information-preserving representations in the sense you require.
>	In short, I don't see any *philosophical* reason why symbol-grounding
>	requires analog processing or a non-modular structure.

It is exactly this modular scenario that I am calling into question. It
is not clear at all that a cognitive system must conform to it. To get a
device to be able to do what we can do we may have to stop thinking in
terms of "isolable" input modules that go straight into symbolic
representations. That may be enough to "ground" a conventional toy
system, but, as I've said, such toy systems don't have a grounding problem
in the first place, because nobody really believes they're thinking. To get
closer to life-size devices -- devices that can generate *all* of our
performance capacity, and hence may indeed be thinking -- we may have to
turn to hybrid systems in which the symbolic functions are nonmodularly
grounded, bottom-up, in the nonsymbolic ones. The problem is not a
philosophical one, it's an empirical one: What looks as if it's likely
to work, on the evidence and reasoning available?

-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

aweinste@diamond.bbn.com.UUCP (06/10/87)

In article <812@mind.UUCP> Stevan Harnad <harnad@mind.UUCP> replies:

With regard physical invertibility and the A/D distinction:
>
>>	a *digitized* image -- in your terms... is "analog" in the
>>	information it preserves and not in the information lost. This
>>	seems to me to be a very unhappy choice of terminology! 
>
>                            For the time being, I've acknowledged that
>my invertibility criterion is, if not necessarily unhappy, somewhat
>surprising in its implications, for it implies (1) that being analog
>may be a matter of degree (i.e., degree of invertibility) and (2) even
>a classical digital system must be regarded as analog to a degree ...

Grumble. These consequences only *seem* surprising if we forget that you've
redefined "analog" in a non-standard manner; this is precisely I why I keep
harping on your terminology. Compare them with what you're really saying:
"physical invertibility is a matter of degree" or "a classical digital system
still employs physically invertible representations" -- both quite humdrum.

With regard to the symbolic AI approach to the "symbol-grounding problem":
>
>One proposal, as you note, is that a pure symbol-manipulating system can be
>"grounded" by merely hooking it up causally in the "right way" to the outside
>world with simple (modular) transducers and effectors. ...  I  have argued
>that [this approach] simply won't succeed in the long run (i.e., as we
>attempt to approach an asymptote of total human performance capacity ...)
>...In (1) a "toy" case ... the right causal connections could be wired
>according to the human encryption/decryption scheme: Inputs and outputs could
>be wired into their appropriate symbolic descriptions. ... But none but the
>most diehard symbolic functionalist would want to argue that such a simple
>toy model was "thinking," ...  The reason is that we are capable of
>doing *so much more* -- and not by an assemblage of endless independent
>modules of essentially the same sort as these toy models, but by some sort of
>(2) integrated internal system. Could that "total" system be just an
>oversized toy model -- a symbol system with its interpretations "fixed" by a
>means analogous to these toy cases? I am conjecturing that it is not.

I think your reply may misunderstand the point of my objection. I'm not
trying to defend the intentionality of "toy" programs.  I'm not even
particularly concerned to *defend* the symbolic approach to AI (I personally
don't even believe in it).  I'm merely trying to determine exactly what your
argument against symbolic AI is.

I had thought, perhaps wrongly, that you were claiming that the
interpretations of systems conceived by symbolic AI system must somehow
inevitably fail to be "grounded", and that only a system which employed
"analog" processing in the way you suggest would have the causal basis
required for fixing an interpretation.  In response, I pointed out first that
advocates of the symbolic approach already understand that causal commerce
with the environment is necessary for intentionality: they envision the use
of complex perceptual systems to provide the requisite "grounding". So it's
not as though the symbolic approach is indifferent to this issue.  And your
remarks against "toy" systems and "hard-wiring" the interpretations of the
inputs are plain unfair -- the symbolic approach doesn't belittle the
importance or complexity of what perceptual systems must be able to do. It is
in total agreement with you that a truly intentional system must be capable
of complex adaptive performance via the use of its sensory input -- it just
hypothesizes that symbolic processing is sufficient to achieve this.

And, as I tried to point out, there is just no reason that a modular,
all-digital system of the kind envisioned by the symbolic approach could not
be entirely "grounded" BY YOUR OWN THEORY OF "GROUNDEDNESS":  it could employ
"physically inevertible" representations (only they would be digital ones),
from these it could induct reliable "feature filters" based on training (only
these would use digital rather than analog techniques), etc. I concluded that
the symbolic approach appears to handle your so-called "grounding problem"
every bit as well as any other method.

Now comes the reply that you are merely conjecturing that analog processing
may be required to realize the full range of human, as opposed to "toy",
performance -- in short, you think the symbolic approach just won't work.
But this is a completely different issue! It has nothing to do with some
mythical "symbol grounding" problem, at least as I understand it.  It's just
the same old "intelligent-behavior-generating" problem which everyone in AI,
regardless of paradigm, is looking to solve.

From this reply, it seems to me that this alleged "symbol-grounding problem"
is a real red-herring (it misled me, at least).  All you're saying is that
you suspect that mainstream AI's symbol system hypothesis is false, based on
its lack of conspicuous performance-generating sucesses.  Obviously everyone
must recognize that this is a possibility -- the premise of symbolic AI is,
after all, only a hypothesis. 

But I find this a much less interesting claim than I originally thought --
conjectures, after all, are cheap.  It *would* be interesting if you could
show, as, say, the connectionist program is trying to, how analog processing
can work wonders that symbol-manipulation can't. But this would require
detailed research, not speculation. Until then, it remains a mystery why your
proposed approach should be regarded as any more promising than any other.

Anders Weinstein
BBN Labs

harnad@mind.UUCP (06/11/87)

aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc.,
Cambridge, MA writes:

>	There's no [symbol] grounding problem, just the old
>	behavior-generating problem

Before responding to the supporting arguments for this conclusion, let
me restate the matter in what I consider to be the right way. There is:
(1) the behavior-generating problem (what I have referred to as the problem of
devising a candidate that will pass the Total Turing Test), (2) the
symbol-grounding problem (the problem of how to make formal symbols
intrinsically meaningful, independent of our interpretations), and (3)
the conjecture (based on the existing empirical evidence and on
logical and methodological considerations) that (2) is responsible for
the failure of the top-down symbolic approach to solve (1).

>>my [SH's] invertibility criterion is, if not necessarily unhappy, somewhat
>>surprising in its implications, for it implies that (1) being analog may
>>be a matter of degree (i.e., degree of invertibility) and that (2) even
>>a classical digital system must be regarded as analog to a degree ...
>
>	These consequences only *seem* surprising if we forget that you've
>	redefined "analog" in a non-standard manner... you're really saying:
>	"physical invertibility is a matter of degree" or "a classical digital
>	system still employs physically invertible representations" -- both
>	quite humdrum.

You've bypassed the three points I brought up in replying to your
challenge to my invertibility criterion for an analog transform the
last time: (1) the quantization in standard A/D is noninvertible, (2) a
representation can only be analog in what it preserves, not in what it
fails to preserve, and, in cognition at any rate, (3) the physical
shape of the signal may be what matters, not the "message" it
"carries." Add to this the surprising logical consequence that a
"dedicated" digital system (hardwired to its peripherals) would be
"analog" in its invertible inputs and outputs according to my
invertibility criterion, and you have a coherent distinction that conforms
well to some features of the classical A/D distinction, but that may prove
to diverge, as I acknowledged, sufficiently to make it an independent,
"non-standard" distinction, unique to cognition and neurobiology. Would it be
surprising if classical electrical engineering concepts did not turn
out to be just right for mind-modeling?

>	I [AW] had thought, perhaps wrongly, that you were claiming that the
>	interpretations of systems conceived by symbolic AI system must somehow
>	inevitably fail to be "grounded", and that only a system which employed
>	"analog" processing in the way you suggest would have the causal basis
>	required for fixing an interpretation.

That is indeed what I'm claiming (although you've completely omitted
the role of the categorical representations, which are just as
critical to my scheme, as described in the CP book). But do make sure you
keep my "non-standard" definition of analog in mind, and recall that I'm
talking about asymptotic, human-scale performance, not toy systems.
Toy systems are trivially "groundable" (even by my definition of
"analog") by hard-wiring them into a dedicated input/output
system. But the problem of intrinsic meaningfulness does not arise for
toy models, only for devices that can pass the Total Turing Test (TTT).
[The argument here involves showing that to attribute intentionality to devices
that exhibit sub-TTT performance is not justified in the first place.]
The conjecture is accordingly that the modular solution (i.e., hardwiring an
autonomous top-down symbolic module to conventional peripheral modules
-- transducers and effectors) will simply not succeed in producing a candidate
that will be able to pass the Total Turing Test, and that the fault
lies with the autonomy (or modularity) of the symbolic module.

But I am not simply proposing an unexplicated "analog" solution to the
grounding problem either, for note that a dedicated modular system *would*
be analog according to my invertibility criterion! The conjecture is
that such a modular solution would not be able to meet the TTT
performance criterion, and the grounds for the conjecture are partly
inductive (extrapolating symbolic AI's performance failures), partly
logical and methodological (the grounding problem), and partly
theory and data-driven (psychophysical findings in human categorical
perception). My proposal is not that some undifferentiated,
non-standard "analog" processing must be going on. I am advocating a
specific hybrid bottom-up, symbolic/nonsymbolic rival to the pure
top-down symbolic approach (whether or not the latter is wedded to
peripheral modules), as described in the volume under discussion
("Categorical Perception: The Groundwork of Cognition," CUP 1987).

>	advocates of the symbolic approach already understand that causal
>	commerce with the environment is necessary for intentionality: they
>	envision the use of complex perceptual systems to provide the
>	requisite "grounding". So it's not as though the symbolic approach
>	is indifferent to this issue.

This is the pious hope of the "top-down" approach: That suitably
"complex" perceptual systems will meet for a successful "hook-up"
somewhere in the middle. But simply reiterating it does not mean it
will be realized. The evidence to date suggests the opposite: That the
top-down approach will just generate more special-purpose toys, not a
general purpose, TTT-scale model of human performance capacity. Nor is
there any theory at all of what the requisite perceptual "complexity"
might be: The stereotype is still standard transducers that go from physical
energy via A/D conversion straight into symbols. Nor does "causal
commerce" say anything: It leaves open anything from the modular
symbol-cruncher/transducer hookups of the kind that so far only seem
capable of generating toy models, to hybrid, nonmodular, bottom-up
models of the sort I would advocate. Perhaps it's in the specific
nature of the bottom-up grounding that the nature of the requisite
"complexity" and "causality" will be cashed in.

>	your remarks against "toy" systems and "hard-wiring" the
>	interpretations of the inputs are plain unfair -- the symbolic
>	approach doesn't belittle the importance or complexity of what
>	perceptual systems must be able to do. It is in total agreement
>	with you that a truly intentional system must be capable of complex
>	adaptive performance via the use of its sensory input -- it just
>	hypothesizes that symbolic processing is sufficient to achieve this.

And I just hypothesize that it isn't. And I try to say why not (the
grounding problem and modularity) and what to do about it (bottom-up,
nonmodular grounding of symbolic representations in iconic and categorical
representations).

>	there is just no reason that a modular, all-digital system of the
>	kind envisioned by the symbolic approach could not be entirely
>	"grounded" BY YOUR OWN THEORY OF "GROUNDEDNESS":  it could employ
>	"physically inevertible" representations (only they would be digital
>	ones), from these it could induct reliable "feature filters" based on
>	training (only these would use digital rather than analog techniques),
>	etc. ...  the symbolic approach appears to handle your so-called
>	"grounding problem" every bit as well as any other method.

First of all, as I indicated earlier, a dedicated top-down symbol-crunching
module hooked to peripherals would indeed be "grounded" in my sense --
if it had TTT-performance power. Nor is it *logically impossible* that
such a system could exist. But it certainly does not look likely on the
evidence. I think some of the reasons we were led (wrongly) to expect it were
the following:

(1) The original successes of symbolic AI in generating intelligent
performance: The initial rule-based, knowledge-driven toys were great
successes, compared to the alternatives (which, apart from some limited
feats of Perceptrons, were nonexistent). But now, after a generation of
toys that show no signs of converging on general principles and growing
up to TTT-size, the inductive evidence is pointing in the other direction:
More ad hoc toys is all we have grounds to expect.

(2) Symbol strings seemed such hopeful candidates for capturing mental
phenomena such as thoughts, knowledge, beliefs. Symbolic function seemed
like such a natural, distinct, nonphysical level for capturing the mind.
Easy come, easy go.

(3) We were persuaded by the power of computation -- Turing
equivalence and all that -- to suppose that computation
(symbol-crunching) just might *be* cognition. If every (discrete)
thing anyone or anything (including the mind) does is computationally
simulable, then maybe the computational functions capture the mental
functions? But the fact that something is computationally simulable
does not entail that it is implemented computationally (any more than
behavior that is *describable* as ruleful is necessarily following an
explicit rule). And some functions (such as transduction and causality)
cannot be implemented computationally at all.

(4) We were similarly persuaded by the power of digital coding -- the
fact that it can approximate analog coding as closely as we please
(and physics permits) -- to suppose that digital representations were
the only ones we needed to think about. But the fact that a digital
approximation is always possible does not entail that it is always
practical or optimal, nor that it is the one that is actually being
*used* (by, say, the brain). Some form of functionalism is probably
right, but it certainly need not be symbolic functionalism, or a
functionalism that is indifferent to whether a mental function or
representation is analog or digital: The type of implementation may
matter, both to the practical empirical problem of successfully
generating performance and to the untestable phenomenological problem of
capturing qualitative subjective experience. And some functions (let
me again add), such as transduction and (continuous) A/A, cannot be
implemented purely symbolically at all.

A good example to bear in mind is Shepard's mental rotation
experiments. On the face of it, the data seemed to suggest that
subjects were doing analog processing: In making same/different
judgments of pairs of successively presented 2-dimensional projections
of 3-dimensional, computer-generated, unfamiliar forms, subjects' reaction
times for saying "same" when one stimulus was in a standard orientation and
the other was rotated were proportional to the degree of rotation. The
diehard symbolists pointed out (correctly) that the proportionality,
instead of being due to the real-time analog rotation of a mental icon, could
have been produced by, say, (1) serially searching through the coordinates
of a digital grid on which the stimuli were represented, with more distant
numbers taking more incremental steps to reach, or by (2) doing
inferences on formal descriptions that became more complex (and hence
time-consuming) as the orientation became more eccentric. The point,
though, is that although digital/symbolic representations were indeed
possible, so were analog ones, and here the latter would certainly seem to be
more practical and parsimonious. And the fact of the matter -- namely,
which kinds of representations were *actually* used -- is certainly
not settled by pointing out that digital representations are always
*possible.*

Maybe a completely digital mind would have required a head the size of
New York State and polynomial evolutionary time in order to come into
existence -- who knows? Not to mention that it still couldn't do the
"A" in the A/D...

>	[you] reply that you are merely conjecturing that analog processing
>	may be required to realize the full range of human, as opposed to "toy",
>	performance -- in short, you think the symbolic approach just won't
>	work. But this... has nothing to do with some mythical "symbol
>	grounding" problem, at least as I understand it. It's just
>	the same old "intelligent-behavior-generating" problem which everyone
>	in AI, regardless of paradigm, is looking to solve... All you're
>	saying is that you suspect that mainstream AI's symbol system
>	hypothesis is false, based on its lack of conspicuous
>	performance-generating successes. Obviously everyone must recognize
>	that this is a possibility -- the premise of symbolic AI is, after
>	all, only a hypothesis. 

I'm not just saying I think the symbolic hypothesis is false. I'm
saying why I think it's false (ungroundedness) and I'm suggesting an
alternative (a bottom-up hybrid).

>	But I find this a much less interesting claim than I originally
>	thought -- conjectures, after all, are cheap. It *would* be
>	interesting if you could show, as, say, the connectionist program
>	is trying to, how analog processing can work wonders that
>	symbol-manipulation can't. But this would require detailed research,
>	not speculation. Until then, it remains a mystery why your proposed
>	approach should be regarded as any more promising than any other.

Be patient. My hypotheses (which are not just spontaneous conjectures,
but are based on an evaluation of the available evidence, the theoretical
alternatives, and the logical and methodological problems involved)
will be tested. They even have a potential connectionist component (in
the induction of the features subserving categorization), although
connectionism comes in for criticism too. For now it would seem only
salutary to attempt to set cognitive modeling in directions that
differ from the unprofitable ones it has taken so far.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

marty1@houdi.UUCP (M.BRILLIANT) (06/11/87)

In article <828@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc.,
> Cambridge, MA writes:
> 
> >	There's no [symbol] grounding problem, just the old
> >	behavior-generating problem
> 
> ..... There is:
> (1) the behavior-generating problem (what I have referred to as the problem of
> devising a candidate that will pass the Total Turing Test), (2) the
> symbol-grounding problem (the problem of how to make formal symbols
> intrinsically meaningful, independent of our interpretations), and (3) ...

Just incidentally, what is the intrinsic meaning of "intrinsically
meaningful"?  The Turing test is an objectively verifiable criterion.
How can we objectively verify intrinsic meaningfulness?

> .... Add to this the surprising logical consequence that a
> "dedicated" digital system (hardwired to its peripherals) would be
> "analog" in its invertible inputs and outputs according to my
> invertibility criterion, .....

Using "analog" to mean "invertible" invites misunderstanding, which
invites irrelevant criticism.

Human (in general, vertebrate) visual processing is a dedicated
hardwired digital system.  It employs data reduction to abstract such
features as motion, edges, and orientation of edges.  It then forms a
map in which position is crudely analog to the visual plane, but
quantized.  This map is sufficiently similar to maps used in image
processing machines so that I can almost imagine how symbols could be
generated from it.

By the time it gets to perception, it is not invertible, except with
respect to what is perceived.  Noninvertibility is demonstrated in
experiments in the identification of suspects.  Witnesses can report
what they perceive, but they don't always perceive enough to invert
the perceived image and identify the object that gave rise to the
perception.  If you don't agree, please give a concrete, objectively
verifiable definition of "invertibility" that can be used to refute my
conclusion.

If I am right, human intelligence itself relies on neither analog nor
invertible symbol grounding, and therefore artificial intelligence
does not require it.

By the way, there is an even simpler argument: even the best of us can
engage in fuzzy thinking in which our symbols turn out not to be
grounded.  Subjectively, we then admit that our symbols are not
intrinsically meaningful, though we had interpreted them as such.

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

aweinste@Diamond.BBN.COM (Anders Weinstein) (06/12/87)

In article <828@mind.UUCP> Stevan Harnad <harnad@mind.UUCP> writes
>
>>	There's no [symbol] grounding problem, just the old
>>	behavior-generating problem
>                                                              There is:
>(1) the behavior-generating problem (what I have referred to as the problem of
>devising a candidate that will pass the Total Turing Test), (2) the
>symbol-grounding problem (the problem of how to make formal symbols
>intrinsically meaningful, independent of our interpretations), and (3)
>the conjecture (based on the existing empirical evidence and on
>logical and methodological considerations) that (2) is responsible for
>the failure of the top-down symbolic approach to solve (1).

It seems to me that in different places, you are arguing the relation between
(1) and (2) in both directions, claiming both

	(A) The symbols in a purely symbolic system will always be 
	    ungrounded because such systems can't generate real performance;
and
	(B) A purely symbolic system can't generate real performance because
	    its symbols will always be ungrounded.

That is, when I ask you why you think the symbolic approach won't work, one
of your reasons is always "because it can't solve the grounding problem", but
when I press you for why the symbolic approach can't solve the grounding
problem, it always turns out to be "because I think it won't work." I think
we should get straight on the priority here.

It seems to me that, contra (3), thesis (A) is the one that makes perfect
sense -- in fact, it's what I thought you were saying. I just don't
understand (B) at all.

To elaborate: I presume the "symbol-grounding" problem is a *philosophical*
question: what gives formal symbols original intentionality? I suppose the
only answer anybody knows is, in brief, that the symbols must be playing a
certain role in what Dennett calls an "intentional system", that is, a system
which is capable of producing complex, adaptive behavior in a rational way.

Since such a system must be able to respond to changes in its environment,
this answer has the interesting consequence that causal interaction with the
world is a *necessary* condition for original intentionality. It tells us
that symbols in a disconnected computer, without sense organs, could never be
"grounded" or intrinsically meaningful. But those in a machine that can
sense and react could be, provided the machine exhibited the requisite
rationality.

And this, as far as I can tell, is the end of what we learn from the "symbol
grounding" problem -- you've got to have sense organs. For a system that is
not causally isolated from the environment, the symbol-grounding problem now
just reduces to the old behavior-generating problem, for, if we could just
produce the behavior, there would be no question of the intentionality of the
symbols. In other words, once we've wised up enough to recognize that we must
include sensory systems (as symbolic AI has), we have completely disposed of
the "symbol grounding" problem, and all that's left to worry about is the
question of what kind of system can produce the requisite intelligent
behavior. That is, all that's left is the old behavior-generating problem.

Now as I've indicated, I think it's perfectly reasonable to suspect that the
symbolic approach is insufficient to produce full human performance. You
really don't have to issue any polemics on this point to me; such a suspicion
could well be justified by pointing out the triviality of AI's performance
achievements to date.

What I *don't* see is any more "principled" or "logical" or "methodological"
reason for such a suspicion; in particular, I don't understand how (B) could
provide such a reason.  My system can't produce intelligent performance
because it doesn't make its symbols meaningful? This statement has just got
things backwards -- if I could produce the behavior, you'd have to admit that
its symbols had all the "grounding" they needed for original intentionality.

In sum, apart from the considerations that require causal embedding, I don't
see that there *is* any "symbol-grounding" problem, at least not any problem
that is any different from the old "total-performance generating" problem.
For this reason, I think your animadversions on symbol grounding are largely
irrelevant to your position -- the really substantial claims pertain only to
"what looks like it's likely to work" for generating intelligent behavior.

On a more specific issue:
>
>You've bypassed the three points I brought up in replying to your
>challenge to my invertibility criterion for an analog transform the
>last time: (1) the quantization in standard A/D is noninvertible, 

Yes, but *my* point has been that since there isn't necessarily any more loss
here than there is in a typical A/A transformation, the "degree of
invertibility" criterion cross-cuts the intuitive A/D distinction.

Look, suppose we had a digitized image, A, which is of much higher resolution
than another analog one, B.  A is more invertible since it contains more
detail from which to reconstruct the original signal, but B is more
"shape-preserving" in an intuitive sense.  So, which do you regard as "more
analog"?  Which does your theory think is better suited to subserving our
categorization performance? If you say B, then invertibility is just not what
you're after.

Anders Weinstein
BBN Labs

marty1@houdi.UUCP (M.BRILLIANT) (06/13/87)

In article <6521@diamond.BBN.COM>, aweinste@Diamond.BBN.COM (Anders Weinstein) writes:
> ....
> 	(A) The symbols in a purely symbolic system will always be 
> 	    ungrounded because such systems can't generate real performance;
> ...
> It seems to me that .... thesis (A) is the one that makes perfect
> sense ....
> 
> ..... I think it's perfectly reasonable to suspect that the
> symbolic approach is insufficient to produce full human performance....

What exactly is this "purely" symbolic approach?  What impure approach
might be necessary?  "Purely symbolic" sounds like a straw man: a
system so purely abstract that it couldn't possibly relate to the real
world, and that nobody seriously trying to mimic human behavior would
even try to build anything that pure.

To begin with, any attempt to "produce full human performance" must
involve sensors, effectors, and motivations.  Does "purely symbolic"
preclude any of these?  If not, what is it in the definition of a
"purely symbolic" approach that makes it inadequate to pull these
factors together?

(Why do I so casually include motivations?  I'm an amateur actor.  Not
even a human can mimic another human without knowing about motivations.)

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

harnad@mind.UUCP (06/13/87)

aweinste@Diamond.BBN.COM (Anders Weinstein)
of BBN Laboratories, Inc., Cambridge, MA writes:

>	X has intrinsic intentionality (is "grounded") iff X can pass the TTT.
>	I thought from your postings that you shared this frankly behavioristic
>	philosophy... So what could it come to to say that symbolic AI must
>	inevitably choke on the grounding problem? Since grounding == behavioral
>	capability, all this claim can mean is that symbolic AI won't be able
>	to generate full TTT performance. I think, incidentally, that you're
>	probably right in this claim. However,...To say that your approach
>	is better grounded is only to say that it may work better (ie.
>	generate TTT performance); there's just no independent content to the
>	claim of "groundedness". Or do you have some non-behavioral definition
>	of intrinsic intentionality that I haven't yet heard?

I think that this discussion has become repetitious, so I'm going to
have to cut down on the words. Our disagreement is not substantive.
I am not a behaviorist. I am a methodological epiphenomenalist.
Intentionality and consciousness are not equivalent to behavioral
capacity, but behavioral capacity is our only objective basis for
inferring that they are present. Apart from behavioral considerations,
there are also functional considerations: What kinds of internal
processes (e.g., symbolic and nonsymbolic) look as if they might work?
and why? and how? The grounding problem accordingly has functional aspects
too. What are the right kinds of causal connections to ground a
system? Yes, the test of successful grounding is the TTT, but that
still leaves you with the problem of which kinds of connections are
going to work. I've argued that top-down symbol systems hooked to
transducers won't, and that certain hybrid bottom-up systems might. All
these functional considerations concern how to ground symbols, they are
distinct from (though ultimately, of course, dependent on) behavioral
success, and they do have independent content.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

marty1@houdi.UUCP (06/14/87)

In article <835@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel writes:
> 
> >	Human visual processing is neither analog nor invertible.
> 
> Nor understood nearly well enough to draw the former two conclusions,
> it seems to me. If you are taking the discreteness of neurons, the
> all-or-none nature of the action potential, and the transformation of
> stimulus intensity to firing frequency as your basis for concluding
> that visual processing is "digital," the basis is weak, and the
> analogy with electronic transduction strained.

No, I'm taking more than that as the basis.  I don't have any
names handy, and I'm not a professional in neurobiology, but
I've seen many articles in Science and Scientific American
(including a classic paper titled something like "What the
frog's eye tells the frog's brain") that describe the flow of
visual information through the layers of the retina, and through
the layers of the visual cortex, with motion detection, edge
detection, orientation detection, etc., all going on in specific
neurons.  Maybe a neurobiologist can give a good account of what
all that means, so we can guess whether computer image
processing could emulate it.

> >	what is the intrinsic meaning of "intrinsically meaningful"?
> >	The Turing test is an objectively verifiable criterion. How can
> >	we objectively verify intrinsic meaningfulness?
> 
> We cannot objectively verify intrinsic meaningfulness. The Turing test
> is the only available criterion. Yet we can make inferences...

I think that substantiates Weinstein's position: we're back to
the behavior-generating problem.

> ....: We
> know the difference between looking up a meaning in an English/English
> dictionary versus a Chinese/Chinese dictionary (if we are nonspeakers
> of Chinese): The former symbols are meaningful and the latter are
> not.

Not relevant.  Intrinsically, words in both languages are
equally meaningful.

> >	Using "analog" to mean "invertible" invites misunderstanding,
> >	which invites irrelevant criticism.
> 
> ..... I have acknowledged all
> along that the physically invertible/noninvertible distinction may
> turn out to be independent of the A/D distinction, although the
> overlap looks significant. And I'm doing my best to sort out the
> misunderstandings and irrelevant criticism...

Then please stop using the terms analog and digital.

> 
> >	Human (in general, vertebrate) visual processing is a dedicated
> >	hardwired digital system.  It employs data reduction to abstract such
> >	features as motion, edges, and orientation of edges.  It then forms a
> >	map in which position is crudely analog to the visual plane, but
> >	quantized.  This map is sufficiently similar to maps used in image
> >	processing machines so that I can almost imagine how symbols could be
> >	generated from it.
> 
> I am surprised that you state this with such confidence. In
> particular, do you really think that vertebrate vision is well enough
> understood functionally to draw such conclusions? ...

Yes. See above.

> ... And are you sure
> that the current hardware and signal-analytic concepts from electrical
> engineering are adequate to apply to what we do know of visual
> neurobiology, rather than being prima facie metaphors?

Not the hardware concepts.  But I think some principles of
information theory are independent of the medium.

> >	By the time it gets to perception, it is not invertible, except with
> >	respect to what is perceived.  Noninvertibility is demonstrated in
> >	experiments in the identification of suspects.  Witnesses can report
> >	what they perceive, but they don't always perceive enough to invert
> >	the perceived image and identify the object that gave rise to the
> >	perception....
> >	.... If I am right, human intelligence itself relies on neither
> >	analog nor invertible symbol grounding, and therefore artificial
> >	intelligence does not require it.
> 
> I cannot follow your argument at all. Inability to categorize and identify
> is indeed evidence of a form of noninvertibility. But my theory never laid
> claim to complete invertibility throughout.....

First "analog" doesn't mean analog, and now "invertibility"
doesn't mean complete invertibility.  These arguments are
getting too slippery for me.

> .... Categorization and identification
> itself *requires* selective non-invertibility: within-category differences
> must be ignored and diminished, while between-category differences must
> be selected and enhanced.

Well, that's the point I've been making.  If non-invertibility
is essential to the way we process information, you can't say
non-invertibility would prevent a machine from emulating us.

Anybody can do hand-waving.  To be convincing, abstract
reasoning must be rigidly self-consistent.  Harnad's is not.
I haven't made any assertions as to what is possible.  All
I'm saying is that Harnad has come nowhere near proving his
assertions, or even making clear what his assertions are.

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

harwood@cvl.UUCP (06/14/87)

In article <843@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
>
(... replying to Anders Weinstein ...who wonders "Where's the beef?" in
Steve Harnad's conceptual and terminological salad ...; uh - let me be first
to prophylactically remind us - lest there is any confusion and forfending
that he should perforce of intellectual scruple must need refer to his modest
accomplishments  - Steve Harnad is editor of Behavioral and Brain Sciences,
and I am not, of course. We - all of us - enjoy reading such high-class
stuff...;-)

	Anyway, Steve Harnad replies to A.W., re "Total Turing Tests", 
behavior, and the (great AI) "symbol grounding problem":

>I think that this discussion has become repetitious, so I'm going to
>have to cut down on the words.

	Praise the Lord - some insight - by itself, worthy of Pass of
the "Total Turing Test."

>... Our disagreement is not substantive.
>I am not a behaviorist. I am a methodological epiphenomenalist.

	I'm not a behaviorist, you're not a behaviorist, he's not a
behaviorist too ... We are all methodological solipsists hereabouts
on this planet, having already, incorrigibly, failed the "Total Turing
Test" for genuine intergalactic First Class rational beings, but so what?
(Please, Steve - this is a NOT a test - I repeat - this is NOT a test of
your philosophical intelligence. It is an ACTUAL ALERT of your common
sense, not to mention, sense of humor. Please do not solicit BBS review of 
this thesis...

>... Apart from behavioral considerations,
>there are also functional considerations: What kinds of internal
>processes (e.g., symbolic and nonsymbolic) look as if they might work?
>and why? and how? The grounding problem accordingly has functional aspects
>too. What are the right kinds of causal connections to ground a
>system? Yes, the test of successful grounding is the TTT, but that
>still leaves you with the problem of which kinds of connections are
>going to work. I've argued that top-down symbol systems hooked to
>transducers won't, and that certain hybrid bottom-up systems might. All
>these functional considerations concern how to ground symbols, they are
>distinct from (though ultimately, of course, dependent on) behavioral
>success, and they do have independent content.
>-- 
>
>Stevan Harnad                                  (609) - 921 7771
>{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
>harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

	You know what is the real problem with your postings - it's
what I would call "the symbol grounding problem". You want to say obvious
things in the worst possible way, otherwise say abstract things in the
worst possible way.. And ignore what others say. Also, for purposes of
controversial public discussion, ignore scientific 'facts' (eg about 
neurologic perceptual equivalence), and standard usage of scientific
terminology and interpretation of theories. (Not that these are sacrosanct.)
	It seems to me that your particular "symbol grounding problem" 
is indeed the the sine qua non of the Total Turing Test for "real" 
philosophers of human cognition. As I said, we are all methodological 
solipsists hereabouts. However, if you want AI funding from me - I want to
see what real computing system, using your own architecture and object code 
of at least 1 megabytes, has been designed by you. Then we will see how 
your "symbols" are actually grounded, using the standard, naive but effective
denotational semantics for the "symbols" of your intention, qua "methodological
epiphenomensist."

David Harwood

marty1@houdi.UUCP (06/14/87)

In article <843@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> 
> Intentionality and consciousness are not equivalent to behavioral
> capacity, but behavioral capacity is our only objective basis for
> inferring that they are present. Apart from behavioral considerations,
> there are also functional considerations: What kinds of internal
> processes (e.g., symbolic and nonsymbolic) look as if they might work?
> and why? and how? The grounding problem accordingly has functional aspects
> too. What are the right kinds of causal connections to ground a
> system? Yes, the test of successful grounding is the TTT, but that
> still leaves you with the problem of which kinds of connections are
> going to work. I've argued that top-down symbol systems hooked to
> transducers won't, and that certain hybrid bottom-up systems might. All
> these functional considerations concern how to ground symbols, they are
> distinct from (though ultimately, of course, dependent on) behavioral
> success, and they do have independent content.

Harnad's terminology has proved unreliable: analog doesn't mean
analog, invertible doesn't mean invertible, and so on.  Maybe
top-down doesn't mean top-down either.

Suppose we create a visual transducer feeding into an image
processing module that could delineate edges, detect motion,
abstract shape, etc.  This processor is to be built with a
hard-wired capability to detect "objects" without necessarily
finding symbols for them.

Next let's create a symbol bank, consisting of a large storage
area that can be partitioned into spaces for strings of
alphanumeric characters, with associated pointers, frames,
anything else you think will work to support a sophisiticated
knowledge base.  The finite area means that memory will be
limited, but human memory can't really be infinite, either.

Next let's connect the two: any time the image processor finds
an object, the machine makes up a symbol for it.  When it finds
another object, it makes up another symbol and links that symbol
to the symbols for any other objects that are related to it in
ways that it knows about (some of which might be hard-wired
primitives): proximity in time or space, similar shape, etc.  It
also has to make up symbols for the relations it relies on to
link objects.  I'm over my head here, but I don't think I'm
asking for anything we think is impossible.  Basically, I'm
looking for an expert system that learns.

Now we decide whether we want to play a game, which is to make
the machine seem human, or whether we want the machine to
exhibit human behavior on the same basis as humans, that is, to
survive.  For the game, the essential step is to make the
machine communicate with us both visually and verbally, so it
can translate the character strings it made up into English, so
we can understand it and it can understand us.  For the survival
motivation, the machine needs a full set of receptors and
effectors, and an environment in which it can either survive or
perish, and if we built it right it will learn English for its
own reasons.  It could also endanger our survival.

Now, Harnad, Weinstein, anyone: do you think this could work,
or do you think it could not work?

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

aweinste@diamond.bbn.com.UUCP (06/14/87)

In article <1163@houdi.UUCP> marty1@houdi.UUCP (M.BRILLIANT) writes:
>> 	(A) The symbols in a purely symbolic system ...
>
>What exactly is this "purely" symbolic approach?  What impure approach
>might be necessary?  "Purely symbolic" sounds like a straw man ...

The phrase "purely symbolic" was just my short label for the AI strategy that
Stevan Harnad has been criticizing. Yes this strategy *does* encompass the
use of sensors and effectors and (maybe) motivations. Sorry if the term was
misleading, I was only using it as pointer;  consult Harnad's postings for a
fuller characterization.

berleant@ut-sally.UUCP (06/15/87)

It is interesting that some (presumably significant) visual processing
occurs by graded potentials without action potentials. Receptor cells
(rods & cones), 'horizontal cells' which process the graded output of
the receptors, and 'bipolar cells' which do further processing, use no
action potentials to do it. This seems to indicate the significance of
analog processing to vision.

There may also be significant invertibility at these early stages of
visual processing in the retina: One photon can cause several hundred
sodium channels in a rod cell to close. Such sensitivity suggests a need
for precise representation of visual stimuli which suggests the
representation might be invertible.

Furthermore, the retina cannot be viewed as a module, only loosely
coupled to the brain. The optic nerve, which does the coupling, has a
high bandwidth and thus carries much information simultaneously along
many fibers. In fact, the optic nerve carries a topographic
representation of the retina. To the degree that a topographic
representation is an iconic representation, the brain thus receives an
iconic representation of the visual field. 

Furthermore, even central processing of visual information is
characterized by topographic representations. This suggests that iconic
representations are important to the later stages of perceptual
processing. Indeed, all of the sensory systems seem to rely on
topographic representations (particularly touch and hearing as well as
vision).

An interesting example in hearing is direction perception. Direction
seems to be, as I understand it, found by processing the difference in
time from when a sound reaches one ear to when it reaches the other, in
large part. The resulting direction is presumably an invertible
representation of that time difference.

Dan Berleant
UUCP: {gatech,ucbvax,ihnp4,seismo,kpno,ctvax}!ut-sally!berleant
ARPA: ai.berleant@r20.utexas.edu

harnad@mind.UUCP (06/15/87)

In two consecutive postings marty1@houdi.UUCP (M.BRILLIANT)
of AT&T Bell Laboratories, Holmdel wrote:

>	the flow of visual information through the layers of the retina,
>	and through the layers of the visual cortex, with motion detection,
>	edge detection, orientation detection, etc., all going on in specific
>	neurons... Maybe a neurobiologist can give a good account of what
>	all that means, so we can guess whether computer image
>	processing could emulate it.

As I indicated the last time, neurobiologists don't *know* what all
those findings mean. It is not known how features are detected and by
what. The idea that single cells are doing the detecting is just a
theory fragment, and one that has currently fallen on hard times. Rivals
include distributed networks (of which the cell is just a component),
or spatial frequency detectors, or coding at some entirely different
level, such as continuous postsynaptic potentials, local circuits,
architectonic columns or neurochemistry. Some even think that the
multiple analog retinas at various levels of the visual system (12 on
each side, at last count) may have something to do with feature
extraction. One cannot just take current neurophysiological data and
replace the nonexistent theory by preconceptions from machine vision
-- especially not by way of justifying the machine-theoretic concepts.

>>     		>[SH:] my theory never laid claim to complete invertibility
>>		>throughout.
>
>	First "analog" doesn't mean analog, and now "invertibility"
>	doesn't mean complete invertibility.  These arguments are
>	getting too slippery for me... If non-invertibility is essential
>	to the way we process information, you can't say non-invertibility
>	would prevent a machine from emulating us.

I have no idea what proposition you think you were debating here. I
had pointed out a problem with the top-down symbolic approach to
mind-modeling -- the symbol grounding problem -- which suggested that
symbolic representations would have to be grounded in nonsymbolic
representations. I had also sketched a model for categorization that
attempted to ground symbolic representations in two nonsymbolic kinds
of representations -- iconic (analog) representations and categorical
(feature-filtered) representations. I also proposed a criterion for
analog transformations -- invertibility. I never said that categorical
representations were invertible or that iconic representations were
the only nonsymbolic representations you needed to ground symbols. Indeed,
most of the CP book under discussion concerns categorical representations.

>	All I'm saying is that Harnad has come nowhere near proving his
>	assertions, or even making clear what his assertions are...
>	Harnad's terminology has proved unreliable: analog doesn't mean
>	analog, invertible doesn't mean invertible, and so on.  Maybe
>	top-down doesn't mean top-down either...
>	Anybody can do hand-waving.  To be convincing, abstract
>	reasoning must be rigidly self-consistent.  Harnad's is not.
>	I haven't made any assertions as to what is possible.

Invertibility is my candidate criterion for an analog transform. Invertible
means invertible, top-down means top-down. Where further clarification is
needed, all one need do is ask.

Now here is M. B. Brilliant's "Recipe for a symbol-grounder" (not to be
confused with an assertion as to what is possible):

>	Suppose we create a visual transducer... with hard-wired
>	capability to detect "objects"... Next let's create a symbol bank
>	Next let's connect the two... I'm over my head here, but I don't
>	think I'm asking for anything we think is impossible. Basically,
>	I'm looking for an expert system that learns... the essential step
>	is to make the machine communicate with us both visually and verbally,
>	so it can translate the character strings it made up into English, so
>	we can understand it and it can understand us. For the survival
>	motivation, the machine needs a full set of receptors and
>	effectors, and an environment in which it can either survive or
>	perish, and if we built it right it will learn English for its
>	own reasons. Now, Harnad, Weinstein, anyone: do you think this
>	could work, or do you think it could not work?

Sounds like a conjecture about a system that would pass the TTT.
Unfortunately, the rest seems far too vague and hypothetical to respond to.

If you want me to pay attention to further postings of yours, stay
temperate and respectful as I endeavor to do. Dismissive rhetoric will not
convince anyone, and will not elicit substantive discussion.

-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

berleant@ut-sally.UUCP (06/15/87)

In article <835@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:

>We cannot objectively verify intrinsic meaningfulness. The Turing test
>is the only available criterion.

Yes, the Turing test is by definition subjective, and also subject to
variable results from hour to hour even from the same judge.

But I think I disagree that intrinsic meaningfulness cannot be
objectively verified. What about the model theory of logic?


Dan Berleant
UUCP: {gatech,ucbvax,ihnp4,seismo,kpno,ctvax}!ut-sally!berleant
ARPA: ai.berleant@r20.utexas.edu

harnad@mind.UUCP (06/15/87)

berleant@ut-sally.UUCP (Dan Berleant) of U. Texas CS Dept., Austin, Texas
has posted this welcome reminder:

>	the retina cannot be viewed as a module, only loosely
>	coupled to the brain. The optic nerve, which does the coupling, has a
>	high bandwidth and thus carries much information simultaneously along
>	many fibers. In fact, the optic nerve carries a topographic
>	representation of the retina. To the degree that a topographic
>	representation is an iconic representation, the brain thus receives an
>	iconic representation of the visual field. 

>	Furthermore, even central processing of visual information is
>	characterized by topographic representations. This suggests that iconic
>	representations are important to the later stages of perceptual
>	processing. Indeed, all of the sensory systems seem to rely on
>	topographic representations (particularly touch and hearing as well as
>	vision).

As I mentioned in my last posting, at last count there were 12 pairs
of successively higher analog retinas in the visual system. No one yet
knows what function they perform, but they certainly suggest that it
is premature to dismiss the importance of analog representations in at
least one well optimized system...

>	Yes, the Turing test is by definition subjective, and also subject to
>	variable results from hour to hour even from the same judge.
>	But I think I disagree that intrinsic meaningfulness cannot be
>	objectively verified. What about the model theory of logic?

In earlier postings I distinguished between two components of the
Turing Test. One is the formal, objective one: Getting a system to generate
all of our behavioral capacities. The second is the informal,
intuitive (and hence subjective) one: Can a person tell such a device
apart from a person? This version must be open-ended, and is no better
or worse than -- in fact, I argue that is identical to -- the
real-life turing-testing we do of one another in contending with the
"other minds" problem.

The subjective verification of intrinsic meaning, however, is not done
by means of the informal turing test. It is done from the first-person
point of view. Each of us knows that his symbols (his linguistic ones,
at any rate) are grounded, and refer to objects, rather than being
menaningless syntactic objects manipulated on the basis of their shapes.

I am not a model theorist, so the following reply may be inadequate, but it
seems to me that the semantic model for an uninterpreted formal system
in formal model-theoretic semantics is always yet another formal
object, only its symbols are of a different type from the symbols of the
system that is being interpreted. That seems true of *formal* models.
Of course, there are informal models, in which the intended interpretation
of a formal system corresponds to conceptual or even physical objects. We
can say that the intended interpretation of the primitive symbol tokens
and the axioms of formal number theory are "numbers," by which we mean
either our intuitive concept of numbers or whatever invariant physical
property quantities of objects share. But such informal interpretations 
are not what formal model theory trades in. As far as I can tell,
formal models are not intrinsically grounded, but depend on our
concepts and our linking them to real objects. And of course the
intrinsic grounding of our concepts and our references to objects is
what we are attempting to capture in confronting the symbol grounding
problem.

I hope model theorists will correct me if I'm wrong. But even if the
model-theoretic interpretation of some formal symbol systems can truly
be regarded as the "objects" to which it refers, it is not clear that
this can be generalized to natural language or to the "language of
thought," which must, after all, have Total-Turing-Test scope, rather
than the scope of the circumscribed artificial languages of logic and
mathematics. Is there any indication that all that can be formalized
model-theoretically?
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

harnad@mind.UUCP (06/15/87)

Ken Laws <Laws@Stripe.SRI.Com> on ailist@Stripe.SRI.Com writes:

>	Consider a "hash transformation" that maps a set of "intuitively
>	meaningful" numeric symbols to a set of seemingly random binary codes.
>	Suppose that the transformation can be computed by some [horrendous]
>	information-preserving mapping of the reals to the reals.  Now, the
>	hash function satisfies my notion of an analog transformation (in the
>	signal-processing sense). When applied to my discrete input set,
>	however, the mapping does not seem to be analog (in the sense of
>	preserving isomorphic relationships between pairs -- or higher
>	orders -- of symbolic codes). Since information has not been lost,
>	however, it should be possible to define "relational functions" that
>	are analogous to "adjacency" and other properties in the original
>	domain.  Once this is done, surely the binary codes must be viewed
>	as isomorphic to the original symbols rather than just "standing for
>	them".

I don't think I disagree with this. Don't forget that I bit the bullet
on some surprising consequences of taking my invertibility criterion
for an analog transform seriously. As long as the requisite
information-preserving mapping or "relational function" is in the head
of the human interpreter, you do not have an invertible (hence analog)
transformation. But as soon as the inverse function is wired in
physically, producing a dedicated invertible transformation, you do
have invertibility, even if a lot of the stuff in between is as
discrete, digital and binary as it can be.

I'm not unaware of this counterintuitive property of the invertibility
criterion -- or even of the possibility that it may ultimately do it in
as an attempt to capture the essential feature of an analog transform in
general. Invertibility could fail to capture the standard A/D distinction,
but may be important in the special case of mind-modeling. Or it could
turn out not to be useful at all. (Although Ken Laws's point seems to
strengthen rather than weaken my criterion, unless I've misunderstood.)

Note, however, that what I've said about the grounding problem and the role
of nonsymbolic representations (analog and categorical) would stand
independently of my particular criterion for analog; substituting a more
standard one leaves just about all of the argument intact. Some of the prior
commentators (not Ken Laws) haven't noticed that, criticizing
invertibility as a criterion for analog and thinking that they were
criticizing the symbol grounding problem.

>	The "information" in a signal is a function of your methods for
>	extracting and interpreting the information.  Likewise the "analog
>	nature" of an information-preserving transformation is a function
>	of your methods for decoding the analog relationships.

I completely agree. But to get the requisite causality I'm looking
for, the information must be interpretation-independent. Physical
invertibility seems to give you that, even if it's generated by
hardwiring the encryption/decryption (encoding/decoding) scheme underlying
the interpretation into a dedicated system.

>	Perhaps [information theorists] have too limited (or general!)
>	a view of information, but they have certainly considered your
>	problem of decoding signal shape (as opposed to detecting modulation
>	patterns)... I am sure that methods for decoding both discrete and
>	continuous information in continuous signals are well studied.

I would be interested to hear from those who are familiar with such work.
It may be that some of it is relevant to cognitive and neural modeling
and even the symbol grounding problems under discussion here.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

marty1@houdi.UUCP (M.BRILLIANT) (06/16/87)

In article <849@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> .... Invertibility could fail to capture the standard A/D distinction,
> but may be important in the special case of mind-modeling. Or it could
> turn out not to be useful at all....

So what do you think is essential: (A) literally analog transformation,
(B) invertibility, or (C) preservation of significant relational
functions?

> ..... what I've said about the grounding problem and the role
> of nonsymbolic representations (analog and categorical) would stand
> independently of my particular criterion for analog; substituting a more
> standard one leaves just about all of the argument intact.....

Where does that argument stand now?  Can we restate it in terms whose
definitions we all agree on?

> ..... to get the requisite causality I'm looking
> for, the information must be interpretation-independent. Physical
> invertibility seems to give you that......

I think invertibility is too strong.  It is sufficient, but not
necessary, for human-style information-processing.  Real people forget
awesome amounts of detail, we misunderstand each other (our symbol
groundings are not fully invertible), and we thereby achieve levels of
communication that often, but not always, satisify us.

Do you still say we only need transformations that are analog
(invertible) with respect to those features for which they are analog
(invertible)?  That amounts to limited invertibility, and the next
essential step would be to identify the features that need
invertibility, as distinct from those that can be thrown away.

> Ken Laws <Laws@Stripe.SRI.Com> on ailist@Stripe.SRI.Com writes:
> >	... I am sure that methods for decoding both discrete and
> >	continuous information in continuous signals are well studied.
> 
> I would be interested to hear from those who are familiar with such work.
> It may be that some of it is relevant to cognitive and neural modeling
> and even the symbol grounding problems under discussion here.

I'm not up to date on these methods.  But if you want to get responses
from experts, it might be well to be more specific.  For monaural
sound, decoding can be done with Fourier methods that are in principle
continuous.  For monocular vision, Fourier methods are used for image
enhancement to aid in human decoding, but I think machine decoding
depends on making the spatial dimensions discontinous and comparing the
content of adjacent cells.

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

aweinste@Diamond.BBN.COM (Anders Weinstein) (06/17/87)

In article <849@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
>                                   As long as the requisite
>information-preserving mapping or "relational function" is in the head
>of the human interpreter, you do not have an invertible (hence analog)
>transformation. But as soon as the inverse function is wired in
>physically, producing a dedicated invertible transformation, you do
>have invertibility, ...

This seems to relate to a distinction between "physical invertibility" and
plain old invertibility, another of your points which I haven't understood.

I don't see any difference between "physical" and "merely theoretical"
invertibility.  If a particular physical transformation of a signal is
invertible in theory, then I'd imagine we could always build a device to
perform the actual inversion if we wanted to. Such a device would of course
be a physical device; hence the invertibility would seem to count as
"physical," at least in the sense of "physically possible".

Surely you don't mean that a transformation-inversion capability must
actually be present in the device for it to count as "analog" in your sense.
(Else brains, for example, wouldn't count). So what difference are you trying
to capture with this distinction?

Anders Weinstein
BBN Labs

harnad@mind.UUCP (Stevan Harnad) (06/17/87)

marty1@houdi.UUCP (M.BRILLIANT) asks:

>	what do you think is essential: (A) literally analog transformation,
>	(B) invertibility, or (C) preservation of significant relational
>	functions?

Essential for what? For (i) generating the pairwise same/different judgments,
simlarity judgments and matching that I've called, collectively,
"discrimination", and for which I've hypothesized that there are
iconic ("analog") representations? For that I think invertibility is
essential. (I think that in most real cases what is actually
physically invertible in my sense will also turn out to be "literally
analog" in a more standard sense. Dedicated digital equivalents that
would also have yielded invertibility will be like a Rube-Goldberg
alternative; they will have a much bigger processing cost. But for my
puroposes, the dedicated digital equivalent would in principle serve
just as well. Don't forget the *dedicated* constraint though.)

For (ii) generating the reliable sorting and labeling of objects on the
basis of their sensory projections, which I've called collectively,
"identification" or "categorization"? For that I think only distinctive
features need to be extracted from the sensory projection. The rest need
not be invertible. Iconic representations are one-to-one with the
sensory projection; categorical representations are many-to-few.

But if you're not talking about sensory discrimination or about
stimulus categorization but about, say, (iii) conscious problem-solving,
deduction, or linguistic description, then relation-preserving
symbolic representations would be optimal -- only the ones I advocate
would not be autonomous (modular). The atomic terms of which they were
composed would be the labels of categories in the above sense, and hence they
would be grounded in and constrained by the nonsymbolic representations.
They would preserve relations not just in virtue of their syntactic
form, as mediated by an interpretation; their meanings would be "fixed"
by their causal connections with the nonsymbolic representations that
ground their atoms.

But if your question concerns what I think is nesessary to pass the
Total Turing Test (TTT), I think you need all of (i) - (iii), grounded
bottom-up in the way I've described.

>	Where does [the symbol grounding] argument stand now? Can we
>	restate it in terms whose definitions we all agree on?

The symbols of an autonomous symbol-manipulating module are
ungrounded. Their "meanings" depend on the mediation of human
interpretation. If an attempt is made to "ground" them merely by
linking the symbolic module with input/output modules in a dedicated
system, all you will ever get is toy models: Small, nonrepresentative,
nongeneralizable pieces of intelligent performance (a valid objective for
AI, by the way, but not for cognitive modeling). This is only a
conjecture, however, based on current toy performance models and the
the kind of thing it takes to make them work. If a top-down symbolic
module linked to peripherals could successfully pass the TTT that way,
however, nothing would be left of the symbol grounding problem.

My own alternative has to do with the way symbolic models work (and
don't work). The hypothesis is that a hybrid symbolic/nonsymbolic
model along the lines sketched above will be needed in order to pass
the TTT. It will require a bottom-up, nonmodular grounding of its
symbolic representations in nonsymbolic representations: iconic
( = invertible with the sensory projection) and categorical ( = invertible
only with the invariant features of category members that are preserved
in the sensory projection and are sufficient to guide reliable
categorization).

>	I think invertibility is too strong. It is sufficient, but not
>	necessary, for human-style information-processing. Real people
>	forget... misunderstand...

I think this is not the relevant form of evidence bearing on this
question.  Sure we forget, etc., but the question concerns what it takes
to get it right when we actually do get it right. How do we discriminate,
categorize, identify and describe things as well as we do (TTT-level)
based on the sensory data we get? And I have to remind you again:
categorization involves at least as much selective *non*invertibility
as it does invertibility. Invertibility is needed where it's needed;
it's not needed everywhere, indeed it may even be a handicap (see
Luria's "Mind of a Mnemonist," which is about a person who seems to
have had such vivid, accurate and persisting eidetic imagery that he
couldn't selectively ignore or forget sensory details, and hence had
great difficulty categorizing, abstracting and generalizing; Borges
describes a similar case in "Funes the Memorious," and I discuss the
problem in "Metaphor and Mental Duality," a chapter in Simon & Sholes' (eds.)
"Language, Mind and Brain," Academic Press 1978).

>	Do you still say [1] we only need transformations that are analog
>	(invertible) with respect to those features for which they are analog
>	(invertible)?  That amounts to limited invertibility, and the next
>	essential step would be [2] to identify the features that need
>	invertibility, as distinct from those that can be thrown away.

Yes, I still say [1]. And yes, the category induction problem is [2].
Perhaps with the three-level division-of-labor I've described a
connectionist algorithm or some other inductive mechanism would be
able to find the invariant features that will subserve a sensory
categorization from a given sample of confusable alternatives. That's
the categorical representation.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

aweinste@Diamond.BBN.COM (Anders Weinstein) (06/18/87)

In article <861@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
>                                   The atomic terms of which they were
>composed would be the labels of categories in the above sense, and hence they
>would be grounded in and constrained by the nonsymbolic representations.
>They would preserve relations not just in virtue of their syntactic
>form, as mediated by an interpretation; their meanings would be "fixed"
>by their causal connections with the nonsymbolic representations that
>ground their atoms.

I don't know how significant this is for your theory, but I think it's worth
emphasizing that the *semantic* meaning  of a symbol is still left largely
unconstrained even after you take account of it's "grounding" in perceptual
categorization.  This is because what matters for intentional content is not
the objective property in the world that's being detected, but rather how the
subject *conceives* of that external property, a far more slippery notion.

This point is emphasized in a different context in the Churchland's BBS reply
to Drestke's "Knowledge and the Flow of Information." To paraphrase one of
their examples: primitive people may be able to reliably categorize certain
large-scale atmospheric electrical discharges; nevertheless, the semantic
content of their corresponding states might be "Angry gods nearby" or some
such. Indeed, by varying their factual beliefs we could invent cases where
the semantic content of these states is just about anything you please.
Semantic content is a holistic matter.

Another well-known obstacle to moving from an objective to an intentional
description is that the latter contains an essentially normative component,
in that we must make some distinction between correct and erroneous
classification. For example, we'd probably like to say that a frog has a
fly-detector which is sometimes wrong, rather than a "moving-spot-against-a-
fixed-background" detector which is infallible. Again, this distinction seems
to depend on fuzzy considerations about the purpose or functional role of the
concept in question.

Some of the things you say also suggest that you're attempting to resuscitate
a form of classical empricist sensory atomism, where the "atomic" symbols
refer to sensory categories acquired "by acquaintance" and the meaning of
complex symbols is built up from the atoms "by description". This approach
has an honorable history in philsophy; unfortunately, no one has ever been
able to make it work. In addition to the above considerations, the main
problems seem to be: first, that no principled distinction can be made
between the simple sensory concepts and the complex "theoretical" ones; and
second, that very little that is interesting can be explicitly defined in
sensory terms (try, for example, "chair").

I realize the above considerations may not be relevant to your program -- I
just can't tell to what extent you expect it to shed any light on the problem
of explaining semantic content in naturalistic terms. In any case, I think
it's important to understand why this fundamental problem remains largely
untouched by such theories.

Anders Weinstein
BBN Labs

marty1@houdi.UUCP (M.BRILLIANT) (06/20/87)

In article <861@mind.UUCP>, harnad@mind.UUCP writes:
> marty1@houdi.UUCP (M.BRILLIANT) asks:
> 
> >	what do you think is essential: (A) literally analog transformation,
> >	(B) invertibility, or (C) preservation of significant relational
> >	functions?
> 
Let me see if I can correctly rephrase his answer:

(i) "discrimination" (pairwise same/different judgments) he associates
with iconic ("analog") representations, which he says have to be
invertible, and will ordinarily be really analog because "dedicated"
digital equivalents will be too complex.

(ii) for "identification" or "categorization" (sorting and labeling of
objects), he says only distinctive features need be extracted from the
sensory projection; this process is not invertible.

(iii) for "conscious problem-solving," etc., he says relation-preserving
symbolic representations would be optimal, if they are not "autonomous
(modular)" but rather are grounded by deriving their atomic symbols
through the categorization process above.

(iv) to pass the Total Turing Test he wants all of the above, tied
together in the sequence described.

I agree with this formulation in most of its terms.  But some of the
terms are confusing, in that if I accept what I think are good
definitions, I don't entirely agree with the statements above.

"Invertible/Analog": The property of invertibility is easy to visualize
for continuous functions. First, continuous functions are what I would
call "analog" transformations.  They are at least locally image-forming
(iconic). Then, saying a continuous transformation is invertible, or
one-to-one, means it is monotonic, like a linear transformation, rather
than many-to-one like a parabolic transformation.  That is, it is
unambiguously iconic.

It might be argued that physical sensors can be ambiguously iconic,
e.g., an object seen in a half-silvered mirror.  Harnad would argue
that the ambiguity is inherent in the physical scene, and is not
dependent on the sensor.  I would agree with that if no human sensory
system ever gave ambiguous imaging of unambiguous objects.  What about
the ambiguity of stereophonic location of sound sources?  In that case
the imaging (i) is unambiguous; only the perception (ii) is ambiguous.

But physical sensors are also noisy.  In mathematical terms, that noise
could be modeled as discontinuity, as many-to-one, as one-to-many, or
combinations of these.  The noisy transformation is not invertible. 
But a "physically analog" sensory process (as distinct from a digital
one) can be approximately modeled (to within the noise) by a continuous
transformation.  The continuous approximation allows us to regard the
analog transformation as image-forming (iconic).  But only the
continuous approximation is invertible.

"Autonomous/Modular": The definition of "modular" is not clear to me. 
I have Harnad's definition "not analogous to a top-down, autonomous
symbol-crunching module ... hardwired to peripheral modules."  The
terms in the definition need defining themselves, and I think there are
too many of them.

I would rather look at the "hybrid" three-layer system and say it does
not have a "symbol-cruncher hardwired to peripheral modules" because
there is a feature extractor (and classifier) in between.  The main
point is the presence or absence of the feature extractor.

The symbol-grounding problem arises because the symbols are discrete,
and therefore have to be associated with discrete objects or classes.
Without the feature extractor, there would be no way to derive discrete
objects from the sensory inputs.  The feature extractor obviates the
symbol-grounding problem.  I consider the "symbol-cruncher hardwired to
peripheral modules" to be not only a straw man but a dead horse.

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

marty1@houdi.UUCP (M.BRILLIANT) (06/22/87)

In article <6670@diamond.BBN.COM>, aweinste@Diamond.BBN.COM (Anders
Weinstein) writes, with reference to article <861@mind.UUCP>
harnad@mind.UUCP (Stevan Harnad):
> 
> Some of the things you say also suggest that you're attempting to resuscitate
> a form of classical empricist sensory atomism, where the "atomic" symbols
> refer to sensory categories acquired "by acquaintance" and the meaning of
> complex symbols is built up from the atoms "by description". This approach
> has an honorable history in philsophy; unfortunately, no one has ever been
> able to make it work. In addition to the above considerations, the main
> problems seem to be: first, that no principled distinction can be made
> between the simple sensory concepts and the complex "theoretical" ones; and
> second, that very little that is interesting can be explicitly defined in
> sensory terms (try, for example, "chair").
> 
I hope none of us are really trying to resuscitate classical philosophies,
because the object of this discussion is to learn how to use modern
technologies.  To define an interesting object in sensory terms requires
an intermediary module between the sensory system and the symbolic system.

With a chair in the visual sensory field, the system will use hard-coded
nonlinear (decision-making) techniques to identify boundaries and shapes
of objects, and identify the properties that are invariant to rotation
and translation.  A plain wooden chair and an overstuffed chair will be
different objects in these terms.  But the system might also learn to
identify certain types of objects that move, i.e., those we call people.
If it notices that people assume the same position in association with
both chair-objects, it could decide to use the same category for both.

The key to this kind of classification is that the chair is not defined in
explicit sensory terms but in terms of filtered sensory input.

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

P.S. Sorry for the double posting of my previous article.

adam@gec-mi-at.co.uk (Adam Quantrill) (06/23/87)

In article <6521@diamond.BBN.COM> aweinste@Diamond.BBN.COM (Anders Weinstein) writes:
>
>To elaborate: I presume the "symbol-grounding" problem is a *philosophical*
>question: what gives formal symbols original intentionality? I suppose the
>only answer anybody knows is, in brief, that the symbols must be playing a
>certain role in what Dennett calls an "intentional system", that is, a system
>which is capable of producing complex, adaptive behavior in a rational way.
>[]
>And this, as far as I can tell, is the end of what we learn from the "symbol
>grounding" problem -- you've got to have sense organs. []

It seems to me that the Symbol Grounding problem is  a   red   herring.   If   I
took  a  partially self-learning program and data (P & D) that had learnt from a
computer with 'sense organs',  and  ran it  on a  computer  without,  would  the
program's output become symbolically ungrounded?

Similarily, if I myself wrote P & D without running it on a  computer   at  all,
where's  the difference? Surely it is possible that I can come up with identical
P & D by analysis.  Does  that  make the  original  P  & D running on  the  com-
puter with 'sense organs' symbolically ungrounded?

A computer can  always  interact  via  the  keyboard  &  terminal  screen,   (if
those   are  the only 'sense organs'), grounding its internal symbols via people
who react to the output, and  provide further stimulus.
       -Adam.

/* If at first it don't compile, kludge, kludge again.*/

harnad@mind.UUCP (Stevan Harnad) (06/26/87)

John Cugini <Cugini@icst-ecf.arpa> on ailist@stripe.sri.com writes:

>	What if there were a few-to-one transformation between the skin-level
>	sensors (remember Harnad proposes "skin-and-in" invertibility
>	as being necessary for grounding) and the (somewhat more internal)
>	iconic representation.  My example was to suppose that #1: 
>	a combination of both red and green retinal receptors and #2 a yellow
>	receptor BOTH generated the same iconic yellow.
>	Clearly this iconic representation is non-invertible back out to the
>	sensory surfaces, but intuitively it seems like it would be grounded
>	nonetheless - how about it?

Invertibility is a necessary condition for iconic representation, not
for grounding.  Grounding symbolic representations (according to my
hypothesis) requires both iconic and categorical representations. The
latter are selective, many-to-few, invertible only in the features
they pick out and, most important, APPROXIMATE (e.g., as between
red-green and yellow in your example above). This point has by now
come up several times...
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

marty1@houdi.UUCP (M.BRILLIANT) (06/26/87)

In article <914@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> Invertibility is a necessary condition for iconic representation, not
> for grounding.  Grounding symbolic representations (according to my
> hypothesis) requires both iconic and categorical representations...

Syllogism:
    (a) grounding ... requires ... iconic ... representation....
    (b) invertibility is ... necessary ... for iconic representation.
    (c) hence, grounding must require invertibility.

Why then does harnad say "invertibility is a necessary condition
for ..., NOT for grounding" (caps mine, of course)?

This discussion is getting hard to follow.  Does it have to be carried
on simultaneously in both comp.ai and comp.cog-eng?  Could harnad, who
seems to be the major participant, pick one?

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

harnad@mind.UUCP (Stevan Harnad) (06/26/87)

berleant@ut-sally.UUCP (Dan Berleant) of U. Texas CS Dept., Austin, Texas
writes:

>	Are you saying that the categorical representations are to be
>	nonsymbolic?  The review of human concept representation I recently read
>	(Smith and Medin, Categories and Concepts, 1981) came down... hard on
>	the holistic theory of concept representation... The alternative
>	nonsymbolic approach would be the 'dimensional' one. It seems a
>	strongish statement to say that this would be sufficient, to the
>	exclusion of symbolic properties... However, the metric
>	hypothesis -- that a concept is sufficiently characterized by a point
>	in a multi-dimensional space -- seems wrong, as experiments have shown.

Categorical representations are the representations of purely SENSORY
categories, and I am indeed saying that they are to be NONsymbolic.
Let me also point out that the theory I am putting forward represents
a direct challenge to the Roschian line of category research in which
the book you cite belongs. To put it very briefly, I claim that that
line of experimental and theoretical work is not really investigating
the representations underlying the capacity to categorize at all; it is
only looking at the fine tuning of category judgments. The experiments
are typically not addressing the question of how it is that a device
or organism can successfully categorize the inputs in question in the
first place; instead it examines (1) how QUICKLY or EASILY subjects do it,
(2) how TYPICAL (of the members of the category in question) subjects rate
the inputs to be and (3) what features subjects INTROSPECT that they are
using. This completely bypasses the real question of how anyone or anything
actually manages to accomplish the categorization at all.

Let me quickly add that there is nothing wrong with reaction-time
experiments if they suggest hypotheses about the basic underlying
mechanism, or provide ways of testing them. But in this case -- as in
many others in experimental cognitive psychology -- the basic
mechanisms are bypassed and the focus is on fine-tuning questions
that are beside the point (or premature) -- if, that is, the objective
is to explain how organisms or devices actually manage to generate
successful categorization performance given the inputs in question. As
an exercise, see where the constructs you mention above -- "holistic,"
"dimensional," or "metric" representations -- are likely to get you if
you're actually trying to get a device to categorize, as we do.

There is also an "entry point" problem with this line of research,
which typically looks willy-nilly at higher-order, abstract
categories, as well as "basic level" object categories (an incoherent
concept, in my opinion, except as an arbitrary default level), and
even some sensory categories. But it seems obvious that the question
of how the higher-order categories are represented is dependent on how
the lower-order ones are represented, the abstract ones on the
concrete ones, and perhaps all of these depend on the sensory ones.
Moreover, often the inputs used are members of familiar, overlearned
categories, and the task is a trivial one, not engaging the mechanisms
that were involved in their acquisition. In other experiments,
artificial stimuli are used, but it is not clear how representative
these are of the category acquisition process either.

Finally, and perhaps most important: In bypassing the problem of
categorization capacity itself -- i.e., the problem of how devices
manage to categorize as correctly and successfully as they do, given
the inputs they have encountered -- in favor of its fine tuning, this
line of research has unhelpfully blurred the distinction between the
following: (a) the many all-or-none categories that are the real burden
for an explanatory theory of categorization (a penguin, after all, be it
ever so atypical a bird, and be it ever so time-consuming for us to judge
that it is indeed a bird, is, after all, indeed a bird, and we know
it, and can say so, with 100% accuracy every time, irrespective of
whether we can successfully introspect what features we are using to
say so) and (b) true "graded" categories such as "big," "intelligent,"
etc. Let's face the all-or-none problem before we get fancy...

>	To discuss "invariant features... sufficient to guide reliable
>	categorization" sounds like the "classical" theory (as Smith & Medin
>	call it) of concept representation: Concepts are represented as
>	necessary and sufficient features (i.e., there are defining features,
>	i.e. there is a boolean conjunction of predicates for a concept).  This
>	approach has serious problems, not the least of which is the inability
>	of humans to describe these features for seemingly elementary concepts,
>	like "chair", as Weinstein and others point out. I contend that a
>	boolean function (including ORs as well as ANDs) could work, but that
>	is not what was mentioned. An example might be helpful: A vehicle must
>	have a steering wheel OR handlebars. But to remove the OR by saying,
>	a vehicle must have a means of steering, is to rely on a feature which
>	is symbolic, high level, functional, which I gather we are not allowing.

It certainly is the "classical" theory, but the one with the serious
problems is the fine-tuning approach I just described, not the quite
reasonable assumption that if 100% correct, all-or-none categorization
is possible at all (without magic), then there must be a set of features
in the inputs that is SUFFICIENT to generate it. I of course agree
that disjunctive features are legitimate -- but whoever said they
weren't? That was another red herring introduced by this line of
research. And, as I mentioned, "the inability of humans to describe
these features" is irrelevant. If they could do it, they'd be
cognitive modelers! We must INFER what features they're using to
categorize successfully; nothing guarantees they can tell us.

(If by "Weinstein" you mean "Wittgenstein" on "games," etc., I have to remind
you that Wittgenstein did not have the contemporary burden of speaking
in terms of internal mechanisms a device would have to have in order to
categorize successfully. Otherwise he would have had to admit that
"games" are either (i) an all-or-none category, i.e., there is a "right" or
"wrong" of the matter, and we are able to sort accordingly, whether or
not we can introspect the basis of our correct sorting, or (ii) "games"
are truly a fuzzy category, in which membership is arbitrary,
uncertain, or a matter of degree. But if the latter, then games are
simply not representative of the garden-variety all-or-none
categorization capacity that we exercise when we categorize most
objects, such as chairs, tables, birds. And again, there's nothing
whatsoever wrong with disjunctive features.)

Finally, it is not that we are not "allowing" higher-order symbolically
described features. They are the goal of the whole grounding project.
But the approach I am advocating requires that symbolic descriptions
be composed of primitive symbols which are in turn the labels of sensory
categories, grounded in nonsymbolic (iconic and categorical) representations.

>	[Concerning model-theoretic "grounding":] The more statements
>	you have (that you wish to be deemed correct), the more the possible
>	meanings of the terms will be constrained. To illustrate, consider
>	the statement FISH SWIM. Think of the terms FISH and SWIM as variables
>	with no predetermined meaning -- so that FISH SWIM is just another way
>	of writing A B. What variable bindings satisfy this?  Well, many do...
>	Now consider the statement FISH LIVE, where FISH and LIVE are variables.
>	Now there are two statements to be satisfied. The assignment to the
>	variable LIVE restricts the possible assignments to the variable SWIM...
>	Of course, we have many many statements in our minds that must be
>	simultaneously satisfied, so the possible meanings that each word name
>	can be assigned is correspondingly restricted. Could the restrictions be
>	sufficient to require such a small amount of ambiguity that the word
>	names could be said to have intrinsic meaning?...  footnote: This
>	leaves unanswered the question of how the meanings themselves are
>	grounded. Non-symbolically, seems to be the gist of the discussion,
>	in which case logic would be useless for that task even in an
>	"in principle" capacity since the stuff of logic is symbols.

I agree that there are constraints on the correlations of symbols in a
natural language, and that the degrees of freedom probably shrink, in
a sense, as the text grows. That is probably the basis of successful
cryptography. But I still think (and you appear to agree) that even if
the degrees of freedom are close to zero for a natural language's
symbol combinatons and their interpretations, this still leaves the
grounding problem intact: How are the symbols connected to their
referents? And what justifies our interpretation of their meanings?
With true cryptography, the decryption of the symbols of the unknown
language is always grounded in the meanings of the symbols of a known
language, which are in turn grounded in our heads, and their
understanding of the symbols and their relation to the world. But
that's the standard DERIVED meaning scenario, and for cognitive
modeling we need INTRINSICALLY grounded symbols. (I do believe,
though, that the degrees-of-freedom constraint on symbol combinations
does cut somewhat into Quine's claims about the indeterminacy of
radical translation, and ESPECIALLY for an intrinsically grounded
symbol system.)
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

harnad@mind.UUCP (Stevan Harnad) (06/27/87)

aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc.,
Cambridge, MA writes:

>	I don't see any difference between "physical" and "merely theoretical"
>	invertibility... Surely you don't mean that a transformation-inversion
>	capability must actually be present in the device for it to count as
>	"analog" in your sense.  (Else brains, for example, wouldn't count).

I think this is partly an empirical question. "Physically possible"
invertibility is enough for an analog transformation, but actual
physical invertibility may be necessary for an iconic representation
that can generate all of our discrimination capacities. Avoiding
"merely theoretical" invertibility is also part of avoiding any reliance
on mediation by our theoretical interpretations in order to get an
autonomous, intrinsically grounded system.

>	the *semantic* meaning  of a symbol is still left largely unconstrained
>	even after you take account of it's "grounding" in perceptual
>	categorization. This is because what matters for intentional content
>	is not the objective property in the world that's being detected, but
>	rather how the subject *conceives* of that external property, a far
>	more slippery notion... primitive people may be able to reliably
>	categorize certain large-scale atmospheric electrical discharges;
>	nevertheless, the semantic content of their corresponding states might
>	be "Angry gods nearby" or some such.

I agree that symbol grounding cannot be based on the "objective
property" that's being detected. Categorical representations in my
grounding model are approximate. All they do is sort and label the confusable
alternatives that have been sampled, using the provisional features
that suffice to generate reliable sorting performance according to the feedback
that defines "right" and "wrong." There is always a context of
confusable alternatives, and which features are used to sort reliably
is always a "compared to what?" matter. The exact "objective property" they
pick out is never an issue, only whether they can generate reliable
asymptotic categorization performance given that sample and those
feedback constraints. The representation is indifferent to whether
what you are calling "water," is really "twin-water" (with other
objective properties), as long as you can sort it "correctly" according
to the feedback (say, from the dictates of thirst, or a community of
categorizing instructors).

As to what people "conceive" themselves to be categorizing: My model
is proposed in a framework of methodological epiphenomenalism. I'm
interested in what's going on in people's heads only inasmuch as it is
REALLY generating their performance, not just because they think or
feel it is. So, for example, in criticizing the Roschian approach to
categorization in my reply to Dan Berleant I suggested that it was
irrelevant what features subjects BELIEVED they were using to
categorize, say, chairs; what matters is what features they (or any
organism or device in a similar input situation) really ARE using.
[This does not contradict my previous point about the irrelevance of
"objective properties." "Features" refers to properties of the
proximal projection on the device's sense receptors, whereas
"properties" would be the essential characteristics of distal objects
in the world. Feature detectors are blind to distal differences that
are not preserved in the proximal projection.]

On the other hand, "Angry gods nearby" is not just an atomic label for
"thunder" (otherwise it WOULD be equivalent to it in my model -- both
labels would pick out approximately the same thing); in fact, it is
decomposable, and hence has a different meaning in virtue of the
meanings of "angry" and "gods." There should be corresponding internal
representational differences (iconic, categorical and symbolic) that
capture that difference.

>	Another well-known obstacle to moving from an objective to an
>	intentional description is that the latter contains an essentially
>	normative component, in that we must make some distinction between
>	correct and erroneous classification. For example, we'd probably
>	like to say that a frog has a fly-detector which is sometimes wrong,
>	rather than a "moving-spot-against-a- fixed-background" detector
>	which is infallible. Again, this distinction seems to depend on fuzzy
>	considerations about the purpose or functional role of the concept
>	in question... [In his reply on this point to Dan Berleant,
>	Weinstein continues:] the philosophical problem is to say why any
>	response should count as an *error* at all. What makes it wrong?
>	I.e. who decides which "concept" -- "fly" or "moving-spot..." -- the
>	frog is trying to apply? The objective facts about the frog's
>	perceptual abilities by themselves don't seem to tell you that in
>	snapping out its tongue at a decoy, it's making a *mistake*. To
>	say this, an outside interpreter has to make some judgement about what
>	the frog's brain is trying to accomplish by its detection of moving
>	spots. And this makes the determination of semantic descriptions a
>	fuzzy matter.

I don't think there's any problem at all of what should count as an "error"
for my kind of model. The correctness or incorrectness of a label is
always determined by feedback -- either ecological, as in evolution
and daily nonverbal learning, or linguistic, where it is conventions
of usage that determine what we call what. I don't see anything fuzzy about
such a functional framework. (The frog's feedback, by the way,
probably has to do with edibility, so (i) "something that affords eating"
is probably a better "interpretation" of what it's detecting. And, to
the extent that (ii) flies and (iii) moving spots are treated indifferently by
the detector, the representation is approximate among all three.
The case is not like that of natives and thunder, since the frog's
"descriptions" are hardly decomposable. Finally, there is again no
hope of specifying distal "objective properties" ["bug"/"schmug"] here
either, as approximateness continues to prevail.)

>	Some of the things you say also suggest that you're attempting to
>	resuscitate a form of classical empricist sensory atomism, where the
>	"atomic" symbols refer to sensory categories acquired "by acquaintance"
>	and the meaning of complex symbols is built up from the atoms "by
>	description". This approach has an honorable history in philosophy;
>	unfortunately, no one has ever been able to make it work. In addition
>	to the above considerations, the main problems seem to be: first,
>	(1) that no principled distinction can be made between the simple
>	sensory concepts and the complex "theoretical" ones; and second,
>	(2) that very little that is interesting can be explicitly defined in
>	sensory terms (try, for example, "chair")...[In reply to Berleant,
>	Weinstein continues:] Of course *some* concepts can be acquired by
>	definition. However, the "classical empiricist" doctrine is committed
>	to the further idea that there is some privileged set of *purely
>	sensory* concepts and that all non-sensory concepts can be defined in
>	terms of this basis. This is what has never been shown to work. If you
>	regard "juice" as a "primitive" concept, then you do not share the
>	classical doctrine. (And if you do not, I invite you try giving
>	necessary and sufficient conditions for juicehood.)

You're absolutely right that this is a throwback to seventeenth-century
bottom-upism.  In fact, in the CP book I call the iconic and
categorical representations the "acquaintance system" and the symbolic
representations the "description system." The only difference is that
I'm only claiming to be giving a theory of categorization. Whether or
not this captures "meaning" depends (for me at any rate) largely on
whether or not such a system can successfully pass the Total Turing
Test. It's true that no one has made this approach work. But it's also
true that no one has tried. It's only in today's era of computer
modeling, robotics and bioengineering that these mechanisms will begin
to be tested to see whether or not they can deliver the goods.

To reply to your "two main problems": (1) Even an elementary sensory
category such as "red" is already abstract once you get beyond the
icon to the categorical representation. "Red" picks out the
electromagnetic wave-lengths that share the feature of being above and
below a certain threshold. That's an abstraction. And in exchange for
generating a feature-detector that reliably picks it out, you get a
label -- "red" -- which can now enter into symbolic descriptions (e.g.,
"red square"). Categorization is abstraction. As soon as you've left
the realm of invertible icons, you've begun to abstract, yet you've
never left the realm of the senses. And so it goes, bottom up, from
there onward. 

(2) As to sensory "definitions": I don't think this is the right thing
to look for, because it's too hard to find a valid "entry point" into
the bottom-up hierarchy. I doubt that "chair" or "juice" are sensory
primitives, picked out purely by sensory feature detectors. They're
probably represented by symbolic descriptions such as "things you can
sit on" and "things you can drink," and of course those are just the
coarsest of first approximations. But the scenario looks pretty
straightforward: Even though it's flexible enough to be revised to
include a chair (suitably homegenized) as a juice and a juice (for a
bug?) as a chair, it seems very clear that it is the resources of (grounded)
symbolic description that are being drawn upon here in picking out
what is and is not a chair, and on the basis of what features.

The categories are too interrelated (and approximate, and provisional) for
an exhaustive "definition," but provisional descriptions that will get
you by in your sorting and labeling -- and, more important, are
revisable and updatable, to tighten the approximation -- are certainly
available and not hard to come by. "Necessary and sufficient conditions for
juicehood," however, are a red herring. All we need is a provisional
set of features that will reliably sort the instances as environmental and
social feedback currently dictates. Remember, we're not looking for
"objective properties" or ontic essences -- just something that will
guide reliable sorting according to the contingencies sampled to date.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

marty1@houdi.UUCP (M.BRILLIANT) (06/27/87)

In article <917@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> ...  blurred the distinction between the
> following: (a) the many all-or-none categories that are the real burden
> for an explanatory theory of categorization (a penguin, after all, be it
> ever so atypical a bird, ... is, after all, indeed a bird, and we know
> it, and can say so, with 100% accuracy every time, ....
> ... and (b) true "graded" categories such as "big," "intelligent," ...

> ......
> "games" are either (i) an all-or-none category, i.e., there is a "right" or
> "wrong" of the matter, and we are able to sort accordingly, ...
> ... or (ii) "games"
> are truly a fuzzy category, in which membership is arbitrary,
> uncertain, or a matter of degree. But if the latter, then games are
> simply not representative of the garden-variety all-or-none
> categorization capacity that we exercise when we categorize most
> objects, such as chairs, tables, birds....

Now, much of this discussion is out of my field, but (a) I would like
to share in the results, and (b) I understand membership in classes
like "bird" and "chair."

I learned recently that I can't categorize chairs with 100% accuracy. 
A chair used to be a thing that supported one person at the seat and
the back, and a stool had no back support.  Then somebody invented a
thing that supported one person at the seat, the knees, but not the
back, and I didn't know what it was.  As far as my sensory
categorization was concerned at the time, its distinctive features were
inadequate to classify it.  Then somebody told me it was a chair.  Its
membership in the class "chair" was arbitrary.  Now a "chair" in my
lexicon is a thing that supports the seat and either the back or the
knees.

Actually, I think I perceive most chairs by recognizing the object
first as a familiar thing like a kitchen chair, a wing chair, etc., and
then I name it with the generic name "chair."  I think Harnad would
recognize this process.  The class is defined arbitrarily by inclusion
of specific members, not by features common to the class.  It's not so
much a class of objects, as a class of classes....

If that is so, then "bird" as a categorization of "penguin" is purely
symbolic, and hence is arbitrary, and once the arbitrariness is defined
out, that categorization is a logical, 100% accurate, deduction.  The
class "penguin" is closer to the primitives that we infer inductively
from sensory input.

But the identification of "penguin" in a picture, or in the field, is
uncertain because the outlines may be blurred, hidden, etc.  So there
is no place in the pre-symbolic processing of sensory input where 100%
accuracy is essential.  (This being so, there is no requirement for
invertibility.)

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

marty1@houdi.UUCP (M.BRILLIANT) (06/29/87)

In article <919@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel writes:
> ......
> >	.... The feature extractor obviates the symbol-grounding
> >	problem.
> 
> ..... You are vastly underestimating the problem of
> sensory categorization, sensory learning, and the relation between
> lower and higher-order categories. Nor is it obvious that symbol manipulation
> can still be regarded as just symbol manipulation when the atomic symbols
> are constrained to be the labels of sensory categories....

I still think we're having more trouble with terminology than we
would have with the concepts if we understood each other.  To
get a little more concrete, how walking through what a machine
might do in perceiving a chair?

I was just looking at a kitchen chair, a brown wooden kitchen
chair against a yellow wall, in side light from a window.  Let's
let a machine train its camera on that object.  Now either it
has a mechanical array of receptors and processors, like the
layers of cells in a retina, or it does a functionally
equivalent thing with sequential processing.  What it has to do
is compare the brightness of neighboring points to find places
where there is contrast, find contrast in contiguous places so
as to form an outline, and find closed outlines to form objects.
There are some subtleties needed to find partly hidden objects,
but I'll just assume they're solved.  There may also be an
interpretation of shadow gradations to perceive roundness.

Now the machine has the outline of an object in 2 dimensions,
and maybe some clues to the 3rd dimension.  There are CAD
programs that, given a complete description of an object in
3D, can draw any 2D view of it.  How about reversing this
essentially deductive process to inductively find a 3D form that
would give rise to the 2D view the machine just saw.  Let the
machine guess that most of the odd angles in the 2D view are
really right angles in 3D.  Then, if the object is really
unfamiliar, let the machine walk around the chair, or pick it
up and turn it around, to refine its hypothesis.

Now the machine has a form.  If the form is still unfamiliar,
let it ask, "What's that, Daddy?"  Daddy says, "That's a chair."
The machine files that information away.  Next time it sees a
similar form it says "Chair, Daddy, chair!"  It still has to
learn about upholstered chairs, but give it time.

That brings me to a question: do you really want this machine
to be so Totally Turing that it grows like a human, learns like
a human, and not only learns new objects, but, like a human born
at age zero, learns how to perceive objects?  How much of its
abilities do you want to have wired in, and how much learned?

But back to the main question.  I have skipped over a lot of
detail, but I think the outline can in principle be filled in
with technologies we can imagine even if we do not have them.
How much agreement do we have with this scenario?  What are
the points of disagreement?

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

smoliar@vaxa.isi.edu (Stephen Smoliar) (06/30/87)

In article <1194@houdi.UUCP> marty1@houdi.UUCP (M.BRILLIANT) writes:
>
>I was just looking at a kitchen chair, a brown wooden kitchen
>chair against a yellow wall, in side light from a window.  Let's
>let a machine train its camera on that object.  Now either it
>has a mechanical array of receptors and processors, like the
>layers of cells in a retina, or it does a functionally
>equivalent thing with sequential processing.  What it has to do
>is compare the brightness of neighboring points to find places
>where there is contrast, find contrast in contiguous places so
>as to form an outline, and find closed outlines to form objects.
>There are some subtleties needed to find partly hidden objects,
>but I'll just assume they're solved.  There may also be an
>interpretation of shadow gradations to perceive roundness.
>
I have been trying to keep my distance from this debate, but I would like
to insert a few observations regarding this scenario.  In many ways, this
paragraph represents the "obvious" approach to perception, assuming that
one is dealing with a symbol manipulation system.  However, other approaches
have been hypothesized.  While their viability remains to be demonstrated,
it would be fair to say that, in the broad scope of perception in the real
world, the same may be said of symbol manipulation systems.

Consider the holographic model posed by Karl Pribram in LANGUAGES OF THE
BRAIN.  As I understand it, this model postulates that memory is a collection
of holographic transforms of experienced images.  As new images are
experienced, the brain is capable of retrieving "best fits" from this
memory to form associations.  Thus, the chair you see in the above
paragraph is recognized as a chair by virtue of the fact that it "fits"
other images of chairs you have seen in the past.

I'm not sure I buy this, but I'm at least willing to acknowledge it as
an alternative to your symbol manipulation scenario.  The biggest problem
I have has to do with retrieval.  As far as I understand, present holographic
retrieval works fine as long as you don't have to worry about little things
like change of scale, translation, or rotation.  If this model is going to
work, then the retrieval process is going to have to be more powerful than
the current technology allows.

The other problem relates to concept acquisition, as was postulated in
Brilliant's continuation of the scenario:
>
>Now the machine has a form.  If the form is still unfamiliar,
>let it ask, "What's that, Daddy?"  Daddy says, "That's a chair."
>The machine files that information away.  Next time it sees a
>similar form it says "Chair, Daddy, chair!"  It still has to
>learn about upholstered chairs, but give it time.
>
The difficulty seems to be in what it means to file something away if
one's memory is simply one of experiences.  Does the memory trace of the
chair experience include Daddy's voice saying "chair?"  While I'm willing
to acknowledge a multi-media memory trace, this seems a bit pat.  It
reminds me of Skinner's VERBAL BEHAVIOR, in which he claimed that one
learned the concept "beautiful" from stimuli of observing people saying
"beautiful" in front of beautiful objects.  This conjures up a vision
of people wandering around the Metropolitan Museum of Art mutttering
"beautiful" as they wander from gallery to gallery.

Perhaps the difficulty is that the mind really doesn't want to assign a
symbol to every experience immediately.  Rather, following the model of
Holland et. al., it is first necessary to build up some degree of
reinforcement which assures that a particular memory trace is actually
going to be retrieved relatively frequently (whatever that means).
In such a case, then, a symbol becomes a fast-access mechanism for
retrieval of that trace (or a collection of common traces).  However,
this gives rise to at least two questions for which I have no answer:

	1.  What are the criteria by which it is decided that such a
		symbol is required for fast-access?

	2.  Where does the symbol's name come from?

	3.  How is the symbol actually "bound" to what it retrieves?

These would seem to be the sort of questions which might help to tie
this debate down to more concrete matters.

Brilliant continues:
>That brings me to a question: do you really want this machine
>to be so Totally Turing that it grows like a human, learns like
>a human, and not only learns new objects, but, like a human born
>at age zero, learns how to perceive objects?  How much of its
>abilities do you want to have wired in, and how much learned?
>
This would appear to be one of the directions in which connectionism is
leading.  In a recent talk, Sejnowski talked about "training" networks
for text-to-speech and backgammon . . . not programming them.  On the
other hand, at the current level of his experiments, designing the network
is as important as training it;  training can't begin until one has a
suitable architecture of nodes and connections.  The big unanswered
questions would appear to be:  will all of this scale upward?  That
is, is there ultimately some all-embracing architecture which includes
all the mini-architectures examined by connectionist experiments and
enough more to accommodate the methodological epiphenomenalism of real
life?

harnad@mind.UUCP (Stevan Harnad) (06/30/87)

marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel asks:

>	how about walking through what a machine might do in perceiving a chair?
>	...let a machine train its camera on that object.  Now either it
>	has a mechanical array of receptors and processors, like the layers
>	of cells in a retina, or it does a functionally equivalent thing with
>	sequential processing.  What it has to do is compare the brightness of
>	neighboring points to find places where there is contrast, find
>	contrast in contiguous places so as to form an outline, and find
>	closed outlines to form objects... Now the machine has the outline
>	of an object in 2 dimensions, and maybe some clues to the 3rd
>	dimension...  inductively find a 3D form that would give rise to the
>	2D view the machine just saw... Then, if the object is really
>	unfamiliar, let the machine walk around the chair, or pick it
>	up and turn it around, to refine its hypothesis.

So far, apart from its understandable toward current engineering hardware
concepts, there is no particular objection to this description of a
stereoptic sensory receptor.

>	Now the machine has a form.  If the form is still unfamiliar,
>	let it ask, "What's that, Daddy?"  Daddy says, "That's a chair."
>	The machine files that information away.  Next time it sees a
>	similar form it says "Chair, Daddy, chair!"  It still has to
>	learn about upholstered chairs, but give it time.

Now you've lost me completely. Having acknowledged the intricacies of
sensory transduction, you seem to think that the problem of categorization
is just a matter of filing information away and finding "similar forms."

>	do you really want this machine to be so Totally Turing that it
>	grows like a human, learns like a human, and not only learns new
>	objects, but, like a human born at age zero, learns how to perceive
>	objects?  How much of its abilities do you want to have wired in,
>	and how much learned?

That's an empirical question. All it needs to do is pass the Total
turing Test -- i.e., exhibit performance capacities that are
indistinguishable from ours. If you can do it by building everything
in a priori, go ahead. I'm betting it'll need to learn -- or be able to
learn -- a lot.

>	But back to the main question.  I have skipped over a lot of
>	detail, but I think the outline can in principle be filled in
>	with technologies we can imagine even if we do not have them.
>	How much agreement do we have with this scenario?  What are
>	the points of disagreement?

I think the main details are missing, such as how the successful
categorization is accomplished. Your account also sounds as if it
expects innate feature detectors to pick out objects for free, more or
less nonproblematically, and then serve as a front end for another
device (possibly a conventional symbol-cruncher a la standard AI?)
that will then do the cognitive heavy work. I think that the cognitive
heavy work begins with picking out objects, i.e., with categorization.
I think this is done nonsymbolically, on the sensory traces, and that it
involves learning and pattern recognition -- both sophisticated
cognitive activities. I also do not think this work ends, to be taken
over by another kind of work: symbolic processing. I think that ALL of
cognition can be seen as categorization. It begins nonsymbolically,
with sensory features used to sort objects according to their names on
the basis of category learning; then further sorting proceeds by symbolic
descriptions, based on combinations of those atomic names. This hybrid
nonsymbolic/symbolic categorizer is what we are; not a pair of modules,
one that picks out objects and the other that thinks and talks about them.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

marty1@houdi.UUCP (M.BRILLIANT) (06/30/87)

In article <937@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel asks:
> ....
> >	do you really want this machine to be so Totally Turing that it
> >	grows like a human, learns like a human, and not only learns new
> >	objects, but, like a human born at age zero, learns how to perceive
> >	objects?  How much of its abilities do you want to have wired in,
> >	and how much learned?
> 
> That's an empirical question. All it needs to do is pass the Total
> turing Test -- i.e., exhibit performance capacities that are
> indistinguishable from ours. If you can do it by building everything
> in a priori, go ahead. I'm betting it'll need to learn -- or be able to
> learn -- a lot.

To refine the question: how long do you imagine the Total Turing Test
will last?  Science fiction stories have robots or aliens living in
human society as humans for periods of years, as long as they live with
strangers, but failing after a few hours trying to supplant a human and
fool his or her spouse.

By "performance capabilities," do you mean the capability to adapt as a
human does to the experiences of a lifetime?  Or only enough learning
capability to pass a job interview?

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

marty1@houdi.UUCP (M.BRILLIANT) (06/30/87)

In article <937@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> ...
> marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel asks:
> >	how about walking through what a machine might do in perceiving a chair?
> >	... (a few steps skipped here)
> >	Now the machine has a form.  If the form is still unfamiliar,
> >	let it ask, "What's that, Daddy?"  Daddy says, "That's a chair."
> >	The machine files that information away.  Next time it sees a
> >	similar form it says "Chair, Daddy, chair!" ...
> 
> Now you've lost me completely. Having acknowledged the intricacies of
> sensory transduction, you seem to think that the problem of categorization
> is just a matter of filing information away and finding "similar forms."

I think it is.  We've found a set of lines, described in 3 dimensions,
that can be rotated to match the outline we derived from the view of a
real chair.  We file it in association with the name "chair."  A
"similar form" is some other outline that can be matched (to within
some fraction of its size) by rotating the same 3D description.

> I think the main details are missing, such as how the successful
> categorization is accomplished......

Are we having a problem with the word "categorization"?  Is it the
process of picking discrete objects out of a pattern of light and
shade ("that's a thing"), or the process of naming the object ("that
thing is a chair")?

> ..... Your account also sounds as if it
> expects innate feature detectors to pick out objects for free, more or
> less nonproblematically.....

You left out the part where I referred to computer-aided-design
modules.  I think we can find outlines by looking for contiguous
contrasts.  If the outlines are straight we (the machine, maybe also
humans) can define the ends of the straight lines in the visual plane,
and hypothesize corresponding lines in space.  If hard-coding this
capability gives an "innate feature detector" then that's what I want.

> ...... and then serve as a front end for another
> device (possibly a conventional symbol-cruncher a la standard AI?)
> that will then do the cognitive heavy work. I think that the cognitive
> heavy work begins with picking out objects, i.e., with categorization.

I think I find objects with no conscious knowledge of how I do it (is
that what you call "categorization")?  Saying what kind of object it is
more often involves conscious symbol-processing (sometimes one forgets
the word and calls a perfectly familiar object "that thing").

> I think this is done nonsymbolically, on the sensory traces, and that it
> involves learning and pattern recognition -- both sophisticated
> cognitive activities.

If you're talking about finding objects in a field of light and shade, I
agree that it is done nonsymbolically, and everything else you just said.

> .....  I also do not think this work ends, to be taken
> over by another kind of work: symbolic processing.....

That's where I have trouble.  Calling a penguin a bird seems to me
purely symbolic, just as calling a tomato a vegetable in one context,
and a fruit in another, is a symbolic process.

> ..... I think that ALL of
> cognition can be seen as categorization. It begins nonsymbolically,
> with sensory features used to sort objects according to their names on
> the basis of category learning; then further sorting proceeds by symbolic
> descriptions, based on combinations of those atomic names. This hybrid
> nonsymbolic/symbolic categorizer is what we are; not a pair of modules,
> one that picks out objects and the other that thinks and talks about them.

Now I don't understand what you said.  If it begins nonsymbolically,
and proceeds symbolically, why can't it be done by linking a
nonsymbolic module to a symbolic module?

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

aweinste@Diamond.BBN.COM (Anders Weinstein) (06/30/87)

In reply to my statement that
>>	the *semantic* meaning  of a symbol is still left largely unconstrained
>>	even after you take account of it's "grounding" in perceptual
>>	categorization. This is because what matters for intentional content
>>	is not the objective property in the world that's being detected, but
>>	rather how the subject *conceives* of that external property, a far
>>	more slippery notion... 

Stevan Harnad (harnad@mind.UUCP) writes: 
>
> As to what people "conceive" themselves to be categorizing: My model
> is proposed in a framework of methodological epiphenomenalism. I'm
> interested in what's going on in people's heads only inasmuch as it is
> REALLY generating their performance, not just because they think or
> feel it is. 

I regret the subjectivistic tone of my loose characterization; what people
can introspect is indeed not at issue. I was merely pointing out that the
*meaning* of a symbol is crucially dependent on the rest of the cognitive
system, as shown in the Churchland's example:

>>	                    ... primitive people may be able to reliably
>>	categorize certain large-scale atmospheric electrical discharges;
>>	nevertheless, the semantic content of their corresponding states might
>>	be "Angry gods nearby" or some such.
>>
>                ... "Angry gods nearby" is not just an atomic label for
> "thunder" (otherwise it WOULD be equivalent to it in my model -- both
> labels would pick out approximately the same thing); in fact, it is
> decomposable, and hence has a different meaning in virtue of the
> meanings of "angry" and "gods." There should be corresponding internal
> representational differences (iconic, categorical and symbolic) that
> capture that difference.

"Angry gods nearby" is composite in *English*, but it need not be composite
in native, or, more to the point, in the supposed inner language of the
native's categorical mechanisms. They may have a single word, say "gog",
which we would want to translate as "god-noise" or some such. Perhaps they
train their children to detect gog in precisely the same way we train
children to detect thunder -- our internal thunder-detectors are identical.
Nevertheless, the output of their thunder-detector does not *mean* "thunder".

Let me try to clarify the point of these considerations.  I am all for an
inquiry into the mechanisms underlying our categorization ablities. Anything
you can discover out about these mechanisms would certainly be a major
contribution to psychology.  My only concern is with semantics:  I was piqued
by what seemed to be an ambitious claim about the significance of the
psychology of categorization for the problem of "intentionality" or intrinsic
meaningfulness. I merely want to emphasize that the former, interesting
though it is, hardly makes a dent in the latter.

As I said, there are two reasons why meaning resists explication by this kind
of psychology:  (1) holism: the meaning of even a "grounded" symbol will
still depend on the rest of the cognitive system; and (2) normativity:
meaning is dependent upon a determination of what is a *correct* response,
and you can't simply read such a norm off from a description of how the
mechanism in fact performs.

I think these points, particularly (1), should be quite clear.  The fact that
a subject's brain reliably asserts the symbol "foo" when and only when
thunder is presented in no way "fixes" the meaning of "foo". Of course it is
obviously a *constraint* on what "foo" may mean: it is in fact part of what
Quine called the "stimulus meaning" of "foo", his first constraint on
acceptable translation.  Nevertheless, by itself it is still way too weak to
do the whole job, for in different contexts the postive output of a reliable
thunder-detector could mean "thunder", something co-extensive but
non-synonymous with "thunder", "god-noise", or just about anything else.
Indeed, it might not *mean* anything at all, if it were only part of a
mechanical thunder-detector which couldn't do anything else.

I wonder if you disagree with this?

As to normativity, the force of problem (2) is particularly acute when
talking about the supposed intentionality of animals, since there aren't any
obvious linguistic or intellectual norms that they are trying to adhere to.
Although the mechanics of a frog's prey-detector may be crystal clear, I am
convinced that we could easily get into an endless debate about what, if
anything, the output of this detector really *means*.

The normativity problem is germane in an interesting way to the problem of
human meanings as well.  Note, for example, that in doing this sort of
psychology, we probably won't care about the difference between correctly
identifying a duck and mis-identifying a good decoy -- we're interested in
the perceptual mechanisms that are the same in both cases.  In effect, we are
limiting our notion of "categorization" to something like "quick and largely
automatic classification by observation alone".

We pretty much *have* to restrict ourselves in this way, because, in the
general case, there's just no limit to the amount of cognitive activity that
might be required in order to positively classify something.  Consider what
might go into deciding whether a dolphin ought to be classified as a fish,
whether a fetus ought to be classified as a person, etc.  These decisions
potentially call for the full range of science and philosophy, and a
psychology which tries to encompass such decisions has just bitten off more
than it can chew:  it would have to provide a comprehensive theory of
rationality, and such an ambitious theory has eluded philosophers for some
time now.

In short, we have to ignore some normative distinctions if we are to
circumscribe the area of inquiry to a theoretically tractable domain of
cognitive activity.  (Indeed, in spite of some of your claims, we seem
committed to the notion that we are limiting ourselves to particular
*modules* as explained in Fodor's modularity book.) Unfortunately -- and
here's the rub -- these normative distinctions *are* significant for the
*meaning* of symbols.  ("Duck" doesn't *mean* the same thing as "decoy").

It seems that, ultimately, the notion of *meaning* is intimately tied to
standards of rationality that cannot easily be reduced to simple features of
a cognitive mechanism.  And this seems to be a deep reason why a descriptive
psychology of categorization barely touches the problem of intentionality.

Anders Weinstein
BBN Labs

harnad@mind.UUCP (Stevan Harnad) (07/01/87)

marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel writes:

>	how long do you imagine the Total Turing Test will last?... By
>	"performance capabilities," do you mean the capability to adapt as a
>	human does to the experiences of a lifetime?

My Total Turing Test (TTT) has two components, one a formal, empirical one,
another an informal, intuitive one. The formal test requires that a
candidate display all of our generic performance capacities -- the
ability to discriminate, manipulate, identify and describe objects and
events as we do, under the same conditions we do, and to generate and respond
to descriptions (language) as we do. The informal test requires that the
candidate do this in a way that is indistinguishable to human beings from
the way human beings do it. The informal component of the TTT is open-ended --
there is no formal constraint on how much is enough. The reason is
that I proposed the TTT to match what we do in the real world anyway,
in our informal everyday provisional "solutions" to the "other-minds" problem
in dealing with one another. Robots should be held to no more or less
exacting standards. (This was extensively discussed on the Net last year.)

>	We've found a set of lines, described in 3 dimensions, that can be
>	rotated to match the outline we derived from the view of a real chair.
>	We file it in association with the name "chair."  A "similar form" is
>	some other outline that can be matched (to within some fraction of its
>	size) by rotating the same 3D description.

I agree that that kind of process gets you similarity (and similarity
gradients), but it doesn't get you categorization. It sounds as if you're trying
to get successful identification using icons. That will only work if
the inputs you have to sort are low in confusability or are separated
by natural gaps in their variation. As soon as the sorting problem
becomes hard, feature-learning becomes a crucial, active process --
not one that anyone really has a handle on yet. Do you really think
your stereognostic icons of chairs will be able, like us, to reliably pick
out all the chairs from the nonchairs with which they might be confused,
using only the kinds of resources you describe here?

>	Is [categorization] the process of picking discrete objects out of a
>	pattern of light and shade ("that's a thing"), or the process of naming
>	the object ("that thing is a chair")?

The latter. And reliably saying which confusable things are chairs vs.
nonchairs as we do is no mean feat.

>	I think I find objects with no conscious knowledge of how I do it (is
>	that what you call "categorization")?  Saying what kind of object it is
>	more often involves conscious symbol-processing

Categorization is saying what kind of object it is. And, especially in
the case of concrete sensory categories, one is no more conscious of
"how" one does this than in resolving figure and ground. And even when
we are conscious of *something*, it's not clear that's really how
we're doing what we're doing. If it were, we could do cognitive
modeling by introspection. (This is one of the reasons I criticize the
Rosch/Wittgenstein line on "necessary/sufficient" features: It relies too
much on what we can and can't introspect.) Finally, conscious
"symbol-processing" and unconscious "symbol-processing" are not
necessarily the same thing. Performance modeling is more concerned with the
latter, which must be inferred rather than introspected.

>	I agree that [finding objects in a field of light and shade]
>	is done nonsymbolically... [but] Calling a penguin a bird seems to me
>	purely symbolic, just as calling a tomato a vegetable in one context,
>	and a fruit in another, is a symbolic process.

Underlying processes must be inferred. They can't just be read off the
performance, or our introspections about how we accomplish it. I am
hypothesizing that higher-order categories are grounded bottom-up in
lower-order sensory ones, and that the latter are represented
nonsymbolically. We're talking about the underlying basis of successful,
reliable, correct naming here. We can't simply take it as given. (And what
we call an object in one context versus another depends precisely on the
sample of confusable alternatives that I've kept stressing.)

>	If  [cognition/categorization] begins nonsymbolically, and proceeds
>	symbolically, why can't it be done by linking a nonsymbolic module to
>	a symbolic module?

Because (according to my model) the elementary symbols out of which
all the rest are composed are really the names of sensory categories
whose representations -- the structures and processes that pick them
out and reliably identify them -- are nonsymbolic. I do not see this
intimate interrelationship -- between names and, on the one hand, the
nonsymbolic representations that pick out the objects they refer to
and, on the other hand, the higher-level symbolic descriptions into
which they enter -- as being perspicuously described as a link between
a pair of autonomous nonsymbolic and symbolic modules. The relationship is
bottom-up and hybrid through and through, with the symbolic component
derivative from, inextricably interdigitated with, and parasitic on the
nonsymbolic.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

franka@mmintl.UUCP (Frank Adams) (07/02/87)

In article <917@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
|Finally, and perhaps most important: In bypassing the problem of
|categorization capacity itself -- i.e., the problem of how devices
|manage to categorize as correctly and successfully as they do, given
|the inputs they have encountered -- in favor of its fine tuning, this
|line of research has unhelpfully blurred the distinction between the
|following: (a) the many all-or-none categories that are the real burden
|for an explanatory theory of categorization (a penguin, after all, be it
|ever so atypical a bird, and be it ever so time-consuming for us to judge
|that it is indeed a bird, is, after all, indeed a bird, and we know
|it, and can say so, with 100% accuracy every time, irrespective of
|whether we can successfully introspect what features we are using to
|say so) and (b) true "graded" categories such as "big," "intelligent,"
|etc. Let's face the all-or-none problem before we get fancy...

I don't believe there are any truely "all-or-none" categories.  There are
always, at least potentially, ambiguous cases.  There is no "100% accuracy
every time", and trying to theorize as though there were is likely to lead
to problems.

Second, and perhaps more to the point, how do you know that "graded"
categories are less fundamental than the other kind?  Maybe it's the other
way around.  Maybe we should try to understand to understand graded
categories first, before we get fancy with the other kind.  I'm not saying
this is the case; but until we actually have an accepted theory of
categorization, we won't know what the simplest route is to get there.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

harnad@mind.UUCP (Stevan Harnad) (07/02/87)

smoliar@vaxa.isi.edu (Stephen Smoliar)
Information Sciences Institute writes:

>	Consider the holographic model proposed by Karl Pribram in LANGUAGES
>	OF THE BRAIN... as an alternative to [M.B. Brilliant's] symbol
>	manipulation scenario.

Besides being unimplemented and hence untested in what they can and can't
do, holographic representations seem to inherit the same handicap as
all iconic representations: Being unique to each input and blending
continuously into one another, how can holograms generate
categorization rather than merely similarity gradients (in the hard
cases, where obvious natural gaps in the input variation don't solve
the problem for you a priori)? What seems necessary is active
feature-selection, based on feedback from success and failure in attempts
to learn to sort and label correctly, not merely passive filtering
based on natural similarities in the input.

>	[A] difficulty seems to be in what it means to file something away if
>	one's memory is simply one of experiences.

Episodic memory -- rote memory for input experiences -- has the same
liability as any purely iconic approach: It can't generate category
boundaries where there is significant interconfusability among
categories of episodes.

>	Perhaps the difficulty is that the mind really doesn't want to
>	assign a symbol to every experience immediately.

That's right. Maybe it's *categories* of experience that must first be
selectively assigned names, not each raw episode.

>	Where does the symbol's name come from? How is the symbol actually
>	"bound" to what it retrieves?

That's the categorization problem.

>	The big unanswered question...[with respect to connectionism]
>	would appear to be:  will [it] all... scale upward?

Connectionism is one of the candidates for the feature-learning
mechanism. That it's (i) nonsymbolic, that it (ii) learns, and that it
(iii) uses the same general statistical algorithm across problem-types
(i.e., that it has generality rather than being ad hoc, like pure
symbolic AI) are connectionism's plus's. (That it's brainlike is not,
nor is it true, on current evidence, nor even relevant at this stage.)
But the real question is indeed: How much can it really do (i.e., will it
scale up)?
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

harnad@mind.UUCP (Stevan Harnad) (07/02/87)

On ailist cugini@icst-ecf.arpa writes:

>	why say that icons, but not categorical representations or symbols
>	are/must be invertible? Isn't it just a vacuous tautology to claim
>	that icons are invertible wrt to the information they preserve, but
>	not wrt the information they lose?... there's information loss (many
>	to one mapping) at each stage of the game: 1. distal object...
>	2. sensory projection... 3. icons... 4. categorical representation...
>	5. symbols... do you still claim that the transition between 2
>	and 3 is invertible in some strong sense which would not be true of,
>	say, [1 to 2] or [3 to 4], and if so, what is that sense?... Perhaps
>	you just want to say that the transition between 2 and 3 is usually
>	more invertible than the other transitions [i.e., invertibility as a
>	graded category]?

[In keeping with Ken Laws' recommendation about minimizing quotation, I have
compressed this query as much as I could to make my reply intelligible.]

Iconic representations (IRs) must perform a very different function from
categorical representations (IRs) or symbolic representations (SRs).
In my model, IRs only subserve relative discrimination, similarity
judgment and sensory-sensory and sensory-motor matching. For all of
these kinds of task, traces of the sensory projection are needed for
purposes of relative comparison and matching. An analog of the sensory
projection *in the properties that are discriminable to the organism*
is my candidate for the kind of representation that will do the job
(i.e., generate the performance). There is no question of preserving
in the IR properties that are *not* discriminable to the organism.

As has been discussed before, there are two ways that IRs could in
principle be invertible (with the discriminable properties of the
sensory projection): by remaining structurally 1:1 with it or by going
into symbols via A/D and an encryption and decryption transformation in a
dedicated  (hard-wired) system. I hypothesize that structural copies are
much more economical than dedicated symbols for generating discrimination
performance (and there is evidence that they are what the nervous system
actually uses). But in principle, you can get invertibility and generate
successful discrimination performance either way.

CRs need not -- indeed cannot -- be invertible with the sensory
projection because they must selectively discard all features except
those that are sufficient to guide successful categorization
performance (i.e., sorting and labeling, identification). Categorical
feature-detectors must discard most of the discriminable properties preserved
in IRs and selectively preserve only the invariant properties shared
by all members of a category that reliably distinguish them from
nonmembers. I have indicated, though, that this representation is
still nonsymbolic; the IR to CR transformation is many-to-few, but it
continues to be invertible in the invariant properties, hence it is
really "micro-iconic." It does not invert from the representation to
the sensory projection, but from the representation to invariant features of
the category. (You can call this invertibility a matter of degree if
you like, but I don't think it's very informative. The important
difference is functional: What it takes to generate discrimination
performance and what it takes to generate categorization
performance.)

Finally, whatever invertibility SRs have is entirely parasitic on the
IRs and CRs in which they are grounded, because the elementary SRs out
of which the composite ones are put together are simply the names of
the categories that the CRs pick out. That's the whole point of this
grounding proposal.

I hope this explains what is invertible and why. (I do not understand your
question about the "invertibility" of the sensory projection to the distal
object, since the locus of that transformation is outside the head and hence
cannot be part of the internal representation that cognitive modeling is
concerned with.)

-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

marty1@houdi.UUCP (M.BRILLIANT) (07/03/87)

In article <958@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> On ailist cugini@icst-ecf.arpa writes:
> >	why say that icons, but not categorical representations or symbols
> >	are/must be invertible? Isn't it just a vacuous tautology to claim
> >	that icons are invertible wrt to the information they preserve, but
> >	not wrt the information they lose?... there's information loss (many
> >	to one mapping) at each stage of the game ...

In Harnad's response he does not answer the question "why?"  He
only repeats the statement with reference to his own model.

Harnad probably has either a real problem or a contribution to
the solution of one.  But when he writes about it, the verbal
problems conceal it, because he insists on using symbols that
are neither grounded nor consensual.  We make no progress unless
we learn what his terms mean, and either use them or avoid them.

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

berleant@ut-sally.UUCP (Dan Berleant) (07/04/87)

In article <956@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
>Episodic memory -- rote memory for input experiences -- has the same
>liability as any purely iconic approach: It can't generate category
>boundaries where there is significant interconfusability among
>categories of episodes.

Are you assuming a representation of episodes (more generally,
exemplars) that is iconic rather than symbolic? Also, *no* category
representation method can generate category boundaries when there is
significant interconfusability among categories!

Dan Berleant
UUCP: {gatech,ucbvax,ihnp4,seismo...& etc.}!ut-sally!berleant
ARPA: ai.berleant@r20.utexas.edu

marty1@houdi.UUCP (M.BRILLIANT) (07/05/87)

In article <605@gec-mi-at.co.uk>, adam@gec-mi-at.co.uk (Adam Quantrill) writes:
> It seems to me that the Symbol Grounding problem is  a   red   herring.

As one who was drawn into a problem that is not my own, let me
try answering that disinterestedly.  To begin with, a "red
herring" is something drawn across the trail that distracts the
pursuer from the real goal.  Would Adam tell us what his real
goal is? 

Actually, my own real goal, from which I was distracted by the
symbol grounding problem, was an expert system that would (like
Adam's last example) ground its symbols only in terminal I/O. 
But that's a red herring in the symbol grounding problem.

> .....  If   I
> took  a  partially self-learning program and data (P & D) that had learnt from a
> computer with 'sense organs',  and  ran it  on a  computer  without,  would  the
> program's output become symbolically ungrounded?

No, because the symbolic data was (were?) learned from sensory
data to begin with - like a sighted person who became blind.

> Similarily, if I myself wrote P & D without running it on a  computer   at  all,
> [and came] up with identical
> P & D by analysis.  Does  that  make the  original  P  & D running on  the  com-
> puter with 'sense organs' symbolically ungrounded?

No, as long as the original program learned its symbolic data
from its own sensory data, not by having them defined by a
person in terms of his or her sensory data.

> A computer can  always  interact  via  the  keyboard  &  terminal  screen,   (if
> those   are  the only 'sense organs'), grounding its internal symbols via people
> who react to the output, and  provide further stimulus.

That's less challenging and less useful than true symbol
grounding.  One problem that requires symbol grounding (more
useful and less ambitious than the Total Turing Test) is a
seeing-eye robot: a machine with artificial vision that could
guide a blind person by giving and taking verbal instructions. 
It might use a Braille keyboard instead of speech, but the
"terminal I/O" must be "grounded" in visual data from, and
constructive interaction with, the tangible world.  The robot
could learn words for its visual data by talking to people who
could see, but it would still have to relate the verbal symbols
to visual data, and give meaning to the symbols in terms of its
ultimate goal (keeping the blind person out of trouble).

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdiem oh t.

harnad@mind.UUCP (Stevan Harnad) (07/05/87)

In Article 184 of comp.cog-eng: adam@gec-mi-at.co.uk (Adam Quantrill)
of Marconi Instruments Ltd., St. Albans, UK writes:

>	It seems to me that the Symbol Grounding problem is a red herring.
>	If I took a partially self-learning program and data (P & D) that had 
>	learnt from a computer with 'sense organs', and ran it on a computer
>	without, would the program's output become symbolically ungrounded?...
>	[or] if I myself wrote P & D without running it on a computer at all?

This begs two of the central questions that have been raised in
this discussion: (1) Can one speak of grounding in a toy device (i.e.,
a device with performance capacities less than those needed to pass
the Total Turing Test)? (2) Could the TTT be passed by just a symbol
manipulating module connected to transducers and effectors? If a
device that could pass the TTT were cut off from its transducers, it
would be like the philosophers' "brain in a vat" -- which is not
obviously a digital computer running programs.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

aweinste@Diamond.BBN.COM (Anders Weinstein) (07/15/87)

In a previous message, I was prompted by Stevan Harnad's postings to try to
explain something I find very interesting, namely, why the psychology of
categorical perception won't do much to illuminate the difficult question of
how formal symbols should be semantically interpreted, i.e. what the symbols
really *mean*.  Harnad sent a long reply (message 972@mind.UUCP) explaining
the nature of his approach in great detail. The upshot, I think, is that in
spite of some of the rhetoric about "symbol grounding", Harnad's project is
not really *attempting* to do any such thing.  It merely aims to discover the
mechanisms underlying certain recognition skills. Since this more modest aim
was precisely what I was urging, I am satisfied that there is no major
disagreement between us.

I want to make clear that I am not here trying to pose any *objection* to
Harnad's model considered as a bit of psychology. I am only trying to
downplay its significance for philosophical issues.

Remember that the traditional conception of "meanings" or "concepts" involves
certain properties: for example, meanings are supposed to contain a criterion
which determines the correct application of the term, in effect defining the
metaphysical essence of the concept in question; they are supposed to serve
as elementary constituents of more complex concepts and thoughts; and they
are supposed to license analytic implications, such as "all bachelors are
unmarried". Since none of these properties seem to be required of the
representations in Harnad's theory, it is in a philosophical sense *not* a
theory of "concepts" or "meanings" at all. As Harnad should be be happy to
concede.

But I want to emphasize again an important reason for this which Harnad
seemed not to acknowledge.  There is a vast difference between the
quick, observational categorization that psychologists tend (rightly) to
focus on and the processes involved in what might be called "conclusive"
classification.  This is the difference between the ability to recognize
something as fish-like in, say, 500 milliseconds, and the ability to
ascertain that something *really* is a fish and not, say, an aquatic mammal.

Now the former quick and largely unconscious ability seems at least a
plausible candidate for revealing fundamental cognitive mechanisms.  The
latter, however, may involve the full exercise of high-level cognition --
remember, conclusive classification can require *years* of experiment,
discussion and debate, and potentially involves everything we know. The
psychology of conclusive categorization does *not* deal with some specialized
area of cognition -- it's just the psychology of all of science and human
rationality, the cognitive scientist's Theory of Everything. And I don't
expect to see such a thing any time soon.

Confusion can result from losing sight of the boundary between these two
domains, for results from the former do not carry over to the latter. And I
think Harnad's model is only reasonably viewed as applying to the first of
these.  The rub is that it seems that the notion of *meaning* has more to do
with what goes on in the second.  Indeed, what I find most interesting in all
this is the way recent philosophy suggests that concepts or meanings in the
traditional sense are essentially *outside* the scope of forseeable psychology.

Some other replies to Harnad:

Although my discussion was informed by Quine's philosophy in its reference to
"meaning holism", it was otherwise not all that Quinean, and I'm not sure
that Quine's highly counter-intuitive views could be called "standard." Note
also that I was *not* arguing from Quine's thesis of the indeterminacy of
translation; nor did I bring up Putnam's Twin-Earth example. (Both of these
arguments would be congenial to my points, but I think they're excessively
weighty sledgehammers to wield in this context). The distinction between
observational and "conclusive" classification, however, does bear in mind 
Putnam's points about the the non-necessity of stereotypical properties.

I also don't think that philosophers have been looking for "the wrong thing
in the wrong way." I think they have made a host of genuine discoveries about
the nature of meaning -- you cite several in your list of issues you'd prefer
to ignore.  The only "failure" I mentioned was the inability to come up with
necessary and sufficient definitions for almost anything. (Not at all, by the
way, a mere failure of "introspection".)

I *do* agree that the aims of philosophy are different than those of
psychology. Indeed, because of this difference of goals, you shouldn't feel
you have to argue *against* Quine or Putnam or even me. You merely have to
explain why you are side-stepping those philosophical issues (as I think you
have done). And the reason in brief is that philosophers are investigating
the notion of meaning and you are not.

Anders Weinstein
BBN Labs