[comp.cog-eng] The symbol grounding problem: Against Rosch & Wittgenstein

harnad@mind.UUCP (Stevan Harnad) (06/28/87)

marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel asks:

>	Why require 100% accuracy in all-or-none categorizing?... I learned
>	recently that I can't categorize chairs with 100% accuracy. 

This is a misunderstanding. The "100% accuracy" refers to the
all-or-none-ness of the kinds of categories in question. The rival
theories in the Roschian tradition have claimed that many categories
(including "bird" and "chair") do not have "defining" features. Instead,
membership is either fuzzy or a matter of degree (i.e., percent), being
based on degree of similarity to a prototype or to prior instances, or on
"family resemblances" (as in Wittgenstein on "games"), etc.. I am directly
challenging this family of theories as not really providing a model for
categorization at all. The "100% accuracy" refers to the fact that,
after all, we do succeed in performing all-or-none sorting and
labeling, and that membership assignment in these categories is not
graded or a matter of degree (although our speed and "typicality
ratings" may be).

I am not, of course, claiming that noise does not exist and that errors
may not occur under certain conditions. Perhaps I should have put it this way:
Categorization preformance (with all-or-none categories) is highly reliable
(close to 100%) and MEMBERSHIP is 100%. Only speed/ease of categorization and
typicality ratings are a matter of degree. The underlying representation must
hence account for all-or-none categorization capacity itself first,
then worry about its fine-tuning.

This is not to deny that even all-or-none categorization may encounter
regions of uncertainty. Since ALL category representations in my model are
provisional and approximate (relative to the context of confusable
alternatives that have been sampled to date), it is always possible that
the categorizer will encounter an anomalous instance that he cannot classify
according to his current representation. The representation must
hence be revised and updated under these conditions, if ~100% accuracy
is to be re-attained. This still does not imply that membership is
fuzzy or a matter of degree, however, only that the (provisional
"defining") features that will successfully sort the members must be revised
or extended. The approximation must be tightened. (Perhaps this is
what happened to you with your category "chair.") The models for the
true graded (non-all-or-none) and fuzzy categories are, respectively,
"big" and "beautiful."

>	The class ["chair," "bird"] is defined arbitrarily by inclusion
>	of specific members, not by features common to the class. It's not so
>	much a class of objects, as a class of classes.... If that is so,
>	then "bird" as a categorization of "penguin" is purely symbolic, and
>	hence is arbitrary, and once the arbitrariness is defined
>	out, that categorization is a logical, 100% accurate, deduction.
>	The class "penguin" is closer to the primitives that we infer
>	inductively [?] from sensory input... But the identification of
>	"penguin" in a picture, or in the field, is uncertain because the
>	outlines may be blurred, hidden, etc.  So there is no place in the
>	pre-symbolic processing of sensory input where 100% accuracy is
>	essential. (This being so, there is no requirement for invertibility.)

First, most categories are not arbitrary. Physical and ecological
contraints govern them. (In the case of "chair," this includes the
Gibsonian "affordance" of whether they're something that can be sat
upon.) One of the constraints may be social convention (as in
stipulations of what we call what, and why), but for a
categorizer that must learn to sort and label correctly, that's just
another constraint to be satisfied. Perhaps what counts as a "game" will
turn out to depend largely on social stipulation, but that does not make
its constraints on categorization arbitrary: Unless we stipulate that
"gameness" is a matter of degree, or that there are uncertain cases
that we have no way to classify as "game" or "nongame," this category
is still an all-or-none one, governed by the features we stipulate.
(And I must repeat: Whether or not we can introspectvely report the features
we are actually using is irrelevant. As long as reliable, consensual,
all-or-none categorization performance is going on, there must be a set of
underlying features governing it -- both with sensory and more
abstract categories. The categorization theorist's burden is to infer
or guess what those features really are.)

Nor is "symbolic" synonymous with arbitrary. In my grounding scheme,
for example, the primitive categories are sensory, based on
nonsymbolic representations. The primitive symbols are then the names
of sensory categories; these can then can go on to enter into combinations
in the form of symbolic descriptions. There is a very subtle "entry-point"
problem in investigating this bottom-up quasi-hierarchy, however:
Is a given input sensory or symbolic? And, somewhat independently, is
its categorization mediated by a sensory representation or a symbolic
one (or both, since there are complicated interrelations [especially
inclusion relations] between them, including redundancies and sometimes
even incoherencies)? The Roschian experimental and theoretical line of
work I am criticizing does not attempt to sort any of this out, and no
wonder, because it is not really modeling categorization performance
in the first place, just its fine tuning.

As to invertibility: I must again repeat, an iconic representation is
only analog in the properties of the sensory projection that it
preserves, not those it fails to preserve. Just as our successful
all-or-none categorization performance dictates that a reliable
feature set must have been selected, so our discrimination performance
dictates the minimal resolution capacity and invertibility there must be
in our iconic representations.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

marty1@houdi.UUCP (M.BRILLIANT) (06/29/87)

In article <931@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes:
> marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel asks:
> >	Why require 100% accuracy in all-or-none categorizing?... I learned
> >	recently that I can't categorize chairs with 100% accuracy. 
> 
> This is a misunderstanding. The "100% accuracy" refers to the
> all-or-none-ness of the kinds of categories in question. The rival
> theories in the Roschian tradition have claimed that many categories
> (including "bird" and "chair") do not have "defining" features. Instead,
> membership is either fuzzy or a matter of degree (i.e., percent)....

OK: once I classify a thing as a chair, there are no two ways about it:
it's a chair.  But there can be a stage when I can't decide.  I
vacillate: "I think it's a chair."  "Are you sure?"  "No, I'm not sure,
maybe it's a bed."  I would never say seriously that I'm 40 percent
sure it's a chair, 50 percent sure it's a bed, and 10% sure it's an
unfamiliar object I've never seen before.

I think this is in agreement with Harnad when he says:

> Categorization preformance (with all-or-none categories) is highly reliable
> (close to 100%) and MEMBERSHIP is 100%. Only speed/ease of categorization and
> typicality ratings are a matter of degree....
> This is not to deny that even all-or-none categorization may encounter
> regions of uncertainty. Since ALL category representations in my model are
> provisional and approximate .....  it is always possible that
> the categorizer will encounter an anomalous instance that he cannot classify
> according to his current representation.....
> ...... This still does not imply that membership is
> fuzzy or a matter of degree.....

So to pass the Total Turing Test, a machine should respond the way a
human does when faced with inadequate or paradoxical sensory data: it
should vacillate (or bluff, as some people do).  In the presence of
uncertainty it will not make self-consistent statements about
uncertainty, but uncertain and possibly inconsistent statements about
absolute membership.

M. B. Brilliant					Marty
AT&T-BL HO 3D-520	(201)-949-1858
Holmdel, NJ 07733	ihnp4!houdi!marty1

dgordon@teknowledge-vaxc.ARPA (Dan Gordon) (06/30/87)

In article <931@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
>(And I must repeat: Whether or not we can introspectvely report the features
>we are actually using is irrelevant. As long as reliable, consensual,
>all-or-none categorization performance is going on, there must be a set of
>underlying features governing it -- both with sensory and more

Is this so?  There is no reliable, consensual all-or-none categorization
performance without a set of underlying features?  That sounds like a
restatement of the categorization theorist's credo rather than a thing
that is so.

Dan Gordon

harnad@mind.UUCP (Stevan Harnad) (07/01/87)

dgordon@teknowledge-vaxc.ARPA (Dan Gordon)
of Teknowledge, Inc., Palo Alto CA writes:

>	There is no reliable, consensual all-or-none categorization performance
>	without a set of underlying features?  That sounds like a restatement of
>	the categorization theorist's credo rather than a thing that is so.

If not, what is the objective basis for the performance? And how would
you get a device to do it given the same inputs?
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU

aweinste@Diamond.BBN.COM (Anders Weinstein) (07/01/87)

In article <949@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
>
>>	There is no reliable, consensual all-or-none categorization performance
>>	without a set of underlying features?  That sounds like a restatement of
>>	the categorization theorist's credo rather than a thing that is so.
>
>If not, what is the objective basis for the performance? And how would
>you get a device to do it given the same inputs?

I think there's some confusion as to whether Harnad's claim is just an empty
tautology or a significant empirical claim. To wit: it's clear that we can
reliably recognize chairs from sensory input, and we don't do this by magic.
Hence, we can perhaps take it as trivially true that there are some
"features" of the input that are being detected. If we are taking this line
however, we have remember that it doesn't really say *anything* about the
operation of the mechanism -- it's just a fancy way of saying we can
recognize chairs. 

On the other hand, it might be taken as a significant claim about the nature
of the chair-recognition device, viz., that we can understand its workings as
a process of actually parsing the input into a set of features and actually
comparing these against what is essentially some logical formula in
featurese.  This *is* an empirical claim, and it is certainly dubitable:
there could be pattern recognition devices (holograms are one speculative
suggestion) which cannot be interestingly broken down into feature-detecting
parts.

Anders Weinstein
BBN Labs

dgordon@teknowledge-vaxc.ARPA (Dan Gordon) (07/02/87)

In article <949@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes:
>
>
>dgordon@teknowledge-vaxc.ARPA (Dan Gordon)
>of Teknowledge, Inc., Palo Alto CA writes:
>
>>	There is no reliable, consensual all-or-none categorization performance
>>	without a set of underlying features?  That sounds like a restatement of
>>	the categorization theorist's credo rather than a thing that is so.
>
>If not, what is the objective basis for the performance? And how would
>you get a device to do it given the same inputs?

Not a riposte, but some observations:

1) finding an objective basis for a performance and getting a device to
do it given the same inputs are two different things.  We may be able
to find an objective basis for a performance but be unable (for merely
contingent reasons, like engineering problems, etc., or for more funda-
mental reasons) to get a device to exhibit the same performance.  And,
I suppose, the converse is true: we may be able to get a device to mimic
a performance without understanding the objective basis for the model
(chess programs seem to me to fall into this class).

2) There may in fact be categorization performances that a) do not use
a set of underlying features; b) have an objective basis which is not
feature-driven; and c) can only be simulated (in the strong sense) by
a device which likewise does not use features.  This is one of the 
central prongs of Wittgenstein's attack on the positivist approach to
language, and although I am not completely convinced by his criticisms,
I haven't run across any very convincing rejoinder.

Maybe more later, Dan Gordon

harnad@mind.UUCP (Stevan Harnad) (07/02/87)

dgordon@teknowledge-vaxc.ARPA (Dan Gordon)
of Teknowledge, Inc., Palo Alto CA writes:

>	finding an objective basis for a performance and getting a device to
>	do it given the same inputs are two different things. We may be able
>	to find an objective basis for a performance but be unable...to get a
>	device to exhibit the same performance. And, I suppose, the converse
>	is true: we may be able to get a device to mimic a performance without
>	understanding the objective basis for the model

I agree with part of this. J.J. Gibson argued that the objective basis of much
of our sensorimotor performance is in stimulus invariants, but this
does not explain how we get a device (like ourselves) to find and use
those invariants and thereby generate the performance. I also agree that a
device (e.g., a connectionist network) may generate a performance
without our understanding quite how it does it (apart from the general
statistical algorithm it's using, in the case of nets). But the
point I am making is neither of these. It concerns whether performance
(correct all-or-none categorization) can be generated without an
objective basis (in the form of "defining" features) (a) existing and
(b) being used by any device that successfully generates the
performance. Whether or not we know know what the objective basis is
and how it's used is another matter.

>	There may in fact be categorization performances that a) do not use
>	a set of underlying features; b) have an objective basis which is not
>	feature-driven; and c) can only be simulated (in the strong sense) by
>	a device which likewise does not use features.  This is one of the 
>	central prongs of Wittgenstein's attack on the positivist approach to
>	language, and although I am not completely convinced by his criticisms,
>	I haven't run across any very convincing rejoinder.

Let's say I'm trying to provide the requisite rejoinder (in the special case of
all-or-none categorization, which is not unrelated to the problems of
language: naming and description). Wittgenstein's arguments were not governed
by a thoroughly modern constraint that has arisen from the possibility of
computer simulation and cognitive modeling. He was introspecting on
what the features defining, say, "games" might be, and he failed to
find a necessary and sufficient set, so he said there wasn't one. If
he had instead asked: "How, in principle, could a device categorize
"games" and "nongames" successfully in every instance?" he would have had
to conclude that the inputs must provide an objective basis
which the device must find and use. Whether or not the device can
introspect and report what the objective basis is is another matter.

Another red herring in Wittegenstein's "family resemblance" metaphor was
the issue of negative and disjunctive features. Not-F is a perfectly good
feature. So is Not-F & Not-G. Which quite naturally yields the
disjunctive feature F-or-G. None of this is tautologous. It just shows
up a certain arbitrary myopia there has been about what a "feature" is.
There's absolutely no reason to restrict "features" to monadic,
conjunctive features that subjects can report by introspection. The
problem in principle is whether there are any logical (and nonmagical)
alternatives to a feature-set sufficient to sort the confusable
alternatives correctly. I would argue that -- apart from contrived,
gerrymandered cases that no one would want to argue formed the real
basis of our ability to categorize -- there are none.

Finally, in the special case of categorization, the criterion of "defining"
features also turns out to be a red herring. According to my own model,
categorization is always provisional and context-dependent (it depends on
what's needed to successfully sort the confusable alternatives sampled to date).
Hence an exhaustive "definition," good till doomsday and formulated from the
God's-eye viewpoint is not at issue, only an approximation that works now, and
can be revised and tightened if the context is ever widened by further
confusable alternatives that the current feature set would not be able to
sort correctly. The conflation of (1) features sufficient to generate the
current provisional (but successful) approximation and (2) some nebulous
"eternal," ontologically exact "defining" set (which I agree does not exist,
and may not even make sense, since categorization is always a relative,
"compared-to-what?" matter) has led to a multitude of spurious
misunderstandings -- foremost among them being the misconception that
our categories are all graded or fuzzy.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Pobjectivjohch

harnad@mind.UUCP (Stevan Harnad) (07/04/87)

aweinste@Diamond.BBN.COM (Anders Weinstein)
of BBN Laboratories, Inc., Cambridge, MA writes:

>      [It's] tempting to suppose that all complex concepts *must* have implicit
>	definitions in terms of some atomic ones, even if...largely
>	unconscious... [but] philosophy has spent two thousand years searching
>	for implicit definitions of concepts without any conspicuous success.

First of all, let me say that this rejoinder of Weinstein's is
excellent. It portrays the standard Quinean view on these matters, so
far as I can tell, faithfully and resourcefully. It will be a pleasure
attempting to refute this sophisticated position, and if I succeed, I
hope that the outcome cannot fail to be informative to all who have
been following these exchanges, particularly in view of the
influential status of the Quinean view. In replying, however, I have
been obliged to quote extensively from Weinstein's articulate
statements, despite Ken Laws's valid request that we minimize quotes
(and my sincere efforts to comply with it). The facility of quoting is
one of the unique powers of electronic communication, though, and I
think in this case paraphrase or cross-reference would have caused
more confusion and discontinuity than it was worth. Now my response:

It is important to note that -- even in this ecumenical age of
"cognitive science" -- the concerns of philosophy and of empirical
psychology are not the same. In other words, it may be that philosophy
was searching for 2000 years in the wrong way or for the wrong thing.
Probably both. This will become clearer in my response, but what I
claim is that (i) the only way to find out how far a bottom-up
approach to concepts grounded in sensory features can get you is
actually to model it -- to see what performance you can get out of a
device that functions that way. (Even what I'm doing is just
prolegomena to such modeling, by the way, but I think I've got the
methodological constraints right and some hints as to how one might
start.) Philosophy has certainly not been doing that.

(ii) "Definitions" (implicit or otherwise) are not what we're looking
for in modeling our use of concepts. We're looking for what kinds of
internal structures and processes a device must have in order to be able to do
what we can do. The fact that philosophers have failed to introspect
exhaustive definitions of concepts is not evidence about what internal
representations may or may not actually underlie concepts. Not only is it
possible that explicit, verbalizable definitions are not what these
representations consist of, but it is unlikely that even their
"implicit" counterparts will be "definitions" at all.

According to my own model, for example, the features that pick out a category
named "X" will never be the real, "essential" features that define X's
ontologically, from the eternal, omniscient point of view. They will
only be the local, context-dependent features that allow a categorizer
to sort X's and non-X's CORRECTLY (sic -- I'll return to this) on the
basis of the sample of interconfusable X's he has encountered to
date. We're not defining X's. We're picking out the features available
from the sensory projection that will reliably sort the X's and
non-X's we encounter. This provisional, approximate, context-dependent
representation then allows us to use "X" in grounded composite
symbolic descriptions of higher-order objects not so closely tied to our
sense experience. These descriptions too merely provisionally pick out
rather than definitively define. Nor is it some exact object that's
being picked out; just the best current approximation on the data
available.

>      [Why the psychology of categorization won't dent the problem of meaning:]
>	"Angry gods nearby" is composite in *English*, but it need not be
>	composite in native, or, more to the point, in the supposed inner
>	language of the native's categorical mechanisms. They may have a single
>	word, say "gog", which we would want to translate as "god-noise" or some>	such. Perhaps they train their children to detect gog in precisely the
>	same way we train children to detect thunder -- our internal
>	thunder-detectors are identical. Nevertheless, the output of their
>	thunder-detector does not *mean* "thunder".

Besides the obvious rejoinder that -- in a very real sense -- "gog"
and "thunder" ARE picking out the same thing to an approximation, I
think you are understimating the complexity and resources of
compositeness versus atomicity (i.e., descriptions versus names).
Utterances are not infinitely decomposable; there are elementary
labels that simply refer to an object or a state of affairs, rather
than predicate something more complex about it, and some of these
objects will be sensory, and picked out by sensory attributes alone.
The rest, I claim, can be grounded in combinations of these labels
(stating category inclusion relations, to begin with).

If "gog" is really a holophrastic description rather than an atomic
name, then there must be a way of decomposing it into its components
("angry," "gods," etc.), which will then themselves either be
composite or atomic, and if atomic (and sensory), then the grounding can
start there. "Thunder," on the other hand, need not be decomposable in that
way, and its representation need not presuppose a similar set of
interrelations with other representations. (This is not to say that it does
not have interrelations with other names and descriptions and their
underlying features; just that it does not have the ones the
composite holophrastic "gog" must have.)

I'll return to the issue of training below. For now, let me say that
although I myself introduced the problem of "meaning" ("intentionality" etc.)
in formulating the symbol grounding in the first place, I was
appealing mainly to the informal, intuitive "folk-psychological"
meaning of meaning. We all know what "meaningful" vs. "meaningless"
means. We all know what it's like to mean something, and what it's
like not to know what something means. That's really all I want to
take on. On the other hand, the long line of intentionality
conundrums -- beginning with Frege's "morning star" and "evening star"
and passing through puzzles about referential opacity and culminating in
Putnam's "water/twin-water" koans (and related Quinean "gavagai"
problems and even Goodmanian "green/grue" and Kuhnian
incommensurability) -- I would rather keep my distance from, as not a helpful
legacy from philosophers' 2000-year unsuccessful struggle with meaning.

>	there are two reasons why meaning resists explication by this kind
>	of psychology:  (1) holism: the meaning of even a "grounded" symbol will
>	still depend on the rest of the cognitive system; and (2) normativity:
>	meaning is dependent upon a determination of what is a *correct*
>	response, and you can't simply read such a norm off from a description
>	of how the mechanism in fact performs.

(1) "Holism" is a vague notion, but I take it that Quine has in mind
that the meanings of words are intimately interrelated, and that a
change of meaning in one may require adjustments, perhaps even radical
ones, throughout the entire system. I think this is something that the
kind of bottom-up grounding scheme I'm proposing is particularly well
suited to handle, and I discuss it explicitly in the theoretical chapter
of the book under discussion here ("Categorical Perception").

One of the most important features of this approach is what I've dubbed
"approximationism": All category representations are provisional and
approximate, depending on the confusable alternatives sampled to date.
This means that feature-sets are open to revision, perhaps even
radical revision, if the existing context is too narrow or
unrepresentative. The only constraint is that all prior contexts must
be subsumed as special cases; in other words, the updating is
convergent. Grounding is itself a "holistic" relation, and any
ground-level change in the representation will ramify bottom-up to
everything that's grounded in it (for example, gog's meaning changes
if gods turn out not to exist). This is not to say, however, that
incoherencies can't make their way into such a system, or that it will
always behave optimally or rationally.

(2) "Normativity" is no problem for an approximationist device 
whose internal principles of function have nothing at all to
do with questions about what things "really" are (or "really mean").
These principles only concern what you can reliably sort and label on
the evidence available: Every category learning task has a source of
feedback about "right" and "wrong." If you are in the wild and you're
hungry and only mushrooms are available, there's a distinct
ecological constraint to guide you in sorting "edibles" from "inedibles."
Less radically, most of our transactions with objects and events that
require categorization are attended by feedback from the consequences of
MIScategorization (otherwise why bother?). And often the feedback source
is good old-fashioned instruction, some of it based on preemptive
ecological experience, some of it just based on arbitrary convention.
The trick for the theorist is to forget about what a label "really" picks
out and just worry about the actually sample a device sorts, and how.

So neither holism nor "norms" seem to be a problem for the
categorization model I am describing. And whether some of its internal
representations are justifiably interpreted as "meanings" depends
ultimately on whether or not its performance is TTT-indistinguishable
(Total Turing Test) from ours. (Let's not get into another round about
whether this is the ONLY criterion again...)

>	The fact that a subject's brain reliably asserts the symbol "foo" when
>	and only when thunder is presented in no way "fixes" the meaning of
>	"foo". Of course it is obviously a *constraint* on what "foo" may
>	mean: it is in fact part of what Quine called the "stimulus meaning"
>	of "foo", his first constraint on acceptable translation. Nevertheless, 
>	by itself it is still way too weak to do the whole job, for in different
>	contexts the positive output of a reliable thunder-detector could mean
>	"thunder", something co-extensive but non-synonymous with "thunder",
>	"god-noise", or just about anything else. Indeed, it might not *mean*
>	anything at all, if it were only part of a mechanical thunder-detector
>	which couldn't do anything else... I wonder if you disagree with this?

I agree with most of this. I certainly agree about context-dependence
and what sounds like approximateness. I don't really know what Quine's
"stimulus meaning" is, but perhaps it could be cashed in by coming up with
the right performance model. That theoretical task, however, is anything but
trivial, and the real work seems to begin where Quine's vague descriptor
leaves off. (Same for "behavioral dispositions.") I also agree that a
sub-TTT device may have nothing worthy of being interpreted as "meaning" at
all. Hence much of meaning must have to do with the interrelations
among the representations subserving our total sorting, labeling and
describing capacity; and it of course depends on the context of interconfusable
alternatives that any given device can successfully sort and describe -- the
"compared to what?" factor. Widen the context and you narrow the
options on what an isolated act of stimulus-naming (and the underlying
structures and processes generating it) might "mean."

(I've always felt that radical alternative translations are unlikely to
exist because of constraints on the permutations and combinations that will
still yield a coherently decryptable story. In the propositional
calculus, conjunction/negation and disjunction/negation may be "duals,"
but it's not clear that more complex alternative permutations are
possible in the semantics of natural language. They may leave no degrees
of freedom. See also the contributions of Dan Berleant to this discussion
on that topic. I think similar considerations may apply to
inverted-spectrum thought-experiments regarding qualia, i.e., swapping
red and green, etc.)

>	As to normativity, the force of problem (2) is particularly acute when
>	talking about the supposed intentionality of animals, since there aren't
>	any obvious linguistic or intellectual norms that they are trying to
>	adhere to. Although the mechanics of a frog's prey-detector may be
>	crystal clear, I am convinced that we could easily get into an endless
>	debate about what, if anything, the output of this detector really
>	*means*.

I agree, although that may partly be a problem with the weakness of our
ecological knowledge and cross-species intuitions with respect to the
infrahuman-TTT. It may also be a consequence of the preeminent
role language plays in our judgments (as perhaps it should). But one
can certainly speak about "right" and "wrong" in an animal's
categorization performance, both with respect to evolutionary
adaptation and learning. And approximationism relieves us of having to
decide the fact of the matter about what EXACTLY the frog's bug-detector
is picking out. To an approximation it might be the same thing ours is
picking out... But, not being TTT-equivalent to us, frogs may well be
"meaning" nothing at all.

>	in doing this sort of psychology, we probably won't care about the
>	difference between correctly identifying a duck and mis-identifying
>	a good decoy -- we're interested in the perceptual mechanisms that are
>	the same in both cases. In effect, we are limiting our notion of
>	"categorization" to something like "quick and largely automatic
>	classification by observation alone".

Whether duck/decoy is a good enough approximation for duck depends on
context and consequences. (For the unfortunate hunted duck, it
matters.) But there are big differences between innate and learned
categories (the former are not revisable in an indivdual lifetime) and
not all categories are sensory. They're simple all GROUNDED in sensory
categories.

>	We pretty much *have* to restrict ourselves in this way, because, in the
>	general case, there's just no limit to the amount of cognitive activity
>	that might be required in order to positively classify something.
>	Consider what might go into deciding whether a dolphin ought to be
>	classified as a fish, whether a fetus ought to be classified as a
>	person, etc.  These decisions potentially call for the full range of
>	science and philosophy, and a psychology which tries to encompass such
>	decisions has just bitten off more than it can chew: it would have to
>	provide a comprehensive theory of rationality, and such an ambitious
>	theory has eluded philosophers for some time now... we seem committed
>	to the notion that we are limiting ourselves to particular *modules*
>	as explained in Fodor's modularity book. Unfortunately... these
>	normative distinctions *are* significant for the *meaning* of symbols.
>	("Duck" doesn't *mean* the same thing as "decoy").

I'd like to try having that bite and chewing it too. As I suggested before,
philosophers may have failed because they never really tried. And holism is
not a problem for my kind of model, for example, because there's no restriction 
on how much of a grounded hybrid system is used to form one categorization,
concrete or abstract. And, as I mentioned, the current provisional approximation
is always open to updating (say, on the basis of new scientific findings)
by widening the context. Nor are "norms" a problem; category formation
is always guided by feedback -- either ecological or social -- about what
labels and descriptions are right and wrong.

I disagree, though, that a successful model calls for a comprehensive theory
of rationality (any more than it needs a periscope on ontic reality): It
need only be able to make the fallible practical inferences we can and do
make. I also see nothing that commits a grounded bottom-up system of the
kind I'm describing to any kind of modularity, on the contrary. And "duck"
doesn't mean the same as "decoy" only because there are ways we can and do
tell them apart.

>	I think there's some confusion as to whether Harnad's claim [about
>	the necessity of a sufficient feature-set] is just an empty tautology
>	or a significant empirical claim. To wit: it's clear that we can
>	reliably recognize chairs from sensory input, and we don't do this by
>	magic. Hence, we can perhaps take it as trivially true that there are
>	some "features" of the input that are being detected. If we are taking
>	this line however, we have to remember that it doesn't really say
>	*anything* about the operation of the mechanism -- it's just a fancy
>	way of saying we can recognize chairs.

I agree that I haven't provided a feature-learning mechanism (although
I've suggested some candidates, such as connectionist nets or some
other inductive statistical algorithm). I've just argued that one
must exist. But those who were disagreeing were suggesting that category
membership is really graded, not all-or-none, and that sufficient
feature-sets do not and need not exist. Be it ever so fancy, it
matters whether we categorize chairs as chairs on an all-or-none featural
basis or as a matter of degree (of similarity to a "template," say). I
think the whole line of research based on "family resemblances" and
protoype-matching is wrong-headed and based on misunderstandings about
what features and feature-detectors are; moreover, it begs most of the
questions involved in trying to get a device to perform successful
all-or-none categorization at all. If the existence and use of sufficient
feature-sets is so certain that it's tautologous, tell that to the
ones who seem to be denying it!

>	On the other hand, it might be taken as a significant claim about the
>	nature of the chair-recognition device, viz., that we can understand
>	its workings as a process of actually parsing the input into a set of
>	features and actually comparing these against what is essentially some
>	logical formula in featurese.  This *is* an empirical claim, and it is
>	certainly dubitable: there could be pattern recognition devices
>	(holograms are one speculative suggestion) which cannot be
>	interestingly broken down into feature-detecting parts.

In another response I argue that holograms and other iconic
representations cannot do nontrivial categorization (i.e.,
problems in which there are no obvious gaps in the variation and the
feature-set is complex and underdetermined). I also do not favor
"logical formulas in featurese" (which sounds as if it has gone
symbolic prematurely). A disjunctive feature-detector need not have any
explicit formulas. It could be a selective filter that only passes input
that is, say, red or green; i.e., it could be "micro-iconic," -- invertible
only in red-or-green-ness. I also don't think the representation of
"chair" is likely to be purely sensory; it's probably a higher-order
category grounded in sensory categories. I think there's plenty in
what I claim that is dubitable (hence empirical), if not dubious.
-- 

Stevan Harnad                                  (609) - 921 7771
{bellcore, psuvax1, seismo, rutgers, packard}  !princeton!mind!harnad
harnad%mind@princeton.csnet       harnad@mind.Princeton.EDU