reiter@endor.harvard.edu (Ehud Reiter) (01/18/89)
Steve Harnad writes >"Information content" depends on resolving uncertainty. Objective >uncertainty is a function of objective consequences: It matters >whether this is a "mushroom" or a "toadstool," because if I miscategorize, >I may die. To the extent that categorization is guided by objective >consequences, it is nonarbitrary ... The problem here is who makes up the categories? Professional biologists don't particularly care if a plant is poisonous or not, or if an animal is edible - they're much more interested in organ structure, evolutionary history, biochemical details, and so forth. The categories they come up with are pretty good at predicting such details, but they're much less useful at predicting more mundane attributes like edibility. So, the categories used by professional biologists may not be very useful to the average language user. This would not matter, except for the fact that the language community at large has been trying to align its biological categories with the ones used by biologists. So, the community has decided that "bird" means a member of class Aves, and thus penguins are birds, but bats are not. Similarly, "fish" means member of class Agnatha, Placodermi, Chondrichthyes, or Osteichthyes. Thus, a whale is not a fish, since it is a member of class Mammalia. In short, how useful modern English biological categories are to the average language user (as opposed to the professional biologist) may be questionable. >[Categories] may or may not be human creations, but they must be based on >objectively observable features if they are to be reliably picked out >(as they are). Cladistic and evolutionary taxonomists define categories phylogenetically, not in terms of observable physiological features (see my previous posting). Identification procedures that are based on observable features can usually be constructed, although these may be based on "family resemblance" ideas. From Ernst Mayr, PRINCIPLES OF SYSTEMATIC ZOOLOGY, pg 88 "As a heritage from the days when classification was considered synonymous with identification, there is an erroneous concept of the higher taxon, or rather the members of a higher taxon, as the carriers of an identifying character. A taxon is in fact a group of [evolutionary] relatives, and whether or not they have the same "characters in common" is irrelevant. Many taxa are based on a combination of characters, and frequently not a single one of these characters is present in all members of the taxon, yet such a taxon may have a sound "polythetic" basis." ["polythetic" means "Each species possesses a large (but unspecified) number of the total number of properties of the taxon" - page 83] I question whether an average language user is in fact capable of always reliably identifying a "bird", a "mammal", or a "fish" (i.e. a member of class Aves, class Mammalia, or classes {Agnatha, Placodermi, Chondrichthyes, Osteichthyes}). We are taught some special cases in school (e.g. "penguins are birds, but bats are not", "whales aren't fish, they're mammals"), but I suspect we would fail on other unusual cases that are not taught in school (e.g. pterodactyls and ichthyosaurs). Hilary Putnam (e.g. in "The Meaning of 'Meaning'", chap 12 in MIND, LANGUAGE, AND REALITY) has suggested that definitions can make reference to expert knowledge (e.g. "I don't know whether an ichthyosaur is a fish or a reptile, but I know who to ask to find out"). This sounds like as good a suggestion as any for how the average language user defines biological categories. Ehud Reiter reiter@harvard (ARPA,BITNET,UUCP) reiter@harvard.harvard.EDU (new ARPA)
harnad@elbereth.rutgers.edu (Stevan Harnad) (01/22/89)
reiter@endor.harvard.edu (Ehud Reiter) of Aiken Computation Lab Harvard, Cambridge, MA wrote: " [W]ho makes up the categories? Professional biologists [categories] are " pretty good at predicting [biological] details, but much less useful at " predicting more mundane attributes like edibility... [H]ow useful " modern English biological categories are to the average language user " (as opposed to the professional biologist) may be questionable. A persistent misunderstanding (or perhaps a divergence of interest) seems to be running through some aspects of this discussion. In my view, cognitive theory is not -- and should not ITSELF aspire to be -- amateur taxonomy or amateur ontology. Cognitive theorists should be trying to model how categories are represented in the head by testing models of how devices manage to categorize as people do. The only face-valid constraint on this enterprise is the data on human (and animal) categorization performance capacity: What people can actually sort and label, and what labels and sortings they produce. Ordinary language users are people; biologists are people; ontologists are people; sometimes they happen to be the same people, sometimes not. Sometimes people's categorization performance is reliable and all-or-none, sometimes not. Sometimes the reliability is or can be raised to virtually 100% correct all-or-none performance (this is the core of our categorization capacity) sometimes not. Sometimes (and this is important) there is (temporarily or permanently) NO BASIS on which either people OR cognitive theorists can assess whether or not a categorization is correct, because no detectable consequences follow from MIScategorization. This may happen (and often does) in certain anomalous or fuzzy regions of the sample space; but if it happens for all or most of a "category," then it is simple not a category (or not yet a category). So it doesn't really matter who makes up the categories. It just matters that human performance indicates that they are there, and can be sorted and labeled on the basis of SOMETHING. If the sorting is all-or-none and reliable (as it is for a vast core of ordinary cognition) then, I claim, it must have a classical (invariant featural) basis in the input instances themselves, or, recursively, in whatever the input instances are GROUNDED in. And it also matters that the categories (or, more appropriately, MIScategorization) must have consequences. This is what guides and constrains both the categorizer and the categorization theorist. The categories of ordinary folk are typically calibrated by one species of consequences (usually related to sustenance and certain [partially self-imposed] social constraints), whereas the categories of scientists are calibrated by "empiricism" -- which is to say: the consequences of experimental tests and the internal coherence and implications of scientists' explanatory theories. Sometimes folk and scientific categories square with one another, sometimes they do not. It is not the cognitive theorist's burden to equate them, just to model them as both being empirical instances of human categorization performance capacity. Nor does the "English Language" integrate them; lay and scientific categories usually simply get different dictionary entries. In a sense, though, the scientist is closer to having an integrated category, since he presumably has internal representations of both, with the lay category encoded as a special case or weaker approximation of the scientific one. (The factors of approximation, cumulativity and convergence in categorization are discussed in my book.) And as POTENTIAL categories that could be formed by all human beings within one head, it is of course the burden of the cognitive theorist to model the cumulative represetation too. The intuitions and introspections of ordinary folk about HOW they accomplish their categorizations is likely to be of limited usefulness to the cognitive theorist. The introspections of scientists may be somewhat more useful, because they tend to be more explicit about the features they are using, but even here they have no FACE-validity: It's what Simon DOES that matters, not what Simon SAYS he does. But, in the end, no "expert" will be able to do the cognitive theorist's work for him, which is to model the internal representations that will successfully generate human performance capacity. " [E]volutionary taxonomists define categories phylogenetically, " not in terms of observable physiological features... " Identification procedures that are based on observable features can usually " be constructed, although these may be based on "family resemblance" ideas. " Ernst Mayr wrote: " "A taxon is in fact a group of [evolutionary] relatives, and whether " "or not they have the same "characters in common" is irrelevant. Many " "taxa are based on a combination of characters, and frequently not a " "single one of these characters is present in all members of the " "taxon... Each species possesses a large (but unspecified) number of " "the total number of properties of the taxon" The key here is that "identification procedures based on observable features can usually be constructed." That seems to give away the store. No symbol grounding theory (including my own) -- at least no non-positivistic one -- would require either laymen or scientists to speak exclusively in an observation language. But their terms must somehow be GROUNDED in observations, otherwise how is one to say whether or not the categorization is "correct"? (In fact, how is one otherwise even to know what the words mean? Unless grounded somehow is something other than words, they are just meaningless strings of symbols. That's the symbol grounding problem. And to say that the "solution" is simply to connect the symbols to objects "in the right way" is simply to beg the question. For the categorization problem IS the problem of how symbols come to be connected to objects!) "Family resemblances" is simply a red herring. Most of this pseudoproblem is handled by noting that disjunctive features are perfectly valid features (which is what launched this whole discussion). So is a complex "polythetic" rule that says "It's an X if it has at least K out of M properties." Moreover, "common descent" (though not always available for observation, obviously) seems a perfectly classical "feature" even on the arbitrary view that only shared monadic properties qualify as features. So taxa too, to the extent that they are reliable, decidable, all-or-none categories at all, must be decided by their consequences: The consequences are not based on whether or not the biologist eats, but on whether or not the taxonomic system is internally coherent and has testable consequences. Internal consistency alone, by the way, is certainly not good enough, as the long history of arbitrary typologies mankind has come up with testifies (e.g., astrology, yin/yang, and the many self-fulfilling, ad hoc, AD LIB classifications that psychologists have proposed to us across time in place of a substantive predictive theory); see the prior discussion on imposed vs. ad lib categorization. " I question whether an average language user is in fact capable of " always reliably identifying a "bird", a "mammal", or a "fish"... " I suspect we would fail on unusual cases that are not taught in school " (e.g. pterodactyls and ichthyosaurs). It must be repeated that where there is no reliable categorization performance -- or worse, no objective BASIS for reliable categorization performance -- there simply IS NO CATEGORY (or not yet a category). For the cognitive theorist, a category consists of the cases you CAN sort and label, not those you can't. To ask for more, as I said, is for cognitive theory to over-reach into the domain of empirical taxonomy or ontology. " Hilary Putnam... suggested that definitions can make reference to " expert knowledge (e.g. "I don't know whether an ichthyosaur is a fish " or a reptile, but I know who to ask to find out"). This sounds like as " good a suggestion as any for how the average language user defines " biological categories. Putnam is not a cognitive theorist who is concerned with how to model the internal mechanism that allows us to sort and label inputs. He is a philosopher concerned with the philosopher's problem of how a name "fixes" a referent, in the sense that "the elementary particle physicists will say is basic in the year 2000" seems to "pick out" something "out there" that I already have "in mind" right now when I refer to "it." And, in a sense, the physicist's future say-so is a kind of "feature." But it's more like the "Dumpty-says" feature discussed in another posting in this discussion. And it's not much use without the expert oracle. To the cognitive theorist this only indicates that some categories cannot be sorted without someone else's help. That's not a very interesting representation. On the other hand, this example does bring out some interesting aspects of the grounding problem: The higher levels of discourse in a grounded symbol system can be quite abstract and removed from observation, yet they may still be coherent and even informative. As long as "fish," "reptile," and, say, "vertebrate," are grounded, I can go on to talk and learn a lot about "Ichtyosaurus" knowing only that it's a vertebrate that's either a fish or a reptile, despite neither having ever seen one nor being able, with my current resources, to be able to pick one out if I ever did see one. This is a powerful and remarkable feature of grounding. But it is a flagrant flouting of what I've dubbed the "entry-point problem" for category modeling merely to step into the category network at an arbitrary point like this, simply supposing oneself to be the HEIR to all the prior requisite categories (such as "fish" and "reptile") without having worked for them or or at least specified how THEY got there, and then trying to say something general about category representations, such as that "they need not be based on classical features"! So I may defer to expert knowledge in order to talk at all about some of my vaguer categories, but that's hardly the paradigm for my categorization performance and its substrates. According to my grounding theory, I must have done a lot of hard work by direct acquaintance with sensory categories before I built up the grounded system that now allows me to rely on experts' say-so. The theorist has to do a lot of hard work too, before he can help himself to this derivative high-level capability. -- Stevan Harnad INTERNET: harnad@confidence.princeton.edu harnad@princeton.edu srh@flash.bellcore.com harnad@elbereth.rutgers.edu harnad@princeton.uucp BITNET: harnad@pucc.bitnet CSNET: harnad%princeton.edu@relay.cs.net (609)-921-7771
lee@uhccux.uhcc.hawaii.edu (Greg Lee) (01/22/89)
From article <Jan.21.13.42.31.1989.10447@elbereth.rutgers.edu>, by harnad@elbereth.rutgers.edu (Stevan Harnad): " ... " No symbol grounding theory (including my own) -- at least no " non-positivistic one -- would require either laymen or scientists to " speak exclusively in an observation language. You underestimate Positivism, of which in fact your theory appears to be a species. Here is a passage from Moritz Schlick's "Positivism and Realism" (1932/3,in Ayer, _Logical Positivism_): "But when do I understand a proposition? When I understand the meanings of the words which occur in it? These can be explained by definitions. But in the definitions new words appear whose meanings cannot again be described in propositions, they must be indicated directly: the meaning of a word must in the end be _shown_, it must be _given_. This is done by an act of indication, of pointing; and what is pointed at must be given, otherwise I cannot be referred to it." " But their terms must " somehow be GROUNDED in observations, otherwise how is one to say " whether or not the categorization is "correct"? (In fact, how is one " otherwise even to know what the words mean? Unless grounded somehow " is something other than words, they are just meaningless strings " of symbols. ... Except that even Schlick allows grounding in *possible* experiences, I see no notable differences from your views. For a recent argument against applying this approach to natural language, see Chomsky's review of Skinner's _Verbal Behavior_. Greg, lee@uhccux.uhcc.hawaii.edu
harnad@elbereth.rutgers.edu (Stevan Harnad) (01/25/89)
lee@uhccux.uhcc.hawaii.edu (Greg Lee) of University of Hawaii wrote: " You underestimate Positivism, of which in fact your theory appears to be " a species. Here is a passage from Moritz Schlick's "Positivism and " Realism" (1932/3,in Ayer, _Logical Positivism_): "But when do I " understand a proposition? When I understand the meanings of the words " which occur in it? These can be explained by definitions. But in the " definitions new words appear whose meanings cannot again be described in " propositions, they must be indicated directly: the meaning of a word " must in the end be _shown_, it must be _given_. This is done by an act " of indication, of pointing; and what is pointed at must be given, " otherwise I cannot be referred to it." " Except that even Schlick allows grounding in *possible* experiences, " I see no notable differences from your views. For a recent argument " against applying this approach to natural language, see Chomsky's " review of Skinner's _Verbal Behavior_. Although I do confess to a lingering sympathy for certain perfectly valid features of positivism (the "P" word), as well as for verificationism and the 18th century empiricism from which they grew (I think positivism was rejected by psychologists just as hastily, superficially, unselectively and uncritically as it was first accepted by them), I am nevertheless no positivist, as is quite evident from the representational model I am proposing. Nor am I a behaviorist (as the above quote also seems to imply). The positivists were concerned with MEANING (especially the meaning of scientific statements): What statements are and are not meaningful, and in what does their meaning consist? I, on the other hand, am concerned with categorization: How do we SORT and LABEL categories and USE the category labels in statements about categories? The positivists claimed that only "observation statements" -- or statements from which observation statements could be readily derived -- were "meaningful." I certainly don't say anything of the sort. In fact, I happen to find most of the very statements that the positivists wished to reject as meaningless and metaphysical to be perfectly meaningful, with their terms perfectly well grounded (in MY sense, i.e., consisting of the labels of categories that were grounded in the labels of categories that were grounded in... the labels of concrete sensory categories that we can sort and label directly). Moreover, most of what the positivists themselves said becomes trivially obvious and no longer "positivistic" in any substantive sense if restated in terms of categorization rather than meaning, as the following transcription shows: "But when can I sort a category described by a label-string? When I can sort categories for the labels which occur in it? These can be described by more label-strings. But in the label-strings new labels appear whose categories cannot again be described by still more label-strings [on pain of infinite regress]: the members of a category must in the end be actually sorted." This is simply a statement of a version of what I've called "the symbol grounding problem," plus a fairly obvious constraint on its solution in the case of categorization. That the positivists too noticed this problem does not mean that all solutions to it are therefore positivistic. -- Not that the positivists even offered a solution, mind you, for "pointing" is certainly no solution to a cognitive theorist, who must provide the underlying causal mechanism that governs the success of the pointing, i.e., a representational theory! Successful pointing to the right category members is the behavioral capacity the cognitivist must explain! The philosopher simple takes it for granted. Behaviorism likewise has as little to offer a cognitive theorist as does positivism. It's not helpful to know that a subject's successful pointing performance was "shaped" by his reinforcement history: The cognitive theorist must come up with the internal structures and processes that were responsible for that success, given those inputs and that feedback from the consequences of MIScategorization. (To this extent one can of course agree completely with Chomsky's critique of Skinner; but the rest of Chomsky's argument against learning and empiricism -- the celebrated "poverty of the stimulus" argument -- has so far only been applied to and provisionally supported in the special case of certain syntactic categories, certainly not categorization in general!) [Among the objections to positivism was one that was directed at empiricism as a whole and has lately been championed by Chomskian nativists like Jerry Fodor: the problem of "abstraction" or "vanishing intersections" -- the (alleged) fact that one cannot ground abstract terms such as "goodness" or "truth" in the features shared by concrete sensory instances (such as "this good boy" or "that true statement") because NO FEATURE is shared by all the sensory instances: their intersection is simply empty. I, obviously, am not persuaded by this claim (or the radical nativism about categories that accepting it would entail -- what I've dubbed elsewhere the "Big-Bang Theory of the Origin of Knowledge"). Let me note in passing only that this claim has often been made, but never tested, because testing whether sensory intersections actually vanish is not in the philosopher's line of work. Other reasons for rejecting positivism came from some of the Wittgensteinian considerations, likewise untested, that have surfaced a few times in this discussion (e.g., that the category "game" has no invariant features).] Finally, about the "possible" experience that my theory supposedly does not allow: On the contrary, MOST of the grounding in my theory is based on possible rather than actual direct sorting. In fact, even categories that are unverifiable in principle may be perfectly well-represented categories in my theory. As an example, I will use the category of a "peekaboo unicorn," which should be meaningless to a verificationist. But first, let me just quickly sketch the representational theory (for details, see the last chapter of "Categorical Perception: The Groundwork of Cognition"): There are three kinds of internal representations. "Iconic Representations" (IRs) are internal analogs of the proximal projections of objects on the receptor surfaces. IRs subserve relative discrimination, similarity judgment, and tasks based on continuous analog transformations of the proximal stimulus; but because they blend continuously into one another, IRs cannot subserve categorization. Categorical Representations (CRs) are IRs that have been selectively filtered and reduced to only the invariant features of the proximal projection that reliably distinguish members of a category from whatever the confusable alternatives are in a specific "context" or sample of alternatives. CRs subserve categorical perception and object identification. CRs are also associated with a label, the category name; these names are directly "grounded" in their IRs and CRs and the objects these pick out. The labels are also the primitives of a third kind of representation, Symbolic Representations (SRs). SRs can be combined and recombined into strings of symbols that form composite SRs and likewise pick out categories, as grounded in IRs and CRs. CRs pick out categories by direct perceptual experience; SRs pick them out by symbolic description, with its primitive terms ultimately grounded in perceptual representations. SRs subserve natural language. Here is an example of how my "grounding" scheme would work. The example is recursive on whatever the primitive categories actually are (they are certainly not the ones I actually give here): Suppose the category "horse" ("H") is grounded in the categorizer's having learned to sort and label horses by direct perceptual experience, with feedback from mislabeling. IRs have been formed, as well as CRs that will correctly sort horses and non-horses (within a sampled context of confusable alternatives). Suppose the category "having stripes" ("S") is similarly grounded, with IRs and CRs. Suppose also that the category "having one horn" ("O") is similarly grounded, with IRs and CRs. With the IRs and CRs possessed so far, a categorizer could sort and label H's, S's and O's from direct experience. Now introduce the following Symbolic Representation: "Zebra" ("Z") = H & S. It is evident that, armed only with the IRs and CRs in which "H" and "S" are grounded, not only WOULD [note the "possible experience"] a categorizer now be able to sort and label zebras correctly from the very first time he encountered one (if he were ever to encounter one), but he also now has a new grounded label "Z" that can henceforth enter into further grounded SRs, in virtue of the IRs and CRs in which it is grounded. Let's take yet another nonpositivistic step forward: "Unicorn" ("U") = H & O. This category, being fictional, will NEVER be encountered, yet it is perfectly well-grounded. Let's go still further: By similar means I could define a "Peekaboo Unicorn" which is not only a horse with one horn, but has the property that it "disappears" whenever any "sense-organ" or "detecting device" is trained on it (all these further categories likewise being grounded as above). Hence a "Peekaboo Unicorn" is a category that is unobservable and unverifiable IN PRINCIPLE, yet perfectly well-grounded in IRs, CRs, sensory experience, and actual objects. Such is the power of a viable grounding scheme. So do you still think I'm a positivist? -- Stevan Harnad INTERNET: harnad@confidence.princeton.edu harnad@princeton.edu srh@flash.bellcore.com harnad@elbereth.rutgers.edu harnad@princeton.uucp BITNET: harnad@pucc.bitnet CSNET: harnad%princeton.edu@relay.cs.net (609)-921-7771
lee@uhccux.uhcc.hawaii.edu (Greg Lee) (01/26/89)
From article <Jan.24.15.34.00.1989.27159@elbereth.rutgers.edu>, by harnad@elbereth.rutgers.edu (Stevan Harnad): >... >The positivists were concerned with MEANING (especially the meaning of >scientific statements): What statements are and are not meaningful, and >in what does their meaning consist? I, on the other hand, am concerned >with categorization: ... I see, I think. You had said: " But their terms must " somehow be GROUNDED in observations, otherwise how is one to say " whether or not the categorization is "correct"? (In fact, how is one " otherwise even to know what the words mean? Unless grounded somehow " is something other than words, they are just meaningless strings " of symbols. ... which seemed to me to imply that one *does* know what words mean by virtue of their grounding in observations. Now, I know that you don't *intend* to be proposing a theory of meaning. And if there is really any difference between the problem of categorization and the problem of reference, then perhaps you're not. In regard to my implication concerning behaviorism, I'll remark further that I had in mind your remarks in other postings about a categorization being significant only in regard to its consequences. This seemed to me to be akin to the central role ascribed to reinforcement in that B-theory (with which I have certain sympathies, by the way). >... >So do you still think I'm a positivist? Ah, let me put it this way. I think that the strategy of converting some sophisticated varieties of philosophical empiricism into a psychological theory is an interesting and plausible move to make, whether or not that is exactly what you're doing. Just as converting ordinary language philosophy into linguistic theory has been a rewarding endeavor for linguists. Thanks for your detailed and very informative discussion. Greg, lee@uhccux.uhcc.hawaii.edu