harnad@mind.UUCP (Stevan Harnad) (05/09/87)
To define a SUBsymbolic "level" rather than merely a NONsymbolic process or phenomenon one needs a formal justification for the implied up/down-ness of the relationship. In the paradigm case -- the hardware/software distinction and the hierarchy of compiled programming languages -- the requisite formal basis for the hierarchy is quite explicit. It is the relation of compilation and implementation. Higher-level languages are formally compiled into lower level ones and the lowest is implemented as instructions that are executed by a machine. Is there anything in the relation of connectionist processes to symbolic ones that justifies calling the former "sub"-symbolic in anything other than a a hopeful metaphorical sense at this time? The fact that IF neural processes are really connectionistic (an empirical hypothesis) THEN connectionist models are implementable in the brain defines a super/sub relationship between connectionist models and neural processes (conditional, of course, on the validity -- far from established or even suggested by existing evidence -- of the empirical hypothesis), but this would still have no bearing on whether connectionism can be considered to stand in a sub/super relationship to a symbolic "level." There is of course also the fact that any discrete physical process is formally equivalent in its input/output relations to some turing machine state, i.e., some symbolic state. But that would make every such physical process "subsymbolic," so surely turing equivalence cannot be the requisite justification for the putative subsymbolic status of connectionism in particular. A fourth sense of down-up (besides hardware/software, neural implementability and turing-equivalence) is psychophysical down-upness. According to my own bottom-up model, presented in the book I just edited (Categorical Perception, Cambridge University Press 1987), symbols can be "grounded" in nonsymbolic representations in the following specific way: Sensory input generates (1) iconic representations -- continuous, isomorphic analogs of the sensory surfaces. Iconic representations subserve relative discrimination performance (telling pairs of things apart and judging how similar they are). Next, constraints on categorization (e.g., either natural discontinuities in the input, innate discontinuities in the internal representation, or, most important, discontinuities *learned* on the basis of input sampling, sorting and labeling with feedback) generate (2) categorical representations -- constructive A/D filters which preserve the invariant sensory features that are sufficient to subserve reliable categorization performance. [It is in the process of *finding* the invariant features in a given context of confusable alternatives that I believe connectionist processes may come in.] Categorical representations subserve identification performance (sorting things and naming them). Finally, the *labels* of these labeled categories -- now *grounded* bottom/up in nonsymbolic representations (iconic and categorical) derived from sensory experience -- can then be combined and recombined in (3) symbolic representations of the kind used (exclusively, and without grounding) in contemporary symbolic AI approaches. Symbolic representations subserve natural language and all knowledge and learning by *description* as opposed to direct experiential acquaintance. In response to my challenge to justify the "sub" in "subsymbolic" when one wishes to characterize connectionism as subsymbolic rather than just nonsymbolic, rik%roland@sdcsvax.ucsd.edu (Rik Belew) replies: > I do intend something more than non-symbolic when I use the term > sub-symbolic. I do not rely upon "hopeful neural analogies" or any > other form of hardware/software distinction. I use "subsymbolic" > to refer to a level of representation below the symbolic > representations typically used in AI... I also intend to connote > a supporting relationship between the levels, with subsymbolic > representations being used to construct symbolic ones (as in subatomic). The problem is that the "below" and the "supporting" are not cashed in, and hence just seem to be synonyms for "sub," which remains to be justified. An explicit bottom-up hypothesis is needed to characterize just how the symbolic representations are constructed out of the "subsymbolic" ones. (The "subatomic" analogy won't do, otherwise atoms risk becoming subsymbolic too...) Dr. Belew expresses some sympathy for my own grounding hypothesis, but it is not clear that he is relying on it for the justification of his own "sub." Moreover, this would make connectionism's subsymbolic status conditional on the validity of a particular grounding hypothesis (i.e., that three representational levels exist as I described them, in the specific relation I described, and that connectionistic processes are the means of extracting the invariant features underlying the categorical [subsymbolic] representation). I would of course be delighted if my hypothesis turned out to be right, but at this point it still seems a rather risky "ground" for justifying the "sub" status of connectionism. > my interest in symbols began with the question of how a system might > learn truly new symbols. I see nothing in the traditional AI > definitions of symbol that helps me with that problem. The traditional AI definition of symbol is simply arbitrary formal tokens in a formal symbol system, governed by formal syntactic rules for symbol manipulation. This general notion is not unique to AI but comes from the formal theory of computation. There is certainly a sense of "new" that this captures, namely, novel recombinations of prior symbols, according to the syntactic rules for combination and recombination. And that's certainly too vague and general for, say, human senses of symbol and new-symbol. In my model this combinatorial property does make the production of new symbols possible, in a sense. But combinatorics is limited by several factors. One factor is the grounding problem, already discussed (symbols alone just generate an ungrounded, formal syntactic circle that there is no way of breaking out of, just as in trying to learn Chinese from a Chinese-Chinese dictionary alone). Other limiting factors on combinatorics are combinatory explosion, the frame problem, the credit assignment problem and all the other variants that I have conjectured to be just different aspects of the problem of the *underdetermination* of theory by data. Pure symbol combinatorics certainly cannot contend with these. The final "newness" problem is of course that of creativity -- the stuff that, by definition, is not derivable by some prior rule from your existing symbolic repertoire. A rule for handling that would be self-contradictory; the real source of such newness is probably partly statistical, and again connectionism may be one of the candidate components. > It seems very conceivable to me that the critical property we will > choose to ascribe to computational objects in our systems symbols > is that we (i.e., people) can understand their semantic content. You are right, and what I had inadvertently left out of my prior (standard) syntactic definition of symbols and symbol manipulation was of course that the symbols and manipulations must be semantically interpretable. Unfortunately, so far that further fact has only led to Searlian mysteries about "intrinsic" vs. "derived intentionality" and scepticism about the the possibility of capturing mental processes with computational ones. My grounding proposal is meant to answer these as well. > the fact that symbols must be grounded in the *experience* of the > cognitive system suggests why symbols in artificial systems (like > computers) will be fundamentally different from those arising in > natural systems (like people)... if your grounding hypothesis is > correct (as I believe it is) and the symbols thus generated are based > in a fundamental way on the machine's experience, I see no reason to > believe that the resulting symbols will be comprehensible to people. > [e.g., interpretations of hidden units... as our systems get more > complex] This is why I've laid such emphasis on the "Total Turing Test." Because toy models and modules, based on restricted data and performance capacities, may simply not be representative of and comparable to organisms' complexly interrelated robotic and symbolic functional capacities. The experiential base -- and, more important, the performance capacity -- must be comparable in a viable model of cognition. On the other hand, the "experience" I'm talking about is merely the direct (nonsymbolic) sensory input history, *not* "conscious experience." I'm a methodological epiphenomenalist on that. And I don't understand the part about the comprehensibility of machine symbols to people. This may be the ambiguity of the symbolic status of putative "subsymbolic" representations again. > The experience lying behind a word like "apple" is so different > for any human from that of any machine that I find it very unlikely > that the "apple" symbol used by these two system will be comparable. I agree. But this is why I proposed that a candidate device must pass the Total Turing Test in order to be capture mental function. Arbitrary pieces of performance could be accomplished in radically different ways and would hence be noncomparable with our own. > Based on the grounding hypothesis, if computers are ever to understand > NL as fully as humans, they must have an equally vast corpus of > experience from which to draw. We propose that the huge volumes of NL > text managed by IR systems provide exactly the corpus of "experience" > needed for such understanding. Each word in every document in an IR > system constitutes a separate experiential "data point" about what > that word means. (We also recognize, however, that the obvious > differences between the text-base "experience" and the human > experience also implies fundamental limits on NL understanding > derived from this source.)... In this application the computer's > experience of the world is second-hand, via documents written by > people about the world and subsequently through users'queries of > the system We cannot be talking about the same grounding hypothesis, because mine is based on *direct sensory experience* ("learning by acquaintance") as oppposed to the symbol combinations ("learning by description"), with which it is explicitly contrasted, and which my hypothesis claims must be *grounded* in the former. The difference between text-based and sensory experience is crucial indeed, but for both humans and machines. Sensory input is nonsymbolic and first-hand; textual information is symbolic and second-hand. First things first. > I'm a bit worried that there is a basic contradiction in grounded > symbols. You are suggesting (and I've been agreeing) that the only > useful notion of symbols requires that they have "inherent > intentionality": i.e., that there is a relatively direct connection > between them and the world they denote. Yet almost every definition > of symbols requires that the correspondence between the symbol and > its referent be *arbitrary*. It seems, therefore, that your "symbols" > correspond more closely to *icons* (as defined by Peirce), which > do have such direct correspondences, than to symbols. Would you agree? I'm afraid I must disagree. As I indicated earlier, icons do indeed play a role in my proposal, but they are not the symbols. They merely provide part of the (nonsymbolic) *groundwork* for the symbols. The symbol tokens are indeed arbitrary. Their relation to the world is grounded in and mediated by the (nonsymbolic) iconic and categorical representations. > In terms of computerized knowledge representations, I think we have > need of both icons and symbols... And reliable categorical invariance filters. And a principled bottom-up grounding relation among them. > I see connectionist learning systems building representational objects > that seem most like icons. I see traditional AI knowledge > representation languages typically using symbols and indices. One of > the questions that most interests me at the moment is the appropriate > "ontogenetic ordering" for these three classes of representation. > I think the answer would have clear consequences for this discussion > of the relationship between connectionist and symbolic representations > in AI. I see analog transformations of the sensory surfaces as the best candidates for icons, and connectionist learning systems as as possible candidates for the process that finds and extracts the invariant features underlying categorical representations. I agree about traditional AI and symbols, and my grounding hypothesis is intended as an answer about the appropriate "ontogenetic ordering." > Finally, this view also helps to characterize what I find missing > in most *symbolic* approaches to machine learning: the world > "experienced" by these systems is unrealistically barren, composed > of relatively small numbers of relatively simple percepts (describing > blocks-world arches, or poker hands, for example). The appealling > aspect of connectionist learning systems (and other subsymbolic > learning approaches...) is that they thrive in exactly those > situations where the system's base of "experience" is richer by > several orders of magnitude. This accounts for the basically > *statistical* nature of these algorithms (to which you've referred), > since they are attempting to build representations that account for > statistically significant regularities in their massive base of > experience. Toy models and microworlds are indeed barren, unrealistic and probably unrepresentative. We should work toward models that can pass the Total Turing Test. Invariance-detection under conditions of high interconfusability is indeed the problem of a device or organism that learns its categories from experience. If connectionism turns out to be able to do this on a life-size scale, it will certainly be a powerful candidate component in the processes underlying our representational architecture, especially the categorical level. What that architecture is, and whether this is indeed the precise justification for connectionism's "sub" status, remains to be seen. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
harnad@mind.UUCP (05/20/87)
John X. Laporta <rutgers!mit-eddie!apollo!laporta> Apollo Computer, Chelmsford, MA wrote: > You say that symbols are grounded in nonsymbolic sensory input. > You propose a model of segmentation... by which discontinuities > in the input map to segment boundaries... I wonder what you do with > the problem of segmentation of the visual spectrum. > ...spectral segmentations differ widely across cultures. > The problem is that these breaks and their number vary widely... > what system intervenes to choose the set a particular culture favors > and asserts as obvious? What is the filter in the A/D converter? More recent evidence seems to suggest that color segmentation does not vary nearly as widely as had been believed (see M. Bornstein's work). There may be some variability in the tuning of color boundaries, and some sub-boundaries may be added sometimes, but the focal colors are governed by our innate color receptor apparatus and they seem to be universal. The partial flexibility of the boundaries -- short and long term -- must be governed by learning, and the learning must consist of readjustment of boundary locations as a function of color naming experience and feedback, or perhaps even the formation of new sub-boundaries where there are none. The innate color-detector mechanism would be the A/D filter in the default case, and learning may set some of the boundary fine-tuning parameters. The really interesting case, though, and one that has not been tested directly yet, is the one where boundary formation occurs de novo purely as a result of learning. This does not happen with evolutionarily "prepared" categories such as colors (although it may have happened in phylogeny), but it may happen with arbitrary learned ones (e.g., perhaps musical semitones). Here the A/D filter would be acquired from categorization training alone: labeling with feedback. In simple one-dimensional continua, what would be acquired would simply be some sort of a threshold detector, but with more complex multidimensional stimuli the feature-filter would have to be constructed by a more active inductive process. This may be where connectionist algorithms come in. Another important factor in the selectivity of the A/D feature-filter is the "context" of alternatives: the sample of confusable members and nonmembers of the categories in question on the basis of which the features must be extracted; these also focus the uncertainty that the filter must resolve if it is to generate reliable categorization performance. All this is described in the book under discussion (Categorical Perception: The Groundwork of Cognition, Cambridge University Press 1987, S. Harnad, Ed.). -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
harnad@mind.UUCP (Stevan Harnad) (05/22/87)
This is part 1 of a response to a longish exchange from Rik Belew <rik%roland@SDCSVAX.UCSD.EDU> who asks: > ... [1] what evidence causes you to postulate iconic and categorical > representations as being distinct?... Apart from a relatively few > cognitive phenomena (short-term sensory storage, perhaps mental > imagery), I am aware of little evidence of "continuous, isomorphic > analogues of the sensory surfaces" [your "iconic" representations]. > [2] I see great difficulty in distinguishing between such > representations and "constructive A/D filters [`categorical' > representations] which preserve the invariant sensory features" based > simply on performance at any particular task. More generally, could > you [3] motivate your ``subserve'' basis for classifying cognitive > representations. [1] First of all, short-term sensory storage does not seem to constitute *little* evidence but considerable evidence. The tasks we can perform after a stimulus is no longer present (such as comparing and matching) force us to infer that there exist iconic traces. The alternative hypthesis that the information is already a symbolic description at this stage is simply not parsimonious and does not account for all the data (e.g., Shepard's mental rotation effects). These short-term effects do suggest that iconic representations may only be temporary or transient, and that is entirely compatible with my model. Something permanent is also going on, however, as the sensory exposure studies suggest: Even if iconic traces are always stimulus-bound and transient, they seem to have a long-term substrate too, because their acuity and reliability increases with experience. I would agree that the subjective phenomenology of mental imagery is very weak evidence for long-term icons, but successful performance on some perceptual tasks drawing on long-term memory is at least as economically explained by the hypothesis that the icons are still accessible as by the alternative that only symbolic descriptions are being used. In my model, however, most long-term effects are mediated by the categorical representations rather than the iconic ones. Iconic representations are hypothesized largely to account for short-term perceptual performance (same/difference judgment, relative comparisons, similarity judgments, mental rotation, etc.). They are also, of course, more compatible with subjective phenomenology (memory images seem to be more like holistic sensory images than like selective feature filters or symbol strings). [2] The difference between isomorphic iconic representations (IRs) and selective invariance filters (categorical representations, CRs) is quite specific, although I must reiterate that CRs are really a special form of "micro-icon." They are still sensory, but they are selective, discarding most of the sensory variation and preserving only the features that are invariant *within a specific context of confusable alternatives*. (The key to my approach is that identifying or categorizing something is never an *absolute* task but a relative, context-dependent one: "What's that?" "Compared to What?") The only "features" preserved in a CR are the ones that will serve as a reliable basis for sorting the instances one has sampled into their respective categories (as learned from feedback indicating correct or incorrect categorizing). The "context" (of confusable alternatives), however, is not a short-term phenomenon. Invariant features are provisional, and always potentially revisable, but they are parts of a stable, long-term category-representational system, one that is always being extended and updated on the basis of new categorization tasks and samples. It constitutes an ever-tightening approximation. So the difference between IRs and CRs ("constructive A/D filters") is that IRs are context-independent, depending only on the comparison of raw sensory configurations and on any transformations that rely on isomorphism with the unfiltered sensory configuration, whereas IRs are context-dependent and depend on what confusable alternatives have been sampled and must then be reliably identified in isolation. The features on which this successful categorization is based cannot be the holistic configural ones, which blend continuously into one another; they are features specifically selected and abstracted to subserve reliable categorization (within the context of alternatives sampled to date). They may even be "constructive" features, in the sense that they are picked out by performing an active operation -- sensory, comparative or even logical -- on the sensory input. Apart from this invariant basis for categorization (let's call these selectively abstracted features "micro-iconic") all the rest of the iconic information is discarded from the category filter. [3] Having said all this, it is easy to motivate my "subserve" as you request: IRs are the representations that subserve ( = are required in order to generate successful performance on) tasks that call for holistic sensory comparisons and isomorphic transformations of the unfiltered sensory trace (e.g., discrimination, matching, similarity judgment) and CRs are the representations required to generate successsful performance on tasks that call for reliable identification of confusable alternatives presented in isolation. As a bonus, the latter provide the grounding for a third representational system, symbolic representations (SRs), whose elementary symbols are the labels of the bounded categories picked out by the CRs and "fleshed out" by the IRs. These elementary symbols can then be rulefully combined and recombined into symbolic descriptions which, in virtue of their reducibility to grounded nonsymbolic representations, can now refer to, describe, predict and explain objects and events in the world. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
harnad@mind.UUCP (Stevan Harnad) (05/22/87)
Rik Belew <rik%roland@SDCSVAX.UCSD.EDU> writes: > I use ``icon'' to mean much the same as your ``categorical > representations''... their direct, albeit statistical, > relationship with sensory features... distinguishes icons from > ``symbols'', which are representations without structural > correspondence with the environment. The criterion for being iconic is physical isomorphism ( = "structural correspondence"). This means that the relationship between an object and its icon must be a physically invertible (analog) transformation. In my model, iconic representations are isomorphic with the unfiltered sensory projection of the input they represent, whereas categorical representations are only isomorphic with selected features of the input. In that sense they are "micro-iconic." The important point is that they are selective and based on abstracting some features and discarding all the rest. The basis of selection is: "What features do I need in order to categorize this input correctly, relative to other confusable alternatives I have encountered and may encounter in the future?" To call the input an "X" on the basis of such a selective, context-governed feature filter, however, is hardly to say that one has an "icon" of an "X" in the same sense that iconic representations are icons of input sensory projections. The "structural correspondence" is only with the selected features, not with the "object" being named. On the other hand, the complete absence of any structural correspondence whatever is indeed what distinguishes both iconic and categorical representations from symbolic ones. The heart of my symbol grounding proposal is that in allowing you to speak of (identify, label, categorize) "X's" at all, categorical representations have provided you with a set of elementary labels, based on nonsymbolic representations, that can now ground an otherwise purely syntactic symbol system in the objects and events to which it refers. Note, though, that the grounding is a strong constraint, one that renders the symbolic system no longer the autonomous syntactic module of conventional AI. The system is hybrid through-and-through. The relations between the three kinds of representation are not modular but bottom-up, with the nonsymbolic representations supporting the symbolic representations' relation to objects. Most of the rules for symbol binding, etc. are now constrained in ways that depart from the freedom of ungrounded formal systems. > Your, more restricted, notion of ``symbol'' seems to differ in two > major respects: its emphasis on the systematicity of symbols; and its > use of LABELS (of categories) as the atomic elements. I accept > the systematicity requirement, but I believe your labeling notion > confounds several important factors... > First, I believe you are using labels to mean POINTERS: > computationally efficient references to more elaborate and complete > representations... valuable not only for pointing from symbols > to icons (the role you intend for labels) but also from one place in > the symbolic representation to another... > many connectionists have taken this pointer quality to be > what they mean by "symbol." I believe my grounding proposal is a lot more specific than merely a pointing proposal. Pointing is, after all, a symbol-to-symbol function. It may get you to an address, but it won't get you from a word to the nonsymbolic object to which it refers. The labeling performance that categorical representations subserve, on the other hand, is an operation on objects in the world. That is why I proposed grounding elementary symbols in it: Let the arbitrary labels of reliably sorted object categories be the elementary symbols of the symbolic system. Such a hybrid system would continue to have most of the benefits of higher-order systematicity (compositionality), but with nonsymbolic constraints "weighing down" its elementary terms. Consider ordinary syntactic constraints to be "top-down" constraints on a symbol-system. A grounded hybrid system would have "bottom-up" constraints on its symbol combinations too. As to the symbolic status of connectionism -- that still seems to be moot. > The other feature of your labeling notion that intrigues me is > the naming activity it implies. This is where I see the issues > of language as becoming critical. ...truly symbolic representations and > language are co-dependent. I believe we agree on this point... > true symbol manipulation arose only as a response to language > Current connectionist research is showing just how > powerful iconic (and perhaps categorical) representations can > be... I use the term language broadly, to > include the behavior of other animals for example. Labeling and categorizing is much more primitive than language, and that's all I require to ground a symbol system. All this calls for is reliable discrimination and identification of objects. Animals certainly do it. Machines should be able to do it (although until they approach the performance capacity of the "Total Turing Test" they may be doing it modularly in a nonrepresentative way). Language seems to be more than labeling and categorizing. It also requires *describing*, and that requires symbol-combining functions that in my model depend critically on prior labeling and categorizing. Again, the symbolic/nonsymbolic status of connectionism still seems to be under analysis. In my model the provisional role of connectionistic processes is in inducing and encoding the invariant features in the categorical representation. > the aspect of symbols [that] connectionism > needs most is something resembling pointers. More elaborate notions of > symbol introduce difficult semantic issues of language that can be > separated and addressed independently... Without pointers, > connectionist systems will be restricted to ``iconic'' representations > whose close correspondence with the literal world severely limits them > from ``subserving'' most higher (non-lingual) cognitive functioning. I don't think pointer function can be divorced from semantic issues in a symbol system. Symbols don't just combine and recombine according to syntactic rules, they are also semantically interpretable. Pointing is a symbol-to-symbol relation. Semantics is a symbol-to-object relationship. But without a semantically interpretable system you don't have a symbol system at all, so what would be pointing to what? For what it's worth, I don't personally believe that there is any point in connectionism's trying to emulate bits and pieces of the virtues of symbol systems, such as pointing. Symbolic AI's problem was that it had symbol strings that were interpretable as "standing for" objects and events, but that relation seemed to be in the head of the (human) interpreter, i.e., it was derivative, ungrounded. Except where this could be resolved by brute-force hard-wiring into a dedicated system married to its peripheral devices, this grounding problem remained unsolved for pure symbolic AI. Why should connectionism aspire to inherit it? Sure, having objects around that you can interpret as standing for things in the world and yet still manipulate formally is a strength. But at some point the interpretation must be cashed in (at least in mind-modeling) and then the strength becomes a weakness. Perhaps a role in the hybrid mediation between the symbolic and the nonsymbolic is more appropriate for connectionism than direct competition or emulation. > While I agree with the aims of your Total Turing Test (TTT), > viz. capturing the rich interrelated complexity characteristic > of human cognition, I have never found this direct comparison > to human performance helpful. A criterion of cognitive > adequacy that relies so heavily on comparison with humans > raises many tangential issues. I can imagine many questions > (e.g., regarding sex, drugs, rock and roll) that would easily > discriminate between human and machine. Yet I do not see such > questions illuminating issues in cognition. My TTT criterion has been much debated on the Net. The short reply is that the goal of the TTT is not to capture complexity but to capture performance capacity, and the only way to maximize your confidence that you're capturing it the right way (i.e., the way the mind does it) is to capture all of it. This does not mean sex, drugs and rock and roll (there are people who do none of these). It means (1) formally, that a candidate model must generate all of our generic performance capacities (of discriminating, identifying, manipulating and describing objects and events, and producing and responding appropriately to names and descriptions), and (2) (informally) the way it does so must be intuitively indistinguishable from the way a real person does, as judged by a real person. The goal is asymptotic, but it's the only one so far proposed that cuts the underdetermination of cognitive theory down to the size of the ordinary underdetermination of scientific theory by empirical observations: It's the next best thing to being there (in the mind of the robot). > First, let's do our best to imagine providing an artificial cognitive > system (a robot) with the sort of grounding experience you and I both > believe necessary to full cognition. Let's give it video eyes, > microphone ears, feedback from its affectors, etc. And let's even > give it something approaching the same amount of time in this > environment that the developing child requires... > the corpus of experience acquired by such a robot is orders of magnitude > more complex than any system today... [yet] even such a complete > system as this would have a radically different experience of the > world than our own. The communication barrier between the symbols > of man and the symbols of machine to which I referred in my last > message is a consequence of this [difference]. My own conjecture is that simple peripheral modules like these will *not* be enough to ground an artificial cognitive system, at least not enough to make any significant progress toward the TTT. The kind of grounding I'm proposing calls for nonsymbolic internal representations of the kind I described (iconic representations [IRs] and categorical representations [CRs]), related to one another and to input and output in the way I described. The critical thing is not the grounding *experience*, but what the system can *do* with it in order to discriminate and identify as we do. I have hypothesized that it must have IRs and CRs in order to do so. The problem is not complexity (at least not directly), but performance capacity, and what it takes to generate it. And the only relevant difference between contemporary machine models and people is not their *experience* per se, but their performance capacities. No model comes close. They're all special-purpose toys. And the ultimate test of man/machine "communication" is of course the TTT! > So the question for me becomes: how might we give a machine the > same rich corpus of experience (hence satisfying the total part > of your TTT) without relying on such direct experiential > contact with the world? The answer for me (at the moment) is > to begin at the level of WORDS... the enormous textual > databases of information retrieval (IR) systems... > I want to take this huge set of ``labels,'' attached by humans to > their world, as my primitive experiential database... > The task facing my system, then, is to look at and learn from this > world:... the textbase itself [and] interactions with IR users... > the system then adapts its (connectionist) representation... Your hypothesis is that an information retrieval system whose only source of input is text (symbols) plus feedback from human users (more symbols) will capture a significant component of cognition. Your hypothesis may be right. My own conjecture, however, is the exact opposite. I don't believe that input consisting of nothing but symbols constitutes "experience." I think it constitutes (ungrounded) symbols, inheriting, as usual, the interpretations of the users with which the system interacts. I don't think that doing connectionism instead of symbol-crunching with this kind of input makes it any more likely to overcome the groundedness problem, but again, I may be wrong. But performance capacity (not experience) -- i.e., the TTT -- will have to be the ultimate arbiter of these hypotheses. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
aweinste@Diamond.BBN.COM (Anders Weinstein) (05/27/87)
In article <770@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: > >The criterion for being iconic is physical isomorphism >( = "structural correspondence"). This means that the relationship >between an object and its icon must be a physically invertible >(analog) transformation. As I've seen you broach this criterion a few times now, I just thought I'd remind you of a point that I thought was clearly made in our earlier discussion of the A/D distinction: loss of information, i.e. non-invertibility, is neither a necessary nor sufficient condition for analog to digital transformation. Anders Weinstein
harnad@mind.UUCP (05/28/87)
Anders Weinstein of BBN wrote: > a point that I thought was clearly made in our earlier > discussion of the A/D distinction: loss of information, i.e. > non-invertibility, is neither a necessary nor sufficient condition for > analog to digital transformation. The only point that seems to have been clearly made in the sizable discussion of the A/D distinction on the Net last year (to my mind, at least) was that no A/D distinction could be agreed upon that would meet the needs and interests of all of the serious proponents and that perhaps there was an element of incoherence in all but the most technical and restricted of signal-analytic candidates. In the discussion to which you refer above (a 3-level bottom-up model for grounding symbolic representations in nonsymbolic -- iconic and categorical -- representions) the issue was not the A/D transformation but A/A transformations: isomorphic copies of the sensory surfaces. These are the iconic representations. So whereas physical invertibility may not have been more successful than any of the other candidates in mapping out a universally acceptable criterion for the A/D distinction, it is not clear that it can be faulted as a criterion for physical isomorphism. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
aweinste@diamond.bbn.com.UUCP (05/29/87)
Replying to my claim that >> ...loss of information, i.e. >> non-invertibility, is neither a necessary nor sufficient condition for >> analog to digital transformation. in article <786@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: > >The only point that seems to have been clearly made in the sizable discussion >of the A/D distinction on the Net last year (to my mind, at least) was that no >A/D distinction could be agreed upon ... > >In the discussion to which you refer above ... the issue was not the A/D >transformation but A/A transformations: isomorphic copies of the >sensory surfaces. These are the iconic representations. So whereas >physical invertibility may not have been more successful than any of >the other candidates in mapping out a universally acceptable criterion >for the A/D distinction, it is not clear that it can be faulted as a >criterion for physical isomorphism. Well the point is just the same for the A/A or "physically isomorphic" transformations you describe. Although the earlier discussion admittedly did not yield a positive result, I continue to believe that it was at least established that invertibility is a non-starter: invertibility has essentially *nothing* to do with the difference between analog and digital representation according to anybody's intuitive use of the terms. The reason I think this is so clear is that for any one of the possible transformation types -- A/D, A/A, D/A, or D/D -- one can find paradigmatic examples in which invertibility either does or does not obtain. A blurry image is uncontroversially an analog or "iconic" representation, yet it is non-invertible; a digital recording of sound in the audible range is surely an A/D transformation, yet it is completely invertible, etc. All the invertibility or non-invertibility of a transformation indicates is whether or not the transformation preserves or loses information in the technical sense. But loss of information is of course possible (and not necessary) in any of the 4 cases. I admit I don't know what the qualifier means in your criterion of "physical invertibility"; perhaps this alters the case. Anders Weinstein
harnad@mind.UUCP (Stevan Harnad) (05/29/87)
aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc., Cambridge, MA writes: > invertibility has essentially *nothing* to do with the difference > between analog and digital representation according to anybody's > intuitive use of the terms... A blurry image is uncontroversially > an analog or "iconic" representation, yet it is non-invertible; > a digital recording of sound in the audible range is surely an A/D > transformation, yet it is completely invertible. [I]nvertibility... > [only] indicates whether... the transformation preserves or loses > information in the technical sense. But loss of information is... > possible in any of the 4 cases... A/D, A/A, D/A, D/D... > I admit I don't know what the qualifier means in your criterion > of "physical invertibility"; perhaps this alters the case. I admit that the physical-invertibility criterion is controversial and in the end may prove to be unsatisfactory in delimiting a counterpart of the technical A/D distinction that will be useful in formulating models of internal representation in cognitive science. The underlying idea is this: There are two stages of A/D even in the technical sense. Signal quantization (making a continuous signal discrete) and symbolization (assigning names and addresses to the discrete "chunks"). Unless the original signal is already discrete, the quantization phase involves a loss of information. Some regions of input variation will not be retrievable from the quantized image. The transformation is many-to-fewer instead of one-to-one. A many-to-few mapping cannot be inverted so as to recover the entire original signal. Now I conjecture that it is this physical invertibility -- the possibility of recovering all the original information -- that may be critical in cognitive representations. I agree that there may be information loss in A/A transformations (e.g., smoothing, blurring or loss of some dimensions of variation), but then the image is simply *not analog in the properties that have been lost*! It is only an analog of what it preserves, not what it fails to preserve. A strong motivation for giving invertibility a central role in cognitive representations has to do with the second stage of A/D conversion: symbolization. The "symbol grounding problem" that has been under discussion here concerns the fact that symbol systems depend for their "meanings" on only one of two possibilities: One is an interpretation supplied by human users -- "`Squiggle' means `animal' and `Squoggle' means `has four legs'" -- and the other is a physical, causal connection with the objects to which the symbols refer. The first source of "meaning" is not suitable for cognitive modeling, for obvious reasons (the meaning must be intrinsic and self-contained, not dependent on human mental mediation). The second has a surprising consequence, one that is either valid and instructive about cognitive representations (as I tentatively believe it is), or else a symptom of the wrong-headedness of this approach to the grounding problem, and the inadequacy of the invertibility criterion. The surprising consequence is that a "dedicated system" -- one that is hard-wired to its transducers and effectors (and hence their interactions with objects in the world) may be significantly different from the very *same* system as an isolated symbol-manipulating module, cut off from its peripherals -- different in certain respects that could be critical to cognitive modeling (and cognitive modeling only). The dedicated system can be regarded as "analog" in the input signal properties that are physically recoverable, even if there have been (dedicated) "digital" stages of processing in between. This would only be true of dedicated systems, and would cease to be true as soon as you severed their physical connection to their peripherals. This physical invertibility criterion would be of no interest whatever to ordinary technical signal processing work in engineering. (It may even be a strategic error to keep using the engineering "A/D" terminology for what might only bear a metaphorical relation to it.) The potential relevance of the physical invertibility criterion would only be to cognitive modeling, especially in the constrain that a grounded symbol system must be *nonmodular* -- i.e., it must be hybrid symbolic/nonsymbolic. The reason I have hypothesized that symbolic representations in cognition must be grounded nonmodularly in nonsymbolic representations (iconic and categorical ones) is based in part on the conjecture that the physical invertibility of input information in a dedicated system may play a crucial role in successful cognitive modeling (as described in the book under discussion: "Categorical Perception: The Groundwork of Cognition," Cambridge University Press 1987). Of course, selective *noninvertibility* -- as in categorizing by ignoring some differences and not others -- plays an equally crucial complementary role. The reason the invertibility must be physical rather than merely formal or conceptual is to make sure the system is grounded rather than hanging by a skyhook from people's mental interpretations. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
aweinste@Diamond.BBN.COM (Anders Weinstein) (06/05/87)
In reply to my objection that >> invertibility has essentially *nothing* to do with the difference >> between analog and digital representation according to anybody's >> intuitive use of the terms Stevan Harnad (harnad@mind.UUCP) writes in message <792@mind.UUCP>: >There are two stages of A/D even in the technical sense. ... Unless the >original signal is already discrete, the quantization phase involves a >loss of information. Some regions of input variation will not be retrievable >from the quantized image. The transformation ... cannot be inverted so as to >recover the entire original signal. Well, what I think is interesting is not preserving the signal itself but rather the *information* that the signal carries. In this sense, an analog signal conveys only a finite amount of information and it can in fact be converted to digital form and back to analog *without* any loss. But in any case the point I've been emphasizing remains: the A/A transformations you envisage are not going to be perfect (no "skyhooks" now, remember?), so preservation or loss of information alone won't distinguish an (intuitively) A/A from an A/D transfomation. I think the following reply to this point only muddies the waters: > I agree that there may be information loss in >A/A transformations (e.g., smoothing, blurring or loss of some >dimensions of variation), but then the image is simply *not analog in >the properties that have been lost*! It is only an analog of what it >preserves, not what it fails to preserve. You can take this line if you like, but notice that the same is true of a *digitized* image -- in your terms, it is "analog" in the information it preserves and not in the information lost. This seems to me to be a very unhappy choice of terminology! Both analog and digitizing transformations must preserve *some* information. If all you're *really* interested in is the quality of being (naturally) information-preserving (i.e. physically invertible), than I'd strongly recommend you just use one of these terms and drop the misleading use of "analog", "iconic", and "digital". > The "symbol grounding problem" that has >been under discussion here concerns the fact that symbol systems >depend for their "meanings" on only one of two possibilities: One is >an interpretation supplied by human users... and the other is a physical, >causal connection with the objects to which the symbols refer. >The surprising consequence is that a "dedicated system" -- one that is >hard-wired to its transducers and effectors... may be significantly different >from the very *same* system as an isolated symbol-manipulating module, >cut off from its peripherals ... With regard to this "symbol grounding problem": I think it's been well-understood for some time that causal interaction with the world is a necessary requirement for artificial intelligence. Recall that in his BBS reply to Searle, Dennett dismissed Searle's initial target -- the "bedridden" form of the Turing test -- as a strawman for precisely this reason. (Searle believes his argument goes through for causally embedded AI programs as well, but that's another topic.) The philosophical rationale for this requirement is the fact that some causal "grounding" is needed in order to determine a semantic interpretation. A classic example is due to Georges Rey: it's possible that a program for playing chess could, when compiled, be *identical* to one used to plot strategy in the Six Day War. If you look only at the formal symbol manipulations, you can't distinguish between the two interpretations; it's only by virtue of the causal relations between the symbols and the world that the symbols could have one meaning rather than another. But although everyone agrees that *some* kind of causal grounding is necessary for intentionality, it's notoriously difficult to explain exactly what sort it must be. And although the information-preserving transformations you discuss may play some role here, I really don't see how this challenges the premises of symbolic AI in the way you seem to think it does. In particular you say that: >The potential relevance of the physical invertibility criterion >would only be to cognitive modeling, especially in the constraint that >a grounded symbol system must be *nonmodular* -- i.e., it must be hybrid >symbolic/nonsymbolic. But why must the arrangement you envision must be "nonmodular" ? A system may contain analog and digital subsystems and still be modular if the subsytems interact solely via well-defined inputs and outputs. More importantly -- and this is the real motivation for my terminological objections -- it isn't clear why *any* (intuitively) analog processing need take place at all. I presume the stance of symbolic AI is that sensory input affects the system via an isolable module which converts incoming stimuli into symbolic representations. Imagine a vision sub-system that converts incoming light into digital form at the first stage, as it strikes a grid of photo-receptor surfaces, and is entirely digital from there on in. Such a system is still "grounded" in information-preserving representations in the sense you require. In short, I don't see any *philosophical* reason why symbol-grounding requires analog processing or a non-modular structure. Anders Weinstein BBN Labs
harnad@mind.UUCP (06/07/87)
aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc., Cambridge, MA writes: > [regarding invertibility, information preservation and the A/D > distinction]: what I think is interesting is not preserving the > signal itself but rather the *information* that the signal carries. > In this sense, an analog signal conveys only a finite amount of > information and it can in fact be converted to digital form and back > to analog *without* any loss. This is an important point and concerns a matter that is at the heart of the symbolic/nonsymbolic issue: What you're saying is appropriate for ordinary communication theory and communication-theoretic applications such as radio signals, telegraph, radar CDs, etc. In all these cases the signal is simply a carrier that encodes information which is subsequently decoded at the receiving end. But in the case of human cognition this communication-theoretic model -- of signals carrying messages that are encoded/decoded on either end -- may not be appropriate. (Formal information theory has always had difficulties with "content" or "meaning." This has often been pointed out, and I take this to be symptomatic of the fact that it's missing something as a candidate model for cognitive "information processing.") Note that the communication-theoretic, signal-analytic view has a kind of built-in bias toward digital coding, since it's the "message" and not the "medium" that matters. But what if -- in cognition -- the medium *is* the message? This may well be the case in iconic processing (and the performances that it subserves, such as discrimination, similarity judgment, matching, short-term memory, mental rotation, etc.): It may be the structure or "shape" of the physical signal (the stimulus) itself that matters, not some secondary information or message it carries in coded form. Hence the processing may have to be structure- or shape-preserving in the physical analog sense I've tried to capture with the criterion of invertibiliy. > a *digitized* image -- in your terms... is "analog" in the > information it preserves and not in the information lost. This > seems to me to be a very unhappy choice of terminology! Both analog > and digitizing transformations must preserve *some* information. > If all you're *really* interested in is the quality of being > (naturally) information-preserving (i.e. physically invertible), > than I'd strongly recommend you just use one of these terms and drop > the misleading use of "analog", "iconic", and "digital". I'm not at all convinced yet that the sense of iconic and analog that I am referring to is unrelated to the signal-analytic A/D distinction, although I've noted that it may turn out, on sufficient analysis, to be an independent distinction. For the time being, I've acknowledged that my invertibility criterion is, if not necessarily unhappy, somewhat surprising in its implications, for it implies (1) that being analog may be a matter of degree (i.e., degree of invertibility) and (2) even a classical digital system must be regarded as analog to a degree if one is considering a larger "dedicated" system of which it is a hard-wired (i.e., causally connected) component rather than an independent (human-interpretation-mediated) module. Let me repeat, though, that it could turn out that, despite some suggestive similarities, these considerations are not pertinent to the A/D distinction but, say, to the symbolic/nonsymbolic distinction -- and even that only in the special context of cognitive modeling rather than signal analysis or artificial intelligence in general. > With regard to [the] "symbol grounding problem": I think it's been > well-understood for some time that causal interaction with the world > is a necessary requirement for artificial intelligence... > The philosophical rationale for this requirement is the fact that > some causal "grounding" is needed in order to determine a semantic > interpretation... But although everyone agrees that *some* kind of > causal grounding is necessary for intentionality, it's notoriously > difficult to explain exactly what sort it must be. And although the > information-preserving transformations you discuss may play some role > here, I really don't see how this challenges the premises of symbolic > AI in the way you seem to think it does. As far as I know, there have so far been only two candidate proposals to overcome the symbol grounding problem WITHOUT resorting to the kind of hybrid proposal I advocate (i.e., without giving up purely symbolic top-down modules): One proposal, as you note, is that a pure symbol-manipulating system can be "grounded" by merely hooking it up causally in the "right way" to the outside world with simple (modular) transducers and effectors. I have conjectured that this strategy will not work in cognitive modeling (and I have given my supporting arguments elsewhere: "Minds, Machines and Searle"). The strategy may work in AI and conventional robotics and vision, but that is because these fields *do not have a grounding problem*! They're only trying to generate intelligent *pieces* of performance, not to model the mind in *all* its performance capacity. Only cognitive modeling has a symbol grounding problem. The second nonhybrid way to try to ground a purely symbolic system in real-world objects is by cryptology. Human beings, knowing already at least one grounded language and its relation to the world, can infer the meanings of a second one [e.g., ancient cuneiform] by using its internal formal structure plus what they already know: Since the symbol permutations and combinations of the unknown system (i.e., its syntactic rules) are constrained to yield a semantically interpretatable system, sometimes the semantics can be reliably and uniquely decoded this way (despite Quine's claims about the indeterminacy of radical translation). It is obvious, however, that such a "grounding" would be derivative, and would depend entirely on the groundedness of the original grounded symbol system. (This is equivalent to Searle's "intrinsic" vs. "derived intentionality.") And *that* grounding problem remains to be solved in an autonomous cognitive model. My own hybrid approach is simply to bite the bullet and give up on the hope of an autonomous symbolic level, the hope on which AI and symbolic functionalism had relied in their attempt to capture mental function. Although you can get a lot of clever performance by building in purely symbolic "knowledge," and although it had seemed so promising that symbol-strings could be interpreted as thoughts, beliefs, and mental propositions, I have argued that a mere extension of this modular "top-down" approach, hooking up eventually with peripheral modules, simply won't succeed in the long run (i.e., as we attempt to approach an asymptote of total human performance capacity, or what I've called the "Total Turing Test") because of the grounding problem and the nonviability of the two "solutions" sketched above (i.e., simple peripheral hook-ups and/or mediating human cryptology). Instead, I have described a nonmodular hybrid representational system in which symbolic representations are grounded bottom-up in nonsymbolic ones (iconic and categorical). Although there is a symbolic level in such a system, it is not quite the autonomous all-purpose level of symbolic AI. It trades its autonomy for its groundedness. > [W]hy must the arrangement you envision be "nonmodular"? A system > may contain analog and digital subsystems and still be modular if > the subsystems interact solely via well-defined inputs and outputs. I'll try to explain why I believe that a successful mind-model (one able to pass the Total Turing Test) is unlikely to consist merely of a pure symbol-manipulative module connected to input/output modules. A pure top-down symbol system just consists of physically implemented symbol manipulations. You yourself describe a typical example of ungroundedness (from Georges Rey): > it's possible that a program for playing chess could, > when compiled, be *identical* to one used to plot > strategy in the Six Day War. If you look only at the > formal symbol manipulations, you can't distinguish between > the two interpretations; it's only by virtue of the causal > relations between the symbols and the world that the symbols > could have one meaning rather than another. Now consider two cases of "fixing" the symbol interpretations by grounding the causal relations between the symbols and the world. In (1) a "toy" case -- a circumscribed little chunk of performance such as chess-playing or war-games -- the right causal connections could be wired according to the human encryption/decryption scheme: Inputs and outputs could be wired into their appropriate symbolic descriptions. There is no problem here, because the toy problems are themselves modular, and we know all the ins and outs. But none but the most diehard symbolic functionalist would want to argue that such a simple toy model was "thinking," or even doing anything remotely like what we do when we accomplish the same performance. The reason is that we are capable of doing *so much more* -- and not by an assemblage of endless independent modules of essentially the same sort as these toy models, but by some sort of (2) integrated internal system. Could that "total" system be just an oversized toy model -- a symbol system with its interpretations "fixed" by a means analogous to these toy cases? I am conjecturing that it is not. Toy models don't think. Their internal symbols really *are* meaningless, and hence setting them in the service of generating a toy performance just involves hard-wiring our intended interpretations of its symbols into a suitable dedicated system. Total (human-capacity-sized) models, on the other hand, will, one hopes, think, and hence the intended interpretations of their symbols will have to be intrinsic in some deeper way than the analogy with the toy model would suggest, at least so I think. This is my proposed "nonmodular" candidate: Every formal symbol system has both primitive atomic symbols and composite symbol-strings consisting of ruleful combinations of the atoms. Both the atoms and the combinations are semantically interpretable, but from the standpoint of the formal syntactic rules governing the symbol manipulations, the atoms could just as well have been undefined or meaningless. I hypothesize that the primitive symbols of a nonmodular cognitive symbol system are actually the (arbitrary) labels of object categories, and that these labels are reliably assigned to their referents by a nonsymbolic representational system consisting of (i) iconic (invertible, one-to-one) transformations of the sensory surface and (ii) categorical (many-to-few) representations that preserve only the features that suffice to reliably categorize and label sensory projections of the objects in question. Hence, rather than being primitive and undefined, and hence independent of interpretation, I suggest that the atoms of cognitive symbol systems are grounded, bottom-up, in such a categorization mechanism. The higher-order symbol combinations inherit the bottom-up constraints, including the nonsymbolic representations to which they are attached, rather than being an independent top-down symbol-manipulative module with its connections to an input/output module open to being fixed in various extrinsically determined ways. > it isn't clear why *any* (intuitively) analog processing need > take place at all. I presume the stance of symbolic AI is that > sensory input affects the system via an isolable module which converts > incoming stimuli into symbolic representations. Imagine a vision > sub-system that converts incoming light into digital form at the > first stage, as it strikes a grid of photo-receptor surfaces, and is > entirely digital from there on in. Such a system is still "grounded" > in information-preserving representations in the sense you require. > In short, I don't see any *philosophical* reason why symbol-grounding > requires analog processing or a non-modular structure. It is exactly this modular scenario that I am calling into question. It is not clear at all that a cognitive system must conform to it. To get a device to be able to do what we can do we may have to stop thinking in terms of "isolable" input modules that go straight into symbolic representations. That may be enough to "ground" a conventional toy system, but, as I've said, such toy systems don't have a grounding problem in the first place, because nobody really believes they're thinking. To get closer to life-size devices -- devices that can generate *all* of our performance capacity, and hence may indeed be thinking -- we may have to turn to hybrid systems in which the symbolic functions are nonmodularly grounded, bottom-up, in the nonsymbolic ones. The problem is not a philosophical one, it's an empirical one: What looks as if it's likely to work, on the evidence and reasoning available? -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
aweinste@diamond.bbn.com.UUCP (06/10/87)
In article <812@mind.UUCP> Stevan Harnad <harnad@mind.UUCP> replies: With regard physical invertibility and the A/D distinction: > >> a *digitized* image -- in your terms... is "analog" in the >> information it preserves and not in the information lost. This >> seems to me to be a very unhappy choice of terminology! > > For the time being, I've acknowledged that >my invertibility criterion is, if not necessarily unhappy, somewhat >surprising in its implications, for it implies (1) that being analog >may be a matter of degree (i.e., degree of invertibility) and (2) even >a classical digital system must be regarded as analog to a degree ... Grumble. These consequences only *seem* surprising if we forget that you've redefined "analog" in a non-standard manner; this is precisely I why I keep harping on your terminology. Compare them with what you're really saying: "physical invertibility is a matter of degree" or "a classical digital system still employs physically invertible representations" -- both quite humdrum. With regard to the symbolic AI approach to the "symbol-grounding problem": > >One proposal, as you note, is that a pure symbol-manipulating system can be >"grounded" by merely hooking it up causally in the "right way" to the outside >world with simple (modular) transducers and effectors. ... I have argued >that [this approach] simply won't succeed in the long run (i.e., as we >attempt to approach an asymptote of total human performance capacity ...) >...In (1) a "toy" case ... the right causal connections could be wired >according to the human encryption/decryption scheme: Inputs and outputs could >be wired into their appropriate symbolic descriptions. ... But none but the >most diehard symbolic functionalist would want to argue that such a simple >toy model was "thinking," ... The reason is that we are capable of >doing *so much more* -- and not by an assemblage of endless independent >modules of essentially the same sort as these toy models, but by some sort of >(2) integrated internal system. Could that "total" system be just an >oversized toy model -- a symbol system with its interpretations "fixed" by a >means analogous to these toy cases? I am conjecturing that it is not. I think your reply may misunderstand the point of my objection. I'm not trying to defend the intentionality of "toy" programs. I'm not even particularly concerned to *defend* the symbolic approach to AI (I personally don't even believe in it). I'm merely trying to determine exactly what your argument against symbolic AI is. I had thought, perhaps wrongly, that you were claiming that the interpretations of systems conceived by symbolic AI system must somehow inevitably fail to be "grounded", and that only a system which employed "analog" processing in the way you suggest would have the causal basis required for fixing an interpretation. In response, I pointed out first that advocates of the symbolic approach already understand that causal commerce with the environment is necessary for intentionality: they envision the use of complex perceptual systems to provide the requisite "grounding". So it's not as though the symbolic approach is indifferent to this issue. And your remarks against "toy" systems and "hard-wiring" the interpretations of the inputs are plain unfair -- the symbolic approach doesn't belittle the importance or complexity of what perceptual systems must be able to do. It is in total agreement with you that a truly intentional system must be capable of complex adaptive performance via the use of its sensory input -- it just hypothesizes that symbolic processing is sufficient to achieve this. And, as I tried to point out, there is just no reason that a modular, all-digital system of the kind envisioned by the symbolic approach could not be entirely "grounded" BY YOUR OWN THEORY OF "GROUNDEDNESS": it could employ "physically inevertible" representations (only they would be digital ones), from these it could induct reliable "feature filters" based on training (only these would use digital rather than analog techniques), etc. I concluded that the symbolic approach appears to handle your so-called "grounding problem" every bit as well as any other method. Now comes the reply that you are merely conjecturing that analog processing may be required to realize the full range of human, as opposed to "toy", performance -- in short, you think the symbolic approach just won't work. But this is a completely different issue! It has nothing to do with some mythical "symbol grounding" problem, at least as I understand it. It's just the same old "intelligent-behavior-generating" problem which everyone in AI, regardless of paradigm, is looking to solve. From this reply, it seems to me that this alleged "symbol-grounding problem" is a real red-herring (it misled me, at least). All you're saying is that you suspect that mainstream AI's symbol system hypothesis is false, based on its lack of conspicuous performance-generating sucesses. Obviously everyone must recognize that this is a possibility -- the premise of symbolic AI is, after all, only a hypothesis. But I find this a much less interesting claim than I originally thought -- conjectures, after all, are cheap. It *would* be interesting if you could show, as, say, the connectionist program is trying to, how analog processing can work wonders that symbol-manipulation can't. But this would require detailed research, not speculation. Until then, it remains a mystery why your proposed approach should be regarded as any more promising than any other. Anders Weinstein BBN Labs
harnad@mind.UUCP (06/11/87)
aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc., Cambridge, MA writes: > There's no [symbol] grounding problem, just the old > behavior-generating problem Before responding to the supporting arguments for this conclusion, let me restate the matter in what I consider to be the right way. There is: (1) the behavior-generating problem (what I have referred to as the problem of devising a candidate that will pass the Total Turing Test), (2) the symbol-grounding problem (the problem of how to make formal symbols intrinsically meaningful, independent of our interpretations), and (3) the conjecture (based on the existing empirical evidence and on logical and methodological considerations) that (2) is responsible for the failure of the top-down symbolic approach to solve (1). >>my [SH's] invertibility criterion is, if not necessarily unhappy, somewhat >>surprising in its implications, for it implies that (1) being analog may >>be a matter of degree (i.e., degree of invertibility) and that (2) even >>a classical digital system must be regarded as analog to a degree ... > > These consequences only *seem* surprising if we forget that you've > redefined "analog" in a non-standard manner... you're really saying: > "physical invertibility is a matter of degree" or "a classical digital > system still employs physically invertible representations" -- both > quite humdrum. You've bypassed the three points I brought up in replying to your challenge to my invertibility criterion for an analog transform the last time: (1) the quantization in standard A/D is noninvertible, (2) a representation can only be analog in what it preserves, not in what it fails to preserve, and, in cognition at any rate, (3) the physical shape of the signal may be what matters, not the "message" it "carries." Add to this the surprising logical consequence that a "dedicated" digital system (hardwired to its peripherals) would be "analog" in its invertible inputs and outputs according to my invertibility criterion, and you have a coherent distinction that conforms well to some features of the classical A/D distinction, but that may prove to diverge, as I acknowledged, sufficiently to make it an independent, "non-standard" distinction, unique to cognition and neurobiology. Would it be surprising if classical electrical engineering concepts did not turn out to be just right for mind-modeling? > I [AW] had thought, perhaps wrongly, that you were claiming that the > interpretations of systems conceived by symbolic AI system must somehow > inevitably fail to be "grounded", and that only a system which employed > "analog" processing in the way you suggest would have the causal basis > required for fixing an interpretation. That is indeed what I'm claiming (although you've completely omitted the role of the categorical representations, which are just as critical to my scheme, as described in the CP book). But do make sure you keep my "non-standard" definition of analog in mind, and recall that I'm talking about asymptotic, human-scale performance, not toy systems. Toy systems are trivially "groundable" (even by my definition of "analog") by hard-wiring them into a dedicated input/output system. But the problem of intrinsic meaningfulness does not arise for toy models, only for devices that can pass the Total Turing Test (TTT). [The argument here involves showing that to attribute intentionality to devices that exhibit sub-TTT performance is not justified in the first place.] The conjecture is accordingly that the modular solution (i.e., hardwiring an autonomous top-down symbolic module to conventional peripheral modules -- transducers and effectors) will simply not succeed in producing a candidate that will be able to pass the Total Turing Test, and that the fault lies with the autonomy (or modularity) of the symbolic module. But I am not simply proposing an unexplicated "analog" solution to the grounding problem either, for note that a dedicated modular system *would* be analog according to my invertibility criterion! The conjecture is that such a modular solution would not be able to meet the TTT performance criterion, and the grounds for the conjecture are partly inductive (extrapolating symbolic AI's performance failures), partly logical and methodological (the grounding problem), and partly theory and data-driven (psychophysical findings in human categorical perception). My proposal is not that some undifferentiated, non-standard "analog" processing must be going on. I am advocating a specific hybrid bottom-up, symbolic/nonsymbolic rival to the pure top-down symbolic approach (whether or not the latter is wedded to peripheral modules), as described in the volume under discussion ("Categorical Perception: The Groundwork of Cognition," CUP 1987). > advocates of the symbolic approach already understand that causal > commerce with the environment is necessary for intentionality: they > envision the use of complex perceptual systems to provide the > requisite "grounding". So it's not as though the symbolic approach > is indifferent to this issue. This is the pious hope of the "top-down" approach: That suitably "complex" perceptual systems will meet for a successful "hook-up" somewhere in the middle. But simply reiterating it does not mean it will be realized. The evidence to date suggests the opposite: That the top-down approach will just generate more special-purpose toys, not a general purpose, TTT-scale model of human performance capacity. Nor is there any theory at all of what the requisite perceptual "complexity" might be: The stereotype is still standard transducers that go from physical energy via A/D conversion straight into symbols. Nor does "causal commerce" say anything: It leaves open anything from the modular symbol-cruncher/transducer hookups of the kind that so far only seem capable of generating toy models, to hybrid, nonmodular, bottom-up models of the sort I would advocate. Perhaps it's in the specific nature of the bottom-up grounding that the nature of the requisite "complexity" and "causality" will be cashed in. > your remarks against "toy" systems and "hard-wiring" the > interpretations of the inputs are plain unfair -- the symbolic > approach doesn't belittle the importance or complexity of what > perceptual systems must be able to do. It is in total agreement > with you that a truly intentional system must be capable of complex > adaptive performance via the use of its sensory input -- it just > hypothesizes that symbolic processing is sufficient to achieve this. And I just hypothesize that it isn't. And I try to say why not (the grounding problem and modularity) and what to do about it (bottom-up, nonmodular grounding of symbolic representations in iconic and categorical representations). > there is just no reason that a modular, all-digital system of the > kind envisioned by the symbolic approach could not be entirely > "grounded" BY YOUR OWN THEORY OF "GROUNDEDNESS": it could employ > "physically inevertible" representations (only they would be digital > ones), from these it could induct reliable "feature filters" based on > training (only these would use digital rather than analog techniques), > etc. ... the symbolic approach appears to handle your so-called > "grounding problem" every bit as well as any other method. First of all, as I indicated earlier, a dedicated top-down symbol-crunching module hooked to peripherals would indeed be "grounded" in my sense -- if it had TTT-performance power. Nor is it *logically impossible* that such a system could exist. But it certainly does not look likely on the evidence. I think some of the reasons we were led (wrongly) to expect it were the following: (1) The original successes of symbolic AI in generating intelligent performance: The initial rule-based, knowledge-driven toys were great successes, compared to the alternatives (which, apart from some limited feats of Perceptrons, were nonexistent). But now, after a generation of toys that show no signs of converging on general principles and growing up to TTT-size, the inductive evidence is pointing in the other direction: More ad hoc toys is all we have grounds to expect. (2) Symbol strings seemed such hopeful candidates for capturing mental phenomena such as thoughts, knowledge, beliefs. Symbolic function seemed like such a natural, distinct, nonphysical level for capturing the mind. Easy come, easy go. (3) We were persuaded by the power of computation -- Turing equivalence and all that -- to suppose that computation (symbol-crunching) just might *be* cognition. If every (discrete) thing anyone or anything (including the mind) does is computationally simulable, then maybe the computational functions capture the mental functions? But the fact that something is computationally simulable does not entail that it is implemented computationally (any more than behavior that is *describable* as ruleful is necessarily following an explicit rule). And some functions (such as transduction and causality) cannot be implemented computationally at all. (4) We were similarly persuaded by the power of digital coding -- the fact that it can approximate analog coding as closely as we please (and physics permits) -- to suppose that digital representations were the only ones we needed to think about. But the fact that a digital approximation is always possible does not entail that it is always practical or optimal, nor that it is the one that is actually being *used* (by, say, the brain). Some form of functionalism is probably right, but it certainly need not be symbolic functionalism, or a functionalism that is indifferent to whether a mental function or representation is analog or digital: The type of implementation may matter, both to the practical empirical problem of successfully generating performance and to the untestable phenomenological problem of capturing qualitative subjective experience. And some functions (let me again add), such as transduction and (continuous) A/A, cannot be implemented purely symbolically at all. A good example to bear in mind is Shepard's mental rotation experiments. On the face of it, the data seemed to suggest that subjects were doing analog processing: In making same/different judgments of pairs of successively presented 2-dimensional projections of 3-dimensional, computer-generated, unfamiliar forms, subjects' reaction times for saying "same" when one stimulus was in a standard orientation and the other was rotated were proportional to the degree of rotation. The diehard symbolists pointed out (correctly) that the proportionality, instead of being due to the real-time analog rotation of a mental icon, could have been produced by, say, (1) serially searching through the coordinates of a digital grid on which the stimuli were represented, with more distant numbers taking more incremental steps to reach, or by (2) doing inferences on formal descriptions that became more complex (and hence time-consuming) as the orientation became more eccentric. The point, though, is that although digital/symbolic representations were indeed possible, so were analog ones, and here the latter would certainly seem to be more practical and parsimonious. And the fact of the matter -- namely, which kinds of representations were *actually* used -- is certainly not settled by pointing out that digital representations are always *possible.* Maybe a completely digital mind would have required a head the size of New York State and polynomial evolutionary time in order to come into existence -- who knows? Not to mention that it still couldn't do the "A" in the A/D... > [you] reply that you are merely conjecturing that analog processing > may be required to realize the full range of human, as opposed to "toy", > performance -- in short, you think the symbolic approach just won't > work. But this... has nothing to do with some mythical "symbol > grounding" problem, at least as I understand it. It's just > the same old "intelligent-behavior-generating" problem which everyone > in AI, regardless of paradigm, is looking to solve... All you're > saying is that you suspect that mainstream AI's symbol system > hypothesis is false, based on its lack of conspicuous > performance-generating successes. Obviously everyone must recognize > that this is a possibility -- the premise of symbolic AI is, after > all, only a hypothesis. I'm not just saying I think the symbolic hypothesis is false. I'm saying why I think it's false (ungroundedness) and I'm suggesting an alternative (a bottom-up hybrid). > But I find this a much less interesting claim than I originally > thought -- conjectures, after all, are cheap. It *would* be > interesting if you could show, as, say, the connectionist program > is trying to, how analog processing can work wonders that > symbol-manipulation can't. But this would require detailed research, > not speculation. Until then, it remains a mystery why your proposed > approach should be regarded as any more promising than any other. Be patient. My hypotheses (which are not just spontaneous conjectures, but are based on an evaluation of the available evidence, the theoretical alternatives, and the logical and methodological problems involved) will be tested. They even have a potential connectionist component (in the induction of the features subserving categorization), although connectionism comes in for criticism too. For now it would seem only salutary to attempt to set cognitive modeling in directions that differ from the unprofitable ones it has taken so far. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
marty1@houdi.UUCP (M.BRILLIANT) (06/11/87)
In article <828@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc., > Cambridge, MA writes: > > > There's no [symbol] grounding problem, just the old > > behavior-generating problem > > ..... There is: > (1) the behavior-generating problem (what I have referred to as the problem of > devising a candidate that will pass the Total Turing Test), (2) the > symbol-grounding problem (the problem of how to make formal symbols > intrinsically meaningful, independent of our interpretations), and (3) ... Just incidentally, what is the intrinsic meaning of "intrinsically meaningful"? The Turing test is an objectively verifiable criterion. How can we objectively verify intrinsic meaningfulness? > .... Add to this the surprising logical consequence that a > "dedicated" digital system (hardwired to its peripherals) would be > "analog" in its invertible inputs and outputs according to my > invertibility criterion, ..... Using "analog" to mean "invertible" invites misunderstanding, which invites irrelevant criticism. Human (in general, vertebrate) visual processing is a dedicated hardwired digital system. It employs data reduction to abstract such features as motion, edges, and orientation of edges. It then forms a map in which position is crudely analog to the visual plane, but quantized. This map is sufficiently similar to maps used in image processing machines so that I can almost imagine how symbols could be generated from it. By the time it gets to perception, it is not invertible, except with respect to what is perceived. Noninvertibility is demonstrated in experiments in the identification of suspects. Witnesses can report what they perceive, but they don't always perceive enough to invert the perceived image and identify the object that gave rise to the perception. If you don't agree, please give a concrete, objectively verifiable definition of "invertibility" that can be used to refute my conclusion. If I am right, human intelligence itself relies on neither analog nor invertible symbol grounding, and therefore artificial intelligence does not require it. By the way, there is an even simpler argument: even the best of us can engage in fuzzy thinking in which our symbols turn out not to be grounded. Subjectively, we then admit that our symbols are not intrinsically meaningful, though we had interpreted them as such. M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
aweinste@Diamond.BBN.COM (Anders Weinstein) (06/12/87)
In article <828@mind.UUCP> Stevan Harnad <harnad@mind.UUCP> writes > >> There's no [symbol] grounding problem, just the old >> behavior-generating problem > There is: >(1) the behavior-generating problem (what I have referred to as the problem of >devising a candidate that will pass the Total Turing Test), (2) the >symbol-grounding problem (the problem of how to make formal symbols >intrinsically meaningful, independent of our interpretations), and (3) >the conjecture (based on the existing empirical evidence and on >logical and methodological considerations) that (2) is responsible for >the failure of the top-down symbolic approach to solve (1). It seems to me that in different places, you are arguing the relation between (1) and (2) in both directions, claiming both (A) The symbols in a purely symbolic system will always be ungrounded because such systems can't generate real performance; and (B) A purely symbolic system can't generate real performance because its symbols will always be ungrounded. That is, when I ask you why you think the symbolic approach won't work, one of your reasons is always "because it can't solve the grounding problem", but when I press you for why the symbolic approach can't solve the grounding problem, it always turns out to be "because I think it won't work." I think we should get straight on the priority here. It seems to me that, contra (3), thesis (A) is the one that makes perfect sense -- in fact, it's what I thought you were saying. I just don't understand (B) at all. To elaborate: I presume the "symbol-grounding" problem is a *philosophical* question: what gives formal symbols original intentionality? I suppose the only answer anybody knows is, in brief, that the symbols must be playing a certain role in what Dennett calls an "intentional system", that is, a system which is capable of producing complex, adaptive behavior in a rational way. Since such a system must be able to respond to changes in its environment, this answer has the interesting consequence that causal interaction with the world is a *necessary* condition for original intentionality. It tells us that symbols in a disconnected computer, without sense organs, could never be "grounded" or intrinsically meaningful. But those in a machine that can sense and react could be, provided the machine exhibited the requisite rationality. And this, as far as I can tell, is the end of what we learn from the "symbol grounding" problem -- you've got to have sense organs. For a system that is not causally isolated from the environment, the symbol-grounding problem now just reduces to the old behavior-generating problem, for, if we could just produce the behavior, there would be no question of the intentionality of the symbols. In other words, once we've wised up enough to recognize that we must include sensory systems (as symbolic AI has), we have completely disposed of the "symbol grounding" problem, and all that's left to worry about is the question of what kind of system can produce the requisite intelligent behavior. That is, all that's left is the old behavior-generating problem. Now as I've indicated, I think it's perfectly reasonable to suspect that the symbolic approach is insufficient to produce full human performance. You really don't have to issue any polemics on this point to me; such a suspicion could well be justified by pointing out the triviality of AI's performance achievements to date. What I *don't* see is any more "principled" or "logical" or "methodological" reason for such a suspicion; in particular, I don't understand how (B) could provide such a reason. My system can't produce intelligent performance because it doesn't make its symbols meaningful? This statement has just got things backwards -- if I could produce the behavior, you'd have to admit that its symbols had all the "grounding" they needed for original intentionality. In sum, apart from the considerations that require causal embedding, I don't see that there *is* any "symbol-grounding" problem, at least not any problem that is any different from the old "total-performance generating" problem. For this reason, I think your animadversions on symbol grounding are largely irrelevant to your position -- the really substantial claims pertain only to "what looks like it's likely to work" for generating intelligent behavior. On a more specific issue: > >You've bypassed the three points I brought up in replying to your >challenge to my invertibility criterion for an analog transform the >last time: (1) the quantization in standard A/D is noninvertible, Yes, but *my* point has been that since there isn't necessarily any more loss here than there is in a typical A/A transformation, the "degree of invertibility" criterion cross-cuts the intuitive A/D distinction. Look, suppose we had a digitized image, A, which is of much higher resolution than another analog one, B. A is more invertible since it contains more detail from which to reconstruct the original signal, but B is more "shape-preserving" in an intuitive sense. So, which do you regard as "more analog"? Which does your theory think is better suited to subserving our categorization performance? If you say B, then invertibility is just not what you're after. Anders Weinstein BBN Labs
marty1@houdi.UUCP (M.BRILLIANT) (06/13/87)
In article <6521@diamond.BBN.COM>, aweinste@Diamond.BBN.COM (Anders Weinstein) writes: > .... > (A) The symbols in a purely symbolic system will always be > ungrounded because such systems can't generate real performance; > ... > It seems to me that .... thesis (A) is the one that makes perfect > sense .... > > ..... I think it's perfectly reasonable to suspect that the > symbolic approach is insufficient to produce full human performance.... What exactly is this "purely" symbolic approach? What impure approach might be necessary? "Purely symbolic" sounds like a straw man: a system so purely abstract that it couldn't possibly relate to the real world, and that nobody seriously trying to mimic human behavior would even try to build anything that pure. To begin with, any attempt to "produce full human performance" must involve sensors, effectors, and motivations. Does "purely symbolic" preclude any of these? If not, what is it in the definition of a "purely symbolic" approach that makes it inadequate to pull these factors together? (Why do I so casually include motivations? I'm an amateur actor. Not even a human can mimic another human without knowing about motivations.) M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
harnad@mind.UUCP (06/13/87)
aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc., Cambridge, MA writes: > X has intrinsic intentionality (is "grounded") iff X can pass the TTT. > I thought from your postings that you shared this frankly behavioristic > philosophy... So what could it come to to say that symbolic AI must > inevitably choke on the grounding problem? Since grounding == behavioral > capability, all this claim can mean is that symbolic AI won't be able > to generate full TTT performance. I think, incidentally, that you're > probably right in this claim. However,...To say that your approach > is better grounded is only to say that it may work better (ie. > generate TTT performance); there's just no independent content to the > claim of "groundedness". Or do you have some non-behavioral definition > of intrinsic intentionality that I haven't yet heard? I think that this discussion has become repetitious, so I'm going to have to cut down on the words. Our disagreement is not substantive. I am not a behaviorist. I am a methodological epiphenomenalist. Intentionality and consciousness are not equivalent to behavioral capacity, but behavioral capacity is our only objective basis for inferring that they are present. Apart from behavioral considerations, there are also functional considerations: What kinds of internal processes (e.g., symbolic and nonsymbolic) look as if they might work? and why? and how? The grounding problem accordingly has functional aspects too. What are the right kinds of causal connections to ground a system? Yes, the test of successful grounding is the TTT, but that still leaves you with the problem of which kinds of connections are going to work. I've argued that top-down symbol systems hooked to transducers won't, and that certain hybrid bottom-up systems might. All these functional considerations concern how to ground symbols, they are distinct from (though ultimately, of course, dependent on) behavioral success, and they do have independent content. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
marty1@houdi.UUCP (06/14/87)
In article <835@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel writes: > > > Human visual processing is neither analog nor invertible. > > Nor understood nearly well enough to draw the former two conclusions, > it seems to me. If you are taking the discreteness of neurons, the > all-or-none nature of the action potential, and the transformation of > stimulus intensity to firing frequency as your basis for concluding > that visual processing is "digital," the basis is weak, and the > analogy with electronic transduction strained. No, I'm taking more than that as the basis. I don't have any names handy, and I'm not a professional in neurobiology, but I've seen many articles in Science and Scientific American (including a classic paper titled something like "What the frog's eye tells the frog's brain") that describe the flow of visual information through the layers of the retina, and through the layers of the visual cortex, with motion detection, edge detection, orientation detection, etc., all going on in specific neurons. Maybe a neurobiologist can give a good account of what all that means, so we can guess whether computer image processing could emulate it. > > what is the intrinsic meaning of "intrinsically meaningful"? > > The Turing test is an objectively verifiable criterion. How can > > we objectively verify intrinsic meaningfulness? > > We cannot objectively verify intrinsic meaningfulness. The Turing test > is the only available criterion. Yet we can make inferences... I think that substantiates Weinstein's position: we're back to the behavior-generating problem. > ....: We > know the difference between looking up a meaning in an English/English > dictionary versus a Chinese/Chinese dictionary (if we are nonspeakers > of Chinese): The former symbols are meaningful and the latter are > not. Not relevant. Intrinsically, words in both languages are equally meaningful. > > Using "analog" to mean "invertible" invites misunderstanding, > > which invites irrelevant criticism. > > ..... I have acknowledged all > along that the physically invertible/noninvertible distinction may > turn out to be independent of the A/D distinction, although the > overlap looks significant. And I'm doing my best to sort out the > misunderstandings and irrelevant criticism... Then please stop using the terms analog and digital. > > > Human (in general, vertebrate) visual processing is a dedicated > > hardwired digital system. It employs data reduction to abstract such > > features as motion, edges, and orientation of edges. It then forms a > > map in which position is crudely analog to the visual plane, but > > quantized. This map is sufficiently similar to maps used in image > > processing machines so that I can almost imagine how symbols could be > > generated from it. > > I am surprised that you state this with such confidence. In > particular, do you really think that vertebrate vision is well enough > understood functionally to draw such conclusions? ... Yes. See above. > ... And are you sure > that the current hardware and signal-analytic concepts from electrical > engineering are adequate to apply to what we do know of visual > neurobiology, rather than being prima facie metaphors? Not the hardware concepts. But I think some principles of information theory are independent of the medium. > > By the time it gets to perception, it is not invertible, except with > > respect to what is perceived. Noninvertibility is demonstrated in > > experiments in the identification of suspects. Witnesses can report > > what they perceive, but they don't always perceive enough to invert > > the perceived image and identify the object that gave rise to the > > perception.... > > .... If I am right, human intelligence itself relies on neither > > analog nor invertible symbol grounding, and therefore artificial > > intelligence does not require it. > > I cannot follow your argument at all. Inability to categorize and identify > is indeed evidence of a form of noninvertibility. But my theory never laid > claim to complete invertibility throughout..... First "analog" doesn't mean analog, and now "invertibility" doesn't mean complete invertibility. These arguments are getting too slippery for me. > .... Categorization and identification > itself *requires* selective non-invertibility: within-category differences > must be ignored and diminished, while between-category differences must > be selected and enhanced. Well, that's the point I've been making. If non-invertibility is essential to the way we process information, you can't say non-invertibility would prevent a machine from emulating us. Anybody can do hand-waving. To be convincing, abstract reasoning must be rigidly self-consistent. Harnad's is not. I haven't made any assertions as to what is possible. All I'm saying is that Harnad has come nowhere near proving his assertions, or even making clear what his assertions are. M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
harwood@cvl.UUCP (06/14/87)
In article <843@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: > (... replying to Anders Weinstein ...who wonders "Where's the beef?" in Steve Harnad's conceptual and terminological salad ...; uh - let me be first to prophylactically remind us - lest there is any confusion and forfending that he should perforce of intellectual scruple must need refer to his modest accomplishments - Steve Harnad is editor of Behavioral and Brain Sciences, and I am not, of course. We - all of us - enjoy reading such high-class stuff...;-) Anyway, Steve Harnad replies to A.W., re "Total Turing Tests", behavior, and the (great AI) "symbol grounding problem": >I think that this discussion has become repetitious, so I'm going to >have to cut down on the words. Praise the Lord - some insight - by itself, worthy of Pass of the "Total Turing Test." >... Our disagreement is not substantive. >I am not a behaviorist. I am a methodological epiphenomenalist. I'm not a behaviorist, you're not a behaviorist, he's not a behaviorist too ... We are all methodological solipsists hereabouts on this planet, having already, incorrigibly, failed the "Total Turing Test" for genuine intergalactic First Class rational beings, but so what? (Please, Steve - this is a NOT a test - I repeat - this is NOT a test of your philosophical intelligence. It is an ACTUAL ALERT of your common sense, not to mention, sense of humor. Please do not solicit BBS review of this thesis... >... Apart from behavioral considerations, >there are also functional considerations: What kinds of internal >processes (e.g., symbolic and nonsymbolic) look as if they might work? >and why? and how? The grounding problem accordingly has functional aspects >too. What are the right kinds of causal connections to ground a >system? Yes, the test of successful grounding is the TTT, but that >still leaves you with the problem of which kinds of connections are >going to work. I've argued that top-down symbol systems hooked to >transducers won't, and that certain hybrid bottom-up systems might. All >these functional considerations concern how to ground symbols, they are >distinct from (though ultimately, of course, dependent on) behavioral >success, and they do have independent content. >-- > >Stevan Harnad (609) - 921 7771 >{bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad >harnad%mind@princeton.csnet harnad@mind.Princeton.EDU You know what is the real problem with your postings - it's what I would call "the symbol grounding problem". You want to say obvious things in the worst possible way, otherwise say abstract things in the worst possible way.. And ignore what others say. Also, for purposes of controversial public discussion, ignore scientific 'facts' (eg about neurologic perceptual equivalence), and standard usage of scientific terminology and interpretation of theories. (Not that these are sacrosanct.) It seems to me that your particular "symbol grounding problem" is indeed the the sine qua non of the Total Turing Test for "real" philosophers of human cognition. As I said, we are all methodological solipsists hereabouts. However, if you want AI funding from me - I want to see what real computing system, using your own architecture and object code of at least 1 megabytes, has been designed by you. Then we will see how your "symbols" are actually grounded, using the standard, naive but effective denotational semantics for the "symbols" of your intention, qua "methodological epiphenomensist." David Harwood
marty1@houdi.UUCP (06/14/87)
In article <843@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > > Intentionality and consciousness are not equivalent to behavioral > capacity, but behavioral capacity is our only objective basis for > inferring that they are present. Apart from behavioral considerations, > there are also functional considerations: What kinds of internal > processes (e.g., symbolic and nonsymbolic) look as if they might work? > and why? and how? The grounding problem accordingly has functional aspects > too. What are the right kinds of causal connections to ground a > system? Yes, the test of successful grounding is the TTT, but that > still leaves you with the problem of which kinds of connections are > going to work. I've argued that top-down symbol systems hooked to > transducers won't, and that certain hybrid bottom-up systems might. All > these functional considerations concern how to ground symbols, they are > distinct from (though ultimately, of course, dependent on) behavioral > success, and they do have independent content. Harnad's terminology has proved unreliable: analog doesn't mean analog, invertible doesn't mean invertible, and so on. Maybe top-down doesn't mean top-down either. Suppose we create a visual transducer feeding into an image processing module that could delineate edges, detect motion, abstract shape, etc. This processor is to be built with a hard-wired capability to detect "objects" without necessarily finding symbols for them. Next let's create a symbol bank, consisting of a large storage area that can be partitioned into spaces for strings of alphanumeric characters, with associated pointers, frames, anything else you think will work to support a sophisiticated knowledge base. The finite area means that memory will be limited, but human memory can't really be infinite, either. Next let's connect the two: any time the image processor finds an object, the machine makes up a symbol for it. When it finds another object, it makes up another symbol and links that symbol to the symbols for any other objects that are related to it in ways that it knows about (some of which might be hard-wired primitives): proximity in time or space, similar shape, etc. It also has to make up symbols for the relations it relies on to link objects. I'm over my head here, but I don't think I'm asking for anything we think is impossible. Basically, I'm looking for an expert system that learns. Now we decide whether we want to play a game, which is to make the machine seem human, or whether we want the machine to exhibit human behavior on the same basis as humans, that is, to survive. For the game, the essential step is to make the machine communicate with us both visually and verbally, so it can translate the character strings it made up into English, so we can understand it and it can understand us. For the survival motivation, the machine needs a full set of receptors and effectors, and an environment in which it can either survive or perish, and if we built it right it will learn English for its own reasons. It could also endanger our survival. Now, Harnad, Weinstein, anyone: do you think this could work, or do you think it could not work? M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
aweinste@diamond.bbn.com.UUCP (06/14/87)
In article <1163@houdi.UUCP> marty1@houdi.UUCP (M.BRILLIANT) writes: >> (A) The symbols in a purely symbolic system ... > >What exactly is this "purely" symbolic approach? What impure approach >might be necessary? "Purely symbolic" sounds like a straw man ... The phrase "purely symbolic" was just my short label for the AI strategy that Stevan Harnad has been criticizing. Yes this strategy *does* encompass the use of sensors and effectors and (maybe) motivations. Sorry if the term was misleading, I was only using it as pointer; consult Harnad's postings for a fuller characterization.
berleant@ut-sally.UUCP (06/15/87)
It is interesting that some (presumably significant) visual processing occurs by graded potentials without action potentials. Receptor cells (rods & cones), 'horizontal cells' which process the graded output of the receptors, and 'bipolar cells' which do further processing, use no action potentials to do it. This seems to indicate the significance of analog processing to vision. There may also be significant invertibility at these early stages of visual processing in the retina: One photon can cause several hundred sodium channels in a rod cell to close. Such sensitivity suggests a need for precise representation of visual stimuli which suggests the representation might be invertible. Furthermore, the retina cannot be viewed as a module, only loosely coupled to the brain. The optic nerve, which does the coupling, has a high bandwidth and thus carries much information simultaneously along many fibers. In fact, the optic nerve carries a topographic representation of the retina. To the degree that a topographic representation is an iconic representation, the brain thus receives an iconic representation of the visual field. Furthermore, even central processing of visual information is characterized by topographic representations. This suggests that iconic representations are important to the later stages of perceptual processing. Indeed, all of the sensory systems seem to rely on topographic representations (particularly touch and hearing as well as vision). An interesting example in hearing is direction perception. Direction seems to be, as I understand it, found by processing the difference in time from when a sound reaches one ear to when it reaches the other, in large part. The resulting direction is presumably an invertible representation of that time difference. Dan Berleant UUCP: {gatech,ucbvax,ihnp4,seismo,kpno,ctvax}!ut-sally!berleant ARPA: ai.berleant@r20.utexas.edu
harnad@mind.UUCP (06/15/87)
In two consecutive postings marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel wrote: > the flow of visual information through the layers of the retina, > and through the layers of the visual cortex, with motion detection, > edge detection, orientation detection, etc., all going on in specific > neurons... Maybe a neurobiologist can give a good account of what > all that means, so we can guess whether computer image > processing could emulate it. As I indicated the last time, neurobiologists don't *know* what all those findings mean. It is not known how features are detected and by what. The idea that single cells are doing the detecting is just a theory fragment, and one that has currently fallen on hard times. Rivals include distributed networks (of which the cell is just a component), or spatial frequency detectors, or coding at some entirely different level, such as continuous postsynaptic potentials, local circuits, architectonic columns or neurochemistry. Some even think that the multiple analog retinas at various levels of the visual system (12 on each side, at last count) may have something to do with feature extraction. One cannot just take current neurophysiological data and replace the nonexistent theory by preconceptions from machine vision -- especially not by way of justifying the machine-theoretic concepts. >> >[SH:] my theory never laid claim to complete invertibility >> >throughout. > > First "analog" doesn't mean analog, and now "invertibility" > doesn't mean complete invertibility. These arguments are > getting too slippery for me... If non-invertibility is essential > to the way we process information, you can't say non-invertibility > would prevent a machine from emulating us. I have no idea what proposition you think you were debating here. I had pointed out a problem with the top-down symbolic approach to mind-modeling -- the symbol grounding problem -- which suggested that symbolic representations would have to be grounded in nonsymbolic representations. I had also sketched a model for categorization that attempted to ground symbolic representations in two nonsymbolic kinds of representations -- iconic (analog) representations and categorical (feature-filtered) representations. I also proposed a criterion for analog transformations -- invertibility. I never said that categorical representations were invertible or that iconic representations were the only nonsymbolic representations you needed to ground symbols. Indeed, most of the CP book under discussion concerns categorical representations. > All I'm saying is that Harnad has come nowhere near proving his > assertions, or even making clear what his assertions are... > Harnad's terminology has proved unreliable: analog doesn't mean > analog, invertible doesn't mean invertible, and so on. Maybe > top-down doesn't mean top-down either... > Anybody can do hand-waving. To be convincing, abstract > reasoning must be rigidly self-consistent. Harnad's is not. > I haven't made any assertions as to what is possible. Invertibility is my candidate criterion for an analog transform. Invertible means invertible, top-down means top-down. Where further clarification is needed, all one need do is ask. Now here is M. B. Brilliant's "Recipe for a symbol-grounder" (not to be confused with an assertion as to what is possible): > Suppose we create a visual transducer... with hard-wired > capability to detect "objects"... Next let's create a symbol bank > Next let's connect the two... I'm over my head here, but I don't > think I'm asking for anything we think is impossible. Basically, > I'm looking for an expert system that learns... the essential step > is to make the machine communicate with us both visually and verbally, > so it can translate the character strings it made up into English, so > we can understand it and it can understand us. For the survival > motivation, the machine needs a full set of receptors and > effectors, and an environment in which it can either survive or > perish, and if we built it right it will learn English for its > own reasons. Now, Harnad, Weinstein, anyone: do you think this > could work, or do you think it could not work? Sounds like a conjecture about a system that would pass the TTT. Unfortunately, the rest seems far too vague and hypothetical to respond to. If you want me to pay attention to further postings of yours, stay temperate and respectful as I endeavor to do. Dismissive rhetoric will not convince anyone, and will not elicit substantive discussion. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
berleant@ut-sally.UUCP (06/15/87)
In article <835@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: >We cannot objectively verify intrinsic meaningfulness. The Turing test >is the only available criterion. Yes, the Turing test is by definition subjective, and also subject to variable results from hour to hour even from the same judge. But I think I disagree that intrinsic meaningfulness cannot be objectively verified. What about the model theory of logic? Dan Berleant UUCP: {gatech,ucbvax,ihnp4,seismo,kpno,ctvax}!ut-sally!berleant ARPA: ai.berleant@r20.utexas.edu
harnad@mind.UUCP (06/15/87)
berleant@ut-sally.UUCP (Dan Berleant) of U. Texas CS Dept., Austin, Texas has posted this welcome reminder: > the retina cannot be viewed as a module, only loosely > coupled to the brain. The optic nerve, which does the coupling, has a > high bandwidth and thus carries much information simultaneously along > many fibers. In fact, the optic nerve carries a topographic > representation of the retina. To the degree that a topographic > representation is an iconic representation, the brain thus receives an > iconic representation of the visual field. > Furthermore, even central processing of visual information is > characterized by topographic representations. This suggests that iconic > representations are important to the later stages of perceptual > processing. Indeed, all of the sensory systems seem to rely on > topographic representations (particularly touch and hearing as well as > vision). As I mentioned in my last posting, at last count there were 12 pairs of successively higher analog retinas in the visual system. No one yet knows what function they perform, but they certainly suggest that it is premature to dismiss the importance of analog representations in at least one well optimized system... > Yes, the Turing test is by definition subjective, and also subject to > variable results from hour to hour even from the same judge. > But I think I disagree that intrinsic meaningfulness cannot be > objectively verified. What about the model theory of logic? In earlier postings I distinguished between two components of the Turing Test. One is the formal, objective one: Getting a system to generate all of our behavioral capacities. The second is the informal, intuitive (and hence subjective) one: Can a person tell such a device apart from a person? This version must be open-ended, and is no better or worse than -- in fact, I argue that is identical to -- the real-life turing-testing we do of one another in contending with the "other minds" problem. The subjective verification of intrinsic meaning, however, is not done by means of the informal turing test. It is done from the first-person point of view. Each of us knows that his symbols (his linguistic ones, at any rate) are grounded, and refer to objects, rather than being menaningless syntactic objects manipulated on the basis of their shapes. I am not a model theorist, so the following reply may be inadequate, but it seems to me that the semantic model for an uninterpreted formal system in formal model-theoretic semantics is always yet another formal object, only its symbols are of a different type from the symbols of the system that is being interpreted. That seems true of *formal* models. Of course, there are informal models, in which the intended interpretation of a formal system corresponds to conceptual or even physical objects. We can say that the intended interpretation of the primitive symbol tokens and the axioms of formal number theory are "numbers," by which we mean either our intuitive concept of numbers or whatever invariant physical property quantities of objects share. But such informal interpretations are not what formal model theory trades in. As far as I can tell, formal models are not intrinsically grounded, but depend on our concepts and our linking them to real objects. And of course the intrinsic grounding of our concepts and our references to objects is what we are attempting to capture in confronting the symbol grounding problem. I hope model theorists will correct me if I'm wrong. But even if the model-theoretic interpretation of some formal symbol systems can truly be regarded as the "objects" to which it refers, it is not clear that this can be generalized to natural language or to the "language of thought," which must, after all, have Total-Turing-Test scope, rather than the scope of the circumscribed artificial languages of logic and mathematics. Is there any indication that all that can be formalized model-theoretically? -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
harnad@mind.UUCP (06/15/87)
Ken Laws <Laws@Stripe.SRI.Com> on ailist@Stripe.SRI.Com writes: > Consider a "hash transformation" that maps a set of "intuitively > meaningful" numeric symbols to a set of seemingly random binary codes. > Suppose that the transformation can be computed by some [horrendous] > information-preserving mapping of the reals to the reals. Now, the > hash function satisfies my notion of an analog transformation (in the > signal-processing sense). When applied to my discrete input set, > however, the mapping does not seem to be analog (in the sense of > preserving isomorphic relationships between pairs -- or higher > orders -- of symbolic codes). Since information has not been lost, > however, it should be possible to define "relational functions" that > are analogous to "adjacency" and other properties in the original > domain. Once this is done, surely the binary codes must be viewed > as isomorphic to the original symbols rather than just "standing for > them". I don't think I disagree with this. Don't forget that I bit the bullet on some surprising consequences of taking my invertibility criterion for an analog transform seriously. As long as the requisite information-preserving mapping or "relational function" is in the head of the human interpreter, you do not have an invertible (hence analog) transformation. But as soon as the inverse function is wired in physically, producing a dedicated invertible transformation, you do have invertibility, even if a lot of the stuff in between is as discrete, digital and binary as it can be. I'm not unaware of this counterintuitive property of the invertibility criterion -- or even of the possibility that it may ultimately do it in as an attempt to capture the essential feature of an analog transform in general. Invertibility could fail to capture the standard A/D distinction, but may be important in the special case of mind-modeling. Or it could turn out not to be useful at all. (Although Ken Laws's point seems to strengthen rather than weaken my criterion, unless I've misunderstood.) Note, however, that what I've said about the grounding problem and the role of nonsymbolic representations (analog and categorical) would stand independently of my particular criterion for analog; substituting a more standard one leaves just about all of the argument intact. Some of the prior commentators (not Ken Laws) haven't noticed that, criticizing invertibility as a criterion for analog and thinking that they were criticizing the symbol grounding problem. > The "information" in a signal is a function of your methods for > extracting and interpreting the information. Likewise the "analog > nature" of an information-preserving transformation is a function > of your methods for decoding the analog relationships. I completely agree. But to get the requisite causality I'm looking for, the information must be interpretation-independent. Physical invertibility seems to give you that, even if it's generated by hardwiring the encryption/decryption (encoding/decoding) scheme underlying the interpretation into a dedicated system. > Perhaps [information theorists] have too limited (or general!) > a view of information, but they have certainly considered your > problem of decoding signal shape (as opposed to detecting modulation > patterns)... I am sure that methods for decoding both discrete and > continuous information in continuous signals are well studied. I would be interested to hear from those who are familiar with such work. It may be that some of it is relevant to cognitive and neural modeling and even the symbol grounding problems under discussion here. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
marty1@houdi.UUCP (M.BRILLIANT) (06/16/87)
In article <849@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > .... Invertibility could fail to capture the standard A/D distinction, > but may be important in the special case of mind-modeling. Or it could > turn out not to be useful at all.... So what do you think is essential: (A) literally analog transformation, (B) invertibility, or (C) preservation of significant relational functions? > ..... what I've said about the grounding problem and the role > of nonsymbolic representations (analog and categorical) would stand > independently of my particular criterion for analog; substituting a more > standard one leaves just about all of the argument intact..... Where does that argument stand now? Can we restate it in terms whose definitions we all agree on? > ..... to get the requisite causality I'm looking > for, the information must be interpretation-independent. Physical > invertibility seems to give you that...... I think invertibility is too strong. It is sufficient, but not necessary, for human-style information-processing. Real people forget awesome amounts of detail, we misunderstand each other (our symbol groundings are not fully invertible), and we thereby achieve levels of communication that often, but not always, satisify us. Do you still say we only need transformations that are analog (invertible) with respect to those features for which they are analog (invertible)? That amounts to limited invertibility, and the next essential step would be to identify the features that need invertibility, as distinct from those that can be thrown away. > Ken Laws <Laws@Stripe.SRI.Com> on ailist@Stripe.SRI.Com writes: > > ... I am sure that methods for decoding both discrete and > > continuous information in continuous signals are well studied. > > I would be interested to hear from those who are familiar with such work. > It may be that some of it is relevant to cognitive and neural modeling > and even the symbol grounding problems under discussion here. I'm not up to date on these methods. But if you want to get responses from experts, it might be well to be more specific. For monaural sound, decoding can be done with Fourier methods that are in principle continuous. For monocular vision, Fourier methods are used for image enhancement to aid in human decoding, but I think machine decoding depends on making the spatial dimensions discontinous and comparing the content of adjacent cells. M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
aweinste@Diamond.BBN.COM (Anders Weinstein) (06/17/87)
In article <849@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: > As long as the requisite >information-preserving mapping or "relational function" is in the head >of the human interpreter, you do not have an invertible (hence analog) >transformation. But as soon as the inverse function is wired in >physically, producing a dedicated invertible transformation, you do >have invertibility, ... This seems to relate to a distinction between "physical invertibility" and plain old invertibility, another of your points which I haven't understood. I don't see any difference between "physical" and "merely theoretical" invertibility. If a particular physical transformation of a signal is invertible in theory, then I'd imagine we could always build a device to perform the actual inversion if we wanted to. Such a device would of course be a physical device; hence the invertibility would seem to count as "physical," at least in the sense of "physically possible". Surely you don't mean that a transformation-inversion capability must actually be present in the device for it to count as "analog" in your sense. (Else brains, for example, wouldn't count). So what difference are you trying to capture with this distinction? Anders Weinstein BBN Labs
harnad@mind.UUCP (Stevan Harnad) (06/17/87)
marty1@houdi.UUCP (M.BRILLIANT) asks: > what do you think is essential: (A) literally analog transformation, > (B) invertibility, or (C) preservation of significant relational > functions? Essential for what? For (i) generating the pairwise same/different judgments, simlarity judgments and matching that I've called, collectively, "discrimination", and for which I've hypothesized that there are iconic ("analog") representations? For that I think invertibility is essential. (I think that in most real cases what is actually physically invertible in my sense will also turn out to be "literally analog" in a more standard sense. Dedicated digital equivalents that would also have yielded invertibility will be like a Rube-Goldberg alternative; they will have a much bigger processing cost. But for my puroposes, the dedicated digital equivalent would in principle serve just as well. Don't forget the *dedicated* constraint though.) For (ii) generating the reliable sorting and labeling of objects on the basis of their sensory projections, which I've called collectively, "identification" or "categorization"? For that I think only distinctive features need to be extracted from the sensory projection. The rest need not be invertible. Iconic representations are one-to-one with the sensory projection; categorical representations are many-to-few. But if you're not talking about sensory discrimination or about stimulus categorization but about, say, (iii) conscious problem-solving, deduction, or linguistic description, then relation-preserving symbolic representations would be optimal -- only the ones I advocate would not be autonomous (modular). The atomic terms of which they were composed would be the labels of categories in the above sense, and hence they would be grounded in and constrained by the nonsymbolic representations. They would preserve relations not just in virtue of their syntactic form, as mediated by an interpretation; their meanings would be "fixed" by their causal connections with the nonsymbolic representations that ground their atoms. But if your question concerns what I think is nesessary to pass the Total Turing Test (TTT), I think you need all of (i) - (iii), grounded bottom-up in the way I've described. > Where does [the symbol grounding] argument stand now? Can we > restate it in terms whose definitions we all agree on? The symbols of an autonomous symbol-manipulating module are ungrounded. Their "meanings" depend on the mediation of human interpretation. If an attempt is made to "ground" them merely by linking the symbolic module with input/output modules in a dedicated system, all you will ever get is toy models: Small, nonrepresentative, nongeneralizable pieces of intelligent performance (a valid objective for AI, by the way, but not for cognitive modeling). This is only a conjecture, however, based on current toy performance models and the the kind of thing it takes to make them work. If a top-down symbolic module linked to peripherals could successfully pass the TTT that way, however, nothing would be left of the symbol grounding problem. My own alternative has to do with the way symbolic models work (and don't work). The hypothesis is that a hybrid symbolic/nonsymbolic model along the lines sketched above will be needed in order to pass the TTT. It will require a bottom-up, nonmodular grounding of its symbolic representations in nonsymbolic representations: iconic ( = invertible with the sensory projection) and categorical ( = invertible only with the invariant features of category members that are preserved in the sensory projection and are sufficient to guide reliable categorization). > I think invertibility is too strong. It is sufficient, but not > necessary, for human-style information-processing. Real people > forget... misunderstand... I think this is not the relevant form of evidence bearing on this question. Sure we forget, etc., but the question concerns what it takes to get it right when we actually do get it right. How do we discriminate, categorize, identify and describe things as well as we do (TTT-level) based on the sensory data we get? And I have to remind you again: categorization involves at least as much selective *non*invertibility as it does invertibility. Invertibility is needed where it's needed; it's not needed everywhere, indeed it may even be a handicap (see Luria's "Mind of a Mnemonist," which is about a person who seems to have had such vivid, accurate and persisting eidetic imagery that he couldn't selectively ignore or forget sensory details, and hence had great difficulty categorizing, abstracting and generalizing; Borges describes a similar case in "Funes the Memorious," and I discuss the problem in "Metaphor and Mental Duality," a chapter in Simon & Sholes' (eds.) "Language, Mind and Brain," Academic Press 1978). > Do you still say [1] we only need transformations that are analog > (invertible) with respect to those features for which they are analog > (invertible)? That amounts to limited invertibility, and the next > essential step would be [2] to identify the features that need > invertibility, as distinct from those that can be thrown away. Yes, I still say [1]. And yes, the category induction problem is [2]. Perhaps with the three-level division-of-labor I've described a connectionist algorithm or some other inductive mechanism would be able to find the invariant features that will subserve a sensory categorization from a given sample of confusable alternatives. That's the categorical representation. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
aweinste@Diamond.BBN.COM (Anders Weinstein) (06/18/87)
In article <861@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: > The atomic terms of which they were >composed would be the labels of categories in the above sense, and hence they >would be grounded in and constrained by the nonsymbolic representations. >They would preserve relations not just in virtue of their syntactic >form, as mediated by an interpretation; their meanings would be "fixed" >by their causal connections with the nonsymbolic representations that >ground their atoms. I don't know how significant this is for your theory, but I think it's worth emphasizing that the *semantic* meaning of a symbol is still left largely unconstrained even after you take account of it's "grounding" in perceptual categorization. This is because what matters for intentional content is not the objective property in the world that's being detected, but rather how the subject *conceives* of that external property, a far more slippery notion. This point is emphasized in a different context in the Churchland's BBS reply to Drestke's "Knowledge and the Flow of Information." To paraphrase one of their examples: primitive people may be able to reliably categorize certain large-scale atmospheric electrical discharges; nevertheless, the semantic content of their corresponding states might be "Angry gods nearby" or some such. Indeed, by varying their factual beliefs we could invent cases where the semantic content of these states is just about anything you please. Semantic content is a holistic matter. Another well-known obstacle to moving from an objective to an intentional description is that the latter contains an essentially normative component, in that we must make some distinction between correct and erroneous classification. For example, we'd probably like to say that a frog has a fly-detector which is sometimes wrong, rather than a "moving-spot-against-a- fixed-background" detector which is infallible. Again, this distinction seems to depend on fuzzy considerations about the purpose or functional role of the concept in question. Some of the things you say also suggest that you're attempting to resuscitate a form of classical empricist sensory atomism, where the "atomic" symbols refer to sensory categories acquired "by acquaintance" and the meaning of complex symbols is built up from the atoms "by description". This approach has an honorable history in philsophy; unfortunately, no one has ever been able to make it work. In addition to the above considerations, the main problems seem to be: first, that no principled distinction can be made between the simple sensory concepts and the complex "theoretical" ones; and second, that very little that is interesting can be explicitly defined in sensory terms (try, for example, "chair"). I realize the above considerations may not be relevant to your program -- I just can't tell to what extent you expect it to shed any light on the problem of explaining semantic content in naturalistic terms. In any case, I think it's important to understand why this fundamental problem remains largely untouched by such theories. Anders Weinstein BBN Labs
marty1@houdi.UUCP (M.BRILLIANT) (06/20/87)
In article <861@mind.UUCP>, harnad@mind.UUCP writes: > marty1@houdi.UUCP (M.BRILLIANT) asks: > > > what do you think is essential: (A) literally analog transformation, > > (B) invertibility, or (C) preservation of significant relational > > functions? > Let me see if I can correctly rephrase his answer: (i) "discrimination" (pairwise same/different judgments) he associates with iconic ("analog") representations, which he says have to be invertible, and will ordinarily be really analog because "dedicated" digital equivalents will be too complex. (ii) for "identification" or "categorization" (sorting and labeling of objects), he says only distinctive features need be extracted from the sensory projection; this process is not invertible. (iii) for "conscious problem-solving," etc., he says relation-preserving symbolic representations would be optimal, if they are not "autonomous (modular)" but rather are grounded by deriving their atomic symbols through the categorization process above. (iv) to pass the Total Turing Test he wants all of the above, tied together in the sequence described. I agree with this formulation in most of its terms. But some of the terms are confusing, in that if I accept what I think are good definitions, I don't entirely agree with the statements above. "Invertible/Analog": The property of invertibility is easy to visualize for continuous functions. First, continuous functions are what I would call "analog" transformations. They are at least locally image-forming (iconic). Then, saying a continuous transformation is invertible, or one-to-one, means it is monotonic, like a linear transformation, rather than many-to-one like a parabolic transformation. That is, it is unambiguously iconic. It might be argued that physical sensors can be ambiguously iconic, e.g., an object seen in a half-silvered mirror. Harnad would argue that the ambiguity is inherent in the physical scene, and is not dependent on the sensor. I would agree with that if no human sensory system ever gave ambiguous imaging of unambiguous objects. What about the ambiguity of stereophonic location of sound sources? In that case the imaging (i) is unambiguous; only the perception (ii) is ambiguous. But physical sensors are also noisy. In mathematical terms, that noise could be modeled as discontinuity, as many-to-one, as one-to-many, or combinations of these. The noisy transformation is not invertible. But a "physically analog" sensory process (as distinct from a digital one) can be approximately modeled (to within the noise) by a continuous transformation. The continuous approximation allows us to regard the analog transformation as image-forming (iconic). But only the continuous approximation is invertible. "Autonomous/Modular": The definition of "modular" is not clear to me. I have Harnad's definition "not analogous to a top-down, autonomous symbol-crunching module ... hardwired to peripheral modules." The terms in the definition need defining themselves, and I think there are too many of them. I would rather look at the "hybrid" three-layer system and say it does not have a "symbol-cruncher hardwired to peripheral modules" because there is a feature extractor (and classifier) in between. The main point is the presence or absence of the feature extractor. The symbol-grounding problem arises because the symbols are discrete, and therefore have to be associated with discrete objects or classes. Without the feature extractor, there would be no way to derive discrete objects from the sensory inputs. The feature extractor obviates the symbol-grounding problem. I consider the "symbol-cruncher hardwired to peripheral modules" to be not only a straw man but a dead horse. M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
marty1@houdi.UUCP (M.BRILLIANT) (06/22/87)
In article <6670@diamond.BBN.COM>, aweinste@Diamond.BBN.COM (Anders Weinstein) writes, with reference to article <861@mind.UUCP> harnad@mind.UUCP (Stevan Harnad): > > Some of the things you say also suggest that you're attempting to resuscitate > a form of classical empricist sensory atomism, where the "atomic" symbols > refer to sensory categories acquired "by acquaintance" and the meaning of > complex symbols is built up from the atoms "by description". This approach > has an honorable history in philsophy; unfortunately, no one has ever been > able to make it work. In addition to the above considerations, the main > problems seem to be: first, that no principled distinction can be made > between the simple sensory concepts and the complex "theoretical" ones; and > second, that very little that is interesting can be explicitly defined in > sensory terms (try, for example, "chair"). > I hope none of us are really trying to resuscitate classical philosophies, because the object of this discussion is to learn how to use modern technologies. To define an interesting object in sensory terms requires an intermediary module between the sensory system and the symbolic system. With a chair in the visual sensory field, the system will use hard-coded nonlinear (decision-making) techniques to identify boundaries and shapes of objects, and identify the properties that are invariant to rotation and translation. A plain wooden chair and an overstuffed chair will be different objects in these terms. But the system might also learn to identify certain types of objects that move, i.e., those we call people. If it notices that people assume the same position in association with both chair-objects, it could decide to use the same category for both. The key to this kind of classification is that the chair is not defined in explicit sensory terms but in terms of filtered sensory input. M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1 P.S. Sorry for the double posting of my previous article.
berleant@ut-sally.UUCP (06/22/87)
In article <6670@diamond.BBN.COM> aweinste@Diamond.BBN.COM (Anders Weinstein) writes: >Another well-known obstacle to moving from an objective to an intentional >description is that the latter contains an essentially normative component, >in that we must make some distinction between correct and erroneous >classification. For example, we'd probably like to say that a frog has a >fly-detector which is sometimes wrong, rather than a "moving-spot-against-a- >fixed-background" detector which is infallible. Again, this distinction seems >to depend on fuzzy considerations about the purpose or functional role of the >concept in question. An intriguing example! Maybe it's intrinsic to the fact that inference takes place? The world is fuzzy enough that logical deduction is not going to work infallibly, so every time there is inference, as in categorizing=identifying=classifying a percept, errors in the results are guaranteed some of the time. Thus errors may be expected to occur anywhere in the path from percptual icon to concept symbol, rather than, say, only at the point on this path where purposes or functions come into play. >Some of the things you say also suggest that you're attempting to resuscitate >a form of classical empricist sensory atomism, where the "atomic" symbols >refer to sensory categories acquired "by acquaintance" and the meaning of >complex symbols is built up from the atoms "by description". This approach >has an honorable history in philsophy; unfortunately, no one has ever been >able to make it work. Regardless of the form of the "atomic" symbols, complex concepts are built from simpler ones. Conceptual combination allows us to go from 'tomato' and 'juice' to 'tomato juice'. I assume there is no argument that this new category may be acquired, sight unseen, by symbolic processing. Presumably there must be atomic=primitive concepts, however, and where do these come from? It must be by a process different from that usable to acquire the concept 'tomato juice'. What are the alternatives to acquisition "by acquaintance" (I'm not familiar with the term "by acquaintance")? Also, what is meant by the contention that noone has been able to make it work? There is research in AI on 'learning from examples', which is suitable for 'primitive' concepts (ones described by properties rather than other concepts). There is also research e.g. in the computational linguistics field, on conceptual combination (tomato + juice = tomato juice). I know of no system that does both, nor of any reason why we should ask for one. Dan Berleant UUCP: {gatech,ucbvax,ihnp4,seismo...& etc.}!ut-sally!berleant ARPA: ai.berleant@r20.utexas.edu
berleant@ut-sally.UUCP (Dan Berleant) (06/22/87)
In article <861@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: >My own alternative has to do with the way symbolic models work (and >don't work). The hypothesis is that a hybrid symbolic/nonsymbolic >model along the lines sketched above will be needed in order to pass >the TTT. It will require a bottom-up, nonmodular grounding of its >symbolic representations in nonsymbolic representations: iconic >( = invertible with the sensory projection) and categorical ( = invertible >only with the invariant features of category members that are preserved >in the sensory projection and are sufficient to guide reliable >categorization). Are you saying that the categorical representations are to be nonsymbolic? The review of human concept representation I recently read (Smith and Medin, Categories and Concepts, 1981) came down so hard on the holistic theory of concept representation that I must admit I skipped the chapter. I remember some statement like 'this approach is easy to criticize' and skipped the rest. The alternative nonsymbolic approach would be the 'dimensional' one. It seems a strongish statement to say that this would be sufficient, to the exclusion of symbolic properties. For example, if one were to say that a property of the concept "human" is "breathes", insisting on a continuous dimension for this property might be iffy. But it might be correct. An iron lung would be lower on the scale than ordinary breathing. Rate and depth of respiration would be a factor -- a rate and depth typical of a mouse would count against identifying X as human. Visible heaving would be higher on the scale than a lack of firm observation. Any opinions out there?? However, the metric hypothesis -- that a concept is sufficiently characterized by a point in a multi-dimensional space -- seems wrong, as experiments have shown. I'm uncomfortable about another aspect of the quoted paragraph. To discuss "invariant features... sufficient to guide reliable categorization" sounds like the "classical" theory (as Smith & Medin call it) of concept representation: Concepts are represented as necessary and sufficient features (i.e., there are defining features, i.e. there is a boolean conjunction of predicates for a concept). This approach has serious problems, not the least of which is the inability of humans to describe these features for seemingly elementary concepts, like "chair", as Weinstein and others point out. I contend that a boolean function (including ORs as well as ANDs) could work, but that is not what was mentioned. An example might be helpful: A vehicle must have a steering wheel OR handlebars. But to remove the OR by saying, a vehicle must have a means of steering, is to rely on a feature which is symbolic, high level, functional, which I gather we are not allowing. Dan Berleant UUCP: {gatech,ucbvax,ihnp4,seismo...& etc.}!ut-sally!berleant ARPA: ai.berleant@r20.utexas.edro
berleant@ut-sally.UUCP (Dan Berleant) (06/22/87)
In article <847@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: (Concerning the derivability of intrinsic meaning via the model thoery of logic) >I am not a model theorist, so the following reply may be inadequate, but it >seems to me that the semantic model for an uninterpreted formal system >in formal model-theoretic semantics is always yet another formal >object, only its symbols are of a different type from the symbols of the >system that is being interpreted. That seems true of *formal* models. >Of course, there are informal models, in which the intended interpretation >of a formal system corresponds to conceptual or even physical objects. We >can say that the intended interpretation of the primitive symbol tokens >and the axioms of formal number theory are "numbers," by which we mean >either our intuitive concept of numbers or whatever invariant physical >property quantities of objects share. But such informal interpretations >are not what formal model theory trades in. As far as I can tell, >formal models are not intrinsically grounded, but depend on our >concepts and our linking them to real objects. And of course the >intrinsic grounding of our concepts and our references to objects is >what we are attempting to capture in confronting the symbol grounding >problem. > What I have in mind is this: The more statements you have (that you wish to be deemed correct), the more the possible meanings of the terms will be constrained. To illustrate, consider the statement FISH SWIM. Think of the terms FISH and SWIM as variables with no predetermined meaning -- so that FISH SWIM is just another way of writing A B. What variable bindings satisfy this? Well, many do. We could assign to the variable FISH, the meaning we normally assign to the word "fish", and to the variable SWIM the meaning we normally think of for the word "swim". We could also assign the English meaning of the word "mountains" to the variable FISH, and the English meaning of "erode" to the variable SWIM. Obviously, many assignments to the variables FISH and SWIM would work. Now consider the statement FISH LIVE, where FISH and LIVE are variables. Now there are two statements to be satisfied. The assignment to the variable LIVE restricts the possible assignments to the variable SWIM, since e.g. if LIVE is assigned the English meaning of "live", then SWIM can no longer have the meaning of the English "erode". Of course, we have many many statements in our minds that must be simultaneously satisfied, so the possible meanings that each word name can be assigned is correspondingly restricted. Could the restrictions be sufficient to require such a small amount of ambiguity that the word names could be said to have intrinsic meaning? footnote: This leaves unanswered the question of how the meanings themselves are grounded. Non-symbolically, seems to be the gist of the discussion, in which case logic would be useless for that task even in an 'in principle' capacity since the stuff of logic is symbols. >I hope model theorists will correct me if I'm wrong. But even if the >model-theoretic interpretation of some formal symbol systems can truly >be regarded as the "objects" to which it refers, it is not clear that >this can be generalized to natural language or to the "language of >thought," which must, after all, have Total-Turing-Test scope, rather >than the scope of the circumscribed artificial languages of logic and >mathematics. Is there any indication that all that can be formalized >model-theoretically? An interesting question regarding this is, just how much can model theory do to provide intrinsic meaning to (say) language? Nothing? Only in principle? Could it be practically useful? I'm taking the liberty of cross posting this followup to sci.math.symbolic and sci.philosophy.meta (in hopes of increased enlightenment!) Dan Berleant UUCP: {gatech,ucbvax,ihnp4,seismo...& etc.}!ut-sally!berleant ARPA: ai.berleant@r20.utexas.edu
aweinste@Diamond.BBN.COM (Anders Weinstein) (06/22/87)
In article <8309@ut-sally.UUCP> berleant@ut-sally.UUCP (Dan Berleant) writes: >> For example, we'd probably like to say that a frog has a >>fly-detector which is sometimes wrong, rather than a "moving-spot-against-a- >>fixed-background" detector which is infallible. Again, this distinction seems >>to depend on fuzzy considerations about the purpose or functional role of the >>concept in question. > > The world is fuzzy enough that logical deduction is not >going to work infallibly, so every time there is inference, as in >categorizing=identifying=classifying a percept, errors in the results >are guaranteed some of the time. Thus errors may be expected to occur >anywhere in the path from percptual icon to concept symbol, rather than, >say, only at the point on this path where purposes or functions come >into play. Sure, but the philosophical problem is to say why any response should count as an *error* at all. What makes it wrong? I.e. who decides which "concept" -- "fly" or "moving-spot..." -- the frog is trying to apply? The objective facts about the frog's perceptual abilities by themselves don't seem to tell you that in snapping out its tongue at a decoy, it's making a *mistake*. To say this, an outside interpreter has to make some judgement about what the frog's brain is trying to accomplish by its detection of moving spots. And this makes the determination of semantic descriptions a fuzzy matter. It's true that evolutionary considerations might take you some of the way towards defining an objective sense of "purpose" to use here, but I don't think they can eliminate the difficulty entirely, as the purpose of particular adaptations can not always be expected to be clear or even fully determinate. >>Some of the things you say also suggest that you're attempting to resuscitate >>a form of classical empricist sensory atomism ... unfortunately, no one has >>ever been able to make [this] work. > > Conceptual combination allows us to go >from 'tomato' and 'juice' to 'tomato juice'. I assume there is no >argument that this new category may be acquired, sight unseen, >by symbolic processing. Presumably there must be atomic=primitive >concepts, however, and where do these come from? It must be >by a process different from that usable to acquire the concept >'tomato juice'. What are the alternatives to acquisition "by >acquaintance" (I'm not familiar with the term "by acquaintance")? Also, >what is meant by the contention that noone has been able to make it >work? Of course *some* concepts can be acquired by definition. However, the "classical empiricist" doctrine is committed to the further idea that there is some priveleged set of *purely sensory* concepts and that all non-sensory concepts can be defined in terms of this basis. This is what has never been shown to work. If you regard "juice" as a "primitive" concept, then you do not share the classical doctrine. (And if you do not, I invite you try giving necessary and sufficient conditions for juicehood.) The phrases "knowledge by acquaintance" and "knowledge by description" come from Bertrand Russell, who thought at the time that we were "directly acquainted" with our sense-data.
adam@gec-mi-at.co.uk (Adam Quantrill) (06/23/87)
In article <6521@diamond.BBN.COM> aweinste@Diamond.BBN.COM (Anders Weinstein) writes: > >To elaborate: I presume the "symbol-grounding" problem is a *philosophical* >question: what gives formal symbols original intentionality? I suppose the >only answer anybody knows is, in brief, that the symbols must be playing a >certain role in what Dennett calls an "intentional system", that is, a system >which is capable of producing complex, adaptive behavior in a rational way. >[] >And this, as far as I can tell, is the end of what we learn from the "symbol >grounding" problem -- you've got to have sense organs. [] It seems to me that the Symbol Grounding problem is a red herring. If I took a partially self-learning program and data (P & D) that had learnt from a computer with 'sense organs', and ran it on a computer without, would the program's output become symbolically ungrounded? Similarily, if I myself wrote P & D without running it on a computer at all, where's the difference? Surely it is possible that I can come up with identical P & D by analysis. Does that make the original P & D running on the com- puter with 'sense organs' symbolically ungrounded? A computer can always interact via the keyboard & terminal screen, (if those are the only 'sense organs'), grounding its internal symbols via people who react to the output, and provide further stimulus. -Adam. /* If at first it don't compile, kludge, kludge again.*/
harnad@mind.UUCP (Stevan Harnad) (06/26/87)
John Cugini <Cugini@icst-ecf.arpa> on ailist@stripe.sri.com writes: > What if there were a few-to-one transformation between the skin-level > sensors (remember Harnad proposes "skin-and-in" invertibility > as being necessary for grounding) and the (somewhat more internal) > iconic representation. My example was to suppose that #1: > a combination of both red and green retinal receptors and #2 a yellow > receptor BOTH generated the same iconic yellow. > Clearly this iconic representation is non-invertible back out to the > sensory surfaces, but intuitively it seems like it would be grounded > nonetheless - how about it? Invertibility is a necessary condition for iconic representation, not for grounding. Grounding symbolic representations (according to my hypothesis) requires both iconic and categorical representations. The latter are selective, many-to-few, invertible only in the features they pick out and, most important, APPROXIMATE (e.g., as between red-green and yellow in your example above). This point has by now come up several times... -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
marty1@houdi.UUCP (M.BRILLIANT) (06/26/87)
In article <914@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > Invertibility is a necessary condition for iconic representation, not > for grounding. Grounding symbolic representations (according to my > hypothesis) requires both iconic and categorical representations... Syllogism: (a) grounding ... requires ... iconic ... representation.... (b) invertibility is ... necessary ... for iconic representation. (c) hence, grounding must require invertibility. Why then does harnad say "invertibility is a necessary condition for ..., NOT for grounding" (caps mine, of course)? This discussion is getting hard to follow. Does it have to be carried on simultaneously in both comp.ai and comp.cog-eng? Could harnad, who seems to be the major participant, pick one? M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
harnad@mind.UUCP (Stevan Harnad) (06/26/87)
berleant@ut-sally.UUCP (Dan Berleant) of U. Texas CS Dept., Austin, Texas writes: > Are you saying that the categorical representations are to be > nonsymbolic? The review of human concept representation I recently read > (Smith and Medin, Categories and Concepts, 1981) came down... hard on > the holistic theory of concept representation... The alternative > nonsymbolic approach would be the 'dimensional' one. It seems a > strongish statement to say that this would be sufficient, to the > exclusion of symbolic properties... However, the metric > hypothesis -- that a concept is sufficiently characterized by a point > in a multi-dimensional space -- seems wrong, as experiments have shown. Categorical representations are the representations of purely SENSORY categories, and I am indeed saying that they are to be NONsymbolic. Let me also point out that the theory I am putting forward represents a direct challenge to the Roschian line of category research in which the book you cite belongs. To put it very briefly, I claim that that line of experimental and theoretical work is not really investigating the representations underlying the capacity to categorize at all; it is only looking at the fine tuning of category judgments. The experiments are typically not addressing the question of how it is that a device or organism can successfully categorize the inputs in question in the first place; instead it examines (1) how QUICKLY or EASILY subjects do it, (2) how TYPICAL (of the members of the category in question) subjects rate the inputs to be and (3) what features subjects INTROSPECT that they are using. This completely bypasses the real question of how anyone or anything actually manages to accomplish the categorization at all. Let me quickly add that there is nothing wrong with reaction-time experiments if they suggest hypotheses about the basic underlying mechanism, or provide ways of testing them. But in this case -- as in many others in experimental cognitive psychology -- the basic mechanisms are bypassed and the focus is on fine-tuning questions that are beside the point (or premature) -- if, that is, the objective is to explain how organisms or devices actually manage to generate successful categorization performance given the inputs in question. As an exercise, see where the constructs you mention above -- "holistic," "dimensional," or "metric" representations -- are likely to get you if you're actually trying to get a device to categorize, as we do. There is also an "entry point" problem with this line of research, which typically looks willy-nilly at higher-order, abstract categories, as well as "basic level" object categories (an incoherent concept, in my opinion, except as an arbitrary default level), and even some sensory categories. But it seems obvious that the question of how the higher-order categories are represented is dependent on how the lower-order ones are represented, the abstract ones on the concrete ones, and perhaps all of these depend on the sensory ones. Moreover, often the inputs used are members of familiar, overlearned categories, and the task is a trivial one, not engaging the mechanisms that were involved in their acquisition. In other experiments, artificial stimuli are used, but it is not clear how representative these are of the category acquisition process either. Finally, and perhaps most important: In bypassing the problem of categorization capacity itself -- i.e., the problem of how devices manage to categorize as correctly and successfully as they do, given the inputs they have encountered -- in favor of its fine tuning, this line of research has unhelpfully blurred the distinction between the following: (a) the many all-or-none categories that are the real burden for an explanatory theory of categorization (a penguin, after all, be it ever so atypical a bird, and be it ever so time-consuming for us to judge that it is indeed a bird, is, after all, indeed a bird, and we know it, and can say so, with 100% accuracy every time, irrespective of whether we can successfully introspect what features we are using to say so) and (b) true "graded" categories such as "big," "intelligent," etc. Let's face the all-or-none problem before we get fancy... > To discuss "invariant features... sufficient to guide reliable > categorization" sounds like the "classical" theory (as Smith & Medin > call it) of concept representation: Concepts are represented as > necessary and sufficient features (i.e., there are defining features, > i.e. there is a boolean conjunction of predicates for a concept). This > approach has serious problems, not the least of which is the inability > of humans to describe these features for seemingly elementary concepts, > like "chair", as Weinstein and others point out. I contend that a > boolean function (including ORs as well as ANDs) could work, but that > is not what was mentioned. An example might be helpful: A vehicle must > have a steering wheel OR handlebars. But to remove the OR by saying, > a vehicle must have a means of steering, is to rely on a feature which > is symbolic, high level, functional, which I gather we are not allowing. It certainly is the "classical" theory, but the one with the serious problems is the fine-tuning approach I just described, not the quite reasonable assumption that if 100% correct, all-or-none categorization is possible at all (without magic), then there must be a set of features in the inputs that is SUFFICIENT to generate it. I of course agree that disjunctive features are legitimate -- but whoever said they weren't? That was another red herring introduced by this line of research. And, as I mentioned, "the inability of humans to describe these features" is irrelevant. If they could do it, they'd be cognitive modelers! We must INFER what features they're using to categorize successfully; nothing guarantees they can tell us. (If by "Weinstein" you mean "Wittgenstein" on "games," etc., I have to remind you that Wittgenstein did not have the contemporary burden of speaking in terms of internal mechanisms a device would have to have in order to categorize successfully. Otherwise he would have had to admit that "games" are either (i) an all-or-none category, i.e., there is a "right" or "wrong" of the matter, and we are able to sort accordingly, whether or not we can introspect the basis of our correct sorting, or (ii) "games" are truly a fuzzy category, in which membership is arbitrary, uncertain, or a matter of degree. But if the latter, then games are simply not representative of the garden-variety all-or-none categorization capacity that we exercise when we categorize most objects, such as chairs, tables, birds. And again, there's nothing whatsoever wrong with disjunctive features.) Finally, it is not that we are not "allowing" higher-order symbolically described features. They are the goal of the whole grounding project. But the approach I am advocating requires that symbolic descriptions be composed of primitive symbols which are in turn the labels of sensory categories, grounded in nonsymbolic (iconic and categorical) representations. > [Concerning model-theoretic "grounding":] The more statements > you have (that you wish to be deemed correct), the more the possible > meanings of the terms will be constrained. To illustrate, consider > the statement FISH SWIM. Think of the terms FISH and SWIM as variables > with no predetermined meaning -- so that FISH SWIM is just another way > of writing A B. What variable bindings satisfy this? Well, many do... > Now consider the statement FISH LIVE, where FISH and LIVE are variables. > Now there are two statements to be satisfied. The assignment to the > variable LIVE restricts the possible assignments to the variable SWIM... > Of course, we have many many statements in our minds that must be > simultaneously satisfied, so the possible meanings that each word name > can be assigned is correspondingly restricted. Could the restrictions be > sufficient to require such a small amount of ambiguity that the word > names could be said to have intrinsic meaning?... footnote: This > leaves unanswered the question of how the meanings themselves are > grounded. Non-symbolically, seems to be the gist of the discussion, > in which case logic would be useless for that task even in an > "in principle" capacity since the stuff of logic is symbols. I agree that there are constraints on the correlations of symbols in a natural language, and that the degrees of freedom probably shrink, in a sense, as the text grows. That is probably the basis of successful cryptography. But I still think (and you appear to agree) that even if the degrees of freedom are close to zero for a natural language's symbol combinatons and their interpretations, this still leaves the grounding problem intact: How are the symbols connected to their referents? And what justifies our interpretation of their meanings? With true cryptography, the decryption of the symbols of the unknown language is always grounded in the meanings of the symbols of a known language, which are in turn grounded in our heads, and their understanding of the symbols and their relation to the world. But that's the standard DERIVED meaning scenario, and for cognitive modeling we need INTRINSICALLY grounded symbols. (I do believe, though, that the degrees-of-freedom constraint on symbol combinations does cut somewhat into Quine's claims about the indeterminacy of radical translation, and ESPECIALLY for an intrinsically grounded symbol system.) -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
harnad@mind.UUCP (Stevan Harnad) (06/27/87)
aweinste@Diamond.BBN.COM (Anders Weinstein) of BBN Laboratories, Inc., Cambridge, MA writes: > I don't see any difference between "physical" and "merely theoretical" > invertibility... Surely you don't mean that a transformation-inversion > capability must actually be present in the device for it to count as > "analog" in your sense. (Else brains, for example, wouldn't count). I think this is partly an empirical question. "Physically possible" invertibility is enough for an analog transformation, but actual physical invertibility may be necessary for an iconic representation that can generate all of our discrimination capacities. Avoiding "merely theoretical" invertibility is also part of avoiding any reliance on mediation by our theoretical interpretations in order to get an autonomous, intrinsically grounded system. > the *semantic* meaning of a symbol is still left largely unconstrained > even after you take account of it's "grounding" in perceptual > categorization. This is because what matters for intentional content > is not the objective property in the world that's being detected, but > rather how the subject *conceives* of that external property, a far > more slippery notion... primitive people may be able to reliably > categorize certain large-scale atmospheric electrical discharges; > nevertheless, the semantic content of their corresponding states might > be "Angry gods nearby" or some such. I agree that symbol grounding cannot be based on the "objective property" that's being detected. Categorical representations in my grounding model are approximate. All they do is sort and label the confusable alternatives that have been sampled, using the provisional features that suffice to generate reliable sorting performance according to the feedback that defines "right" and "wrong." There is always a context of confusable alternatives, and which features are used to sort reliably is always a "compared to what?" matter. The exact "objective property" they pick out is never an issue, only whether they can generate reliable asymptotic categorization performance given that sample and those feedback constraints. The representation is indifferent to whether what you are calling "water," is really "twin-water" (with other objective properties), as long as you can sort it "correctly" according to the feedback (say, from the dictates of thirst, or a community of categorizing instructors). As to what people "conceive" themselves to be categorizing: My model is proposed in a framework of methodological epiphenomenalism. I'm interested in what's going on in people's heads only inasmuch as it is REALLY generating their performance, not just because they think or feel it is. So, for example, in criticizing the Roschian approach to categorization in my reply to Dan Berleant I suggested that it was irrelevant what features subjects BELIEVED they were using to categorize, say, chairs; what matters is what features they (or any organism or device in a similar input situation) really ARE using. [This does not contradict my previous point about the irrelevance of "objective properties." "Features" refers to properties of the proximal projection on the device's sense receptors, whereas "properties" would be the essential characteristics of distal objects in the world. Feature detectors are blind to distal differences that are not preserved in the proximal projection.] On the other hand, "Angry gods nearby" is not just an atomic label for "thunder" (otherwise it WOULD be equivalent to it in my model -- both labels would pick out approximately the same thing); in fact, it is decomposable, and hence has a different meaning in virtue of the meanings of "angry" and "gods." There should be corresponding internal representational differences (iconic, categorical and symbolic) that capture that difference. > Another well-known obstacle to moving from an objective to an > intentional description is that the latter contains an essentially > normative component, in that we must make some distinction between > correct and erroneous classification. For example, we'd probably > like to say that a frog has a fly-detector which is sometimes wrong, > rather than a "moving-spot-against-a- fixed-background" detector > which is infallible. Again, this distinction seems to depend on fuzzy > considerations about the purpose or functional role of the concept > in question... [In his reply on this point to Dan Berleant, > Weinstein continues:] the philosophical problem is to say why any > response should count as an *error* at all. What makes it wrong? > I.e. who decides which "concept" -- "fly" or "moving-spot..." -- the > frog is trying to apply? The objective facts about the frog's > perceptual abilities by themselves don't seem to tell you that in > snapping out its tongue at a decoy, it's making a *mistake*. To > say this, an outside interpreter has to make some judgement about what > the frog's brain is trying to accomplish by its detection of moving > spots. And this makes the determination of semantic descriptions a > fuzzy matter. I don't think there's any problem at all of what should count as an "error" for my kind of model. The correctness or incorrectness of a label is always determined by feedback -- either ecological, as in evolution and daily nonverbal learning, or linguistic, where it is conventions of usage that determine what we call what. I don't see anything fuzzy about such a functional framework. (The frog's feedback, by the way, probably has to do with edibility, so (i) "something that affords eating" is probably a better "interpretation" of what it's detecting. And, to the extent that (ii) flies and (iii) moving spots are treated indifferently by the detector, the representation is approximate among all three. The case is not like that of natives and thunder, since the frog's "descriptions" are hardly decomposable. Finally, there is again no hope of specifying distal "objective properties" ["bug"/"schmug"] here either, as approximateness continues to prevail.) > Some of the things you say also suggest that you're attempting to > resuscitate a form of classical empricist sensory atomism, where the > "atomic" symbols refer to sensory categories acquired "by acquaintance" > and the meaning of complex symbols is built up from the atoms "by > description". This approach has an honorable history in philosophy; > unfortunately, no one has ever been able to make it work. In addition > to the above considerations, the main problems seem to be: first, > (1) that no principled distinction can be made between the simple > sensory concepts and the complex "theoretical" ones; and second, > (2) that very little that is interesting can be explicitly defined in > sensory terms (try, for example, "chair")...[In reply to Berleant, > Weinstein continues:] Of course *some* concepts can be acquired by > definition. However, the "classical empiricist" doctrine is committed > to the further idea that there is some privileged set of *purely > sensory* concepts and that all non-sensory concepts can be defined in > terms of this basis. This is what has never been shown to work. If you > regard "juice" as a "primitive" concept, then you do not share the > classical doctrine. (And if you do not, I invite you try giving > necessary and sufficient conditions for juicehood.) You're absolutely right that this is a throwback to seventeenth-century bottom-upism. In fact, in the CP book I call the iconic and categorical representations the "acquaintance system" and the symbolic representations the "description system." The only difference is that I'm only claiming to be giving a theory of categorization. Whether or not this captures "meaning" depends (for me at any rate) largely on whether or not such a system can successfully pass the Total Turing Test. It's true that no one has made this approach work. But it's also true that no one has tried. It's only in today's era of computer modeling, robotics and bioengineering that these mechanisms will begin to be tested to see whether or not they can deliver the goods. To reply to your "two main problems": (1) Even an elementary sensory category such as "red" is already abstract once you get beyond the icon to the categorical representation. "Red" picks out the electromagnetic wave-lengths that share the feature of being above and below a certain threshold. That's an abstraction. And in exchange for generating a feature-detector that reliably picks it out, you get a label -- "red" -- which can now enter into symbolic descriptions (e.g., "red square"). Categorization is abstraction. As soon as you've left the realm of invertible icons, you've begun to abstract, yet you've never left the realm of the senses. And so it goes, bottom up, from there onward. (2) As to sensory "definitions": I don't think this is the right thing to look for, because it's too hard to find a valid "entry point" into the bottom-up hierarchy. I doubt that "chair" or "juice" are sensory primitives, picked out purely by sensory feature detectors. They're probably represented by symbolic descriptions such as "things you can sit on" and "things you can drink," and of course those are just the coarsest of first approximations. But the scenario looks pretty straightforward: Even though it's flexible enough to be revised to include a chair (suitably homegenized) as a juice and a juice (for a bug?) as a chair, it seems very clear that it is the resources of (grounded) symbolic description that are being drawn upon here in picking out what is and is not a chair, and on the basis of what features. The categories are too interrelated (and approximate, and provisional) for an exhaustive "definition," but provisional descriptions that will get you by in your sorting and labeling -- and, more important, are revisable and updatable, to tighten the approximation -- are certainly available and not hard to come by. "Necessary and sufficient conditions for juicehood," however, are a red herring. All we need is a provisional set of features that will reliably sort the instances as environmental and social feedback currently dictates. Remember, we're not looking for "objective properties" or ontic essences -- just something that will guide reliable sorting according to the contingencies sampled to date. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
marty1@houdi.UUCP (M.BRILLIANT) (06/27/87)
In article <917@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > ... blurred the distinction between the > following: (a) the many all-or-none categories that are the real burden > for an explanatory theory of categorization (a penguin, after all, be it > ever so atypical a bird, ... is, after all, indeed a bird, and we know > it, and can say so, with 100% accuracy every time, .... > ... and (b) true "graded" categories such as "big," "intelligent," ... > ...... > "games" are either (i) an all-or-none category, i.e., there is a "right" or > "wrong" of the matter, and we are able to sort accordingly, ... > ... or (ii) "games" > are truly a fuzzy category, in which membership is arbitrary, > uncertain, or a matter of degree. But if the latter, then games are > simply not representative of the garden-variety all-or-none > categorization capacity that we exercise when we categorize most > objects, such as chairs, tables, birds.... Now, much of this discussion is out of my field, but (a) I would like to share in the results, and (b) I understand membership in classes like "bird" and "chair." I learned recently that I can't categorize chairs with 100% accuracy. A chair used to be a thing that supported one person at the seat and the back, and a stool had no back support. Then somebody invented a thing that supported one person at the seat, the knees, but not the back, and I didn't know what it was. As far as my sensory categorization was concerned at the time, its distinctive features were inadequate to classify it. Then somebody told me it was a chair. Its membership in the class "chair" was arbitrary. Now a "chair" in my lexicon is a thing that supports the seat and either the back or the knees. Actually, I think I perceive most chairs by recognizing the object first as a familiar thing like a kitchen chair, a wing chair, etc., and then I name it with the generic name "chair." I think Harnad would recognize this process. The class is defined arbitrarily by inclusion of specific members, not by features common to the class. It's not so much a class of objects, as a class of classes.... If that is so, then "bird" as a categorization of "penguin" is purely symbolic, and hence is arbitrary, and once the arbitrariness is defined out, that categorization is a logical, 100% accurate, deduction. The class "penguin" is closer to the primitives that we infer inductively from sensory input. But the identification of "penguin" in a picture, or in the field, is uncertain because the outlines may be blurred, hidden, etc. So there is no place in the pre-symbolic processing of sensory input where 100% accuracy is essential. (This being so, there is no requirement for invertibility.) M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
marty1@houdi.UUCP (M.BRILLIANT) (06/29/87)
In article <919@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel writes: > ...... > > .... The feature extractor obviates the symbol-grounding > > problem. > > ..... You are vastly underestimating the problem of > sensory categorization, sensory learning, and the relation between > lower and higher-order categories. Nor is it obvious that symbol manipulation > can still be regarded as just symbol manipulation when the atomic symbols > are constrained to be the labels of sensory categories.... I still think we're having more trouble with terminology than we would have with the concepts if we understood each other. To get a little more concrete, how walking through what a machine might do in perceiving a chair? I was just looking at a kitchen chair, a brown wooden kitchen chair against a yellow wall, in side light from a window. Let's let a machine train its camera on that object. Now either it has a mechanical array of receptors and processors, like the layers of cells in a retina, or it does a functionally equivalent thing with sequential processing. What it has to do is compare the brightness of neighboring points to find places where there is contrast, find contrast in contiguous places so as to form an outline, and find closed outlines to form objects. There are some subtleties needed to find partly hidden objects, but I'll just assume they're solved. There may also be an interpretation of shadow gradations to perceive roundness. Now the machine has the outline of an object in 2 dimensions, and maybe some clues to the 3rd dimension. There are CAD programs that, given a complete description of an object in 3D, can draw any 2D view of it. How about reversing this essentially deductive process to inductively find a 3D form that would give rise to the 2D view the machine just saw. Let the machine guess that most of the odd angles in the 2D view are really right angles in 3D. Then, if the object is really unfamiliar, let the machine walk around the chair, or pick it up and turn it around, to refine its hypothesis. Now the machine has a form. If the form is still unfamiliar, let it ask, "What's that, Daddy?" Daddy says, "That's a chair." The machine files that information away. Next time it sees a similar form it says "Chair, Daddy, chair!" It still has to learn about upholstered chairs, but give it time. That brings me to a question: do you really want this machine to be so Totally Turing that it grows like a human, learns like a human, and not only learns new objects, but, like a human born at age zero, learns how to perceive objects? How much of its abilities do you want to have wired in, and how much learned? But back to the main question. I have skipped over a lot of detail, but I think the outline can in principle be filled in with technologies we can imagine even if we do not have them. How much agreement do we have with this scenario? What are the points of disagreement? M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
smoliar@vaxa.isi.edu (Stephen Smoliar) (06/30/87)
In article <1194@houdi.UUCP> marty1@houdi.UUCP (M.BRILLIANT) writes: > >I was just looking at a kitchen chair, a brown wooden kitchen >chair against a yellow wall, in side light from a window. Let's >let a machine train its camera on that object. Now either it >has a mechanical array of receptors and processors, like the >layers of cells in a retina, or it does a functionally >equivalent thing with sequential processing. What it has to do >is compare the brightness of neighboring points to find places >where there is contrast, find contrast in contiguous places so >as to form an outline, and find closed outlines to form objects. >There are some subtleties needed to find partly hidden objects, >but I'll just assume they're solved. There may also be an >interpretation of shadow gradations to perceive roundness. > I have been trying to keep my distance from this debate, but I would like to insert a few observations regarding this scenario. In many ways, this paragraph represents the "obvious" approach to perception, assuming that one is dealing with a symbol manipulation system. However, other approaches have been hypothesized. While their viability remains to be demonstrated, it would be fair to say that, in the broad scope of perception in the real world, the same may be said of symbol manipulation systems. Consider the holographic model posed by Karl Pribram in LANGUAGES OF THE BRAIN. As I understand it, this model postulates that memory is a collection of holographic transforms of experienced images. As new images are experienced, the brain is capable of retrieving "best fits" from this memory to form associations. Thus, the chair you see in the above paragraph is recognized as a chair by virtue of the fact that it "fits" other images of chairs you have seen in the past. I'm not sure I buy this, but I'm at least willing to acknowledge it as an alternative to your symbol manipulation scenario. The biggest problem I have has to do with retrieval. As far as I understand, present holographic retrieval works fine as long as you don't have to worry about little things like change of scale, translation, or rotation. If this model is going to work, then the retrieval process is going to have to be more powerful than the current technology allows. The other problem relates to concept acquisition, as was postulated in Brilliant's continuation of the scenario: > >Now the machine has a form. If the form is still unfamiliar, >let it ask, "What's that, Daddy?" Daddy says, "That's a chair." >The machine files that information away. Next time it sees a >similar form it says "Chair, Daddy, chair!" It still has to >learn about upholstered chairs, but give it time. > The difficulty seems to be in what it means to file something away if one's memory is simply one of experiences. Does the memory trace of the chair experience include Daddy's voice saying "chair?" While I'm willing to acknowledge a multi-media memory trace, this seems a bit pat. It reminds me of Skinner's VERBAL BEHAVIOR, in which he claimed that one learned the concept "beautiful" from stimuli of observing people saying "beautiful" in front of beautiful objects. This conjures up a vision of people wandering around the Metropolitan Museum of Art mutttering "beautiful" as they wander from gallery to gallery. Perhaps the difficulty is that the mind really doesn't want to assign a symbol to every experience immediately. Rather, following the model of Holland et. al., it is first necessary to build up some degree of reinforcement which assures that a particular memory trace is actually going to be retrieved relatively frequently (whatever that means). In such a case, then, a symbol becomes a fast-access mechanism for retrieval of that trace (or a collection of common traces). However, this gives rise to at least two questions for which I have no answer: 1. What are the criteria by which it is decided that such a symbol is required for fast-access? 2. Where does the symbol's name come from? 3. How is the symbol actually "bound" to what it retrieves? These would seem to be the sort of questions which might help to tie this debate down to more concrete matters. Brilliant continues: >That brings me to a question: do you really want this machine >to be so Totally Turing that it grows like a human, learns like >a human, and not only learns new objects, but, like a human born >at age zero, learns how to perceive objects? How much of its >abilities do you want to have wired in, and how much learned? > This would appear to be one of the directions in which connectionism is leading. In a recent talk, Sejnowski talked about "training" networks for text-to-speech and backgammon . . . not programming them. On the other hand, at the current level of his experiments, designing the network is as important as training it; training can't begin until one has a suitable architecture of nodes and connections. The big unanswered questions would appear to be: will all of this scale upward? That is, is there ultimately some all-embracing architecture which includes all the mini-architectures examined by connectionist experiments and enough more to accommodate the methodological epiphenomenalism of real life?
harnad@mind.UUCP (Stevan Harnad) (06/30/87)
marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel asks: > how about walking through what a machine might do in perceiving a chair? > ...let a machine train its camera on that object. Now either it > has a mechanical array of receptors and processors, like the layers > of cells in a retina, or it does a functionally equivalent thing with > sequential processing. What it has to do is compare the brightness of > neighboring points to find places where there is contrast, find > contrast in contiguous places so as to form an outline, and find > closed outlines to form objects... Now the machine has the outline > of an object in 2 dimensions, and maybe some clues to the 3rd > dimension... inductively find a 3D form that would give rise to the > 2D view the machine just saw... Then, if the object is really > unfamiliar, let the machine walk around the chair, or pick it > up and turn it around, to refine its hypothesis. So far, apart from its understandable toward current engineering hardware concepts, there is no particular objection to this description of a stereoptic sensory receptor. > Now the machine has a form. If the form is still unfamiliar, > let it ask, "What's that, Daddy?" Daddy says, "That's a chair." > The machine files that information away. Next time it sees a > similar form it says "Chair, Daddy, chair!" It still has to > learn about upholstered chairs, but give it time. Now you've lost me completely. Having acknowledged the intricacies of sensory transduction, you seem to think that the problem of categorization is just a matter of filing information away and finding "similar forms." > do you really want this machine to be so Totally Turing that it > grows like a human, learns like a human, and not only learns new > objects, but, like a human born at age zero, learns how to perceive > objects? How much of its abilities do you want to have wired in, > and how much learned? That's an empirical question. All it needs to do is pass the Total turing Test -- i.e., exhibit performance capacities that are indistinguishable from ours. If you can do it by building everything in a priori, go ahead. I'm betting it'll need to learn -- or be able to learn -- a lot. > But back to the main question. I have skipped over a lot of > detail, but I think the outline can in principle be filled in > with technologies we can imagine even if we do not have them. > How much agreement do we have with this scenario? What are > the points of disagreement? I think the main details are missing, such as how the successful categorization is accomplished. Your account also sounds as if it expects innate feature detectors to pick out objects for free, more or less nonproblematically, and then serve as a front end for another device (possibly a conventional symbol-cruncher a la standard AI?) that will then do the cognitive heavy work. I think that the cognitive heavy work begins with picking out objects, i.e., with categorization. I think this is done nonsymbolically, on the sensory traces, and that it involves learning and pattern recognition -- both sophisticated cognitive activities. I also do not think this work ends, to be taken over by another kind of work: symbolic processing. I think that ALL of cognition can be seen as categorization. It begins nonsymbolically, with sensory features used to sort objects according to their names on the basis of category learning; then further sorting proceeds by symbolic descriptions, based on combinations of those atomic names. This hybrid nonsymbolic/symbolic categorizer is what we are; not a pair of modules, one that picks out objects and the other that thinks and talks about them. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
marty1@houdi.UUCP (M.BRILLIANT) (06/30/87)
In article <937@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel asks: > .... > > do you really want this machine to be so Totally Turing that it > > grows like a human, learns like a human, and not only learns new > > objects, but, like a human born at age zero, learns how to perceive > > objects? How much of its abilities do you want to have wired in, > > and how much learned? > > That's an empirical question. All it needs to do is pass the Total > turing Test -- i.e., exhibit performance capacities that are > indistinguishable from ours. If you can do it by building everything > in a priori, go ahead. I'm betting it'll need to learn -- or be able to > learn -- a lot. To refine the question: how long do you imagine the Total Turing Test will last? Science fiction stories have robots or aliens living in human society as humans for periods of years, as long as they live with strangers, but failing after a few hours trying to supplant a human and fool his or her spouse. By "performance capabilities," do you mean the capability to adapt as a human does to the experiences of a lifetime? Or only enough learning capability to pass a job interview? M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
marty1@houdi.UUCP (M.BRILLIANT) (06/30/87)
In article <937@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > ... > marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel asks: > > how about walking through what a machine might do in perceiving a chair? > > ... (a few steps skipped here) > > Now the machine has a form. If the form is still unfamiliar, > > let it ask, "What's that, Daddy?" Daddy says, "That's a chair." > > The machine files that information away. Next time it sees a > > similar form it says "Chair, Daddy, chair!" ... > > Now you've lost me completely. Having acknowledged the intricacies of > sensory transduction, you seem to think that the problem of categorization > is just a matter of filing information away and finding "similar forms." I think it is. We've found a set of lines, described in 3 dimensions, that can be rotated to match the outline we derived from the view of a real chair. We file it in association with the name "chair." A "similar form" is some other outline that can be matched (to within some fraction of its size) by rotating the same 3D description. > I think the main details are missing, such as how the successful > categorization is accomplished...... Are we having a problem with the word "categorization"? Is it the process of picking discrete objects out of a pattern of light and shade ("that's a thing"), or the process of naming the object ("that thing is a chair")? > ..... Your account also sounds as if it > expects innate feature detectors to pick out objects for free, more or > less nonproblematically..... You left out the part where I referred to computer-aided-design modules. I think we can find outlines by looking for contiguous contrasts. If the outlines are straight we (the machine, maybe also humans) can define the ends of the straight lines in the visual plane, and hypothesize corresponding lines in space. If hard-coding this capability gives an "innate feature detector" then that's what I want. > ...... and then serve as a front end for another > device (possibly a conventional symbol-cruncher a la standard AI?) > that will then do the cognitive heavy work. I think that the cognitive > heavy work begins with picking out objects, i.e., with categorization. I think I find objects with no conscious knowledge of how I do it (is that what you call "categorization")? Saying what kind of object it is more often involves conscious symbol-processing (sometimes one forgets the word and calls a perfectly familiar object "that thing"). > I think this is done nonsymbolically, on the sensory traces, and that it > involves learning and pattern recognition -- both sophisticated > cognitive activities. If you're talking about finding objects in a field of light and shade, I agree that it is done nonsymbolically, and everything else you just said. > ..... I also do not think this work ends, to be taken > over by another kind of work: symbolic processing..... That's where I have trouble. Calling a penguin a bird seems to me purely symbolic, just as calling a tomato a vegetable in one context, and a fruit in another, is a symbolic process. > ..... I think that ALL of > cognition can be seen as categorization. It begins nonsymbolically, > with sensory features used to sort objects according to their names on > the basis of category learning; then further sorting proceeds by symbolic > descriptions, based on combinations of those atomic names. This hybrid > nonsymbolic/symbolic categorizer is what we are; not a pair of modules, > one that picks out objects and the other that thinks and talks about them. Now I don't understand what you said. If it begins nonsymbolically, and proceeds symbolically, why can't it be done by linking a nonsymbolic module to a symbolic module? M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
aweinste@Diamond.BBN.COM (Anders Weinstein) (06/30/87)
In reply to my statement that >> the *semantic* meaning of a symbol is still left largely unconstrained >> even after you take account of it's "grounding" in perceptual >> categorization. This is because what matters for intentional content >> is not the objective property in the world that's being detected, but >> rather how the subject *conceives* of that external property, a far >> more slippery notion... Stevan Harnad (harnad@mind.UUCP) writes: > > As to what people "conceive" themselves to be categorizing: My model > is proposed in a framework of methodological epiphenomenalism. I'm > interested in what's going on in people's heads only inasmuch as it is > REALLY generating their performance, not just because they think or > feel it is. I regret the subjectivistic tone of my loose characterization; what people can introspect is indeed not at issue. I was merely pointing out that the *meaning* of a symbol is crucially dependent on the rest of the cognitive system, as shown in the Churchland's example: >> ... primitive people may be able to reliably >> categorize certain large-scale atmospheric electrical discharges; >> nevertheless, the semantic content of their corresponding states might >> be "Angry gods nearby" or some such. >> > ... "Angry gods nearby" is not just an atomic label for > "thunder" (otherwise it WOULD be equivalent to it in my model -- both > labels would pick out approximately the same thing); in fact, it is > decomposable, and hence has a different meaning in virtue of the > meanings of "angry" and "gods." There should be corresponding internal > representational differences (iconic, categorical and symbolic) that > capture that difference. "Angry gods nearby" is composite in *English*, but it need not be composite in native, or, more to the point, in the supposed inner language of the native's categorical mechanisms. They may have a single word, say "gog", which we would want to translate as "god-noise" or some such. Perhaps they train their children to detect gog in precisely the same way we train children to detect thunder -- our internal thunder-detectors are identical. Nevertheless, the output of their thunder-detector does not *mean* "thunder". Let me try to clarify the point of these considerations. I am all for an inquiry into the mechanisms underlying our categorization ablities. Anything you can discover out about these mechanisms would certainly be a major contribution to psychology. My only concern is with semantics: I was piqued by what seemed to be an ambitious claim about the significance of the psychology of categorization for the problem of "intentionality" or intrinsic meaningfulness. I merely want to emphasize that the former, interesting though it is, hardly makes a dent in the latter. As I said, there are two reasons why meaning resists explication by this kind of psychology: (1) holism: the meaning of even a "grounded" symbol will still depend on the rest of the cognitive system; and (2) normativity: meaning is dependent upon a determination of what is a *correct* response, and you can't simply read such a norm off from a description of how the mechanism in fact performs. I think these points, particularly (1), should be quite clear. The fact that a subject's brain reliably asserts the symbol "foo" when and only when thunder is presented in no way "fixes" the meaning of "foo". Of course it is obviously a *constraint* on what "foo" may mean: it is in fact part of what Quine called the "stimulus meaning" of "foo", his first constraint on acceptable translation. Nevertheless, by itself it is still way too weak to do the whole job, for in different contexts the postive output of a reliable thunder-detector could mean "thunder", something co-extensive but non-synonymous with "thunder", "god-noise", or just about anything else. Indeed, it might not *mean* anything at all, if it were only part of a mechanical thunder-detector which couldn't do anything else. I wonder if you disagree with this? As to normativity, the force of problem (2) is particularly acute when talking about the supposed intentionality of animals, since there aren't any obvious linguistic or intellectual norms that they are trying to adhere to. Although the mechanics of a frog's prey-detector may be crystal clear, I am convinced that we could easily get into an endless debate about what, if anything, the output of this detector really *means*. The normativity problem is germane in an interesting way to the problem of human meanings as well. Note, for example, that in doing this sort of psychology, we probably won't care about the difference between correctly identifying a duck and mis-identifying a good decoy -- we're interested in the perceptual mechanisms that are the same in both cases. In effect, we are limiting our notion of "categorization" to something like "quick and largely automatic classification by observation alone". We pretty much *have* to restrict ourselves in this way, because, in the general case, there's just no limit to the amount of cognitive activity that might be required in order to positively classify something. Consider what might go into deciding whether a dolphin ought to be classified as a fish, whether a fetus ought to be classified as a person, etc. These decisions potentially call for the full range of science and philosophy, and a psychology which tries to encompass such decisions has just bitten off more than it can chew: it would have to provide a comprehensive theory of rationality, and such an ambitious theory has eluded philosophers for some time now. In short, we have to ignore some normative distinctions if we are to circumscribe the area of inquiry to a theoretically tractable domain of cognitive activity. (Indeed, in spite of some of your claims, we seem committed to the notion that we are limiting ourselves to particular *modules* as explained in Fodor's modularity book.) Unfortunately -- and here's the rub -- these normative distinctions *are* significant for the *meaning* of symbols. ("Duck" doesn't *mean* the same thing as "decoy"). It seems that, ultimately, the notion of *meaning* is intimately tied to standards of rationality that cannot easily be reduced to simple features of a cognitive mechanism. And this seems to be a deep reason why a descriptive psychology of categorization barely touches the problem of intentionality. Anders Weinstein BBN Labs
harnad@mind.UUCP (Stevan Harnad) (07/01/87)
marty1@houdi.UUCP (M.BRILLIANT) of AT&T Bell Laboratories, Holmdel writes: > how long do you imagine the Total Turing Test will last?... By > "performance capabilities," do you mean the capability to adapt as a > human does to the experiences of a lifetime? My Total Turing Test (TTT) has two components, one a formal, empirical one, another an informal, intuitive one. The formal test requires that a candidate display all of our generic performance capacities -- the ability to discriminate, manipulate, identify and describe objects and events as we do, under the same conditions we do, and to generate and respond to descriptions (language) as we do. The informal test requires that the candidate do this in a way that is indistinguishable to human beings from the way human beings do it. The informal component of the TTT is open-ended -- there is no formal constraint on how much is enough. The reason is that I proposed the TTT to match what we do in the real world anyway, in our informal everyday provisional "solutions" to the "other-minds" problem in dealing with one another. Robots should be held to no more or less exacting standards. (This was extensively discussed on the Net last year.) > We've found a set of lines, described in 3 dimensions, that can be > rotated to match the outline we derived from the view of a real chair. > We file it in association with the name "chair." A "similar form" is > some other outline that can be matched (to within some fraction of its > size) by rotating the same 3D description. I agree that that kind of process gets you similarity (and similarity gradients), but it doesn't get you categorization. It sounds as if you're trying to get successful identification using icons. That will only work if the inputs you have to sort are low in confusability or are separated by natural gaps in their variation. As soon as the sorting problem becomes hard, feature-learning becomes a crucial, active process -- not one that anyone really has a handle on yet. Do you really think your stereognostic icons of chairs will be able, like us, to reliably pick out all the chairs from the nonchairs with which they might be confused, using only the kinds of resources you describe here? > Is [categorization] the process of picking discrete objects out of a > pattern of light and shade ("that's a thing"), or the process of naming > the object ("that thing is a chair")? The latter. And reliably saying which confusable things are chairs vs. nonchairs as we do is no mean feat. > I think I find objects with no conscious knowledge of how I do it (is > that what you call "categorization")? Saying what kind of object it is > more often involves conscious symbol-processing Categorization is saying what kind of object it is. And, especially in the case of concrete sensory categories, one is no more conscious of "how" one does this than in resolving figure and ground. And even when we are conscious of *something*, it's not clear that's really how we're doing what we're doing. If it were, we could do cognitive modeling by introspection. (This is one of the reasons I criticize the Rosch/Wittgenstein line on "necessary/sufficient" features: It relies too much on what we can and can't introspect.) Finally, conscious "symbol-processing" and unconscious "symbol-processing" are not necessarily the same thing. Performance modeling is more concerned with the latter, which must be inferred rather than introspected. > I agree that [finding objects in a field of light and shade] > is done nonsymbolically... [but] Calling a penguin a bird seems to me > purely symbolic, just as calling a tomato a vegetable in one context, > and a fruit in another, is a symbolic process. Underlying processes must be inferred. They can't just be read off the performance, or our introspections about how we accomplish it. I am hypothesizing that higher-order categories are grounded bottom-up in lower-order sensory ones, and that the latter are represented nonsymbolically. We're talking about the underlying basis of successful, reliable, correct naming here. We can't simply take it as given. (And what we call an object in one context versus another depends precisely on the sample of confusable alternatives that I've kept stressing.) > If [cognition/categorization] begins nonsymbolically, and proceeds > symbolically, why can't it be done by linking a nonsymbolic module to > a symbolic module? Because (according to my model) the elementary symbols out of which all the rest are composed are really the names of sensory categories whose representations -- the structures and processes that pick them out and reliably identify them -- are nonsymbolic. I do not see this intimate interrelationship -- between names and, on the one hand, the nonsymbolic representations that pick out the objects they refer to and, on the other hand, the higher-level symbolic descriptions into which they enter -- as being perspicuously described as a link between a pair of autonomous nonsymbolic and symbolic modules. The relationship is bottom-up and hybrid through and through, with the symbolic component derivative from, inextricably interdigitated with, and parasitic on the nonsymbolic. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
franka@mmintl.UUCP (Frank Adams) (07/02/87)
In article <917@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: |Finally, and perhaps most important: In bypassing the problem of |categorization capacity itself -- i.e., the problem of how devices |manage to categorize as correctly and successfully as they do, given |the inputs they have encountered -- in favor of its fine tuning, this |line of research has unhelpfully blurred the distinction between the |following: (a) the many all-or-none categories that are the real burden |for an explanatory theory of categorization (a penguin, after all, be it |ever so atypical a bird, and be it ever so time-consuming for us to judge |that it is indeed a bird, is, after all, indeed a bird, and we know |it, and can say so, with 100% accuracy every time, irrespective of |whether we can successfully introspect what features we are using to |say so) and (b) true "graded" categories such as "big," "intelligent," |etc. Let's face the all-or-none problem before we get fancy... I don't believe there are any truely "all-or-none" categories. There are always, at least potentially, ambiguous cases. There is no "100% accuracy every time", and trying to theorize as though there were is likely to lead to problems. Second, and perhaps more to the point, how do you know that "graded" categories are less fundamental than the other kind? Maybe it's the other way around. Maybe we should try to understand to understand graded categories first, before we get fancy with the other kind. I'm not saying this is the case; but until we actually have an accepted theory of categorization, we won't know what the simplest route is to get there. -- Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Ashton-Tate 52 Oakland Ave North E. Hartford, CT 06108
harnad@mind.UUCP (Stevan Harnad) (07/02/87)
smoliar@vaxa.isi.edu (Stephen Smoliar) Information Sciences Institute writes: > Consider the holographic model proposed by Karl Pribram in LANGUAGES > OF THE BRAIN... as an alternative to [M.B. Brilliant's] symbol > manipulation scenario. Besides being unimplemented and hence untested in what they can and can't do, holographic representations seem to inherit the same handicap as all iconic representations: Being unique to each input and blending continuously into one another, how can holograms generate categorization rather than merely similarity gradients (in the hard cases, where obvious natural gaps in the input variation don't solve the problem for you a priori)? What seems necessary is active feature-selection, based on feedback from success and failure in attempts to learn to sort and label correctly, not merely passive filtering based on natural similarities in the input. > [A] difficulty seems to be in what it means to file something away if > one's memory is simply one of experiences. Episodic memory -- rote memory for input experiences -- has the same liability as any purely iconic approach: It can't generate category boundaries where there is significant interconfusability among categories of episodes. > Perhaps the difficulty is that the mind really doesn't want to > assign a symbol to every experience immediately. That's right. Maybe it's *categories* of experience that must first be selectively assigned names, not each raw episode. > Where does the symbol's name come from? How is the symbol actually > "bound" to what it retrieves? That's the categorization problem. > The big unanswered question...[with respect to connectionism] > would appear to be: will [it] all... scale upward? Connectionism is one of the candidates for the feature-learning mechanism. That it's (i) nonsymbolic, that it (ii) learns, and that it (iii) uses the same general statistical algorithm across problem-types (i.e., that it has generality rather than being ad hoc, like pure symbolic AI) are connectionism's plus's. (That it's brainlike is not, nor is it true, on current evidence, nor even relevant at this stage.) But the real question is indeed: How much can it really do (i.e., will it scale up)? -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
harnad@mind.UUCP (Stevan Harnad) (07/02/87)
On ailist cugini@icst-ecf.arpa writes: > why say that icons, but not categorical representations or symbols > are/must be invertible? Isn't it just a vacuous tautology to claim > that icons are invertible wrt to the information they preserve, but > not wrt the information they lose?... there's information loss (many > to one mapping) at each stage of the game: 1. distal object... > 2. sensory projection... 3. icons... 4. categorical representation... > 5. symbols... do you still claim that the transition between 2 > and 3 is invertible in some strong sense which would not be true of, > say, [1 to 2] or [3 to 4], and if so, what is that sense?... Perhaps > you just want to say that the transition between 2 and 3 is usually > more invertible than the other transitions [i.e., invertibility as a > graded category]? [In keeping with Ken Laws' recommendation about minimizing quotation, I have compressed this query as much as I could to make my reply intelligible.] Iconic representations (IRs) must perform a very different function from categorical representations (IRs) or symbolic representations (SRs). In my model, IRs only subserve relative discrimination, similarity judgment and sensory-sensory and sensory-motor matching. For all of these kinds of task, traces of the sensory projection are needed for purposes of relative comparison and matching. An analog of the sensory projection *in the properties that are discriminable to the organism* is my candidate for the kind of representation that will do the job (i.e., generate the performance). There is no question of preserving in the IR properties that are *not* discriminable to the organism. As has been discussed before, there are two ways that IRs could in principle be invertible (with the discriminable properties of the sensory projection): by remaining structurally 1:1 with it or by going into symbols via A/D and an encryption and decryption transformation in a dedicated (hard-wired) system. I hypothesize that structural copies are much more economical than dedicated symbols for generating discrimination performance (and there is evidence that they are what the nervous system actually uses). But in principle, you can get invertibility and generate successful discrimination performance either way. CRs need not -- indeed cannot -- be invertible with the sensory projection because they must selectively discard all features except those that are sufficient to guide successful categorization performance (i.e., sorting and labeling, identification). Categorical feature-detectors must discard most of the discriminable properties preserved in IRs and selectively preserve only the invariant properties shared by all members of a category that reliably distinguish them from nonmembers. I have indicated, though, that this representation is still nonsymbolic; the IR to CR transformation is many-to-few, but it continues to be invertible in the invariant properties, hence it is really "micro-iconic." It does not invert from the representation to the sensory projection, but from the representation to invariant features of the category. (You can call this invertibility a matter of degree if you like, but I don't think it's very informative. The important difference is functional: What it takes to generate discrimination performance and what it takes to generate categorization performance.) Finally, whatever invertibility SRs have is entirely parasitic on the IRs and CRs in which they are grounded, because the elementary SRs out of which the composite ones are put together are simply the names of the categories that the CRs pick out. That's the whole point of this grounding proposal. I hope this explains what is invertible and why. (I do not understand your question about the "invertibility" of the sensory projection to the distal object, since the locus of that transformation is outside the head and hence cannot be part of the internal representation that cognitive modeling is concerned with.) -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU
marty1@houdi.UUCP (M.BRILLIANT) (07/03/87)
In article <958@mind.UUCP>, harnad@mind.UUCP (Stevan Harnad) writes: > On ailist cugini@icst-ecf.arpa writes: > > why say that icons, but not categorical representations or symbols > > are/must be invertible? Isn't it just a vacuous tautology to claim > > that icons are invertible wrt to the information they preserve, but > > not wrt the information they lose?... there's information loss (many > > to one mapping) at each stage of the game ... In Harnad's response he does not answer the question "why?" He only repeats the statement with reference to his own model. Harnad probably has either a real problem or a contribution to the solution of one. But when he writes about it, the verbal problems conceal it, because he insists on using symbols that are neither grounded nor consensual. We make no progress unless we learn what his terms mean, and either use them or avoid them. M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdi!marty1
berleant@ut-sally.UUCP (Dan Berleant) (07/04/87)
In article <956@mind.UUCP> harnad@mind.UUCP (Stevan Harnad) writes: >Episodic memory -- rote memory for input experiences -- has the same >liability as any purely iconic approach: It can't generate category >boundaries where there is significant interconfusability among >categories of episodes. Are you assuming a representation of episodes (more generally, exemplars) that is iconic rather than symbolic? Also, *no* category representation method can generate category boundaries when there is significant interconfusability among categories! Dan Berleant UUCP: {gatech,ucbvax,ihnp4,seismo...& etc.}!ut-sally!berleant ARPA: ai.berleant@r20.utexas.edu
marty1@houdi.UUCP (M.BRILLIANT) (07/05/87)
In article <605@gec-mi-at.co.uk>, adam@gec-mi-at.co.uk (Adam Quantrill) writes: > It seems to me that the Symbol Grounding problem is a red herring. As one who was drawn into a problem that is not my own, let me try answering that disinterestedly. To begin with, a "red herring" is something drawn across the trail that distracts the pursuer from the real goal. Would Adam tell us what his real goal is? Actually, my own real goal, from which I was distracted by the symbol grounding problem, was an expert system that would (like Adam's last example) ground its symbols only in terminal I/O. But that's a red herring in the symbol grounding problem. > ..... If I > took a partially self-learning program and data (P & D) that had learnt from a > computer with 'sense organs', and ran it on a computer without, would the > program's output become symbolically ungrounded? No, because the symbolic data was (were?) learned from sensory data to begin with - like a sighted person who became blind. > Similarily, if I myself wrote P & D without running it on a computer at all, > [and came] up with identical > P & D by analysis. Does that make the original P & D running on the com- > puter with 'sense organs' symbolically ungrounded? No, as long as the original program learned its symbolic data from its own sensory data, not by having them defined by a person in terms of his or her sensory data. > A computer can always interact via the keyboard & terminal screen, (if > those are the only 'sense organs'), grounding its internal symbols via people > who react to the output, and provide further stimulus. That's less challenging and less useful than true symbol grounding. One problem that requires symbol grounding (more useful and less ambitious than the Total Turing Test) is a seeing-eye robot: a machine with artificial vision that could guide a blind person by giving and taking verbal instructions. It might use a Braille keyboard instead of speech, but the "terminal I/O" must be "grounded" in visual data from, and constructive interaction with, the tangible world. The robot could learn words for its visual data by talking to people who could see, but it would still have to relate the verbal symbols to visual data, and give meaning to the symbols in terms of its ultimate goal (keeping the blind person out of trouble). M. B. Brilliant Marty AT&T-BL HO 3D-520 (201)-949-1858 Holmdel, NJ 07733 ihnp4!houdiem oh t.
harnad@mind.UUCP (Stevan Harnad) (07/05/87)
In Article 184 of comp.cog-eng: adam@gec-mi-at.co.uk (Adam Quantrill) of Marconi Instruments Ltd., St. Albans, UK writes: > It seems to me that the Symbol Grounding problem is a red herring. > If I took a partially self-learning program and data (P & D) that had > learnt from a computer with 'sense organs', and ran it on a computer > without, would the program's output become symbolically ungrounded?... > [or] if I myself wrote P & D without running it on a computer at all? This begs two of the central questions that have been raised in this discussion: (1) Can one speak of grounding in a toy device (i.e., a device with performance capacities less than those needed to pass the Total Turing Test)? (2) Could the TTT be passed by just a symbol manipulating module connected to transducers and effectors? If a device that could pass the TTT were cut off from its transducers, it would be like the philosophers' "brain in a vat" -- which is not obviously a digital computer running programs. -- Stevan Harnad (609) - 921 7771 {bellcore, psuvax1, seismo, rutgers, packard} !princeton!mind!harnad harnad%mind@princeton.csnet harnad@mind.Princeton.EDU