[net.ai] Question about HEARSAY-II.

doshi%umn-cs.csnet@csnet-relay.arpa (08/23/84)
I have a question about the HEARSAY-II system [Erman et.al.1980].

What exactly is the HEARSAY system required/supposed to do ?
i.e. what is the meaning of the phrase :
      "Speech Understanding system"


Honestly, I did go thru [Erman+ 1980] carefully. I can quote the following :

      page 213 : "The HEARSAY-II system....represents both a specific
                  solution to the speech-understanding problem and a
                  general framework for co-ordinating independent
                  processes to achieve cooperative problem solving
                  behaviour."

      page 213 : "The HEARSAY-II reconstructs an intention ....      "

      page 214 : "The HEARSAY-II recognises connected speech in .... "

      page 234 : (this is a footnote)
                 "IBM has been funding work with somewhat different
                  objective... Its stated goals mandate little reliance
                  on the strong syntactic/semantic/task constraints
                  exploited by the DARPA projects. This orientation is
                  usually dubbed SPEECH RECOGNITION as distinguished
                  from SPEECH UNDERSTANDING."

       page 233 : "DARPA speech understanding system performance goals.....
                                -------------                    -----
                  The system should
                      - Accept connected speech
                      - from many
                      - cooperative speakers of the General American Dialect
                      - in a quiet room
                      - using a good-quality microphone
                      - with slight tuning per speaker
                      - requiring only natural adaption by the user
                      - permitting a slightly selected vocabularu of 1000 words
                      - with a slightly artificial syntax and highly
                        constrained task
                      - providing graceful interaction
                      - tolerating less than 10 % semantic error

                        [this is the only direct reference to `understanding`
                         or `semantics`]

                      - ....... "

Let me explain my confusion with examples. Does the system do one of the
following :

      - 1) Accepts speech as input; Then, tries to output what (ever) was
          spoken or might have been spoken ?

      - 2) Or, accept speech as input and UNDERSTAND it ?

Now, the 1) above is, I think speech RECOGNITION. DARPA did not want just that.

Then, what is(are)  the meaning(s) of UNDERSTAND ?

      - if I say "Alligators can fly", should the system repeat this and also
        tell me that that is "not true"; is this called UNDERSTANDING ??

      - if I say "I go house", should the system repeat this and also add that
        there is a "grammetical error"; is this called UNDERSTANDING ??

      - Or, if HAYES-ROTH claims  "I am ERMAN", the system should say
        "No, You are not ERMAN" - I dont think that HEARSAY was supposedd
        to do this (it does not have Vision etc). But you will agree that
        that is also UNDERSTANDING. Note that the above claim by
        HAYES-ROTH would be true if :
              - he had changed his last name
              - he was merely QUOTING what ERMAN might have said somewhere
              - etc

So, could someone (the original authors of HEARSAY-II, perhaps)
respond to the question :
        In light of the above examples, what does it mean by
        saying that HEARSAY-II understands speech ?

Thank you.

-- raj
   Graduate student
   U. of Minnesota

   CSNET : doshi.umn-cs@csnet-relay


Reference : L.D.Erman, F.Hayes-Roth, Victor Lesser and D.R.Reddy
            "The Hearsay-II speech Understanding System : Integrating
                 Knowledge to resolve uncertainity."
            ACM Computing Surveys, Vol 12, No 2, June 1980.