[comp.theory.info-retrieval] IRList Digest V3 #39

FOX@VTCS1.BITNET ("Edward A. Fox") (11/04/87)
IRList Digest           Tuesday, 3 November 1987      Volume 3 : Issue 39

Today's Topics:
   Discussion - Lexicon development: terms used by library catalog searchers
              - Barriers to library access
   Announcement - CELEX lexical databases
                - ACM CHI '88 Workshop on user interface consistency
                - Position in advanced software development (relating to IR)
   COGSCI - Parsing free word order languages
          - Experience, memory, and reasoning
   CSLI - External Language and Internal Representation
        - Situated Automata

News addresses are
   Internet or CSNET: fox@vtcs1.cs.vt.edu
   BITNET: fox@vtcs1.bitnet

----------------------------------------------------------------------

Date:         Wed, 14 Oct 87 10:25:04 CST
From:         JEFF HUESTIS <C81350JH@WUVMD>
Subject:      LEXICON DEVELOPMENT


I've been doing some keyword work with our catalog (an online KWOC
index has been the only thing ... possible to do at this time
because of other commitments on programmer time) which, of course,
included some word counting for development of a "stoplist".  The
results are, not surprisingly, somewhat different from the distribution
found in the Brown Corpus, the top 400 words from which formed our
original stoplist.  I thought something like this approach would be
useful in doing anything with term-weighting.  However, it would
presumably duplicate the resource you already have in your VTLS data.

[Note: we could generate such statistics but it may differ from yours
due to the difference between library collections - that might be
interesting to see. - Ed]

On the other hand, we also have several million log records from user
searches in our online catalog.  The subject searches would, like
the catalog data, be skewed toward information retrieval, as compared
with the Brown Corpus, and might be useful as representing the concept
space of real people, as opposed to librarians, computer scientists, and
other information workers.  Again, the value of the data, as I see it,
is primarily related to term weighting.  Let me know if you're
interested in this data, either in raw or condensed form.  . . .

Of course, given that you're working with AILIST, maybe the Brown
Corpus is the most appropriate text for word counting, other than the
digests themselves.  But comparing the different distributions may
still be useful, if someone wants to do it.

[Note: this might be of interest to others - has any one developed
statistics of this type? Does anyone have comments?  One thing I might
be able to do is to put up a merged word list with frequency info.
from various sources, on one of the CDROMs we will be mastering - let
me know if anyone thinks these statistics might be of value for
weighting experiments. - Ed]

------------------------------

Date: Sun, 11 Oct 87 23:48:02 EDT
From: dws@EDDIE.MIT.EDU (Don W. Saklad)
Subject: barriers to library access

Subtle pervasive censorship by library officials at our Boston Public Library
still continues after the appointment of the new director.  Savvy visitors
and users ask for documentation such as annual reports and library system
manuals so to know the system and thus develop effective and efficient
techniques like people who've made careers in libraries.  Reference services
decline these inquiries and related requests sometimes as whimsical or
intrusive.

The point is that the 130 year old public library should conserve and maintain
archives.  Even our library board's public meetings should be accessible,
instead of the intimidation that has discouraged civic interest.  Also
marketing the library should encourage civic participation with comment,
criticism, praise and suggestions.  BPL public relations seems to be this
formulated paternalistic approach to constituent groups that through long
experience they know what's best for everyone without asking.

------------------------------

Date:     Wed, 7 Oct 87 14:31 N
From:     <CELEX@HNYMPI52>
Subject:  CELEX lexical databases
To:       foxea@vtvax3
X-Original-To:  foxea@vtvax3, CELEX

We think the work of the CELEX project may be of interest to the readers
of your digest, and would be grateful if you could include this short
notice in a future edition.
With thanks, Marcel Bingley
             Gavin Burnage
                           -- CELEX Nijmegen --

================================================================================


              C E L E X   -  CENTRE FOR LEXICAL INFORMATION
              =============================================

  CELEX is a new and rapidly-developing project undertaken by several Dutch
  institutions which aims to provide extensive information on the English
  and Dutch languages for use in many types of research. Detailed
  information on orthography, morphology, phonology, word frequency,
  syntactical categories etc. has been collected and collated from several
  sources and, by means of the ORACLE relational database management
  system, structured to form a highly flexible and wide-ranging source of
  lexical information.

  Newsletters detailing the development and current standing of the first
  stage of the CELEX project (the first stage covers all but semantic
  information, which will be added in the second stage beginning 1989) are
  now available to anyone who is interested. If you have not already been
  placed on our mailing list, then please send your surface and electronic
  addresses to :
                      CELEX@HNYMPI52

------------------------------

Date:         Sat, 17 Oct 87 15:42:53 DNT
From:         Jakob Nielsen  Tech Univ of Denmark <DATJN@NEUVM1>
Subject:      Announcement of ACM CHI'88 Workshop on user interface consistency


                            Call for Participation

                              CHI'88 Workshop on
              COORDINATING USER INTERFACES FOR CONSISTENCY

A limited attendance, invitational workshop on Coordinating User Interfaces
for Consistency is being organized for the ACM CHI'88 Conference in
Washington, DC. The workshop will be held on Monday, May 16, 1988.

The goal of the workshop is to discuss methods for coordinating the design of
user interfaces for consistency and to produce a set of recommendations for
people responsible for this aspect of user interface design. This workshop will

not cover the subject of user interface standards in the sense of how a
standard is arrived at or what standards should be recommended. (The workshop
will focus on HOW user interfaces can be made to look and feel similar
rather than WHAT they should look and feel like.)

One of the most important aspects of usability is consistency in user
interfaces. Consistency should apply both within the individual application
and across complete computer systems and product families. Practical methods
for coordinating user interface design are not well known. Issues of interest
would include:

   * What we mean by consistency
   * User Interface Architectures
   * In-house standards
   * Methods for quality assurance of compliance with consistency rules
   * Methods for coordinating small-scale projects
   * Methods for coordinating large-scale projects
   * Automated methods for checking user interface consistency

Participation in the workshop will be limited to twelve people. Individuals
wishing to attend the workshop may request an invitation by submitting a
two-page position paper. Applicants should also briefly list major projects
in user interface coordination in which they have participated.

Position papers are due no later than
                  March 1, 1988
Four copies, single-spaced should be sent (use airmail to ensure arrival by
March 1) to:
  Jakob Nielsen
  Department of Computer Science
  Technical University of Denmark
  DK-2800 Lyngby Copenhagen
  DENMARK

  Telephone: International access +45-1-38 23 20
  BITnet:  datJN@NEUVM1
  ArpaNET: datJN%NEUVM1.bitnet@csnet-relay

Please submit position papers in hardcopy.

Notification to invited participants will be mailed by airmail March 15,
1988. Invited participants will also be sent copies of the selected position
papers along with a final agenda for the workshop.

------------------------------

Date: Thu, 15 Oct 87 10:05:26 EDT
Subject: Reply to your message and a job posting
From: j.a.king%dayton.ncr.com@RELAY.CS.NET

 . . .

Enclosed is a job-posting for a position at NCR R&D.  If you know of any
doctoral students or other interested parties, please forward this posting.

 . . .
Thanks.  Jim King  j.a.king@dayton.ncr.com

                               Job Posting


NCR Corporation

Available Position in Advanced Software Development

October 15, 1987


Consulting Analyst - AI

Resposibilities:  Programming in the areas of intelligent interface design,
                  information retrieval, planning, knowledge-based systems
                  and other areas of advanced office information systems -
                  through the use of object-oriented techniques.

Preferred Background: Applicant should possess a solid background in UNIX
                  AI workstation environments, specifically Symbolics.
                  Experience with object-oriented programming, e.g. Flavors,
                  Smalltalk, Common Loops, C++, etc. is required.  Minimum
                  of a B.S. in Computer Science and two years experience
                  required, graduate level degree preferred.

Location:         NCR Corporate Research and Development Division in Dayton,
                  Ohio.
Contact:          Nelson Hazeltine or James King at (513)-445-1060 or 1090.

------------------------------

Date: Mon, 5 Oct 1987  18:46 EDT
From: Peter de Jong <DEJONG%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU>
Subject: Cognitive Science Calendar [Extract - Ed]

        Date: Friday, 2 October 1987  16:17-EDT
        From: Paul Resnick <pr at ht.ai.mit.edu>
        Re:   AI Revolving Seminar-- Michael Kashket

Thursday 8, October  4:00pm  Room: NE43- 8th floor Playroom

                    The Artificial Intelligence Lab
                        Revolving Seminar Series

                         Order Parser Word A Free-

                         Mike Kashket
                         (kash@oz.ai.mit.edu)

                         MIT AI Lab


Free-word order languages (where the words of a sentence may be spoken
in virtually any order with little effect on meaning) pose a great
problem for traditional natural language parsers.  Standard, rule-based
parsers have operated with some degree of success on fixed-word order
languages (e.g., English), relying on the order between words to
drive the construction of the parse tree.  In order to cover the varying
sequences of free word order, however, these parsers have had to use
grammars that contain one rule for each permutation of a sentence.  The
result was a linguistically uninteresting parse that did not even
represent the basic distinction between the verb's subject and object.

A shift from rule-based to  principle-based parsing seems to be the
answer.  A parser grounded on a linguistically principled theory---in
this case, the recently developed Government-Binding theory---has a
grammar that consists of independent modules, each representing a
different facet of the language.  For order phenomena two
representations are mandated:  one that encodes linear precedence, and
one that encodes hierarchical, syntactic relations (such as subject and
object).  In this scheme, linear ordering is represented only where it
is syntactically relevant.

This new parsing technique should also work for fixed-word order
languages.  Here we take advantage of the parameters of GB theory.
The claim is that, rather than allowing unconstrained differences
between grammars, we can account for the variation among languages of
the world by encoding the grammar for each language with only a simple,
finite list of parameter settings.  For ordering phenomena, there are
two parameters:  the part of speech that identifies the subject and the
object, and whether words or morphemes are involved.

In this talk, I will present an implemented, GB-based parser that
handles Warlpiri, a free-word order aboriginal language of central
Australia.  I will also discuss the promise of this approach for
handling fixed-order languages such as English.


Ngakarnanyarra nyanyi.
(All come.)

------------------------------

Date: Tue, 20 Oct 1987  09:58 EDT
From: Peter de Jong <DEJONG%OZ.AI.MIT.EDU@XX.LCS.MIT.EDU>
Subject: Cognitive Science Calendar [Extract - Ed]

        Date: Monday, 19 October 1987  17:51-EDT
        From: Paul Resnick <pr at ht.ai.mit.edu>
        Re:   AI Revolving Seminar Thursday-- Janet Kolodner


Thursday 22, October  4:00pm  Room: NE43- 8th floor Playroom

                        The Artificial Intelligence Lab
                        Revolving Seminar Series


                         Experience, Memory, and Reasoning

                         Janet Kolodner


    Much of the reasoning people do is based on previous experiences
similar to their current situation.  The process of using a previous
experience to reason about a current one is called case-based
reasoning.  In case-based reasoning, a reasoner remembers a
previous case and then adapts it to fit the current situation.
A reasoner that uses case-based reasoning can take reasoning
shortcuts, avoid previously-made errors, and focus on important
parts of a problem and important knowledge that might otherwise
have been missed.

    To build usable case-based reasoning systems on the computer,
we must discover the best ways to make case-based inferences,
how to best organize and retrieve cases in memory, and how to
integrate case-based with other reasoning methods.

    In this talk, I will present several case-based reasoning
methods and present some of the problems involved in developing
case-based problem solving systems.  Examples will come from
several expert and common-sense domains, and examples from
several experimental programs will be shown.

------------------------------

Date: Wed 14 Oct 87 17:42:31-PDT
From: Emma Pease <Emma@CSLI.STANFORD.EDU>
Subject: CSLI Calendar, Oct. 15, 3:3 [Extract - Ed]

              External Language and Internal Representation
                                Pat Hayes
                          (Hayes.pa@xerox.com)
                               October 22

   Language evolved, and is used, for communication between intelligent
   agents.  Internally represented information is used quite differently,
   and different assumptions must be made in thinking about ways of
   encoding it for use inside a mind.  In particular, communication can
   assume an intelligent decoder on the other end but is severely
   constrained by the bandwidth of speech, while internal representations
   seem to have much wider channels of communication available between
   their component parts but must be explicit and detailed to an extent
   that would be inappropriate for a `natural' language.  I will argue
   that general talk of `information' ignores this important distinction
   and is therefore sometimes confusing in discussions of situated
   agency.

------------------------------

Date: Thu 22 Oct 87 09:09:09-PDT
From: Emma Pease <Emma@CSLI.STANFORD.EDU>
Subject: CSLI Calendar, Oct. 22, 3:4 [Extract - Ed]

                  An Introduction to Situated Automata
                         Part I: Basic Concepts
                            Stan Rosenschein
                               October 29

   This is the first of two lectures on the situated-automata approach to
   the analysis and design of embedded systems.  This approach seeks to
   ground our understanding of embedded systems in a rigorous, objective
   analysis of their informational properties, where information is
   modeled mathematically in terms of correlations between states of the
   system and conditions in the environment. In this talk we motivate the
   general framework, present the central mathematical ideas on how
   information is carried in the states of automata, and relate the
   mathematical properties of the model to key theoretical issues in AI
   including the nature of knowledge, its representation in machines, the
   role of syntactic deduction, "nonmonotonic" reasoning, and the
   relation of knowledge and action.  Some general technological
   implications of the approach, including reduced reliance on
   conventional symbolic inference and increased opportunities for
   parallelism, will be discussed.

   The second lecture will describe the application of the
   situated-automata perspective to specific problems arising in the
   design of integrated intelligent agents, including problems of
   perception, planning and action selection, and linguistic
   communication.

------------------------------

END OF IRList Digest
********************