[comp.theory.info-retrieval] IRList Digest V4 #56

fox@VTODIE.CS.VT.EDU (Edward A. Fox) (11/29/88)

IRList Digest           Monday, 28 November 1988      Volume 4 : Issue 56
     
Today's Topics:
   Query - CD-ROMs of Value
   Discussion - CD-ROM Network Server
              - Pedagogical Models for IR
   Announcement - Workshop on Evaluation of NLP Systems
                - Staffing at NSF
                - Times Collection for IR Research
                - New CSLI Publications
     
News addresses are
   Internet: fox@vtopus.cs.vt.edu
   BITNET: foxea@vtcc1.bitnet (replaces foxea@vtvax3)
     
----------------------------------------------------------------------
     
Date:     Tue, 11 Oct 88 11:16 N
From:     <COR_HVH@HNYKUN52>
Subject:  CD-ROM data of value?

     
Dear Ed,
     
 ...
     
CD-Rom interfaces for microcomputers have reached the point where they
become affordable for small departments and for home use. However, is
there already any real use for them? Which data CDs are at this moment
available (what is it called, what is on it, is it in a standard format,
where can you get it, what does it cost)?
     
=========
     
An extension to this: what data will be on the Virginia Disk(s)?
     
Best wishes,
     
Hans van Halteren            COR_HVH @ HNYKUN52.BITNET

[Note: Virginia Disc One is still in process - we have added
a Time collection from Cornell via UMass, a Rutgers
collection from Tefko Saracevic, and have almost all of the
LISA collection from P. Willet - are still waiting for last
piece of that.  Regarding other CD-ROMs - I think this year
marks the turning point in price and availability - many new
titles coming out in High Sierra or ISO form at lower and
lower prices. - Ed]
     
------------------------------

Date:     Tue, 11 Oct 88 12:12:20 PDT
From:     PAAAAA7@CALSTATE
Subject:  CD Roms and Such

Mr. Fox;
I was just given a copy of your CD-Rom letter to Dr. Dick Botting,
and thought I would pass along a product I just found out about.
Meridian Data Inc. markets an interface to plug a High Sierra CD-ROM
into a Novell Network. Sounds like something we all could use! If you
are interested, please let me know and I will grab the address for you.
-Rich McGee
Computer Center
CSUSB

------------------------------

Date:    Fri, 30 Sep 88 20:11 PDT
To:      Ed A Fox                             <FOXEA@VTCC1.BITNET>
From:    Marcia Bates                         <IFQ0MJB@UCLAMVS.BITNET>
Subject: Pedagogical models for IR

[Note: see earlier discussion in issues 39, 40, 46 - Ed.]

I just came onto IRLIST, so I do not know the text of the original question
regarding pedagogical models. However, I have some suggestions that may be of
interest.

First, one must ask whether the interest is in searches only within a given
automated system, or in the overall process of retrieving info.  Some of the
most important questions a searcher--whether end user or expert--must decide
in a real-life search are which system to use and whether to use a manual or
online system.Some of the most serious errors made by librarian-students and
by practitioners involve initiating searches on systems that are quite
inappropriate for the query in hand--looking for statistical data on biblio-
graphic databases, for example.

 If these broader questions are of interest, then there is a growing
literature both in the online area and in the area of "bibliographic
instruction" in the library field.  IRLIST members are probably more
familiar with the online literature than with BI. For BI, Constance
Mellon's 1987 book Bibliographic Instruction: The Second Generation is a
good place to start.  Some otherinteresting recent articles dealing with
college students are:

Stoan, Stephen K. "Research in Library Skills..."  College and Research
   Libraries, 45 (Mar 84): 99-109.

Dunn, Kathleen  "Psychological Needs and Source Linkages in Undergraduate
   Information Seeking Behavior."  College and Research Libraries 47
   (Sept 86): 475-481.

Also book and articles by Nigel Ford, who has dealt in great detail with
higher education students.

If, on the other hand, interest is in automated systems primarily, or in
the design of search interfaces, a number of my own articles deal with many
of these issues (often dealing with both manual and online systems
simultaneously), with models or model fragments being more or less explicit.
See my:

"Information Search Tactics"  Journal of ASIS 30(July 1979): 205-214.
     Describes classes of search models, classes of tactics, and 29
     particular search tactics.

 "Idea Tactics"  JASIS  30 (Sept 1979): 280-289.
     17 additional tactics.

 "Search Techniques."  Annual Review of Information Science and Technology
     16 (1981): 139-169.
        Reviews lib/info sci literature on psychology of searching to that date

 "Locating Elusive Science Information: Some Search Techniques"  Special
     Libraries 75 (Apr 84): 114-120.
        Drawing on model of the scientific publication cycle, the searcher
        is shown how to locate info thought to be inaccessible.

 "The Fallacy of the Perfect 30-item Online Search."  RQ  24 (Fall 84): 43-50.
        Discusses psychological traps for end users and intermediaries that
        degrade quality of online searches.

 "An Exploratory Paradigm for Online Information Retrieval."  In Intelligent
     Information Systems for the Information Society, ed. by B.C. Brookes.
     Amsterdam: Elsevier, 1986, p. 91-99.
        Model of broad classes of info seeking plus discussion of two major
        types of search in automated systems.

 "Subject Access in Online Catalogs: A Design Model."  JASIS 37 (Nov 86):
     357-376.
        Model incorporating psychological and linguistic factors in design of
        subject search interface for online catalogs.Drawing upon recent
        research and thinking, departs dramatically from a number of
        traditional assumptions.

  "How to Use Information Search Tactics Online." ONLINE 11 (May 87): 47-54.
     Groups and applies search tactics in online environment to help at
     several states of search, as well as in cases where too many or too few
     items retrieved.

  "How to Use Controlled Vocabulary More Effectively in Online Searching.
     ONLINE, Nov. 88, in press.
         Argues that features of various indexing and classification
         systems used in databases must be taken into account in online
         search formulation, i.e., different indexing systems require
         different strategies, so searcher must be able to recognize
         controlled vocabulary types.

See also David Bawden, "Information Systems and the Stimulation of
Creativity"  Journal of Info Science 12 (86): 203-216.

                                 --Marcia J. Bates  ifq0mjb@uclamvs.bitnet
                                   (Graduate School of Library and
                                        Information Science
                                    120 PLB  UCLA
                                    Los Angeles, CA 90024-1520 USA)

------------------------------

Date: Fri, 2 Sep 88 12:19:15 EDT
From: palmer@BURDVAX.PRC.UNISYS.COM
Subject: nl evaluation workshop



                   CALL FOR PARTICIPATION

                        Workshop on
     Evaluation of Natural Language Processing Systems

                          Dec 8-9
           Wayne Hotel, Wayne, PA (Philadelphia)

     There has been much recent interest  in  the  difficult
problem  of  evaluating  natural language systems.  With the
exception of natural language interfaces there are few work-
ing systems in existence, and they tend to be concerned with
very different tasks and use equally  different  techniques.
There  has been little agreement in the field about training
sets and test sets, or  about  clearly  defined  subsets  of
problems  that  constitute standards for different levels of
performance.  Even those groups that have attempted a  meas-
ure of self-evaluation have often been reduced to discussing
a system's performance in isolation - comparing its  current
performance  to  its  previous  performance  rather  than to
another system. As this technology  begins  to  move  slowly
into  the  marketplace, the need for useful evaluation tech-
niques is becoming more and more obvious.  The  speech  com-
munity  has  made some recent progress toward developing new
methods of evaluation, and  it  is  time  that  the  natural
language  community followed suit.  This is much more easily
said than done and will require a concentrated effort on the
part of the field.

     There are certain premises that should underly any dis-
cussion  of  evaluation  of natural language processing sys-
tems:

(1)  It should be possible to  discuss   system   evaluation
     in   general  without  having to state whether the pur-
     pose  of the  system is  "question-answering" or  "text
     processing."     Evaluating   a   system   requires the
     definition of an  application  task  in  terms  of  I/O
     pairs   which   are  equally  applicable  to  question-
     answering, text processing, or generation.

(2)  There are two basic types of evaluation: a) "black  box
     evaluation"  which  measures  system  performance  on a
     given task in terms of well-defined I/O pairs;  and  b)
     "glass  box  evaluation"  which  examines  the internal
     workings of the system.  For example,  glass  box  per-
     formance   evaluation   for  a  system that is supposed
     to  perform semantic  and  pragmatic   analysis  should
     include the  examination  of  predicate-argument  rela-
     tions,  referents,  and temporal and causal relations.

     Given these premises, the workshop will  be  structured
around  the following three sessions: 1) Defining "glass box
evaluation" and "black box evaluation." 2) Defining criteria
for "black box evaluation." A Proposal for establishing task
oriented benchmarks for NLP Systems (Session Chair  -   Beth
Sundheim)  3)  Defining criteria for "glass box evaluation."
(Session Chair - Jerry Hobbs)  Several  different  types  of
systems will be discussed, including question-answering sys-
tems, text processing systems and generation systems.

[Note: too late for sending in papers - sorry for not
getting this out sooner - I have omitted that section since
it is no longer relevant - Ed.]

                       Martha Palmer
                           Unisys
                   Research & Development
                         PO Box 517
                      Paoli, PA 19301
                   palmer@prc.unisys.com
                       (215) 648-7228

------------------------------

Date: Tue, 27 Sep 88 23:41 EDT
From: EHRICH@vtcs1.cs.vt.edu
Subject: NSF staffing

Recently someone asked me where to send something at NSF.  Here is the 
staffing of the CCR (Computer and Computation Research) and IRIS 
(Information, Robotics, and Intelligent Systems) divisions:

CCR:

Director: Peter Freeman
Deputy Director: Helen Gigley

Theory: Errol Lloyd
Computer Architecture: Zeke Zalcstein
Software Engineering: John Gannon
Numeric and Symbolic Computation: Kamal Abdali
Software Systems: Thomas Keenan (returning after October 15)


IRIS:

Director: Y.T. Chien
Deputy Director: Bruce Barnes (surprise, folks?)

Robotics and Machine Intelligence: Ken Laws
Knowledge and Cognitive Systems: Henry Hamburger
Database and Expert Systems: Maria Zemankova
Interactive Systems: Hal Bamford
Information Technology and Organizations: Laurence Rosenberg

------------------------------

Date:     Mon, 10 Oct 88 17:00 EDT
From: krovetz@UMass
Subject:  times collection
     
Ed,
  Just wanted to let you know that I finally did get the old Times
collection from Cornell (thanks to Chris Buckley).  I have four files:
the documents themselves, the queries, a stopword list, and a set of
relevance judgements.  Chris said he tried SMART on it and got an
average performance of 64% precision (averaged over the 25%, 50% and
75% recall levels).  This is pretty high, but the collection is also
rather small (425 documents and 83 queries).
     
-bob
[Note: I am putting that collection and a few others onto
Virginia Disc One - thanks to Bob for his perseverence in
sending the many files over BITNET, till we have them all - Ed.]

------------------------------

Date: Wed, 12 Oct 88 17:13:00 PDT
From: Emma Pease <emma@CSLI.STANFORD.EDU>
Subject: CSLI Calendar, October 13, 4:4 [Extract - Ed.]

 ...
			    NEW PUBLICATIONS

   The following reports have recently been published.  They may be
   obtained, or a full list acquired by writing to Trudy Vizmanos, 
   CSLI, Ventura Hall, Stanford, CA 94305-4115, or
   publications@csli.stanford.edu.

   112. Bare Plurals, Naked Relatives, and Their Kin.
        Dietmar Zaefferer $2.50 

   113. Events and ``Logical Form''.        Stephen Neale $2.00 

   114. Backwards Anaphora and Discourse Structure: Some
        Considerations.        Peter Sells $2.50 

   115. Toward a Linking Theory of Relation Changing Rules in LFG.
        Lori Levin $4.00 

   116. Fuzzy Logic.        L. A. Zadeh $2.50 

   117. Dispositional Logic and Commonsense Reasoning.
        L. A. Zadeh $2.00 

   118. Intention and Personal Policies.      Michael Bratman $2.00 

   119. Propositional Attitudes and Russellian Propositions.
        Robert C.Moore $2.50 

   120. Unification and Agreement.        Michael Barlow $2.50 

   121. Extended Categorial Grammar.     Suson Yoo and Kiyong Lee $4.00

   122. The Situation in Logic---IV: On the Model Theory of Common
        Knowledge.     Jon Barwise $2.00

   123. Unaccusative Verbs in Dutch and the Syntax-Semantics Interface.
        Annie Zaenen $3.00 

   124. What Is Unification? A Categorical View of Substitution,
        Equation and Solution.     Joseph A. Goguen $3.50 

   125. Types and Tokens in Linguistics.    Sylvain Bromberger $3.00 

   126. Determination, Uniformity, and Relevance: Normative Criteria 
        for Generalization and Reasoning by Analogy.
        Todd Davies $4.50 

   127. Modal Subordination and Pronominal Anaphora in Discourse.
        Craige Roberts $4.50 

   128. The Prince and the Phone Booth: Reporting Puzzling Beliefs.
        Mark Crimmins and John Perry $3.50 

   129. Set Values for Unification-Based Grammar Formalisms and Logic
        Programming.    William Rounds $4.00 

   130. Fifth Year Report of the Situated Language Research Program.
        free 

   131. Locative Inversion in Chichewa: A Case Study of Factorization
        in Grammar.     Joan Bresnan and Jonni M. Kanerva $5.00 

   132. An Information-Based Theory of Agreement. 
        Carl Pollard and Ivan A.Sag $4.00

------------------------------
     
END OF IRList Digest
********************