[comp.databases] Review on full text databases

xexeo@cernvax.cern.ch (geraldo xexeo) (05/28/91)

It is time for reviewing the information I got.

There was not too much interest, but I think it is fair to give all the 
information I have up to now.

The subject was: Do you have information about text databases? 

I should agree that it was too broad, but that is how things starts. I am
more interested (but not only) on systems that maintains databases  of
structured documents, done in SGML, LAtex, etc...

I received answers from:
*Liam Quin <lee@sq.sq.com>, who told me about lq

*Harlam Stenn <plus5!harlan%plus5.com>
    who is implementing a system for a newspaper!

*Debbie Hoekman <debbie@iris.claremont.edu>,
    who  suggested  "Full Text Databases"  by Carol Tenopir

* Marvin Beck <GO.MSB@ISUMVS.BITNET> 
    whose boss would like to know about my findings... ;-)

* <klassen@sol.uvic.ca>
    Who remembered me about SPIRES!

* Bruce Atherton ,<atherton@unixg.ubc.ca>
    Who gave me a fine literature and pointers to conferences and to lq also.

My thanks to all of you!

And so, here came the references I have, in the same format as in the 
HCI bibliography project, refer compatible.

They come from the people listed and from my own fidings on the libraries
of CERN and ETH/Zurich (which has a nice computer system). I don't need
to say that the references didn't cross and that I will have to ask 
for the suggested books by the inter-library service... Murphy is all around. 

Ah! The HCI is a project that is developing a database about articles on Human 
Computer Interaction (yes, Hypertext is there). 
FTP to cheops.cis.ohio-state.edu, look for pub/hcibib, OR send a message 
to hcibib@cis.ohio-state.edu  where the first line is: "Send: index".


OK, here come the listing:
==========================

(PS: I do not list too obvius things, as the TeX/LaTeX books and
 some outdated material that were just at hand in the library :-))

%T title, %A author, %C city, %I issuer, %D date , %X comments, %P pages
%G isbn, %S series, %R tech. report.



%T Full Text Databases 
%A Carol Tenopir

%T Electronic filing and retrieval  : developments in full text retrieval
%A John A. T. Pritchard
%C Manchester 
%I NCC Publications
%D  1989
%P  175 
%G ISBN 0-85012-788-2
                      
%A Van Rijsbergen, C. J.  
%T Information Retrieval (Second_Edition)
%I    Butterworths
%C London/Boston
%D 1979

%A  Salton, G.  
%T Automatic Text Processing 
%I  Addison Wesley
%C Reading  Mass.
%D 1989

%A  Salton, G. and McGill, M. 
%T Text retrieval : the state of the art
%T : proceedings of the Institute of Information Scientists Text Retrieval 
%T   Conferences: "The User's Perspective" (1988) and "Text Management" (1989)
%E  Peter Gillman
%C  London 
%I  Taylor Graham
%D  1990
%P  208 
%G ISBN 0-947568-44-1

%T _Introduction_To_Modern_Information_Retrieval_,
%I      McGraw-Hill
%C New York
%D 1983
 
%A  Sanchez de Miguel, A
%T  SGML and SQLtextretrieval  
%D  1990
%R  Technical Report  CERN CN 90-9
%I  CERN - CN division
%C  Geneva
%X  Describes the use of a system at CERN that stores SGML formatted
%X  text in a Oracle SQL Text Retrieval data base.

%A  Ayres, F H
%T  Electronic document delivery; 9, the linkage between bibliographic
%T    and full-text databases - a feasibility study 
%R  Technical report  EUR 10677 EN
%D  1987
%X  I could not find, it was lost in my library...
%O  It seems to be from the European Community

%A    DOUGHERTY, Dale
%T    UNIX text processing
%C    Indianapolis, Ind. 
%I    Hayden Books, 
%D    1987 (repr. 1989). - 665 p.

                                                                   
%A KIMBERLEY, Robert ed
%T Text retrieval : a directory of software. - 2nd ed.
%I Aldershot : Gower 
%S 1987

                                                                    
%A WORKSHOP ON THE INTRODUCTION TO TEXT PROCESSING SYSTEMS. 1985
%T An introduction to text processing systems : current problems and
%T      solutions
%P Dublin : Boole Press, 1985. - 120 p.
        
                                                
%A   WORKSHOP ON THE INTRODUCTION TO TEXT PROCESSING SYSTEMS. 1984
%T       An introduction to text processing systems : lecture notes, Dublin,
%T      Ireland, 
%D   22-23 Oct. 1984
%C  Dublin  
%I  Boole Press, 
%P  59 p.

%A   WILLETT, Peter
%T       Parallel database processing : text retrieval and cluster analysis
%T      using the DAP
%I       Pitman 
%C   London 
%D   1990
%P    173 p.


%A     INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE. Le
%A      Chesnay
%T       Le document electronique
%C       Rocquencourt 
%I      INRIA
%D     1990. 
%P     216.
%X     Not checked
%R     Technical report                   
                                             
%A  VAN HERWIJNEN, Eric
%T       Practical SGML
%C       Dordrecht 
%I    Kluwer 
%D   1990.
%P   307 p.  
%X  Good Introduction to SGML



%A   EUROPEAN CONFERENCE ON TEX FOR SCIENTIFIC DOCUMENTATION. 1986
%T       Proceedings, Strasbourg, 
%D       19-21 June 1986
%C       Berlin 
%I  Springer 
%P       204 
%X  Lost at CERN Library...

                                                                                 
%A INTERNATIONAL ORGANIZATION FOR STANDARDIZATION. Geneva
%T       Documentation and information. - 3rd ed.
%C       Geneva
%Y     1988 
%P    1021 
%X    Not checked


%A  ULLMAN, Jeffrey D
%T       Principles of database and knowledge-base systems, v.2 : The new
%T      technologies
%C        Rockville, Md. 
%I Computer Science Press, 
%D   1989. 
%P    504 p.
                    
%A   ULLMAN, Jeffrey D
%T       Principles of database and knowledge-base systems, v.1
%C       Rockville, Md. 
%I    Computer Science Press,
%D        1988. 
%P    631 p.


%A    McALEESE, Ray ed
%T       Hypertext : theory into practice
%C       Oxford 
%I      Blackwell
%D      1989 
%P      175


%A    SHNEIDERMAN, Ben
%T       Hypertext hands-on ! : an introduction to a new way of organizing
      and accessing information
%C       Reading, Mass. 
%I   Addison-Wesley, 
%D      1989. 
%P    165 
%X    Has  a disk with a hypertext system.


       
%S   ACMSIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION
%S      RETRIEVAL
%T       SIGIR '89 : proceedings of the annual international conference,
%C      Cambridge, Mass., 
%D     25-28 June 1989
%X     Technical papers from databases techniques to hypertext implementations.
                                                                               
%T Protext IV : proceedings of the fourth International 
%T Conference on Text Processing Systems
%D 20-22 October 1987, 
%C Boston, USA
%E ed. by J. J. H. Miller
%I Dublin, Ireland : Boole Press
%P 153 
%G ISBN 0-906783-80-1 (Hardback). ISBN 0-906783-79-8 (Paperback)
%X There are PROTEXT I to III at least

%T Document databases
%A  Geoffrey James
%C New York 
%I Van Nostrand Reinhold
%D  1985
%G ISBN 0-442-28185-4

%T Text, ConText, and HyperText  : writing with and for the computer
%E ed. by Edward Barrett
%C Cambridge, Massachusetts (etc.) 
%I  The MIT Press
%D  1988
%P  368 
%S MIT Press series in information systems
%G ISBN ISBN 0-262-02275-3

 
%T Text retrieval : the state of the art
%T : proceedings of the Institute of Information Scientists Text Retrieval 
%T   Conferences: "The User's Perspective" (1988) and "Text Management" (1989)
%E  Peter Gillman
%C  London 
%I  Taylor Graham
%D  1990
%P  208 
%G ISBN 0-947568-44-1

%T Hypertext and Hypermedia
%A  Jakob Nielsen
%I Academic Press
%D 1990

#T The graceful integration of text and fac-simile in an eletronic
#T document delivery system
#B New Trends in Eletronic Publishing and Eletronic Libraries
#C Essen
#I Gesamthochshculbibliothek Essen
#D 1984
#E Ahmed H. Helal
#E Joachim W. Weiss
#A Thomas Hickey

#T MIDOC: An Integrated Interactive System for Structuring, Editing
#T and retrieving Documents
#B PROTEXT I
#A J. Courtin
#A I. Kowarski
#A C. Michaux

#T TeX Document Retrieval
#A D. Lucarella
#B PROTEXT I

#T Structured Documents
#E J. Andre
#E R. Furuta
#E V. Quint
#I INRIA
#S The Cambridge Serires on Eletronic Publishing
#C Cambridge
#D 1989
#G ISBN 0 521 36554 6

#T SPIRES short-reference
#X File available at CERN, from SLAC. 

Last Remark:
Bruce Atherton says:
>I am doing a lot of work on text databases, but mostly using custom software
>that uses the Vector Space Model to find documents relevant to a query.
>As a result, we don't really care what form the original text is stored in
>as long as we can do keyword extraction.  However, I know that a lot of
>work is also being done with systems that have a more intimate knowledge
>of the text layout.

>Perhaps you should talk to the Consortium for Lexical Research.  They
>have a mailing list, although I have only ever received one piece of mail
>from it.  The contact address is lexical@nmsu.edu and you can also find out
>about the programs of information exchange they are setting up.

>One other suggestion: you may wish to look at the lq system.  It is a
>simple boolean text retrieval package, but it may make a good starting
>point for you.  It is available from cs.toronto.edu under the pub
>directory somewhere.  The file name is something like lq1.10.tar.Z.


            Again, thanks!

                            Geraldo Xexeo