FOXEA@VTCC1 (07/22/88)
IRList Digest Thursday, 21 July 1988 Volume 4 : Issue 39 Today's Topics: Query - Automatic abstracting - Pedagogical model for IRS Discussion - Soundex algorithm (from AIList) - Soundex code (from AIList) Call for Papers - ACL European Chapter 1989 Conference Announcement - Contents of special issue of IP&M News addresses are Internet or CSNET: fox@vtopus.cs.vt.edu or fox@fox.cs.vt.edu BITNET: foxea@vtvax3.bitnet (soon will be foxea@vtcc1) ---------------------------------------------------------------------- From: Gareth Husk <gareth@computing.lancaster.ac.uk> Date: Thu, 23 Jun 88 11:30:20 bst Subject: Information Request on automatic abstracting Newsgroups: comp.theory.info-retrieval Organization: Department of Computing at Lancaster University, UK. ... This is a request for information to help with my thesis on automatic abstracting that I am in the process on completing. Does any one have information, or know of a source, on the work of professional abstractors. I need to know: i) The number of new academic papers per year. ii) The rate at which abstractors have to work. iii) The rate of increase of academic publications. iv) The effects on the quality of abstracts of the increase of load. v) Whether the abstract services are beginning to lag noticably under the load. Add to this any information that you may consider pertinent. Please reply by e-mail, if other people require this information I will forward a summary of responses. Gareth Husk -- " Nine weeks and counting..." UUCP: ...!seismo!mcvax!ukc!dcl-cs!gareth JANET: gareth@uk.ac.lancs.comp ------------------------------ Date: Mon, 18 Jul 88 21:05:24 +0300 From: The Computer in Education Research Lab <C37@TAUNOS> Subject: pedagogical model for IRS I am engaged in research and development of a pedagogical model attached to an IRS, guiding pupils in highschools in searching activities. I will be grateful for any information about research concerning searching strategies, IRS users model, IRS in schools etc. thanks, avigail oren [Note: there are a number of articles on searching strategies (Marcia Bates had article in JASIS for example) and Penny Daniels has just finished dissertation on user models (does anyone have current email or complete postal address for her?). There is a lot more - hope people will send you some and you will give us a bibliography - Ed] ------------------------------ Date: 11 Jul 88 18:21:16 GMT From: sundc!netxcom!sdutcher@seismo.css.gov (Sylvia Dutcher) Subject: Re: Soundex algorithm [Forwarded from AIList Digest Monday, 18 Jul 1988 Volume 8: Issue 15] [Note: While I do not generally overlap with AIList, I thought it worthwhile to let IRList readers know about the discussion below so they can join in the discussion there, or so we can have a parallel discussion on IRList. It would be nice for AIList readers to learn about the various other, often better algorithms. - Ed.] In article <12520@sunybcs.UUCP> stewart@sunybcs.UUCP (Norman R. Stewart) writes: > > The source I've used for Soundex (developed by the >Remington Rand Corp., I believe), is > > Huffman, Edna K. (1972) Medical Record Management. > Berwyn, Illonois: Physicians' Record Company. I've written a soundex program based on the rules in Knuth's _Searching_and_ Sorting. These are also the rules used at the National Archives to sort census data. These rules differ slightly from the ones posted by Mr. Stewart. If you don't need to match anyone else's soundex, the most important rule is to be consistent. I will insert Knuth's rules below. >The algorithm is very simple, 1. Retain the first letter of the name, and drop all occurrances of a, e, h, i, o, u, w, y in other positions. >1: Assign number values to all but the first letter of the >word, using this table > 1 - B P F V 2 - C S K G J Q X Z > 3 - D T 4 - L > 5 - M N 6 - R > 7 - A E I O U W H Y 2. Assign number values as above, except for 7. >2: Apply the following rules to produce a code of one letter and > three numbers. > A: The first letter of the word becomes the initial character > in the code. > B: When two or more letters from the same group occur together > only the first is coded. > C: If two letters from the same group are seperated by an H or > a W, code only the first. 3. If two or more letters with the same code were adjacent in the original name (begore step 1), omit all but the first. > D: Group 7 letters are never coded (this does not include the > first letter in the word, which is always coded). 4. Convert to the form "letter, digit, digit, digit" by adding trailing zeros or dropping rightmost digits. BTW according to the reference in Knuth's book, this algorithm was developed by Margaret Odell and Robert Russell in 1922. >Norman R. Stewart Jr. * How much more suffering is >C.S. Grad - SUNYAB * caused by the thought of death >internet: stewart@cs.buffalo.edu * than by death itself! >bitnet: stewart@sunybcs.bitnet * Will Durant -- Sylvia Dutcher * The likeliness of things NetExpress Communications, Inc. * to go wrong is in direct 1953 Gallows Rd. * proportion to the urgency Vienna, Va. 22180 * with which they shouldn't. ------------------------------ Date: Wed, 13 Jul 88 10:35:06 EDT From: "William J. Joel" <JZEM%MARIST.BITNET@MITVMA.MIT.EDU> Subject: Re: Soundex algorithm [Forwarded from AIList Digest Monday, 18 Jul 1988 Volume 8: Issue 15] [Note: While I do not generally overlap with AIList, I thought it worthwhile to have an algorithm for those who teach or use Prolog in their research - Ed] /* The following is source code for a Soundex algorithm written in */ /* Waterloo Prolog. */ /* William J. Joel*/ /* Marist College */ /* Poughkeepsie, NY */ /* jzem@marist.bitnet */ key(a,-1). key(b,1). key(c,2). key(d,3). key(e,-1). key(f,1). key(g,2). key(h,-2). key(i,0). key(j,2). key(k,2). key(l,4). key(m,5). key(n,5). key(o,-1). key(p,1). key(q,2). key(r,6). key(s,2). key(t,3). key(u,-1). key(v,1). key(w,-3). key(x,2). key(y,-2). key(z,2). soundex(Name,Code)<- string(Name,Code1) & write(Code1) & soundex1(Code1,A.B.C.D.Rem) & string(Code,A.B.C.D.nil). soundex1(Head.Code1,Head.Code)<- keycode(Head.Code1,Code2) & write(Code2) & reduce(Code2,T.Code3) & write(T.Code3) & eliminate(Code3,Code4) & write(Code4) & append(Code4,0.0.0.nil,Code). reduce(X.(-2).X.Rem,List)<- reduce(X.Rem,List). reduce(X.X.Rem,List)<- reduce(X.Rem,List). reduce(X.Y.Z.Rem,X.List)<- ^X==Z & reduce(Y.Z.Rem,List). reduce(X.Y.Rem,X.List)<- ^X==Y & reduce(Y.Rem,List). reduce(X.nil,X.nil). reduce(nil,nil). eliminate(X.Rem,List)<- lt(X,0) & eliminate(Rem,List). eliminate(X.Rem,X.List)<- gt(X,0) & eliminate(Rem,List). eliminate(nil,nil). keycode(H.T,N.CodeList)<- key(H,N) & keycode(T,CodeList). keycode(nil,nil). append(Head.Tail,List,Head.NewList)<- append(Tail,List,NewList). append(nil,List,List). ------------------------------ Date: Thu, 23 Jun 88 17:46:32 EDT From: walker_donald e <walker@FLASH.BELLCORE.COM> Message-Id: <8806232146.AA07753@flash.bellcore.com> Subject: ACL European Chapter Call for Papers ACL European Chapter 1989 CALL FOR PAPERS Fourth Conference of the European Chapter of the Association for Computational Linguistics 10-12 April 1989 Centre for Computational Linguistics University of Manchester Institute of Science & Technology Manchester, England This conference is the fourth in a series of biennial conferences on computational linguistics sponsored by the European Chapter of the Association for Computational Linguistics. Previous conferences were held in Pisa (Sep- tember 1983), Geneva (March 1985) and Copenhagen (April 1987). Although hosted by a regional chapter, these confer- ences are global in scope and participation. The European Chapter represents a major subset of the parent Association for Computational Linguistics, and is in its seventh year. The conference is open both to existing members and non- members of the Association. Papers are invited on all aspects of computational linguis- tics, including but not limited to: morphology lexical semantics computational models for the analysis and generation of language speech analysis and synthesis computational lexicography and lexicology syntax and semantics discourse analysis machine translation computational aids to translation natural language interfaces knowledge representation and expert systems computer-assisted language learning Authors should send six copies of a 5- to 8-page double- spaced summary to the Programme Committee at the following address: Harold Somers Centre for Computational Linguistics UMIST PO Box 88 Manchester M60 1 QD England It is important that the summary should identify the new ideas in the paper and indicate to what extent the work is complete and to what extent it has been implemented. It should contain sufficient information to allow the programme committee to determine the scope of the work and its rela- tion to relevant literature. The author's name and address (including net address if possible) should be clearly indi- cated, as well as one or two keywords indicating the general subject matter of the paper. Schedule: Summaries must be submitted by 1st October 1988. Authors will be notified of acceptance by 15th December. Camera-ready copy of final papers prepared in a double- column format on model paper (which will be provided) must be received by 28th February 1989, along with a signed copy- right release statement. Papers not received by this date will not be included in the Conference Proceedings, which will be published in time for distribution to everyone attending the conference. The programme committee will be co-chaired by Harold Somers (UMIST) and Mary McGee Wood (Manchester University), and will include the following Christian Boitet (Grenbole) Laurence Danlos (Paris) Gerald Gazdar (Sussex) Jurgen Kunze (Berlin, DDR) Michael Moortgat (Leiden) Oliviero Stock (Trento) Henry Thompson (Edinburgh) Dan Tufis (Bucharest) Local arrangements will also be handled by Somers and Wood. Please await a further announcement in October for more details. Exhibits and demonstrations: A programme of exhibits and demonstrations is planned. Anyone wishing to participate should contact John McNaught at the above address. Book exhibitors should contact Paul Bennett also at the above address. ------------------------------ Date: Thu, 23 Jun 88 19:10:36 cdt From: radecki@fergvax.unl.edu (Dr. Radecki) Subject: Contents of special issue of IP&M As the Guest Editor of the special issue of Information Processing and Management on "The Potential for Improvements in Commercial Document Retrieval Systems," I would like to inform interested parties that it has just been published as Number 3 of Volume 24, and is now available to both subscribers and non-subscribers. The contents of this special issue is as follows: Bernard M. Fry Editorial: Robert Maxwell and information Tefko Saracevic processing Harold Borko Tadeusz Radecki Trends in research on information retrieval--The potential for improvements in conventional Boolean retrieval systems Peter Smit Information impediments to innovation of Manfred Kochen on-line database vendors William S. Cooper Getting beyond Boole M. E. Maron Probabilistic design principles for conventional and full-text retrieval systems Edward A. Fox Practical enhanced Boolean retrieval: Matthew B. Koll Experiences with the SMART and SIRE systems G. Salton A simple blueprint for automatic Boolean query processing Tadeusz Radecki Probabilistic methods for ranking output documents in conventional Boolean retrieval systems Jitender S. Deogun Integration of information retrieval and database Vijay V. Raghavan management systems Robert M. Losee Integrating Boolean queries in conjunctive Abraham Bookstein normal form with probabilistic retrieval models M. H. Heine A logic assistant for the database searcher Danny P. Wallace Estimating effective display size in online Bert R. Boyce retrieval systems Donald H. Kraft Michael D. Gordon The necessity for adaptation in modified Boolean document retrieval systems David C. Blair An extended relational document retrieval model ------------------------------ END OF IRList Digest ********************