finin@antares.PRC.Unisys.COM (Tim Finin) (05/20/88)
We have a need to process bibliographic references, extracting the relevant information encoded in them. That is, to take a reference like: J. W. Wallis and Edward H. Shortliffe. Customizing explanations using causal knowledge. In Bruce G. Buchanan and Edward H. Shortiffe, editors, Rule Based Expert Systems, Addison-Wesley, Reading, MA, 1984. and to produce a data structure something like: ((type bookChapter) (author "J. W. Wallis and Edward H. Shortliffe") (title "Customizing Explanations Using Causal Knowledge") (book (title "Rule-Based Expert Systems") (publisher "Addison-Wesley") (editor "Bruce G. Buchanan and Edward H. Shortliffe") (year "1984") (address "Reading, MA"))) Put simply, we want to develop a system thast does what BibTeX does, but in reverse. It should work for references to a variety of types of documents (e.g. journal articles, books, technical reports, theses, etc), and bibliographic styles. It should have clear domain-independant knowledge (e.g. "Edward" is a given name, MA can be an abbreviation for Massachusettes which is the name of a state, 1984 is a good value for a year of publication, etc.) and domain-dependant knowledge (e.g. what IJCAI means, that BBN is a company which has a technical reports series, etc). This would ease its porting from one domain (e.g. AI) to another (e.g. fluid dynamics). Such a system would probably be an interesting application drawing on aspects of computational lingusitics (e.g. parsing, sub-language theory, proper name recognition), and knowledge-based expert systems (e.g. expectation-driven parsing, domain modeling). I'm interested in getting pointers to any research on systems like this. I can't recall hearing of any. Tim Tim Finin finin@prc.unisys.com Paoli Research Center ..!{psuvax1,sdcrdcf,cbmvax,bpa}!burdvax!finin Unisys Corporation 215-648-7446 (o) PO Box 517, Paoli PA 19301 215-386-1749 (h)