dayuso@BBN.COM ("Damaris M. Ayuso") (06/06/89)
BBN STC Science Development Program AI Seminar Series Lecture RESEARCH IN MACHINE TRANSLATION AT CARNEGIE MELLON UNIVERISTY SERGEI NIRENBURG School of Computer Science Carnegie Mellon University BBN STC, 2nd floor large conference room 10 Moulton St, Cambridge MA, 02138 Friday June 9th, 10:30 AM I will give an overview of KBMT-89, a knowledge-based machine translation project at the CMU Center for Machine Translation, which resulted in producing a working prototype MT system. Since it is unrealistic to hope to cover the entire material in less than an hour, I will then concentrate on one or more components of the system, as time permits. I would also like to discuss the lessons we learned from the work on this project about the difficulties and tasks in developing knowledge-based MT systems. The specifications of our MT system are as follows: Source languages: English and Japanese Target languages: English and Japanese; Translation paradigm: Interlingua Computational architecture: A distributed, coarsely parallel system Domain of translation: IBM PC installation manuals. The knowledge acquired for the system includes: * An ontology (domain model) of about 1,500 concepts * Analysis lexicons: about 800 open-class lexical units of Japanese and about 900 such units of English * Generation lexicons: about 800 open-class lexical units of Japanese and about 900 such units of English * Analysis grammars for English and Japanese * Generation grammars for English and Japanese * Specialized syntax <---> semantics structural mapping rules. The underlying formalisms that were developed for the use in this system are: * The knowledge representation system FrameKit * A language for representing domain models (a semantic extension of FrameKit) * Specialized grammar formalisms, based on Lexical-Functional Grammar * A language for representing text meanings (the interlingua) The procedural components of the system include: * A syntactic parser with a semantic constraint interpreter; * A semantic mapper for treating additional types of semantic constraints; * An interactive augmentor for treating residual ambiguities; * A semantic generator producing syntactic structures of the target language, complete with lexical insertion; * A syntactic generator, producing output strings based on the output of the semantic generator. The support and environment facilities in KBMT-89 include: * A knowledge acquisition tool for acquiring ontologies and lexicons, ONTOS; * A knowledge acquisition tool for acquiring grammars; and * Testing environments for analysis, augmentation and generation. =================================================================