[comp.ai.digest] KB Machine Translation at CMU - Sergei Nirenburg

dayuso@BBN.COM ("Damaris M. Ayuso") (06/06/89)

		 BBN STC Science Development Program
		      AI Seminar Series Lecture
				   
    RESEARCH IN MACHINE TRANSLATION AT CARNEGIE MELLON UNIVERISTY
				   
                           SERGEI NIRENBURG
                      School of Computer Science
                      Carnegie Mellon University
				   
	       BBN STC, 2nd floor large conference room
		  10 Moulton St, Cambridge MA, 02138
		     Friday June 9th, 10:30 AM


I will give an overview of KBMT-89, a knowledge-based machine translation
project at the CMU Center for Machine Translation, which resulted in
producing a working prototype MT system. Since it is unrealistic to hope to
cover the entire material in less than an hour, I will then concentrate on
one or more components of the system, as time permits. I would also like to
discuss the lessons we learned from the work on this project about the
difficulties and tasks in developing knowledge-based MT systems.

The specifications of our MT system are as follows:

Source languages:                English and Japanese
Target languages:                English and Japanese;
Translation paradigm:            Interlingua 
Computational architecture:      A distributed, coarsely parallel system
Domain of translation:           IBM PC installation manuals.

The knowledge acquired for the system includes:

* An ontology (domain model) of about 1,500 concepts
* Analysis lexicons: about 800 open-class lexical units of Japanese and 
                     about 900 such units of English
* Generation lexicons: about 800 open-class lexical units of Japanese and 
                       about 900 such units of English
* Analysis grammars for English and Japanese
* Generation grammars for English and Japanese
* Specialized syntax <---> semantics structural mapping rules.

The underlying formalisms that were developed for the use in this system are:

* The knowledge representation system FrameKit
* A language for representing domain models (a semantic extension of FrameKit)
* Specialized grammar formalisms, based on Lexical-Functional Grammar
* A language for representing text meanings (the interlingua)

The procedural components of the system include:

* A syntactic parser with a semantic constraint interpreter;
* A semantic mapper for treating additional types of semantic constraints;
* An interactive augmentor for treating residual ambiguities;
* A semantic generator producing syntactic structures of the target
  language, complete with lexical insertion;
* A syntactic generator, producing output strings based on the output of
  the semantic generator.

The support and environment facilities in KBMT-89 include:

* A knowledge acquisition tool for acquiring ontologies and lexicons, ONTOS;
* A knowledge acquisition tool for acquiring grammars; and
* Testing environments for analysis, augmentation and generation.

=================================================================