[comp.text] An environment for developing CFG descriptions

cso@rose.cis.ohio-state.edu (Dr. Conleth S. O'Connell Jr.) (07/30/90)

We have recently published the following technical report:

Supporting the Development of Grammar Descriptions for Multiple
Applications
Conleth S. O'Connell Jr.
OSU-CISRC-7/90-TR20, July, 1990, 39 pp.

If you would like a copy, you may send the request via email to

strawser@cis.ohio-state.edu

Please include your postal mailing address.


                                ABSTRACT

In computer science, context-free grammars are used extensively to
describe data sets such as manuscript types and programming languages.
The data, or members, contained in a particular set represent
instances of the grammar describing that set, for example, documents
and programs.

Determining the elements comprising instances is the task of
content investigation.  Imposing structure on these elements is the
task of grammar development.  Creating, editing, and manipulating
instances of a grammar is the task of grammar instantiation.  Grammar
instantiation has received much attention with software systems such
as programming environments and compound-document environments.
Content investigation and grammar development have only recently been
recognized as recurring complex tasks.  They have received little
attention because of their newly emerging significance.  This work
focuses on grammar development.

Grammar development produces a grammar description in a particular
notation that contains two types of information: a formal, context-free
grammar and auxiliary information.  Auxiliary information describes
the application of the grammar description.  For example, a grammar
may describe the manuscript type ``article,'' but the auxiliary
information may describe how to format the instances for layout, how
to analyze the sentence structure, or how to exchange documents of
that type.

The separation of the general, context-free grammar from the
application-specific, auxiliary information provides the power and
flexibility to generalize problem classes associated with grammar
development.  The formalisms of context-free grammars motivate two
such problem classes: syntactic properties and semantic properties.
The analysis of the development of large grammars motivates two other
problem classes: reusable grammars and multiple notations.

A review of existing software systems reveals that a new,
general-purpose, support environment was required for developing
grammar descriptions.  A prototype environment for developing grammar
descriptions, DeveGram, has been designed and implemented.  DeveGram
controls and manages the four problem classes by capturing any
context-free grammar, providing mechanisms for determining properties
about a grammar, capturing auxiliary information, and generating
automatically grammar descriptions in a testbed of different
notations.  DeveGram produces grammar descriptions for a testbed of
software systems differing in syntax and purpose.  The testbed
presently consists of Yacc, SGML, MDL, MANDEN, and BNF.

-=-
Dr. Conleth S. O'Connell Jr. Department of Computer and Information Science
                                       The Ohio State University
cso@cis.ohio-state.edu        2036 Neil Ave., Columbus, OH USA 43210-1277