[comp.theory.info-retrieval] IRList Digest V4 #29

FOXEA@VTVAX3.BITNET (05/04/88)
IRList Digest           Saturday, 30 April 1988      Volume 4 : Issue 29

Today's Topics:
   Abstracts - Abstracts of interest from Susanne Humphrey

News addresses are
   Internet or CSNET: fox@vtopus.cs.vt.edu
   BITNET: foxea@vtvax3.bitnet

----------------------------------------------------------------------

Date: Fri, 29 Apr 88 21:55:41 EST
From: "Susanne M. HUMPHREY" <humphrey@MCS.NLM.NIH.GOV>
Subject: abstracts of possible interest

[Note: The usual copyright and other restrictions go along with this
submission -- see earlier similar submissions by Susanne for the actual
wording. - Ed.]

AN University Microfilms Order Number ADG87-27184.
AU ABRAMS, DAVID HOWARD.
IN Georgia Institute of Technology D.B. 1987, 220 pages.
TI A REFERENCE MODEL FOR HETEROGENEOUS DATA BASE MANIPULATION AND AN
   EXPERT SYSTEM PROTOTYPE FOR HOMOGENEOUS DATA BASE MANIPULATION.
SO DAI V48(09), SecB.
DE Computer Science.
AB A reference model for heterogeneous database manipulation (HDBM)
   is presented. The model provides a generalized approach for
   allowing applications software to perform data manipulation on
   centralized or distributed heterogeneous database systems. The
   HDBM model has been developed using layering and techniques
   similar to those used in the International Standards
   Organization's reference model for open systems interconnection.

   A working prototype system, based on the HDBM model and called the
   Integrated Data Delivery Expert (IDDE), was developed as a series
   of layers using rapid prototyping methodology and expert systems
   technology. The IDDE serves as a front-end system for manipulating
   data in a distributed, relational, database system environment.
   The IDDE software is developed as four layers: applications
   interface; data presentation services module; data control
   services module; and database services module. The IDDE model was
   tested by retrieval of data from database systems that are
   accessible to it.

   The prototype has been shown to process data retrieval requests
   successfully from the distributed, relational, database system
   environment. Sample IDDE executions, as well as the prototype
   source code, are included.

   The dissertation addresses several future research possibilities
   and implications.

AN University Microfilms Order Number ADG87-26875.
AU CHEN, HSING LUNG.
IN Illinois Institute of Technology Ph.D 1987, 175 pages.
TI OBJECT-ORIENTED ALERTER SYSTEM DESIGN.
SO DAI V48(09), SecB.
DE Computer Science.
AB Database systems are usually 'passive.' Database alerting
   techniques provide a database system with the capability to take
   actions by itself. Hence, the database system with alerting
   techniques can play a more intelligent role. The primary objective
   of this research effort is the development of a distributed
   intelligent database system. This design problem encompasses the
   issues of the development of methodology to decompose the complex
   alerters into simple alerters, the investigation of algorithm for
   allocating the simple alerters and design of protocol for ensuring
   that alerters correctly monitor database. The proposed system is
   useful in office information systems, decision-support systems and
   user-friendly specification of 'chip' expert systems.

   We approach the problem of monitoring database updates by using
   the object-oriented approach. A methodology is proposed to
   decompose a complex alerter into several objects. These objects
   form a tree-structure. Each object can be considered as monitoring
   a virtual database view. If the database view is updated, the
   updated message is sent up the tree for further monitoring. The
   top object can check whether alert condition is met and then
   invoke the alert action.

   An alerter can usually be expressed in terms of an associated
   query on database view. The conventional approach for implementing
   the complex alerters is that the database view is constructed
   whenever related elementary relations are updated. The response
   time of view-construction approach is longer than that of
   object-oriented approach. However, object-oriented approach needs
   much more space to store customized alerters. Another approach we
   proposed is the combination of both approaches. The performance of
   three approaches is compared. The combination approach can achieve
   better time-space tradeoff.

   If the object-oriented approach is applied in the centralized
   computer systems, the complex alerter can be correctly invoked.
   But in the distributed systems, the actions which should be
   invoked may be missing or the extraneous actions are invoked. Some
   concurrency-control methods are proposed to make sure that the
   complex alerters monitor the database updates correctly in the
   distributed systems.

   Some allocation algorithms are also proposed to allocate the
   alerting objects so that the response time is minimal under
   object-number constraints.

AN University Microfilms Order Number ADG87-26878.
AU HARR, HENRY MAXIM.
IN Illinois Institute of Technology Ph.D 1987, 160 pages.
TI ABF: AN EXPERT SYSTEM FOR OFFICE AUTOMATION AND AN INTERPRETER FOR
   LEGAL DOCUMENT CONSTRUCTION.
SO DAI V48(09), SecB.
DE Computer Science.
AB The ABF system creates an environment for document construction
   that allows people who have never used a computer before to create
   complex client-specific drafts. Legal experts develop model
   documents and templates in ABF; then law students and paralegal
   personnel process these documents in the same system, answering
   system-generated questions about the client to produce a
   custom-tailored version. The ABF system protects the user from the
   operating system in every conceivable way. It provides facilities
   for copying, storing, formatting, displaying, printing, and
   deleting documents. It allows the user to convert documents to and
   from MS-DOS ASCII files and organize them into libraries. It
   supplies the same editing facilities whenever the user presses a
   key, whether in answering a question, giving a command, or editing
   a document. During the interviewing process the system
   automatically constructs a client data file from answers to
   questions; that file is searched for relevant information any time
   that questions are asked as documents are processed for that
   client. Model documents are ABF programs in disguise; ABF contains
   all the facilities for conditionals, looping, multiple-values, and
   subroutines provided by any modern programming language. In
   addition the ABF user can modify the program while it is running.
   The user can also switch contexts at any time from editing to
   processing a document or issuing a command. If the system finds an
   error, it puts the user in the editor with the cursor pointing to
   the place where the error occurred. Although the system was
   originally designed for use by lawyers, paralegal personnel, and
   law students, it embodies new techniques for automating any office
   that produces documents of a repetitive nature. ABF is a cross
   between an expert system shell and an interpreter, embedded in a
   sheltered microcomputer environment.

AN University Microfilms Order Number ADG87-26879.
AU HSIEH, CHENG-YUAN
IN Illinois Institute of Technology Ph.D 1987, 146 pages.
TI OFFICE PROCEDURE LANGUAGE: AN OBJECT-ORIENTED APPROACH.
SO DAI V48(09), SecB.
DE Computer Science.
AB This thesis presents the concept and design of a general-purpose
   programming language: Office Procedure Language (OPL). OPL is
   based upon a formal model for the specification of knowledge-based
   information system, the OPM model. From the database viewpoint,
   the OPM model utilizes the database alerting techniques to serve
   the purpose of office activities management. OPL is developed to
   specify an OPM model with object-oriented approach. An OPL program
   can be translated to the corresponding OPM model. The OPM model
   consists of databases, messages, office activities and alerter
   rules. The general goals of the OPM model are (1) to describe the
   relationships between these office objects; (2) to provide a
   perspective view for the coordination and integrating of office
   activities; and (3) to facilitate the protocol analysis and the
   verification of office procedures. OPL is the linguistic interface
   between OPM model and system programmers. It is a high-level
   language developed to meet the specification requirements of the
   above goals.

   In OPL, the mechanism of database-altering is used to specify the
   knowledge of an Office Information System (OIS). OPL is developed
   for OIS design, where knowledge is expressed as database alerter
   rules. However, due to its generality, it is also applicable to
   the design of general information systems.

AN University Microfilms Order Number ADG87-26518.
AU MYAENG, SUNG HYON.
IN Southern Methodist University Ph.D 1987, 175 pages.
TI THE ROLES OF USER PROFILES IN INFORMATION RETRIEVAL.
SO DAI V48(09), SecB.
DE Computer Science.
AB One difficult problem in information retrieval (IR) is the proper
   interpretation of user queries. It is extremely hard for users to
   express their information needs in a specific yet exhaustive way.
   From a different perspective, user variability in information
   seeking behavior is not well reflected in a query. As an effort to
   alleviate this problem, two theoretical models have been proposed
   to utilize user characteristics maintained in a form of a user
   profile.

   Although the idea of integrating user profiles into an IR system
   is intuitively appealing, and the models seem viable, no research
   to date has established a foundation on the roles of user profiles
   in such a system. Aiming at the investigation of the roles of user
   profiles, therefore, this study first identifies and extends
   various query/profile interaction models to provide a ground on
   which the investigation can be undertaken. From a continuum of
   models characterized based on interaction types, metrics, and
   parameters, nearly 400 models are chosen to investigate the "model
   space."

   Following the preparatory work, a series of experiments are
   conducted using an experimental IR system built for this study. In
   recognition that existing measures are not sufficient for the
   evaluation of so many models, new measures are developed based on
   the notion of user satisfaction/frustration. In addition, three
   different criteria are introduced to guide users in making
   judgements on the quality of retrieved items.

   A number of interesting results are produced through the analysis
   of the data obtained from the experiments. It is first shown that,
   regardless of a criterion or a metric used, there are always some
   query/profile interaction models that outperform the query alone
   model. In addition, preferable characteristics for different
   criteria are identified in terms of interaction types, parameters,
   and metrics. To ensure the significance of the results, three
   statistical tests are used for different purposes.

AN This item is not available from University Microfilms International
   ADG05-61333.
AU CHEN, JASON S. J.
IN University of Southern California Ph.D 1987.
TI DISTRIBUTED QUERY OPTIMIZATION IN FRAGMENTED DATABASE SYSTEMS.
SO DAI V48(09), SecB.
DE Engineering, Electronics and Electrical.
AB Join is the most critical operation in distributed query
   optimization. In this thesis, the problem of optimizing multiple
   joins in fragmented database systems on both broadcast and
   nonbroadcast type computer networks is analyzed. Semantic
   information associated with fragments are used to eliminate
   unnecessary processing. Data redundancy is considered.
   Furthermore, we allow more than one physical copy of a fragment to
   be used in a strategy to achieve more parallelism.

   In our proposed approach, the problem of optimizing multiple joins
   is decomposed into two subproblems: the problem of finding a good
   join sequence and the problem of optimizing each two-way join in
   the sequence. A dynamic programming algorithm is developed for
   determining a join sequence. During intermediate steps of the join
   sequence, we have the join results remaining fragmented to achieve
   more parallelism and allow more local executions. All the partial
   results are assembled at the last two-way join.

   If the network has broadcast capability, graph models are
   introduced to represent two-way joins. The two-way join
   optimization problems are mapped into equivalent graph
   minimum-weight vertex cover problems. An algorithm based on
   network flow is developed for optimizing two-way joins with
   results fragmented. The two-way join optimization problem with
   results assembled is proved to be NP-hard. For nonbroadcast
   network environments, the problem of optimizing two-way joins
   either with results fragmented or with results assembled is also
   proved to be NP-hard.

   For those NP-hard optimization problems, properties are identified
   to reduce the solution search space. Efficient heuristic
   procedures based on the identified properties are developed for
   suboptimal solutions. Theoretical bounds are provided to ensure
   the heuristic solutions are within a certain range from the
   optimal solutions.

   Semijoins are also included in our approach. A new operation
   called domain-specific semijoin is introduced which can be
   performed in a fragment-to-fragment manner as opposed to a
   relation-to-relation or relation-to-fragment manner as in the
   application of regular semijoins. For a given query, there is
   always a strategy, using both domain-specific semijoins and
   semijoins, which is at least as good as the best strategy with
   only semijoins. (Copies available exclusively from Micrographics
   Department, Doheny Library, USC, Los Angeles, CA 90089-0182.).

AN University Microfilms Order Number ADG87-27824.
AU CHEN, ZI-TAN.
IN University of California, Santa Barbara Ph.D 1987, 156 pages.
TI QUADTREES AND QUADTREE SPATIAL SPECTRA IN LARGE SCALE GEOGRAPHIC
   INFORMATION SYSTEMS--THE HIERARCHICAL HANDLING OF SPATIAL DATA.
SO DAI V48(09), SecB, pp2768.
DE Geotechnology.
AB The demand to manipulate large volumes of geographic data is
   growing. Besides conventional maps and statistics, huge volumes of
   geographic data are produced by remote sensing, conventional
   mapping, and auto-cartographic processes. Such data need to be
   manipulated efficiently in very large scale geographic information
   systems. However, current geographic information systems exhibit
   major shortcomings in the efficient handling of spatial data. The
   purpose of this dissertation is to explore the use of quadtrees
   and quadtree spatial spectra to improve spatial data handling
   efficiency.

   Artificial intelligence has a potential to eliminate some of the
   disadvantages of present spatial data handling methods. However,
   the gap between the theory of AI and its practical application in
   spatial data handling is still very wide. Many previous efforts in
   this area, including contextual and syntactic analysis in digital
   image processing, have shown interesting functions, but also have
   serious limitations for applications involving very large spatial
   data files.

   This study is an effort to shorten this gap by presenting several
   related studies at different levels. (1) At a high control level,
   a spatial heuristic search module is proposed. It achieves a
   significant gain in CPU time efficiency by using spatial knowledge
   at an early stage to find a short cut strategy. A heuristic search
   substitutes for the blind search of most current geographic
   information systems. (2) At the data structure level, quadtrees
   are used to represent very large volumes of geographic data,
   including both binary and gray tone images. Finally, (3) A global
   quadtree coordinate system is proposed for uniform data
   representation among different geographic information systems that
   have various sizes and locations.

   An approximate spatial distribution knowledge--quadtree spatial
   spectra (QTSS)--is proposed. It provides the necessary spatial
   knowledge for a spatial heuristic search module. Quadtree spatial
   spectra contain rich spatial distribution information over a wide
   spatial wavelength domain. The speed of generating QTSS is two
   orders of magnitude faster than the generating Fast Fourier
   Transform (FFT), and its storage form is very compact.

   Combining these studies, a practical geographic information system
   has been designed. Real geographic data sets are tested. Their
   results are analyzed, and compared with current systems.

------------------------------

END OF IRList Digest
********************