leff@smu.UUCP (Laurence Leff) (05/17/88)
DEPARTMENT OF COMPUTER SCIENCES
TECHNICAL REPORT CENTER
TAYLOR HALL 2.124
THE UNIVERSITY OF TEXAS AT AUSTIN
AUSTIN, TEXAS 78712-1188
(512) 471-9595
NOTE NEW ELECTRONIC MAIL ADDRESS:
trcenter@cs.utexas.edu
TECHNICAL REPORTS, JANUARY 1988 - APRIL 1988
PREPAYMENT IS REQUIRED.
Please make U.S. bank checks or international money orders payable to
"The University of Texas."
[ ] TR-86-24 (Revision) $1.50
[ ] TR-87-20 (Revision) $1.50
[ ] TR-87-23 (Revision) $2.50
[ ] TR-87-28 (Revision) $1.50
[ ] TR-88-01 $1.50 [ ] TR-88-07 $1.50 [ ] TR-88-13 $5.00
[ ] TR-88-02 $1.50 [ ] TR-88-08 $1.50 [ ] TR-88-14 $1.50
[ ] TR-88-03 $2.00 [ ] TR-88-09 $1.50 [ ] TR-88-15 $1.50
[ ] TR-88-04 $5.00 [ ] TR-88-10 $1.50 [ ] TR-88-16 $1.50
[ ] TR-88-05 $1.50 [ ] TR-88-11 $5.00
[ ] TR-88-06 $1.50 [ ] TR-88-12 $1.50
-------------------------------------------------------------------------------
TR-86-24 IMPLEMENTATION CONCEPTS FOR AN EXTENSIBLE DATA MODEL AND DATA
LANGUAGE
D. S. Batory, T. Y. Leung, and T. E. Wise
February 1988
30 pages
ABSTRACT: Future database systems must feature extensible data
models and data languages in order to accommodate the novel data
types and special-purpose operations that are required by
nontraditional database applications. In this paper, we outline a
functional data model and data language that are targeted for the
semantic interface of GENESIS, an extensible DBMS. The model and
language are generalizations of FQL [Bun82] and DAPLEX [Shi81], and
have an implementation that fits ideally with the modularity
required by extensible database technologies. We explore different
implementations of functional operators and present experimental
evidence that they have efficient implementations. We also explain
the advantages of a functional front-end to not-1NF databases, and
show how our language and implementation are being used to process
queries on both 1NF and not-1NF relations.
TR-87-20 ATOMIC SEMANTICS OF NONATOMIC PROGRAMS
James H. Anderson and Mohamed G. Gouda
March 1988
8 pages
ABSTRACT: We argue that it is possible, and sometimes useful, to
reason about nonatomic programs within the conventional atomic
model of concurrency.
TR-87-23 BUILDING BLOCKS OF DATABASE MANAGEMENT SYSTEMS
D. S. Batory
February 1988
42 pages
ABSTRACT: We present a very simple formalism based on parameterized
types and a rule-based algebra to survey and identify the storage
structure and query processing algorithm building blocks of
database management systems. We demonstrate building block
reusability by showing how different combinations of a few blocks
yield the structures and algorithms of three different systems,
namely System R (centralized), R* (distributed), and GRACE
(database machine). We believe that codifying knowledge of DBMS
implementations is an important step toward a technology that
assembles DBMSs rapidly and cheaply from libraries of prewritten
components.
TR-87-28 PAMMA NONITERATIVE APPROXIMATE SOLUTION METHOD FOR CLOSED
MULTICHAIN QUEUEING NETWORKS
Ching-Tarng Hsieh and Simon S. Lam
March 1988
24 pages
ABSTRACT: Approximate MVA algorithms for separable queueing
networks are based upon an iterative solution of a set of modified
MVA formulas. Although each iteration has a computational time
2
requirement of O(MK ) or less, many iterations are typically needed
for convergence to a solution. (M denotes the number of queues and
K the number of closed chains or customer classes.) We present
some faster approximate solution algorithms that are noniterative.
They are suitable for the analysis and design of communication
networks which may require tens to hundreds, perhaps thousands, of
closed chains to model flow-controlled virtual channels. The basis
of our method is the distribution of a chain's population
proportional to loads to get initial estimates of mean queue
lengths. This is the same basis used in the derivation of
proportional upper bounds for single-chain networks; for a
multichain network, such a proportional distribution leads to
approximations rather than upper bounds of chain throughputs.
Nevertheless, these approximate solutions provide chain through-
puts, mean end-to-end delays, and server utilizations that are
sufficiently accurate for the analysis and design of communication
networks and possibly other distributed systems with a large number
of customer classes. Three PAM algorithms of increasing accuracy
are presented. Two of them have time and space requirements of
2
O(MK). The third algorithm has a time requirement of O(MK ) and a
space requirement of O(MK).
TR-88-01 CONCEPTS FOR A DATABASE SYSTEM COMPILER
D. S. Batory
January 1988
9 pages
ABSTRACT: We propose a very simple formalism based on parameterized
types and a rule-based algebra to explain the storage structures
and algorithms of database management systems. Implementations of
DBMSs are expressed as equations. If all functions referenced in
the equations have been implemented, the software for a DBMS can be
synthesized in minutes at little cost, in contrast to current
methods where man-years of effort and hundreds of thousands of
dollars are required. Our research aims to develop a DBMS
counterpart to today's compiler-compiler techniques.
TR-88-02 ON THE REUSABILITY OF QUERY OPTIMIZATION ALGORITHMS
D. S. Batory
January 1988
23 pages
ABSTRACT: We tackle the problem of software reusability in this
paper as it pertains to an important class of nonrecursive query
optimization algorithms. We demonstrate reusability can be achieved
by 1) imposing standardized interfaces and 2) expressing algorithms
in a DBMS implementation-independent manner. The former is
accomplished by generalizing the notion of query graphs, and the
latter is accomplished by standardizing algorithm definitions in
terms of query graph rewrite rules. Demonstrating algorithm
reusability is an essential step toward a building-blocks
technology for extensible database systems.
TR-88-03 A TAXONOMY OF FAIRNESS AND TEMPORAL LOGIC PROBLEMS FOR PETRI NETS
Rodney R. Howell, Louis E. Rosier, and Hsu-Chun Yen
January 1988
38 pages
ABSTRACT: In this paper, we define a temporal logic for reasoning
about Petri nets. We show the model checking problem for this logic
to be PTIME equivalent to the Petri net reachability problem. Using
this logic and two refinements, we show the fair nontermination
problem to be PTIME equivalent to reachability for several
definitions of fairness. For other versions of fairness, this
problem is shown to be either PTIME equivalent to the boundedness
problem or highly undecidable. In all, 24 versions of fairness are
examined.
TR-88-04 A GENERAL APPROACH TO MULTIPROCESSOR SCHEDULING
Sung Jo Kim
February 1988
155 pages
ABSTRACT: As a variety of general-purpose multiprocessor systems
have been recently designed and built, multiprocessor scheduling is
becoming increasingly important. Multiprocessor scheduling is a
technique to exploit the underlying hardware in a multiprocessor
system so that parallelism existing in an application program can
be fully utilized and interprocessor communication time can be
minimized. Traditionally, most research on multiprocessor
scheduling has focused on the development of specific scheduling
strategies to take advantage of unique characteristics of a
specific multiprocessor system or application program. In this
thesis, we define and characterize scheduling techniques and
related heuristic mapping algorithms which are applicable to a
spectrum of multiprocessor systems and a broad class of application
programs.
The fundamental idea we use is that multiprocessor scheduling can
be regarded as a series of mappings from a computation graph
(representing an application program) to a virtual architecture
graph (representing an optimal architecture for the program) and
eventually to a physical architecture graph (representing a target
multiprocessor system). We propose linear clustering and linear
cluster merging as effectual heuristics. After linear clustering
and merging, the computation graph is transformed into a virtual
architecture graph. This graph represents an optimal architecture
which compromises between two conflicting goals, minimization of
interprocessor communication and maximization of potential
parallelism, and satisfies the other goals, throughput enhancement
and workload balance, relatively well. Then we develop two
efficient scheduling algorithms which map the optimal architecture
graph onto a physical architecture graph which may represent either
a homogeneous or a heterogeneous multiprocessor system. These
algorithms rely not only on local information but also on limited
global information. Finally, we present the result of performance
evaluation of the mapping algorithms on an Intel iPSC with 32
processors and a Sequent Balance with 10 processors.
TR-88-05 SCHEMA VERSIONS AND DAG REARRANGEMENT VIEWS IN OBJECT-ORIENTED
DATABASES
Hyoung-Joo Kim and Henry F. Korth
February 1988
30 pages
ABSTRACT: An important requirement of non-traditional database
applications such as computer aided design, artificial intel-
ligence, and office information systems with multimedia documents
is the support of application evolution. Application evolution
includes evolution of object schemas as well as evolution of
objects in the application.
We provided a framework of schema evolution in [BKKK86, BKKK87] and
the framework was realized in an object-oriented database system,
ORION, at MCC. In this paper, we extend this schema evolution
framework by allowing schema versions and DAG rearrangement views
in object-oriented databases. We present a technique that enables
users to manipulate schema versions explicitly and maintain schema
evolution histories. For completeness, we integrate our model with
the object version model formulated by H.T. Chou and W. Kim [CK86].
We identify new types of view, called DAG rearrangement views, of
composite objects and class hierarchies. We present a set of
operators for defining DAG rearrangement views. We identify sets of
composite object views with the property that queries on the views
are processable on instances of the original composite object
schema. We also discuss how instances would be viewed and
reorganized in DAG rearrangement views of class hierarchies.
TR-88-06 CONCURRENT ACCESS OF PRIORITY QUEUES
V. Nageshwara Rao and Vipin Kumar
February 1988
26 pages
ABSTRACT: The heap is an important data structure used as a
priority queue in a wide variety of parallel algorithms (e.g.,
multiprocessor scheduling, branch-and-bound). In these algorithms,
contention for the shared heap limits the obtainable speedup. This
paper presents an approach to allow concurrent insertions and
deletions on the heap in a shared-memory multiprocessor. Our scheme
has much smaller overhead and gives a much better performance than
a previously reported scheme. The scheme also retains the strict
priority ordering of the serial-access heap algorithms; i.e., a
delete operation returns the best key of all keys that have been
inserted or are being inserted at the time delete is started. Our
experimental results on the BBN Butterfly parallel processor
demonstrate that the use of the concurrent-heap algorithms in
parallel branch-and-bound improves its performance substantially.
TR-88-07 FAST RAY TRACING USING K-D TREES
Donald Fussell and K. R. Subramanian
March 1988
22 pages
ABSTRACT: A hierarchical search structure for ray tracing based on
k-d trees is introduced. This data structure can handle the
variety of surfaces commonly used in computer graphics. Algorithms
to build and traverse this binary data structure are described.
Only regions through which a ray travels are interrogated and the
search proceeds along the path of the ray starting from its origin.
The algorithm also ensures that no object is interrogated more than
once in the search for intersection. Benchmarking results indicate
that when used with axis-aligned bounding parallelepipeds, this
method of ray tracing is one of the fastest known.
TR-88-08 SEPARATION PAIR DETECTION
Donald Fussell and Ramakrishna Thurimella
March 1988
14 pages
ABSTRACT: A separation pair of a biconnected graph is a pair of
vertices whose removal disconnects the graph. The central part of
any algorithm that finds triconnected components is an algorithm
for separation pairs. Recently Miller and Ramachandran have given a
2
parallel algorithm that runs in O(log n) time using O(m)
processors. We present a new algorithm for finding all separation
pairs of a biconnected graph that runs in O(log n) time using O(m)
processors. A direct consequence is a test for triconnectivity of
a graph within the same resource bounds.
TR-88-09 ARITHMETIC CLASSIFICATION OF PERFECT MODELS OF STRATIFIED PROGRAMS
Krzysztof R. Apt and Howard A. Blair
March 1988
14 pages
ABSTRACT: We study here the recursion theoretic complexity of the
perfect (Herbrand) models of stratified logic programs. We show
that these models lie arbitrarily high in the arithmetic hierarchy.
As a byproduct we obtain a similar characterization of the
recursion theoretic complexity of the set of consequences in a
number of formalisms for non-monotonic reasoning. We show that
under some circumstances this complexity can be brought down to
recursive enumerability.
TR-88-10 APPRAISING FAIRNESS IN LANGUAGES FOR DISTRIBUTED PROGRAMMING
Krzysztof R. Apt, Nissim Francez, and Shmuel Katz
March 1988
28 pages
ABSTRACT: The relations among various languages and models for
distributed computation and various possible definitions of
fairness are considered. Natural semantic criteria are presented
which an acceptable notion of fairness should satisfy. These are
then used to demonstrate differences among the basic models, the
added power of the fairness notion, and the sensitivity of the
fairness notion to irrelevant semantic interleavings of independent
operations. These results are used to show that from the
considerable variety of commonly used possibilities, only strong
process fairness is appropriate for CSP if these criteria are
adopted. We also show that under these criteria, none of the
commonly used notions of fairness are fully acceptable for a model
with an n-way synchronization mechanism. The notion of fairness
most often mentioned for Ada is shown to be fully acceptable. For a
model with nonblocking send operations, some variants of common
fairness definitions are appraised, and two are shown to satisfy
the suggested criteria.
TR-88-11 PERFORMANCE MODELLING OF PARALLEL COMPUTATIONS
Ashok K. Adiga
April 1988
165 pages
ABSTRACT: The design of parallel computations involves numerous
decisions which effect execution efficiency. The choice of an
optimum configuration for a computation on a given architecture is
essential for attaining the maximum efficiency in terms of achieved
speedup. Some of the relevant factors in the configuration space of
a computation include the granularity of a task, the communication
model used, choice of dependencies between tasks and the host
architecture on which the application is to be run. In this
dissertation, we present a model for representing parallel
computations which can be used to analyze their performance for
various configurations. Our model is an extended Petri Net with
facilities to model control and data flow mechanisms, as well as
synchronization and communication primitives. A methodology is
developed for representing the execution of a computation on a
given architecture. The methodology consists of viewing the model
as consisting of three distinct submodels (the computation,
architecture and mapping submodels) which have standard interfaces
between them. Specification of a structured methodology enables
the automatic generation of model instances. In addition, it
becomes possible to specify a library from which architectures can
be selected to determine if they are suitable for a given
computation. This modelling technique is then used to study the
performance of computations under variations in their configuration
parameters, including their actual run-time behavior on various
target architectures.
TR-88-12 SPECIFYING AN IMPLEMENTATION TO SATISFY INTERFACE SPECIFICATIONS: A
STATE TRANSITION APPROACH
Simon S. Lam and A. Udaya Shankar
April 1988
17 pages
ABSTRACT: We present a solution to the problem posed by Leslie
Lamport to participants of the Specification Logics session in the
1987 Lake Arrowhead workshop. Formal specifications are given for
a database interface offering serializable access to concurrent
client programs, a two-phase locking implementation of the client
interface, and the physical-database interface accessed by the
implementation. We sketch a proof that the implementation
satisfies the client interface specification, assuming that the
physical-database interface specification holds.
TR-88-13 RELATIONAL DATABASE STRUCTURE FOR STORAGE AND MANIPULATION OF
DEPENDENCY GRAPHS
Sivagnanam Ramasundaram Easwar
April 1988
100 pages
ABSTRACT: This thesis provides a first step towards resolution of
the problem, of converting sequential Fortran programs to parallel,
by capturing the potential parallel computation structure of a
Fortran program in a Relational Database. Parallel languages are
required to fully utilize the Parallel machines that have been
developed. Many Man-years of Sequential Programs (in FORTRAN) have
already been written. Re-writing these programs in some parallel
language would be almost impossible. The Database produced by this
thesis can then be used by other programs, to generate specific
parallel computation structures, appropriate for given environ-
ments.
TR-88-14 PARALLEL HEURISTIC SEARCH OF STATE-SPACE GRAPHS: A SUMMARY OF
RESULTS
Vipin Kumar, K. Ramesh, and V. Nageshwara Rao
April 1988
20 pages
ABSTRACT: This paper presents many different parallel formulations
of the A*/Branch-and-Bound search algorithm. The parallel
formulations primarily differ in the data structures used. Some
formulations are suited only for shared-memory architectures,
whereas others are suited for distributed-memory architectures as
well. These parallel formulations have been implemented to solve
the vertex cover problem and the TSP problem on the BBN Butterfly
parallel processor. Using appropriate data structures, we are able
to obtain fairly linear speedups for as many as 100 processors. We
also discovered problem characteristics that make certain
formulations more (or less) suitable for some search problems.
Since the best-first search paradigm of A*/Branch-and-Bound is very
commonly used, we expect these parallel formulations to be
effective for a variety of problems. Concurrent and distributed
priority queues used in these parallel formulations can be used in
many parallel algorithms other than parallel A*/branch-and-bound.
TR-88-15 AND-PARALLEL EXECUTION OF LOGIC PROGRAMS ON A SHARED MEMORY
MULTIPROCESSOR: A SUMMARY OF RESULTS
Yow-Jian Lin and Vipin Kumar
April 1988
20 pages
ABSTRACT: This paper presents the implementation of an
AND-parallel execution model of logic programs on a shared-memory
multiprocessor. The major features of the implementation are
(i) dependency analysis between literals of a clause is done
dynamically without incurring excessive run-time overhead;
(ii) backtracking is done intelligently at the clause level without
incurring any extra cost for the determination of the backtrack
literal; (iii) the implementation is based upon the Warren Abstract
Machine (WAM), hence retains most of the efficiency of the WAM for
sequential segments of logic programs. Performance results on
Sequent Balance 21000 show that our parallel implementation can
achieve reasonable speedup on dozens of processors.
TR-88-16 PARALLEL DEPTH FIRST SEARCH ON THE RING ARCHITECTURE
Vipin Kumar, V. Nageshwara Rao, and K. Ramesh
April 1988
20 pages
ABSTRACT: This paper presents the implementation and analysis of
parallel depth-first search on the ring architecture. At the heart
of the parallel formulation of depth-first search is a dynamic work
distribution scheme that divides the work between different
processors. The effectiveness of the parallel formulation is
strongly influenced by the choice of the work distribution scheme.
In particular, a commonly used work distribution scheme is found to
give very poor performance on large rings ( > 32 processors). We
present a new work distribution scheme that is better than the work
distribution scheme used by other researchers, and gives good
performance even on large rings (128 processors). We introduce the
concept of iso-efficiency function to characterize the effec-
tiveness of different work distribution schemes.