leff@smu.UUCP (Laurence Leff) (05/17/88)
DEPARTMENT OF COMPUTER SCIENCES TECHNICAL REPORT CENTER TAYLOR HALL 2.124 THE UNIVERSITY OF TEXAS AT AUSTIN AUSTIN, TEXAS 78712-1188 (512) 471-9595 NOTE NEW ELECTRONIC MAIL ADDRESS: trcenter@cs.utexas.edu TECHNICAL REPORTS, JANUARY 1988 - APRIL 1988 PREPAYMENT IS REQUIRED. Please make U.S. bank checks or international money orders payable to "The University of Texas." [ ] TR-86-24 (Revision) $1.50 [ ] TR-87-20 (Revision) $1.50 [ ] TR-87-23 (Revision) $2.50 [ ] TR-87-28 (Revision) $1.50 [ ] TR-88-01 $1.50 [ ] TR-88-07 $1.50 [ ] TR-88-13 $5.00 [ ] TR-88-02 $1.50 [ ] TR-88-08 $1.50 [ ] TR-88-14 $1.50 [ ] TR-88-03 $2.00 [ ] TR-88-09 $1.50 [ ] TR-88-15 $1.50 [ ] TR-88-04 $5.00 [ ] TR-88-10 $1.50 [ ] TR-88-16 $1.50 [ ] TR-88-05 $1.50 [ ] TR-88-11 $5.00 [ ] TR-88-06 $1.50 [ ] TR-88-12 $1.50 ------------------------------------------------------------------------------- TR-86-24 IMPLEMENTATION CONCEPTS FOR AN EXTENSIBLE DATA MODEL AND DATA LANGUAGE D. S. Batory, T. Y. Leung, and T. E. Wise February 1988 30 pages ABSTRACT: Future database systems must feature extensible data models and data languages in order to accommodate the novel data types and special-purpose operations that are required by nontraditional database applications. In this paper, we outline a functional data model and data language that are targeted for the semantic interface of GENESIS, an extensible DBMS. The model and language are generalizations of FQL [Bun82] and DAPLEX [Shi81], and have an implementation that fits ideally with the modularity required by extensible database technologies. We explore different implementations of functional operators and present experimental evidence that they have efficient implementations. We also explain the advantages of a functional front-end to not-1NF databases, and show how our language and implementation are being used to process queries on both 1NF and not-1NF relations. TR-87-20 ATOMIC SEMANTICS OF NONATOMIC PROGRAMS James H. Anderson and Mohamed G. Gouda March 1988 8 pages ABSTRACT: We argue that it is possible, and sometimes useful, to reason about nonatomic programs within the conventional atomic model of concurrency. TR-87-23 BUILDING BLOCKS OF DATABASE MANAGEMENT SYSTEMS D. S. Batory February 1988 42 pages ABSTRACT: We present a very simple formalism based on parameterized types and a rule-based algebra to survey and identify the storage structure and query processing algorithm building blocks of database management systems. We demonstrate building block reusability by showing how different combinations of a few blocks yield the structures and algorithms of three different systems, namely System R (centralized), R* (distributed), and GRACE (database machine). We believe that codifying knowledge of DBMS implementations is an important step toward a technology that assembles DBMSs rapidly and cheaply from libraries of prewritten components. TR-87-28 PAMMA NONITERATIVE APPROXIMATE SOLUTION METHOD FOR CLOSED MULTICHAIN QUEUEING NETWORKS Ching-Tarng Hsieh and Simon S. Lam March 1988 24 pages ABSTRACT: Approximate MVA algorithms for separable queueing networks are based upon an iterative solution of a set of modified MVA formulas. Although each iteration has a computational time 2 requirement of O(MK ) or less, many iterations are typically needed for convergence to a solution. (M denotes the number of queues and K the number of closed chains or customer classes.) We present some faster approximate solution algorithms that are noniterative. They are suitable for the analysis and design of communication networks which may require tens to hundreds, perhaps thousands, of closed chains to model flow-controlled virtual channels. The basis of our method is the distribution of a chain's population proportional to loads to get initial estimates of mean queue lengths. This is the same basis used in the derivation of proportional upper bounds for single-chain networks; for a multichain network, such a proportional distribution leads to approximations rather than upper bounds of chain throughputs. Nevertheless, these approximate solutions provide chain through- puts, mean end-to-end delays, and server utilizations that are sufficiently accurate for the analysis and design of communication networks and possibly other distributed systems with a large number of customer classes. Three PAM algorithms of increasing accuracy are presented. Two of them have time and space requirements of 2 O(MK). The third algorithm has a time requirement of O(MK ) and a space requirement of O(MK). TR-88-01 CONCEPTS FOR A DATABASE SYSTEM COMPILER D. S. Batory January 1988 9 pages ABSTRACT: We propose a very simple formalism based on parameterized types and a rule-based algebra to explain the storage structures and algorithms of database management systems. Implementations of DBMSs are expressed as equations. If all functions referenced in the equations have been implemented, the software for a DBMS can be synthesized in minutes at little cost, in contrast to current methods where man-years of effort and hundreds of thousands of dollars are required. Our research aims to develop a DBMS counterpart to today's compiler-compiler techniques. TR-88-02 ON THE REUSABILITY OF QUERY OPTIMIZATION ALGORITHMS D. S. Batory January 1988 23 pages ABSTRACT: We tackle the problem of software reusability in this paper as it pertains to an important class of nonrecursive query optimization algorithms. We demonstrate reusability can be achieved by 1) imposing standardized interfaces and 2) expressing algorithms in a DBMS implementation-independent manner. The former is accomplished by generalizing the notion of query graphs, and the latter is accomplished by standardizing algorithm definitions in terms of query graph rewrite rules. Demonstrating algorithm reusability is an essential step toward a building-blocks technology for extensible database systems. TR-88-03 A TAXONOMY OF FAIRNESS AND TEMPORAL LOGIC PROBLEMS FOR PETRI NETS Rodney R. Howell, Louis E. Rosier, and Hsu-Chun Yen January 1988 38 pages ABSTRACT: In this paper, we define a temporal logic for reasoning about Petri nets. We show the model checking problem for this logic to be PTIME equivalent to the Petri net reachability problem. Using this logic and two refinements, we show the fair nontermination problem to be PTIME equivalent to reachability for several definitions of fairness. For other versions of fairness, this problem is shown to be either PTIME equivalent to the boundedness problem or highly undecidable. In all, 24 versions of fairness are examined. TR-88-04 A GENERAL APPROACH TO MULTIPROCESSOR SCHEDULING Sung Jo Kim February 1988 155 pages ABSTRACT: As a variety of general-purpose multiprocessor systems have been recently designed and built, multiprocessor scheduling is becoming increasingly important. Multiprocessor scheduling is a technique to exploit the underlying hardware in a multiprocessor system so that parallelism existing in an application program can be fully utilized and interprocessor communication time can be minimized. Traditionally, most research on multiprocessor scheduling has focused on the development of specific scheduling strategies to take advantage of unique characteristics of a specific multiprocessor system or application program. In this thesis, we define and characterize scheduling techniques and related heuristic mapping algorithms which are applicable to a spectrum of multiprocessor systems and a broad class of application programs. The fundamental idea we use is that multiprocessor scheduling can be regarded as a series of mappings from a computation graph (representing an application program) to a virtual architecture graph (representing an optimal architecture for the program) and eventually to a physical architecture graph (representing a target multiprocessor system). We propose linear clustering and linear cluster merging as effectual heuristics. After linear clustering and merging, the computation graph is transformed into a virtual architecture graph. This graph represents an optimal architecture which compromises between two conflicting goals, minimization of interprocessor communication and maximization of potential parallelism, and satisfies the other goals, throughput enhancement and workload balance, relatively well. Then we develop two efficient scheduling algorithms which map the optimal architecture graph onto a physical architecture graph which may represent either a homogeneous or a heterogeneous multiprocessor system. These algorithms rely not only on local information but also on limited global information. Finally, we present the result of performance evaluation of the mapping algorithms on an Intel iPSC with 32 processors and a Sequent Balance with 10 processors. TR-88-05 SCHEMA VERSIONS AND DAG REARRANGEMENT VIEWS IN OBJECT-ORIENTED DATABASES Hyoung-Joo Kim and Henry F. Korth February 1988 30 pages ABSTRACT: An important requirement of non-traditional database applications such as computer aided design, artificial intel- ligence, and office information systems with multimedia documents is the support of application evolution. Application evolution includes evolution of object schemas as well as evolution of objects in the application. We provided a framework of schema evolution in [BKKK86, BKKK87] and the framework was realized in an object-oriented database system, ORION, at MCC. In this paper, we extend this schema evolution framework by allowing schema versions and DAG rearrangement views in object-oriented databases. We present a technique that enables users to manipulate schema versions explicitly and maintain schema evolution histories. For completeness, we integrate our model with the object version model formulated by H.T. Chou and W. Kim [CK86]. We identify new types of view, called DAG rearrangement views, of composite objects and class hierarchies. We present a set of operators for defining DAG rearrangement views. We identify sets of composite object views with the property that queries on the views are processable on instances of the original composite object schema. We also discuss how instances would be viewed and reorganized in DAG rearrangement views of class hierarchies. TR-88-06 CONCURRENT ACCESS OF PRIORITY QUEUES V. Nageshwara Rao and Vipin Kumar February 1988 26 pages ABSTRACT: The heap is an important data structure used as a priority queue in a wide variety of parallel algorithms (e.g., multiprocessor scheduling, branch-and-bound). In these algorithms, contention for the shared heap limits the obtainable speedup. This paper presents an approach to allow concurrent insertions and deletions on the heap in a shared-memory multiprocessor. Our scheme has much smaller overhead and gives a much better performance than a previously reported scheme. The scheme also retains the strict priority ordering of the serial-access heap algorithms; i.e., a delete operation returns the best key of all keys that have been inserted or are being inserted at the time delete is started. Our experimental results on the BBN Butterfly parallel processor demonstrate that the use of the concurrent-heap algorithms in parallel branch-and-bound improves its performance substantially. TR-88-07 FAST RAY TRACING USING K-D TREES Donald Fussell and K. R. Subramanian March 1988 22 pages ABSTRACT: A hierarchical search structure for ray tracing based on k-d trees is introduced. This data structure can handle the variety of surfaces commonly used in computer graphics. Algorithms to build and traverse this binary data structure are described. Only regions through which a ray travels are interrogated and the search proceeds along the path of the ray starting from its origin. The algorithm also ensures that no object is interrogated more than once in the search for intersection. Benchmarking results indicate that when used with axis-aligned bounding parallelepipeds, this method of ray tracing is one of the fastest known. TR-88-08 SEPARATION PAIR DETECTION Donald Fussell and Ramakrishna Thurimella March 1988 14 pages ABSTRACT: A separation pair of a biconnected graph is a pair of vertices whose removal disconnects the graph. The central part of any algorithm that finds triconnected components is an algorithm for separation pairs. Recently Miller and Ramachandran have given a 2 parallel algorithm that runs in O(log n) time using O(m) processors. We present a new algorithm for finding all separation pairs of a biconnected graph that runs in O(log n) time using O(m) processors. A direct consequence is a test for triconnectivity of a graph within the same resource bounds. TR-88-09 ARITHMETIC CLASSIFICATION OF PERFECT MODELS OF STRATIFIED PROGRAMS Krzysztof R. Apt and Howard A. Blair March 1988 14 pages ABSTRACT: We study here the recursion theoretic complexity of the perfect (Herbrand) models of stratified logic programs. We show that these models lie arbitrarily high in the arithmetic hierarchy. As a byproduct we obtain a similar characterization of the recursion theoretic complexity of the set of consequences in a number of formalisms for non-monotonic reasoning. We show that under some circumstances this complexity can be brought down to recursive enumerability. TR-88-10 APPRAISING FAIRNESS IN LANGUAGES FOR DISTRIBUTED PROGRAMMING Krzysztof R. Apt, Nissim Francez, and Shmuel Katz March 1988 28 pages ABSTRACT: The relations among various languages and models for distributed computation and various possible definitions of fairness are considered. Natural semantic criteria are presented which an acceptable notion of fairness should satisfy. These are then used to demonstrate differences among the basic models, the added power of the fairness notion, and the sensitivity of the fairness notion to irrelevant semantic interleavings of independent operations. These results are used to show that from the considerable variety of commonly used possibilities, only strong process fairness is appropriate for CSP if these criteria are adopted. We also show that under these criteria, none of the commonly used notions of fairness are fully acceptable for a model with an n-way synchronization mechanism. The notion of fairness most often mentioned for Ada is shown to be fully acceptable. For a model with nonblocking send operations, some variants of common fairness definitions are appraised, and two are shown to satisfy the suggested criteria. TR-88-11 PERFORMANCE MODELLING OF PARALLEL COMPUTATIONS Ashok K. Adiga April 1988 165 pages ABSTRACT: The design of parallel computations involves numerous decisions which effect execution efficiency. The choice of an optimum configuration for a computation on a given architecture is essential for attaining the maximum efficiency in terms of achieved speedup. Some of the relevant factors in the configuration space of a computation include the granularity of a task, the communication model used, choice of dependencies between tasks and the host architecture on which the application is to be run. In this dissertation, we present a model for representing parallel computations which can be used to analyze their performance for various configurations. Our model is an extended Petri Net with facilities to model control and data flow mechanisms, as well as synchronization and communication primitives. A methodology is developed for representing the execution of a computation on a given architecture. The methodology consists of viewing the model as consisting of three distinct submodels (the computation, architecture and mapping submodels) which have standard interfaces between them. Specification of a structured methodology enables the automatic generation of model instances. In addition, it becomes possible to specify a library from which architectures can be selected to determine if they are suitable for a given computation. This modelling technique is then used to study the performance of computations under variations in their configuration parameters, including their actual run-time behavior on various target architectures. TR-88-12 SPECIFYING AN IMPLEMENTATION TO SATISFY INTERFACE SPECIFICATIONS: A STATE TRANSITION APPROACH Simon S. Lam and A. Udaya Shankar April 1988 17 pages ABSTRACT: We present a solution to the problem posed by Leslie Lamport to participants of the Specification Logics session in the 1987 Lake Arrowhead workshop. Formal specifications are given for a database interface offering serializable access to concurrent client programs, a two-phase locking implementation of the client interface, and the physical-database interface accessed by the implementation. We sketch a proof that the implementation satisfies the client interface specification, assuming that the physical-database interface specification holds. TR-88-13 RELATIONAL DATABASE STRUCTURE FOR STORAGE AND MANIPULATION OF DEPENDENCY GRAPHS Sivagnanam Ramasundaram Easwar April 1988 100 pages ABSTRACT: This thesis provides a first step towards resolution of the problem, of converting sequential Fortran programs to parallel, by capturing the potential parallel computation structure of a Fortran program in a Relational Database. Parallel languages are required to fully utilize the Parallel machines that have been developed. Many Man-years of Sequential Programs (in FORTRAN) have already been written. Re-writing these programs in some parallel language would be almost impossible. The Database produced by this thesis can then be used by other programs, to generate specific parallel computation structures, appropriate for given environ- ments. TR-88-14 PARALLEL HEURISTIC SEARCH OF STATE-SPACE GRAPHS: A SUMMARY OF RESULTS Vipin Kumar, K. Ramesh, and V. Nageshwara Rao April 1988 20 pages ABSTRACT: This paper presents many different parallel formulations of the A*/Branch-and-Bound search algorithm. The parallel formulations primarily differ in the data structures used. Some formulations are suited only for shared-memory architectures, whereas others are suited for distributed-memory architectures as well. These parallel formulations have been implemented to solve the vertex cover problem and the TSP problem on the BBN Butterfly parallel processor. Using appropriate data structures, we are able to obtain fairly linear speedups for as many as 100 processors. We also discovered problem characteristics that make certain formulations more (or less) suitable for some search problems. Since the best-first search paradigm of A*/Branch-and-Bound is very commonly used, we expect these parallel formulations to be effective for a variety of problems. Concurrent and distributed priority queues used in these parallel formulations can be used in many parallel algorithms other than parallel A*/branch-and-bound. TR-88-15 AND-PARALLEL EXECUTION OF LOGIC PROGRAMS ON A SHARED MEMORY MULTIPROCESSOR: A SUMMARY OF RESULTS Yow-Jian Lin and Vipin Kumar April 1988 20 pages ABSTRACT: This paper presents the implementation of an AND-parallel execution model of logic programs on a shared-memory multiprocessor. The major features of the implementation are (i) dependency analysis between literals of a clause is done dynamically without incurring excessive run-time overhead; (ii) backtracking is done intelligently at the clause level without incurring any extra cost for the determination of the backtrack literal; (iii) the implementation is based upon the Warren Abstract Machine (WAM), hence retains most of the efficiency of the WAM for sequential segments of logic programs. Performance results on Sequent Balance 21000 show that our parallel implementation can achieve reasonable speedup on dozens of processors. TR-88-16 PARALLEL DEPTH FIRST SEARCH ON THE RING ARCHITECTURE Vipin Kumar, V. Nageshwara Rao, and K. Ramesh April 1988 20 pages ABSTRACT: This paper presents the implementation and analysis of parallel depth-first search on the ring architecture. At the heart of the parallel formulation of depth-first search is a dynamic work distribution scheme that divides the work between different processors. The effectiveness of the parallel formulation is strongly influenced by the choice of the work distribution scheme. In particular, a commonly used work distribution scheme is found to give very poor performance on large rings ( > 32 processors). We present a new work distribution scheme that is better than the work distribution scheme used by other researchers, and gives good performance even on large rings (128 processors). We introduce the concept of iso-efficiency function to characterize the effec- tiveness of different work distribution schemes.