leff@smu.UUCP (Laurence Leff) (04/24/89)
Bibliography of Technical Reports Computer Sciences Department University of Wisconsin November 1988 - January 1989 For copies, send requests to: Technical Report Librarian Computer Sciences Department University of Wisconsin 1210 W. Dayton Street Madison, WI 53706 USA %A Nira Dyn %A Amos Ron %T On Multivariate Polynomial Interpolation %D January 1989 %R TR 820 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X A class of spaces of multivariate polynomials, closed under differentiation, is studied and corresponding classes of well posed Hermite-type interpolation problems are presented. All Hermite-type problems are limits of well posed Lagrange problems. The results are based on a duality between certain spaces of multivariate exponential-polynomials @\fIH@ and corresponding spaces of multivariate polynomials @\fIP@, used by Dyn and Ron (1988) to establish the approximation order of the span of translates of exponential box splines. In the interpolation theory @\fIP@ is the space of interpolating polynomials and @\fIH@ characterizes the interpolation points and the interpolation conditions, both spaces being defined in terms of a set of hyperplanes in \fII\h'-0.07i'R\s-2\us\s+2\d\fR\. This geometric approach extends the work of Chung and Yao (1977) on Lagrange interpolation, and also a subset of the Hermite-type problems considered via the Newton scheme, by several authors (see Gasca and Maetzu (1982) and references therein). For a different approach to the interpolation problem see Chui and Lai (1988). It is the systematic and unified analysis of a wide class of interpolation problems which is the main contribution of this paper to the study of multivariate polynomial interpolation. %A Carl de Boor %A Amos Ron %T On Multivariate Polynomial Interpolation %D January 1989 %R TR 822 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X We provide a map $\(*H \(->\s-2 \(*p\s+2 sub \(*H$ which associates each finite subset \(*H \(sb \C\s-2\us\s+2\d with a polynomial space $\s-2\(*p\s+2 sub \(*H$ from which interpolation to arbitrary data given at the points in $\(*H$ is possible and uniquely so. Among all polynomial spaces $Q$ from which interpolation at $\(*H$ is uniquely possible, our $\s-2\(*p\s+2 sub \(*H$ is of smallest degree. It is also $D$- and scale-invariant. Our map is monotone, thus providing a Newton form for the resulting interpolant. Our map is also continuous within reason, allowing us to interpret certain cases of coalescence as Hermite interpolation. In fact, our map can be extended to the case where, with each $\(*h \(mo \(*H$, there is associated a polynomial space $P sub \(*h$, and, for given smooth $f$, a polynomial $q \(mo Q$ is sought for which $p(D)(f-q)(\(*h)$=0, $\(fa\p \(mo P sub \(*H,$ $\(*h \(mo \(*H$. We obtain $\s-2\(*p\s+2 sub \(*H$ as the ``scaled limit at the origin'' $(exp sub \(*H ) sub \(da$ of the exponential space $exp sub \(*H$, with frequencies $\(*H$, and base our results on a study of the map $H \(-> H\(da$ defined on subspaces $H$ of the space of functions analytic at the origin. This study also allows us to determine the local approximation order from such $H$ and provides an algorithm for the construction of $H\(da$ from any basis for $H$. %A Mark D. Hill %A Alan Jay Smith %T Evaluating Associativity in CPU Caches %D February 1989 %R TR 823 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X Because of the infeasibility or expense of large fully-associative caches, cache memories are often designed to be set-associative or direct-mapped. This paper presents (1) new and efficient algorithms for simulating alternative direct-mapped and set-associative caches, and (2) uses those algorithms to quantify the effect of limited associativity on the cache miss ratio. We introduce a new algorithm, \fIforest simulation\fP, for simulating alternative direct-mapped caches and generalize one, which we call \fIall-associativity simulation\fP, for simulating alternative direct-mapped, set-associative and fully-associative caches. We find that while all-associativity simulation is theoretically less efficient than forest simulation or stack simulation (a commonly-used simulation algorithm), in practice, it is not much slower and allows the simulation of many more caches with a single pass through an address trace. We also provide data and insight into how varying associativity affects the miss ratio. We show: (1) how to use the simulations of alternative caches to isolate the cause of misses; (2) that the principal reason why set-associative miss ratios are larger than fully-associative ones is (as one might expect) that too many active blocks map to a fraction of the sets even when blocks map to sets in a uniform random manner; and (3) that reducing associativity from eight-way to four-way, from four-way to two-way, and from two-way to direct-mapped causes relative miss ratio increases in our data of about 5, 10 and 30 percent respectively, regardless cache size. %A Joel E. Richardson %A Michael J. Carey %A Daniel T. Schuh %T The Design of the E Programming Language %D February 1989 %R TR 824 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X E is an extension of C++ designed for writing data-intensive systems software (e.g. a DBMS). Several aspects of this programming domain cause difficulty in conventional languages. For example, one must usually write the system code without knowing the types of entities to be manipulated. In addition, the entities themselves are persistent, outlasting the program which created them. E addresses these and other problems through a judicious choice of language constructs that significantly ease the programmer's task. Being based on C++, E provides classes and inheritance and then adds generator classes for defining generic container types, iterators for processing streams of values, and a persistent storage class for declaring a database as a collection of language objects. The result is a systems-level programming language which easily expresses the types and operations internal to a DBMS as well as those of the database itself. %A R. E. Kessler %A Miron Livny %T An Analysis of Distributed Shared Memory Algorithms %D February 1989 %R TR 825 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X This paper describes results obtained in a study of algorithms to implement a Distributed Shared Memory in a distributed (loosely coupled) environment. Distributed Shared Memory is the implementation of shared memory across multiple nodes in a distributed system. This is accomplished using only the private memories of the nodes by controlling access to the pages of the shared memory and transferring data to and from the private memories when necessary. We analyze alternative algorithms to implement Distributed Shared Memory, all of them based on the ideas presented in kai li dissertation The Distributed Shared Memory algorithms are analyzed and compared over a wide range of conditions. Application characteristics are identified which can be exploited by the Distributed Shared Memory algorithms. We will show the conditions under which the algorithms analyzed in this paper perform better or worse than the other alternatives. Results are obtained via simulation using a synthetic reference generator. %A Michael J. Carey %A Miron Livny %T Conflict Detection Tradeoffs for Replicated Data %D March 1989 %R TR 826 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X Many concurrency control algorithms have been proposed for use in distributed database systems. Despite the large number of available algorithms, and the fact that distributed database systems are becoming a commercial reality, distributed concurrency control performance tradeoffs are still not well understood. In this paper we examine some of these tradeoffs by using a detailed model of a distributed DBMS to study a set of representative algorithms, including several derivatives of the two-phase locking, timestamp ordering, and optimistic approaches to distributed concurrency control. In particular, we examine the performance of these algorithms as a function of data contention for various levels of data replication and "distributedness" of accesses to replicated data. The results provide some interesting insights into how the tradeoffs between early and late conflict detection vary as a function of message cost, and should prove useful to distributed database system designers. %A Thomas Reps %A Thomas Bricker %T Illustrating Interference in Interfering Versions of Programs %D March 1989 %R TR 827 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X The need to integrate several versions of a program into a common one arises frequently, but it is a tedious and time consuming task to merge programs by hand. The program-integration algorithm recently proposed by S. Horwitz, J. Prins, and T. Reps provides a way to create a semantics-based tool for program integration. The integration algorithm is based on the assumption that any change in the behavior, rather than the text, of a program variant is significant and must be preserved in the merged program. An integration system based on this algorithm will either automatically combine several different but related variants of a base program, or else determine that the programs incorporate interfering changes. In this paper we discuss how an integration tool can illustrate the causes of interference to the user when interference is detected. Our main technical result is an alternative characterization of the integration algorithm's interference criterion that is more suitable for illustrating the causes of interference. We then propose six methods for an integration system to display information to demonstrate the causes of interference to the user. %A Michael J. Carey %A Rajiv Jauhari %A Miron Livny %T Priority in DBMS Resource Scheduling %D March 1989 %R TR 828 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X In this paper, we address the problem of priority scheduling in a database management system. We start by investigating the architectural consequences of adding priority to a DBMS. Specific priority-based schemes are then proposed for managing DBMS resources, including a priority-based disk scheduling algorithm and two approaches to adding priority to the buffer manager of a DBMS. We study the performance of our proposals through simulation, both individually and in combination. Our simulation results indicate that the objectives of priority scheduling cannot be met by a single priority-based scheduler. They show that, regardless of whether the system bottleneck is the CPU or the disk, priority scheduling on the critical resource must be complemented by a priority-based buffer management policy. %A Amos Ron %T Factorization Theorems for Univariate Splines on Regular Grids %D February 1989 %R TR 829 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X For a given univariate compactly supported distribution $\(*f$ we investigate here the space $S(\(*f)$ spanned by its integer translates, the subspace $H(\(*f)$ of all exponentials in $S(\(*f)$ and the kernel $K\s-2\(*f\s+2$ associated semi-discrete convolution $\(*f\(**$. The paper addresses a variety of results including a complete structure of $H(\(*f)$ and $K\s-2\(*f\s+2$ and a characterization of splines of minimal support. The main result shows that each $\(*f$ can be expressed as $\(*f=p(\(gr)\(*p\(**M$ where $p(\(gr)$ is a finite difference operator, $\(*t$ is a distribution of smaller support satisfying $H(\(*t)=K\s-2\(*t\s+2$ = {0}, and $M$ is a spline which depends on $H(\(*f)$ but not on $\(*f$ itself, and which in the generic case (termed here ``regular'') is the exponential B-spline associated with $H(\(*f)$. The approach chosen is direct and avoids entirely the Fourier analysis arguments. The fact that a distribution is examined, rather than a function, is essential for the methods employed. %A Barton P. Miller %A Lars Fredriksen %A Bryan So %T An Empirical Study of the Reliability of Operating System Utilities %D March 1989 %R TR 830 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X Operating system facilities, such as the kernel and utility programs, are assumed to be reliable. Recent experience has led us to question the robustness of our operating system utility programs under unusual input streams. Specifically, we were interested in a basic form of testing: whether the utilities would crash or hang (infinite loop) in response to unusual input. We constructed a set of tools to test this form of reliability. Almost 90 utilities were tested on each of six versions of the UNIX operating system. Included in this list of systems was one that underwent commercial product testing. As a result of our tests, we were able to crash 24-33% of the utilities that we tested, including commonly used utilities such as editors, shells, and document formatters. These were programs that either terminated abnormally, generating a core dump, or programs that had infinite loops. We then examined each utility program that crashed and identified the cause. These results were then categorized by the cause of crash. We describe the cause of the crashes, the programming practices that led to the errors, and make some suggestions on how to avoid these problems in the future. %A Michael J. Carey %A Miron Livny %T Parallelism and Concurrency Control Performance in Distributed Database Machines %D March 1989 %R TR 831 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X While several distributed (or `shared nothing') database machines exist in the form of prototypes or commercial products, and a number of distributed concurrency control algorithms are available, the effect of parallelism on concurrency control performance has received little attention. This paper examines the interplay between parallelism and transaction performance in a distributed database machine context. Four alternative concurrency control algorithms are considered, including two-phase locking, wound-wait, basic timestamp ordering, and optimistic concurrency control. Issues addressed include how performance scales as a function of machine size and the degree to which partitioning the database for intra-transaction parallelism improves performance for the different algorithms. We examine performance from several perspectives, including response time, throughput, and speedup, and we do so over a fairly wide range of system loads. We also examine the performance impact of certain important overhead factors (e.g., communication and process initiation costs) on the four alternative concurrency control algorithms. %A David L. Cohrs %A Barton P. Miller %T Specification and Verification of Network Managers for Large Internets %D March 1989 %R TR 832 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X Large internet environments are increasing the difficulty of network management. Integrating increasing numbers of autonomous subnetworks (each with an increasing number of hosts) makes it more difficult to determine if the network managers of the subnetworks will interoperate correctly. We propose a high level, formal specification language, NMSL, as an aid in solving this problem. NMSL has two modes of operation, a descriptive mode and a prescriptive mode. In its descriptive mode, NMSL specifies abstractions for the network components and their instantiations, and verifies the consistency of such a specification. The abstractions include the data objects and processes in a network management system. These abstractions are instantiated on network elements. Network elements are groupe d together in the specification of domains of administration. An extension mechanism is provided to allow for the specification of new management characteristics that the basic language cannot express. In its prescriptive mode, NMSL generates configuration information directly from a consistent specification. This information is used to configure network management processes to make their operation consistent with their specifications. Standard management protocols (such as the emerging ISO or IETF standards) can be used to incorporate the configuration information into running management processes. %A Yannis E. Ioannidis %A Timos K. Sellis %T Conflict Resolution of Rules Assigning Values to Virtual Attributes %D March 1989 %R TR 833 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X In the majority of research work done on logic programming and deductive databases, it is assumed that the set of rules defined by the user is \fIconsistent\fR, i.e., that no contradictory facts can be inferred by the rules. In this paper, we address the problem of resolving conflicts of rules that assign values to virtual attributes. We devise a general framework for the study of the problem, and we propose an approach that subsumes all previously suggested solutions. Moreover, it suggests several additional solutions, which very often capture the semantics of the data more accurately than the known approaches. Finally, we address the issue of how to index rules so that conflicts are resolved efficiently, i.e., only one of the applicable rules is processed at query time. %A Carl de Boor %A Amos Ron %T The Limit at the Origin of a Smooth Function Space %D March 1989 %R TR 834 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X We present a map $\fIH \(-> \fIH\(da$ that assigns to each finite-dimensional space of smooth functions a homogeneous polynomial space of the same dimension. We discuss applications of this map in the areas of multivariate polynomial interpolation, box spline theory and polynomial ideals. %A James R. Goodman %A Mark D. Hill %A Philip J. Woest %T Scalability and its Application to Multicube %D March 1989 %R TR 835 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X In parallel processing, scalability is an important issue for both applications and systems, with scaling arguments being presented for most large-scale multiprocessors. Even with such widespread interest, the notion of what constitutes a scalable multiprocessor remains ambiguous. In this paper we first present a realistic, quantitative definition of scalability based on a multiprocessor's ability to execute a problem in roughly constant time, given that the number of processors scale in proportion to the complexity of the problem. We then use the definition to demonstrate that the Multicube architecture isscalable by showing that (1) sufficient bus bandwidth is provided to support scalable applications, and (2) broadcast invalidations and non-uniform memory accesses are handled efficiently. We discuss, for example, how the propagation of broadcast invalidations may be limited by the use of special \fIpruning\fR caches, and how Multicube's cache coherency hardware and synchronization primitives can be used to limit the degradation from non-uniform memory access patterns. %A Donovan A. Schneider %A David J. DeWitt %T A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment %D April 1989 %R TR 836 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hash-based join methods, Simple hash represents a looping algorithm with hashing, and our last algorithm is the more traditional sort-merge. The performance of each of the algorithms with different tuple distribution policies, the addition of bit vector filters, varying amounts of main-memory for joining, and non-uniformly distributed join attribute values is studied. The Hybrid hash-join algorithm is found to be superior except when the join attribute values of the inner relation are non-uniformly distributed and memory is limited. In this case, a more conservative algorithm such as the sort-merge algorithm should be used. The Gamma database machine serves as the host for the performance comparison. %A Jude W. Shavlik %A Gerald F. deJong %T Learning in Mathematically-Based Domains: Understanding and Generalizing Obstacle Cancellations %D April 1989 %R TR 837 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X Mathematical reasoning provides the basis for problem solving and learning in many complex domains. A model for applying \fIexplanation-based learning\fP in mathematically-based domains is presented, and an implemented learning system is described. In explanation-based learning, a specific problem's solution is generalized into a form that can be later used to solve conceptually similar problems. The presented system's mathematical reasoning processes are guided by the manner in which variables are cancelled in specific problem solutions. Analyzing the cancellation of \fIobstacles\fP - variables that preclude the direct evaluation of the problem's unknown - leads to the generalization of the specific solution. Two important general issues in explanation-based learning are also addressed. Namely, generalizing the number of entities in a situation and acquiring efficiently-applicable concepts. %A Amos Ron %A Carl de Boor %A Nira Dyn %T On Two Polynomial Spaces Associated with a Box Spline %D April 1989 %R TR 838 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X The polynomial space @H@ in the span of the integer of a box spline @M@ admits a well-known characterization as the joint kernel of a set of homogeneous differential operators with constant coefficients. The dual space @H*@ has a convenient representation by a polynomial space @P@, explicitly known, which plays an important role in box spline theory as well as in multivariate polynomial interpolation. In this paper we characterize the dual space @P@ as the joint kernel of simple differential operators, each one a power of a directional derivative. Various applications of this result to multivariate polynomial interpolation, multivariate splines and duality between polynomial and exponential spaces are discussed. %A Giri Narasimhan %T A Note on the Hamiltonian Circuit Problem on Directed Path Graphs %D April 1989 %R TR 839 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X Bertossi and Bonuccelli [2] proved that the Hamiltonian Circuit problem is NP-Complete even when the inputs are restricted to the special class of Undirected Path graphs. The corresponding problem on Directed Path graphs was left as an open problem. We use a characterization of Directed Path graphs due to Monma and Wei [8] to prove that the Hamiltonian Circuit problem and the Hamiltonian Path problem are NP-Complete on Directed Path graphs. %A Wuu Yang %A Susan Horwitz %A Thomas Reps %T Detecting Program Components with Equivalent Behaviors %D April 1989 %R TR 840 %I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN %C MADISON, WI %X The execution behavior of a program component is defined as the sequence of values produced at the component during program execution. This paper presents an efficient algorithm for detecting program components \(mi in \fIone or more\fP programs \(mi that exhibit identical execution behaviors. The algorithm operates on a new graph representation for programs that combines features of static-single-assignment forms and program dependence graphs. The result provides insight into the relationship between execution behaviors and (control and flow) dependences in the program. The algorithm, called the \fISequence-Congruence Algorithm\fP, is applicable to programs written in a language that includes scalar variables and constants, assignment statements, conditional statements, and while-loops. The Sequence-Congruence Algorithm can be used as the basis for an algorithm for integrating program variants.