[comp.doc.techreports] tr-input/wisc.8889

leff@smu.UUCP (Laurence Leff) (04/24/89)

Bibliography of Technical Reports
Computer Sciences Department
University of Wisconsin
November 1988 - January 1989

For copies, send requests to:

Technical Report Librarian
Computer Sciences Department
University of Wisconsin
1210 W. Dayton Street
Madison, WI 53706 USA


%A Nira Dyn
%A Amos Ron
%T On Multivariate Polynomial Interpolation 
%D January 1989
%R TR 820
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X A class of spaces of multivariate polynomials, closed under
differentiation, is studied and corresponding classes of well posed
Hermite-type interpolation problems are presented.  All Hermite-type
problems are limits of well posed Lagrange problems.
The results are based on a duality between certain spaces of 
multivariate exponential-polynomials
@\fIH@ and corresponding spaces of multivariate polynomials 
@\fIP@, used by Dyn and Ron (1988) to establish
the approximation order of the span of translates of exponential box 
splines.  In the interpolation theory @\fIP@ is the space of
interpolating polynomials and @\fIH@ characterizes the interpolation
points and the interpolation conditions, both spaces being defined in
terms of a set of hyperplanes in
\fII\h'-0.07i'R\s-2\us\s+2\d\fR\.
This geometric approach extends the work of Chung and Yao (1977) on
Lagrange interpolation, and also a subset of the Hermite-type problems
considered via the Newton scheme, by several authors (see Gasca and
Maetzu (1982) and references therein).  For a different approach to the
interpolation problem see Chui and Lai (1988).
It is the systematic and unified analysis of a wide class of
interpolation problems which is the main contribution of this paper to
the study of multivariate polynomial interpolation.

%A Carl de Boor
%A Amos Ron
%T On Multivariate Polynomial Interpolation 
%D January 1989
%R TR 822
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X We provide a map $\(*H \(->\s-2 \(*p\s+2 sub \(*H$ which associates each finite subset
\(*H \(sb \C\s-2\us\s+2\d with a polynomial space $\s-2\(*p\s+2 sub \(*H$ from  which interpolation
to arbitrary data given at the points in $\(*H$ is possible and uniquely
so.
Among all polynomial spaces $Q$ from which interpolation at $\(*H$ is
uniquely possible, our  $\s-2\(*p\s+2 sub \(*H$  is of smallest degree. It is
also  $D$- and  scale-invariant.
Our map is monotone, thus providing a Newton form for the resulting
interpolant. Our map is also continuous within reason, allowing us to
interpret certain cases of coalescence as Hermite interpolation.
In fact, our map can be extended to the case where, with each  $\(*h   \(mo
\(*H$,  there is associated a polynomial space $P sub \(*h$, and, for given smooth $f$, a polynomial $q \(mo Q$ is sought for which 
$p(D)(f-q)(\(*h)$=0,   $\(fa\p \(mo  P sub \(*H,$  $\(*h \(mo \(*H$.
We obtain $\s-2\(*p\s+2 sub \(*H$ as the ``scaled limit at the origin''
$(exp sub \(*H ) sub \(da$ of the exponential space $exp sub \(*H$,
with frequencies $\(*H$, and base our results
on a study of the map $H \(-> H\(da$ defined on subspaces $H$ of the
space of functions analytic at the origin. This study also allows us to
determine the local approximation order from such $H$ and provides an
algorithm for the construction of $H\(da$ from any basis for $H$.

%A Mark D. Hill
%A Alan Jay Smith
%T Evaluating Associativity in CPU Caches 
%D February 1989
%R TR 823
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X Because of the infeasibility or expense of large fully-associative
caches, cache memories are often designed to be set-associative
or direct-mapped.  This paper presents (1) new and efficient algorithms
for simulating alternative direct-mapped and set-associative caches, and (2)
uses those algorithms to quantify the effect of limited associativity
on the cache miss ratio.
We introduce a new algorithm, \fIforest simulation\fP,
for simulating alternative direct-mapped caches and generalize one,
which we call \fIall-associativity simulation\fP, for simulating 
alternative direct-mapped, set-associative and fully-associative caches.
We find that while all-associativity simulation is theoretically less 
efficient than forest simulation or stack simulation (a 
commonly-used simulation algorithm), in practice, it is not much slower
and allows the simulation of many more caches
with a single pass through an address trace.
We also provide data and insight into how varying associativity
affects the miss ratio.  We show:
(1) how to use the simulations of alternative caches to
isolate the cause of misses;
(2) that the principal reason why set-associative miss ratios are larger
than fully-associative ones is (as one might expect) that too many 
active blocks map to a fraction of the sets
even when blocks map to sets in a uniform random manner; and
(3) that reducing associativity from eight-way to four-way, from four-way 
to two-way, and from two-way to direct-mapped
causes relative miss ratio increases in our data of
about 5, 10 and 30 percent respectively, regardless cache size.

%A Joel E. Richardson 
%A Michael J. Carey 
%A Daniel T. Schuh
%T The Design of the E Programming Language 
%D February 1989
%R TR 824  
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X E is an extension of C++ designed for writing data-intensive systems software (e.g.
a DBMS).
Several aspects of this programming domain cause difficulty in conventional languages.
For example, one must usually write the system code without knowing the types of
entities to be manipulated.
In addition, the entities themselves are persistent, outlasting the program which created them.
E addresses these and other problems through a judicious choice of language constructs
that significantly ease the programmer's task.
Being based on C++, E provides classes and inheritance and then adds
generator classes for defining generic container types,  iterators for processing
streams of values,  and a persistent storage class for declaring a database 
as a collection of language objects.
The result is a systems-level programming language which easily expresses the types
and operations internal to a DBMS as well as those of the database itself.

%A R. E. Kessler 
%A Miron Livny
%T An Analysis of Distributed Shared Memory Algorithms 
%D February 1989
%R TR 825 
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X This paper describes results obtained in a study of algorithms to
implement a Distributed Shared Memory in a distributed (loosely coupled) environment.  Distributed Shared Memory is the implementation of shared memory
across multiple nodes in a distributed system.  This is accomplished
using only the private memories of the nodes by controlling access
to the pages of the shared memory and transferring data to and from the private memories when necessary.  We analyze alternative algorithms to implement Distributed Shared Memory, all of them based on the ideas presented in 
kai li dissertation
The Distributed Shared Memory algorithms are analyzed and compared over a wide range of conditions.  Application characteristics are identified which can be exploited by the Distributed Shared Memory algorithms.  We will show the conditions under which the algorithms analyzed in this paper perform better or worse
than the other alternatives.  Results are obtained via simulation using a synthetic reference generator.

%A Michael J. Carey
%A Miron Livny
%T Conflict Detection Tradeoffs for Replicated Data 
%D March 1989
%R TR 826
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X Many concurrency control algorithms have been proposed for use in
distributed database systems.  Despite the large number of available
algorithms, and the fact that distributed database systems are becoming
a commercial reality, distributed concurrency control performance
tradeoffs are still not well understood.  In this paper we examine
some of these tradeoffs by using a detailed model of a distributed DBMS
to study a set of representative algorithms, including several derivatives
of the two-phase locking, timestamp ordering, and optimistic approaches
to distributed concurrency control.  In particular, we examine the
performance of these algorithms as a function of data contention for
various levels of data replication and "distributedness" of accesses
to replicated data.  The results provide some interesting insights into 
how the tradeoffs between early and late conflict detection vary as a
function of message cost, and should prove useful to distributed
database system designers.

%A Thomas Reps
%A Thomas Bricker
%T Illustrating Interference in Interfering Versions of Programs 
%D March 1989
%R TR 827
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X The need to integrate several versions of a program into a common
one arises frequently, but it is a tedious and time consuming task to
merge programs by hand.
The program-integration algorithm recently proposed by
S. Horwitz, J. Prins, and T. Reps provides a way to create a
semantics-based tool for program integration.
The integration algorithm is based on the assumption that any change in
the behavior, rather than the text, of a program variant
is significant and must be preserved in the merged program.
An integration system based on this algorithm will either automatically
combine several different but related variants of a base program, or else
determine that the programs incorporate interfering changes.
In this paper we discuss how an integration tool can illustrate
the causes of interference to the user when interference is detected.
Our main technical result is an alternative characterization of the
integration algorithm's interference criterion that is more suitable
for illustrating the causes of interference.
We then propose six methods for an integration system to display information
to demonstrate the causes of interference to the user.

%A Michael J. Carey
%A Rajiv Jauhari
%A Miron Livny
%T Priority in DBMS Resource Scheduling
%D March 1989
%R TR 828  
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X In this paper, we address the problem of priority scheduling in a database
management system.  We start by investigating the architectural consequences
of adding priority to a DBMS.  Specific priority-based schemes are then
proposed for managing DBMS resources, including a priority-based disk
scheduling algorithm and two approaches to adding priority to the buffer
manager of a DBMS.  We study the performance of our proposals through simulation, both individually and in combination.  Our simulation results indicate that the objectives of priority scheduling cannot be met by a single priority-based
scheduler.  They show that, regardless of whether the system bottleneck is
the CPU or the disk, priority scheduling on the critical resource must be
complemented by a priority-based buffer management policy.

%A Amos Ron
%T Factorization Theorems for Univariate Splines on Regular Grids 
%D February 1989
%R TR 829
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X For a given univariate compactly supported distribution $\(*f$ we
investigate here the space $S(\(*f)$ spanned by its integer translates, the subspace $H(\(*f)$ of all exponentials in $S(\(*f)$ and the kernel $K\s-2\(*f\s+2$
associated semi-discrete convolution $\(*f\(**$. The paper addresses a
variety of results including a complete structure of $H(\(*f)$ and 
$K\s-2\(*f\s+2$  and a characterization of splines of minimal support. The main result shows that each $\(*f$ can be expressed as $\(*f=p(\(gr)\(*p\(**M$ where 
$p(\(gr)$ is a finite difference operator, $\(*t$ is a distribution of smaller
support satisfying $H(\(*t)=K\s-2\(*t\s+2$ = {0}, and $M$ is a spline which
depends on $H(\(*f)$ but not on $\(*f$ itself, and which in the
generic case (termed here ``regular'') is the exponential B-spline associated
with $H(\(*f)$.
The approach chosen is direct and avoids entirely the Fourier analysis
arguments. The fact that a distribution  is examined, rather than 
a function, is essential for the methods employed.

%A Barton P. Miller
%A Lars Fredriksen
%A Bryan So
%T An Empirical Study of the Reliability of Operating System Utilities 
%D March 1989
%R TR 830
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X Operating system facilities, such as the kernel and utility
programs, are assumed to be reliable.  Recent experience has led us to
question the robustness of our operating system utility programs under
unusual input streams.  Specifically, we were interested in a basic
form of testing: whether the utilities would crash or hang (infinite
loop) in response to unusual input.  We constructed a set of tools to
test this form of reliability.  Almost 90 utilities were tested on
each of six versions of the UNIX operating system.  Included in this
list of systems was one that underwent commercial product testing.  As
a result of our tests, we were able to crash 24-33% of the utilities
that we tested, including commonly used utilities such as editors,
shells, and document formatters.  These were programs that either
terminated abnormally, generating a core dump, or programs that had
infinite loops.
We then examined each utility program that crashed and identified
the cause.  These results were then categorized by the cause of crash.
We describe the cause of the crashes, the programming practices that
led to the errors, and make some suggestions on how to avoid these
problems in the future.

%A Michael J. Carey 
%A Miron Livny
%T Parallelism and Concurrency Control Performance in Distributed Database Machines
%D March 1989
%R TR 831
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X While several distributed (or `shared nothing') database machines exist in
the form of prototypes or commercial products, and a number of distributed
concurrency control algorithms are available, the effect of parallelism on
concurrency control performance has received little attention.  This paper
examines the interplay between parallelism and transaction performance in a
distributed database machine context.  Four alternative concurrency control
algorithms are considered, including two-phase locking, wound-wait, basic
timestamp ordering, and optimistic concurrency control.  Issues addressed
include how performance scales as a function of machine size and the degree
to which partitioning the database for intra-transaction parallelism improves
performance for the different algorithms.  We examine performance from several
perspectives, including response time, throughput, and speedup, and we do so
over a fairly wide range of system loads.  We also examine the performance
impact of certain important overhead factors (e.g., communication and process
initiation costs) on the four alternative concurrency control algorithms.

%A David L. Cohrs
%A Barton P. Miller
%T Specification and Verification of Network Managers for Large Internets
%D March 1989
%R TR 832
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X Large internet environments are increasing the difficulty of network
management.  Integrating increasing numbers of autonomous subnetworks
(each with an increasing number of hosts) makes it more difficult to determine if the network managers of the subnetworks will interoperate correctly.
We propose a high level, formal specification language, NMSL, as an aid in solving this problem.  NMSL has two modes of operation, a descriptive mode and a prescriptive mode.  In its descriptive mode, NMSL specifies abstractions for the network components and their instantiations, and verifies the consistency of such a specification.  The abstractions include the data objects and processes in a network management system.  These abstractions are instantiated on network elements.  Network elements are groupe


d together in the specification of
domains of administration.  An extension mechanism is provided to allow for the specification of new management characteristics that the basic language cannot express.  In its prescriptive mode, NMSL generates configuration information directly from a consistent specification.  This information is used to configure network management processes to make their operation consistent with their specifications.  Standard management protocols (such as the emerging ISO or IETF standards) can be used to incorporate 


the configuration information into running management processes.

%A Yannis E. Ioannidis 
%A Timos K. Sellis
%T Conflict Resolution of Rules Assigning Values to Virtual Attributes 
%D March 1989
%R TR 833
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X In the majority of research work done on logic programming and deductive
databases, it is assumed that the set of rules defined by the user is
\fIconsistent\fR, i.e., that no contradictory facts can be inferred by the rules.  In this paper, we address the problem of resolving conflicts of
rules that assign values to virtual attributes.  We devise a general framework for the study of the problem, and we propose an approach that subsumes all previously suggested solutions.  Moreover, it suggests several additional solutions, which very often capture the semantics of the data more accurately than the known approaches.  Finally, we address the issue of how to index rules so that conflicts are resolved efficiently, i.e., only one of the applicable rules is processed at query time.

%A Carl de Boor
%A Amos Ron
%T The Limit at the Origin of a Smooth Function Space 
%D March 1989
%R TR 834
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X We present a map $\fIH \(-> \fIH\(da$ that assigns to  
each finite-dimensional space of smooth functions a homogeneous
polynomial space of the same dimension.  We discuss applications of
this map in the areas of multivariate polynomial interpolation, box
spline theory and polynomial ideals.

%A James R. Goodman
%A Mark D. Hill
%A Philip J. Woest
%T Scalability and its Application to Multicube 
%D March 1989
%R TR 835
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X In parallel processing, scalability is an important issue for
both applications and systems, with scaling arguments being presented
for most large-scale multiprocessors.  Even with such widespread
interest, the notion of what constitutes a scalable multiprocessor
remains ambiguous.
In this paper we first present a realistic, quantitative definition of
scalability based on a multiprocessor's ability to execute a problem in
roughly constant time, given that the number of processors scale in
proportion to the complexity of the problem. We then use the definition
to demonstrate that the Multicube architecture isscalable by showing
that (1) sufficient bus bandwidth is provided to support scalable
applications, and (2) broadcast invalidations and non-uniform memory
accesses are handled efficiently.  We discuss, for example, how the
propagation of broadcast invalidations may be limited by the use of
special \fIpruning\fR caches, and how Multicube's cache coherency
hardware and synchronization primitives can be used to limit the
degradation from non-uniform memory access patterns.

%A Donovan A. Schneider
%A David J. DeWitt
%T A Performance Evaluation of Four Parallel Join Algorithms in a
Shared-Nothing Multiprocessor Environment 
%D April 1989
%R TR 836
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X In this paper we analyze and compare four parallel join algorithms.
Grace and Hybrid hash represent the class of hash-based join methods,
Simple hash represents a looping algorithm with hashing, and our last algorithm is the more traditional sort-merge.  The performance of each of the algorithms
with different tuple distribution policies, the addition of bit vector filters,
varying amounts of main-memory for joining, and non-uniformly distributed join attribute values is studied.  The Hybrid hash-join algorithm is found to be superior except when the join attribute values of the inner relation are non-uniformly distributed and memory is limited.  In this case, a more conservative algorithm such as the sort-merge algorithm should be used.  The Gamma database machine serves as the host for the performance comparison. 

%A Jude W. Shavlik 
%A Gerald F. deJong
%T Learning in Mathematically-Based Domains:  Understanding and
Generalizing Obstacle Cancellations 
%D April 1989
%R TR 837
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X Mathematical reasoning provides the basis for problem solving and learning in
many complex domains.  A model for applying \fIexplanation-based learning\fP
in mathematically-based domains is presented, and an implemented learning
system is described.  In explanation-based learning, a specific problem's
solution is generalized into a form that can be later used to solve
conceptually similar problems.  The presented system's mathematical reasoning
processes are guided by the manner in which variables are cancelled in
specific problem solutions.  Analyzing the cancellation of \fIobstacles\fP -
variables that preclude the direct evaluation of the problem's unknown - leads
to the generalization of the specific solution.  Two important general issues
in explanation-based learning are also addressed.  Namely, generalizing the
number of entities in a situation and acquiring efficiently-applicable
concepts.

%A Amos Ron
%A Carl de Boor
%A Nira Dyn
%T On Two Polynomial Spaces  Associated with a Box Spline 
%D April 1989
%R TR 838
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X The polynomial space @H@ in the span of the integer of a box
spline @M@ admits a well-known characterization as the joint kernel
of a set of homogeneous differential operators with constant coefficients.
The dual space @H*@ has a convenient representation by a polynomial
space @P@, explicitly known, which plays an important role in
box spline theory as well as in multivariate polynomial interpolation.
In this paper we characterize the dual space @P@ as the joint kernel
of simple differential operators, each one a power of a directional derivative.
Various applications of this result to multivariate polynomial
interpolation, multivariate splines and duality between polynomial and
exponential spaces are discussed.

%A Giri Narasimhan
%T A Note on the Hamiltonian Circuit Problem on Directed Path Graphs 
%D April 1989
%R TR 839
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X Bertossi and Bonuccelli [2] proved that the Hamiltonian Circuit
problem is NP-Complete even when the inputs are restricted to the
special class of Undirected Path graphs. The corresponding problem
on Directed Path graphs was left as an open problem. We use a
characterization of Directed Path graphs due to Monma and Wei [8]
to prove that the Hamiltonian Circuit problem and the Hamiltonian
Path problem are NP-Complete on Directed Path graphs.

%A Wuu Yang
%A Susan Horwitz
%A Thomas Reps 
%T Detecting Program Components with Equivalent Behaviors 
%D April 1989
%R TR 840
%I COMPUTER SCIENCES DEPARTMENT, UNIVERSITY OF WISCONSIN
%C MADISON, WI
%X The execution behavior of a program component is defined as the
sequence of values produced at the component during program execution.
This paper presents an efficient algorithm for detecting program components \(mi in \fIone or more\fP programs \(mi that exhibit identical execution behaviors.
The algorithm operates on a new graph representation for programs that combines
features of static-single-assignment forms and program dependence graphs.
The result provides insight into the relationship between
execution behaviors and (control and flow) dependences in the program.
The algorithm, called the \fISequence-Congruence Algorithm\fP, is applicable to
programs written in a language that includes scalar variables and constants,
assignment statements, conditional statements, and while-loops.
The Sequence-Congruence Algorithm can be used as the basis
for an algorithm for integrating program variants.