leff@smu.UUCP (Laurence Leff) (05/26/89)
\documentstyle {article} \setlength{\textwidth}{6.5in} \setlength{\textheight}{9in} \setlength{topmargin}{0in} \setlength{\oddsidemargin}{0in} \setlength{\evensidemargin}{0in} \begin{document} \def\sp{\hskip 0.5 true cm} \noindent {\bf UCSC-CRL-88-01 \sp SIMULATION CHANGES IN A NEW ENVIRONMENT}\\ James L. Murphy\\ March, 1988; 9 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} Computing power continues to increase while cost continues to fall. Processors are faster, memories larger, displays have higher resolution and more colors, software is more portable and machines more easily connected. This paper contrasts the computing resources available to the author in the early seventies with those of the new undergraduate Computer Graphics Laboratory at the University of Califronia at Santa Cruz. The improvements in hardware and software are described and the impact this increased functionality has had on the nature of the simulations now possible are discussed. \noindent {\bf UCSC-CRL-88-02} \sp {\bf SPACE EFFICIENT LEARNING ALGORITHMS}\\ David Haussler\\ September, 1985 (rvsd\frenchspacing. March, 1988); 28 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} We extend Valiant's machine learning model by including a space limitation on the learning algorithm. This forces the algorithm to perform some kind of explicit abstraction or generalization from the sample it receives. Space efficient learning algorithms for a restricted class of decision trees and for certain finite valued functions on the real line are presented, each using a different mechanism of abstraction. These algorithms are analyzed using a general technique for establishing performance bounds that is applicable to a wide class of space efficient learning algorithms. This technique is based on a new result on certain types of finite Markov chains, also given here.\\ {\bf Keywords:} machine learning, learnability, pattern recognition, artificial intelligence, induction, abstraction, generalization, analysis of algorithms, Markov chains, decision trees. \noindent {\bf UCSC-CRL-88-03} \sp {\bf A SELF-ORGANIZING PATTERN RETRIEVAL SYSTEM AND ITS APPLICATIONS}\\ Robert A. Levinson\\ March, 1988; 11 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} This paper describes research on the design and application of a Self-Organizing Pattern Retrieval System. This system is a non-traditional form of database for several reasons: 1. The objects (patterns) stored in the database are complex data structures. 2. The system attempts to optimize its internal structure for retrieval efficiency by taking advantage of classificatory information it discovers about the patterns in the database. 3. The system is representation-independent, allowing the use of any representation scheme as long as a small set of i/o routines and comparison functions for the representation are provided. This system was first applied to the organization of molecules and reactions in organic chemistry. A version of the self-organizing database system is being developed to support the identification and characterization of signals in an RF environment. This paper focuses on the information retrieval requirements of such a system and a proposed solution. The primary application of this system is the guidance and control of NASA's SETI (Search for Extra-Terrestrial Intelligence) Microwave Observing Project. Other applications being pursued include the manipulation and induction of conceptual structures and the storage and retrieval of DNA/protein sequences.\\ {\bf Keywords:} expert systems, knowledge representation, machine learning. \noindent {\bf UCSC-CRL-88-04} \sp {\bf PRACTICAL ARBITRARY LOOKAHEAD LR PARSING}\\ Manuel E. Bermudez and Karl M. Schimpf\\ May, 1988; 17 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} We present a practical technique for computing lookahead for an LR(0) parser, that progressively attempts single-symbol, multi-symbol, and arbitrary lookahead. The technique determines the amount of lookahead required, and the user is spared the task of guessing it. The class of context-free grammars defined by our technique is a subset of the LR-Regular grammars; we show that unlike LR-Regular, the problem of determining whether an arbitrary grammar is in the class, is decidable. When restricted to k-symbol lookahead, the technique has the power of LALR(k) parsers. It has been successfully used to resolve multi-symbol lookahead conflicts in grammars for FORTRAN, Ada, C, COBOL, and PL/I, and its performance compares favorably with that of two well-known, commercially available parser generators. {\bf General Terms:} Algorithms, Languages, Theory. {\bf Keywords:} Context-free grammar, LALR(k), LR-Regular.\\ \noindent {\bf UCSC-CRL-88-05} \sp {\bf COMPUTING SIGNAL DELAY IN GENERAL RC NETWORKS BY TREE/LINK PARTITIONING}\\ Pak K. Chan and Kevin Karplus\\ September, 1988; 18 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} The linear RC delay model is commonly used in timing simulators for MOS digital circuits. Most simulators only handle tree networks, not arbitrary networks. More precisely, these simulators treat networks as if they were trees. Currently, the alternatives are to invert the node-conductance matrix of the RC network numerically, or to provide bounds on signal delays. In this article, we present an algorithm for computing signal delays in general RC networks using the RC-tree computation as the primary operation. The idea is based on a form of Kron's "branch tearing" (commonly called Tree/Link partitioning): we partition a given network into a spanning tree and link branches. Then, we compute the signal delay of the spanning tree, and update the signal delay as we incrementally add the links back to reconstruct the original network. If m is the number of link branches, this algorithm requires m(m+1)/2 updates and m+1 tree delay evaluations. All the tree delay evaluations involve computing signal delays with the same resistive spanning tree, but with different values for the capacitors. This algorithm could be used to build an efficient RC simulator that handles general RC networks, and is also amenable to parallel implementation.\\ \noindent {\bf UCSC-CRL-88-06} \sp {\bf EQUIVALENCE OF MODELS FOR POLYNOMIAL LEARNABILITY}\\ D. Haussler, M. Kearns, N. Littlestone and M. Warmuth\\ September, 1988; 36 pages (\$5.00)\\ \noindent {\bf ABSTRACT:} In this paper we consider several variants of Valiant's learnability model that have appeared in the literature. We give conditions under which these models are equivalent in terms of the polynomially learnable concept classes they define. These equivalences allow quantitative comparisons of most of the existing theorems in Valiant-style learnability. In addition, we prove that several simplifying assumptions on polynomial learning algorithms can be made without loss of generality. We also give a useful reduction of learning problems to the problem of finding consistent hypotheses, and give comparisons and equivalences between Valiant's model and the prediction learning models.\\ \noindent {\bf UCSC-CRL-88-07} \sp {\bf THE MANAGEMENT OF REPLICATION IN A DISTRIBUTED SYSTEM}\\ Darrell Long\\ Spring, 1988 (Ph.D. Thesis); 161 pages (\$10.00)\\ \noindent {\bf ABSTRACT:} The field of consistency control protocols for replicated data objects has existed for about ten years. Its birth coincides with the advent of distributed data bases and the communications technology required to support them. When data objects are replicated around a computer network, a protocol must be chosen to ensure a consistent view to an accessing process. The replicas of the data object are then said to be mutually consistent. The protocols used to insure mutual consistency are known as replica control or consistency control protocols. There are several advantages to a distributed system over a single processor system. Among these are increased computing power and the ability to tolerate partial failures due to the malfunction of individual components. The redundancy present in a distributed system has been the focus of much research in the area of distributed data base systems. Another benefit of this natural redundancy, along with the relatively independent failure modes of the processors, is that it allows the system to continue operation even after some of the processors have failed. This can be used to construct data objects that are robust in the face of partial system failures. The focus of this dissertation is the exploitation of the redundancy present in distributed systems in order to attain an increased level of fault tolerance for data objects. The use of replication as a method of increasing fault tolerance is a well-known technique. Replication introduces the additional complexity of maintaining mutual consistency among the replicas of the data object. The protocols that manage the replicated data and provide the user with a single consistent view of that data are studied, and a comprehensive analysis of the fault tolerance provided by several of the most promising protocols are presented. Several techniques are employed, including Markov analysis and dicrete event simulation. Simulation is used to confirm and extend the results obtained using analytic techniques. {\bf Keywords:} replication, fault-tolerance, distributed data bases.\\ \noindent {\bf UCSC-CRL-88-08} \sp {\bf FEATURE DISCOVERY IN EMPIRICAL LEARNING}\\ Giulia Pagallo and David Haussler\\ October, 1988; 40 pages (\$5.00)\\ \noindent {\bf ABSTRACT:} We present two new methods to learn a Boolean function from examples. The first algorithm uses an iterative strategy to discover features that are used as tests in a decision tree representation of the target concept. The second method uses a greedy top down approach to form a decision list from the primitive literals. We show empirically that these methods outperform a standard decision tree learning algorithm for learning small random DNF functions, when examples are drawn at random from the uniform distribution.\\ \noindent {\bf UCSC-CRL-88-09} \sp {\bf A HIGH-SPEED IBM PC AT TO IBM 3081/E INTERFACE}\\ Kieth S. Huie (Master Thesis)\\ Spring, 1988; 37 pages (\$5.00)\\ \noindent {\bf ABSTRACT:} An overview of the 3081/E processor is given. Methods for interfacing the processor to a host network are discussed. Design goals are stated. A design is proposed for a high-speed 3081/E inter- face using string I/O. Procedures for implementing a new software interface are outlined. Applications for this interface are stated. Details of the new system are discussed and improvements upon the previous system are noted.\\ \noindent {\bf UCSC-CRL-88-10}\sp {\bf THE CONSTRUCTION OF SPARSE NULL MATRIX ERROR CORRECTING CODES}\\ Daniel Warren (Master Thesis)\\ December, 1987; 57 pages (\$6.00)\\ \noindent {\bf ABSTRACT:} Error correcting codes are used in many communications systems. In general, the throughput of these systems is limited by the error correcting component of the receiving end. A massively parallel decoding algorithm, called Algorithm B [20], has been developed, which, when implemented in VLSI, could significantly increase the speed of the error correcting component in many communication situations. However, a fully parallel VLSI implimentation of Algorithm B requires a linear code with a null matrix that is sparse. (A matrix consisting of a spanning set of vectors from the code null space is called a null matrix.) Unfortunately, little is known about good codes with sparse null matrices. In this paper we present two methods for constructing block codes with sparse null matrices and good error correcting capabilities. Since one construction method does not provide information allowing us to determine analytically the resulting code's error correcting capabilities with sufficient accuracy, we have performed simulation experiments, using Algorithm B on a number of these codes, to assess statistically their error correcting capabilities. Chapter 2 contains a brief comparison between some of the decoding algorithms used in exisiting communication systems and Algorithm B. This is followed by a short description of Algorithm B and a justification for Algorithm B's need for sparse null matrix codes when implemented in a fully parrallel manner. We begin Chapter 3 by giving two upper bounds for the rate of a code that are functions of the sparseness the code's null matrix. (The rate of block code is the ratio of the number of information bits per block to the total number of bits per block.) We then argue why, in general, cyclic codes fail to produce null matrices that are very sparse and fail to be very promising in the light of the analytical analyses that provide lower bounds for a code's error correcting ability when used with Algorithm B. The code construction methods are presented in chapter 4 along with a list of the parameters for many codes constructible by these methods. Chapter 5 contains the details of the simulation experiments and several tables of results.\\ \noindent {\bf UCSC-CRL-88-11}\sp {\bf A REALISTIC MODEL OF REFRACTION FOR COMPUTER GRAPHICS}\\ Forest Kenton Musgrave (Master Thesis)\\ September, 1987; 60 pages (\$6.00)\\ \noindent {\bf ABSTRACT:} A model which takes into account the relationship of the frequency of light to the index of refraction in transparent media, is developed as an extension of the distributed ray tracing model. A model of atmospheric rainbows is presented. The problem of representing an approximation of the spectrum of monochromatic colors on a color graphics monitor is discussed, within the context of modelling dispersion. {\bf Keywords:} Dispersion, refraction, stochastic sampling, distributed ray tracing, spectrum, color gamut.\\ \noindent {\bf UCSC-CRL-88-12}\sp {\bf MULTIPROCESSING USING WORKSTATIONS WITH A SHARED FILE SYSTEM}\\ Brian Frederick Hanks (Master Thesis)\\ September, 1987; 27 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} This thesis describes a method for exploiting large grain parallelism on a network of workstations. Parallelism is provided by a collection of C callable library functions and macro expansions, giving the applications programmer the ability to create, execute, and synchronize processes using synchronous message passing. A program using these library calls consists of a main process and a set of created subprocesses executing on multiple workstations in the network. Significant speedup has been obtained with this library.\\ \noindent {\bf UCSC-CRL-88-13}\sp {\bf AUTOMATIC BIRD SONG RECOGNITION}\\ Edward R. Oswalt (Master Thesis)\\ June, 1988; 65 pages (\$6.00)\\ \noindent {\bf ABSTRACT:} This thesis presents a computer system capable of identifying bird species from recordings of calls or songs typical of the species. Each recording is converted into a list of chirp feature vectors (one vector for each chirp), which is then converted into a single song feature vector. In a sample of 225 recordings of 28 species, a statistical system based on these vectors correctly classifies 125 of the recordings (56%), and evidence is given that with a larger sample set better performance would occur. Due to the many kinds of variation in bird utterances, finding a single highly accurate and general approach to bird utterances, finding a single highly accrate and general approach to bird species recognition appears to be more difficult than finding such an approach to human speech recognition. In addition to analysis based on the chirp feature list, spectral analysis based on Fourier and Linear Predictive methods and the physiology of bird song production is discussed.\\ \noindent {\bf UCSC-CRL-88-14}\sp {\bf MOLECULAR AUTOMATA}\\ Don Kimber (Master Thesis)\\ June, 1988; 196 pages (\$10.00)\\ \noindent {\bf ABSTRACT:} We define a class of abstract systems called molecular automata, and investigate the behavior of these systems by exploiting the analogy between them and physical systems. We demonstrate that pattern formation in molecular automata directly corresponds to learning, and describe an implementation of a learning system based on molecular automata. The learning domain chosen in the inference of the statistical structure of symbol sequences produced by markov sources and presented as input to the molecular automata. The interpretation of each pattern in a molecular automata system as a hypothesized markov source and the process by which it may be used to predict future symbols is given. By patterns we refer to classes of states which correspond to equivalent or related markov sources. An information theoretic metric is proposed for performance evaluation. We formally specify a process by which molecular automata systems evolve and demonstrate that the process leads to equilibrium conditions in which the most probable patterns are those with the greatest predictive value with respect to the performance metric. Kirkpatrick [Kirk], Hopfield [Hop82] [Hop85], and others have used a similar approach in the study of optimization problems, and Smolensky [PDP] and Hinton [PDP] have used this approach in the context of learning and pattern recognition. Kirkpatrick and Smolensky have both suggested the analogy of phase transitions but have not presented complete treatments of the analogy. We extend these previous treatments by formally defining the notions of phases (i.e. Macrostates or equivalence classes of states) and open systems (systems allowed to exchange resources with their environment at controlled resource costs) and presenting some results relevant to these extensions. In particular, we quantitatively specify the relative probability of various phases in terms of quantity analogous to Free Energy in physical systems, and demonstrate the relationship between the temperature, chemical potential (resouce cost) and dominant phase. Although our treatment is developed for molecular automata systems, care is taken to develop a general treatment, and consequently some of our results apply to other classes of 'Boltzmann Machines'. Full justification of the statistical mechanical treatment of molecular automata is given. The role and applicability of the analogy between equilibrium physical systems and molecular automata systems is discussed in depth.\\ \noindent {\bf UCSC-CRL-88-15}\sp {\bf Molsurf}\\ Kenneth W. Cluff (Master Thesis)\\ March, 1988; 69 pages (\$6.00)\\ \noindent {\bf ABSTRACT:} The MOLSURF molecular graphics system is a computer program which uses raster graphics techniques to create space filling images of proteins on standard computer systems. Images are composed and manipulated through an interactive front end accessed over a normal terminal, and then rendered to image files or a graphics display device upon command. Effects such as sidelighting, shadows, and highlights are used to increase the three-dimensional information content of the images produced. Chemical properties, either from the original protein data bank file or user created tables, can be displayed on the protein surface via color mapping, and spatial relationships can be similarly indicated using a color mapping to show the proximity of structures. Even with these advanced capabilities, MOLSURF is fast enough to be comfortably used as an interactive system. This thesis describes both the implementation and operation of MOLSURF in detail and is intended to serve as guide for users of the system. Section two provides a brief description of proteins and protein structure to both show the motivation for programs such as MOLSURF and to provide some necessary background information. Section three then briefly reviews the history and current state of molecular graphics to give an ideo of the historical roots of MOLSURF and place it amongst the current generation of molecular graphics systems. The next section lays out the initial design considerations arrived at from a combination of the ideas presented in the literature, experience with other systems, research interests in biochemistry, and knowledge of the capabilities of modern computer graphics techniques. These early decisions guided the course of development that eventually lead to the MOLSURF system which section five and six describe in detail. Section five discusses the details of the interactive front end, and section six the underlying implementation of the MOLSURF renderer. Included in appendix one is a complete description of the MOLSURF command set. \\ \noindent {\bf UCSC-CRL-88-16}\\p {\bf THE WELL-FOUNDED SEMANTICS FOR GENERAL LOGIC PROGRAMS}\\ Allen Van Gelder, Kenneth A. Ross and John S. Schlipf\\ August, 1988; 13 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} A general logic program (abbreviated to "program" hereafter) is a set of rules that have both positive and negative subgoals. It is common to view a deductive database as a general logic program consisting of rules (IDB) sitting above elementary relations (EDB, facts). It is desirable to associate one Herbrand model with a program and think of that model as the "meaning of the program," or its "declarative semantics." Ideally, queries directed to the program would be answered in accordance with this model. Recent research indicates that some programs do not have a "satisfactory" total model; for such programs, the question of an appropriate partial model arises. We introduce unfounded sets andwell-founded partial models, and define the well-founded semantics of a program to be its well- founded partial model. If the well-founded partial model is in fact a total model, we call it the well- founded model, and say informally that the program is "well-behaved." We show that the class of well- behaved programs properly includes previously studied classes of "stratified" and "locally stratified" programs. We also compare our method with other proposals in the literature, including Clark's "program completion," Fitting's 3-valued interpretation of the program completion, and the "stable models" of Gelfond and Lifschitz. A preliminary version of this report appeared in {\sl Seventh ACM Symposium on Principles of Database Systems}, pages 221-230, 1988.\\ \noindent {\bf UCSC-CRL-88-17}\sp {\bf THE ALTERNATING FIXPOINT OF LOGIC PROGRAMS WITH NEGATION}\\ Allen Van Gelder\\ October, 1988; 13 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} Horn clause programs have an intuitive and well-accepted semantics defined by the least fixpoint of an operator that essentially performs modus ponens reasoning. Several early attempts to extend this operator to programs with negative subgoals ran into problems of one sort or another. Two recent proposals to improve matters are named "stable models", due to Gelfond and Lifschitz, and "well- founded partial models", due to Van Gelder, Ross, and Schlipf. Both stable models and well-founded partial models were defined somewhat nonconstructively, in the sense that certain sets could be recognized if presented, but no algorithm to construct them from the program was given. In this paper we introduce and describe the alternating fixpoint of a logic program. The underlying idea is to monotonically build up a set of negative conclusions until the least fixpoint is reached, using a transformation related to the one that defines stable models. From a fixed set of negative conclusions, we can derive the positive conclusions that follow (without deriving any further negative ones), by traditional Horn clause semantics. The union of positive and negative conclusions is called the alternating fixpoint partial model. The name "alternating" was chosen because the transformation runs in two passes; the first pass transforms an underestimate of the set of negative conclusions into an (intermediate) overestimate; the second pass transforms the overestimate into a new underestimate; the composition of the two passes is monotonic. Our main theorem is that the alternating fixpoint partial model is exactly the well-founded partial model. We also show that a system in fixpoint logic, which permits rule bodies to be first order formulas but requires inductive relations to be positive within them, can be transformed straightforwardly into a normal logic program whose alternating fixpoint partial model corresponds to the least fixpoint of the fixpoint logic system. This report will appear in {\sl Eighth ACM Symposium of Principles of Database Systems}, March 1989.\\ \noindent {\bf UCSC-CRL-88-18}\sp {\bf THE RELIABILITY OF REGENERATION-BASED REPLICA CONTROL PROTOCOLS}\\ Darrell D.E. Long, John L. Carroll and Kris Stewart\\ October, 1988; 15 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} The accessibility of vital information can be enhanced by replicating the data on several sites, and employing a consistency control protocol to manage the copies. The most common measures of accessibility include reliability, which is the probability that a replicated data object will remain continuously accessible over a given time period, and availability, which is the steady-state probability that the data object is accessible at any given moment For many applications, the reliability of a system is a more important measure of its performance that its availability. These applications are characterized by the property that interruptions of service are intolerable and usually involve interaction with real-time processes, such as process control or data gathering where the data will be lost if it is not captured when it is available. The reliability of a replicated data object depends on maintaining a viable set of current replicas. When storage is limited it may not be feasible to simply replicate a data object at enough sites to achieve the desired level of reliability. If new replicas of a data object can be created faster than a system failure can be repaired, better reliability can be achieved by creating new replicas on other sites in response to changes in the system configuration. This technique, known as regeneration, approximates the reliability provided by additional replicas for a modest increase in storage costs. Several strategies for replica maintenance are considered, and the benefits of each are analyzed. While the availability afforded by each of the protocols is quite similar, the reliabilities vary greatly. Formulas describing the reliability of the replicated data object are presented, and closed-form solutions are given for the tractible cases. Numerical solutions, validated by simulation results, are used to analyze the trade-offs between reliability and storage costs. With estimates of the mean time to site failure and repair is a given system, the numerical techniques presented here can be applied to predict the fewest number of replicas to provide the desired level of reliability. {\bf Keywords:} replication, fault-tolerance, distributed data bases.\\ \noindent {\bf UCSC-CRL-88-19}\sp {\bf ANALYSIS AND DESIGN OF CMOS MANCHESTER ADDER WITH VARIABLE CARRY-SKIP}\\ Pak K. Chan and Martine D.F. Schlag\\ November, 1988; 22 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} Carry-skip adders compare favorably with any other adders for efficient VLSI implementation. Lehman has shown that carry-skip adders with variable size blocks are faster than adders with constant size blocks. In 1967, Majerski suggested that multi-level implementation of the variable size carry-skip adders would provide further improvement in speed. Almost two decades later, Oklobdzija and Barnes developed algorithms for determining the optimal block sizes for one-level and two-level implementation, and generaliation of the results were given by Guyot et al. In these analyses, they assumed all adder cells were implemented with static (restoring) logic. Therefore the ripple-carry propagation delay is linearly proportional to the size of a block. In the contrary, a more popular VLSI adder implementation is the Manchester Adder using dynamic (precharge) logic, where the ripple propagation is proportional to the square of the size of a block. We shall examine two variants of the Manchester Adder. We analyze them with the RC timing models, which provides us a unified way of analyzing both CMOS circuits and interconnect. Finally, we apply these timing models to derive closed- form formula to determine the optimal size blocks for the one-level case. {\bf Keywords:} Manchester adder, carry skip adder, RC timing model, VLSI design.\\ \noindent {\bf UCSC-CRL-88-20}\sp {\bf AN OPTIMAL POLYNOMIAL-TIME ALGORITHM FOR FINDING MINIMUM VERTEX COVERS OF GRAPHS THAT HAVE NO ODD CYCLES THAT SHARE AN EDGE}\\ Robert A. Levinson\\ November, 1988; 15 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} We present an optimal 0(n3/2) algorithm, "0", for finding minimum vertex covers of graphs that have no odd cycles that share an edge. Optimal polynomial-time algorithms already exist for graphs bounded by degree 2 (0(n)), bipartite graphs (0(n3/2)). Thus algorithm O extends the size of the class of graphs for which we know how to find optimal vertex covers in polynomial-time. Algorithm 0 frequently invokes the optimal algorithm for bipartite graphs. Optimal 0(n) vertex cover algorithms for weighted cycles and weighted linear graphs are also used by algorithm 0. They are also presented. {\bf Keywords:} NP-completeness, vertex cover, bipartite graph.\\ \noindent {\bf UCSC-CRL-88-21}\sp {\bf DEBUGGING CONCURRENT PROGRAMS}\\ Charles E. McDowell and David P. Helmbold\\ November, 1988; 62 pages (\$6.00)\\ \noindent {\bf ABSTRACT:} The main problems associated with debugging concurrent programs are increased complexity, the 'probe effect', non-repeatability and the lack of a synchronized global clock. The probe effect refers to the fact that any attempt to observe the behavior of a distributed system may change the behavior of the system. For some parallel programs different executions with the same data will result in different results even without any attempt to observe the behavior. Even when the behavior can be observed, the lack of a synchronized global clock in many systems makes the results of the observation difficult to interpret. This paper discusses these and other problems related to debugging concurrent programs, and presents a survey of current techniques used in debugging concurrent programs. Systems using three general techniques are described: traditional or breakpoint style debuggers, event monitoring systems, and static analysis systems. In addition, techniques for limiting, organizing and displaying the large amount of data produced by the debugging systems are discussed. An annotated bibliography of over forty papers is included.\\ \noindent {\bf UCSC-CRL-88-22}\sp {\bf DETECTION OF MULTIPLE FAULTS IN MOS CIRCUITS}\\ Joel Ferguson\\ December, 1988; 23 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} This report characterizes test sets that detect multiple faults in MOS circuits guided by the observation that such circuits are implemented as a network of switches. This leads to a conceptionally simple technique for generating multiple-fault test sets. Sufficient conditions for the detection of all multiple faults are given for any switch network, and it is shown that a test set exists meeting these conditions for any irredundant circuit with certain restrictions on input fanout. Optimal test sets are generated for two classes of fanout-free circuits using this procedure. In the cases where these conditions cannot be met, a procedure is given that shows which multiple faults may not be detected by a class of "robust" test sets. The procedures presented in this paper generate superior (100% multiple fault coverage with less test vectors) test sets for a large class of MOS complex gates than is possible using a gate-level description of the circuit. {\bf Keywords:} MOS faults, test vector generation, multiple faults.\\ \noindent {\bf UCSC-CRL-88-23}\sp {\bf REGENERATION PROTOCOLS FOR REPLICATED OBJECTS}\\ Darrell D.E. Long and Jehan-Francois Paris\\ October, 1988; 8 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} The reliability and availability of replicated data can often be increased by generating new replicas when some become inaccessible due to system malfunctions. This technique has been used in the Regeneration Algorithm, a replica control protocol based on file regeneration. The read and write availabilities of replicated data managed by the Regeneration Algorithm are evaluated and two new regeneration protocols are presented that over-come some of its limitations. The first protocol combines regeneration and the Available Copy approach to improve availability of replicated data. The second combines regeneration and the Dynamic Voting approach to guarantee data consistency in the presence of network partitions while maintaining a high availability. Expressions for the availabilities of replicated data managed by both protocols are derived and found to improve significantly on the availability achieved using extant consistency protocols. {\bf Keywords:} file consistency, fault-tolerant systems, replicated files, mutual exclusion, performance evaluation.\\ \noindent {\bf UCSC-CRL-88-24}\sp {\bf MULTIRESOLUTION DISSOLVE}\\ Charles Stein and Lewis Hitchner\\ December, 1988\sp 10 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} Dissolves from one digital image to another exhibiting many varied special effects can be achieved by using the Multiresolution Dissolve. These effects include cross-dissolves of edge details before coarse image features, cross-dissolves of scene illumination before other image details, and a new smooth dissolve that reduces the double exposed look of standard cross-dissolves. Our method produces new and interesting results when applied to fading between background and titles or images. The Multiresolution Dissolve generates a series of bandpass filtered images of start and end images and then produces in-between frames of the dissolve sequence by summing the bandpassed images with weights that vary over time and by frequency band. Computational time and space are greatly reduced by use of multiresolution pyramid image processing techniques. \noindent {\bf UCSC-CRL-88-25}\sp {\bf THE DISTRIBUTED BIT COMPLEXITY OF THE RING: FROM THE ANONYMOUS TO THE NON- ANONYMOUS CASE}\\ Hans L. Bodlaender, Shlomo Moran and Manfred K. Warmuth\\ November, 1988; 16 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} The distributed bit complexity of an asynchronous network of processors is a lower bound on the worst case bit complexity of computing any non-constant function of the inputs to the processors. This concept attempts to capture the amount of communication required for any "useful" computation on the network. The aim of this kind of research is to characterize networks by their bit complexity. Moran and Warmuth studied the bit complexity of a ring of n processors under the assumption that all the processors in the ring are identical (anonymous), i.e. all processors run the same program and the only parameter to the program is the input of the processor. It was shown that for anonymous rings it takes W(n logn) bits to compute any non-constant function. We would like to release the assumption that the processors are anonymous by allowing each of the processors to have a distinct identity (which becomes a second parameter to the program). If the set of possible identities grows double exponentially in n, then by a simple reduction to the anonymous case one can show that the lower bound holds as well. In this paper we show that the W(n logn) bit lower bound for computing any non-constant function holds even if the set of possible identities is very small, that is n 1+e, for any positive e.\\ \noindent {\bf UCSC-CRL-88-26}\sp {\bf PREDICTION PRESERVING REDUCIBILITY}\\ Leonard Pitt and Manfred K. Warmuth\\ November, 1988; 38 pages (\$5.00)\\ \noindent {\bf ABSRACT:} Consider the following question: Given examples of words accepted and rejected by an unknown automaton, is there an algorithm that in a feasible amount of time will learn to predict which words will be accepted by the automaton? We develop a notion of prediction-preserving reducibility, and show that if DFAs are predictable, then so are all languages accepted in logarithmic space. In particular, the predictability of DFAs implies the predictability of all Boolean formulas. Similarly, predictability of NFAs, PDAs (or CFGs), and alternating DFAs implies predictability of all languages in nondeterministic logspace, logcfl, and polynomial time, respectively. We give relationships between the complexity of the membership problem for a class of automata, and the complexity of the prediction problem. Using prediction-preserving reductions, we determine the relative complexity of a number of additional prediction problems. Examples are given of prediction problems that are "prediction-complete" for P, i.e. whose predictability would imply the predictability of all languages accepted in polynomial time. It follows, under the assumption that there exists cryptographically secure pseudorandom bit generators that all such prediction problems are not polynomially predictable, even under a relaxed definition of predictability. We discuss recent related results of Kearns and Valiant which show that Boolean formulas and DFAs are not polynomially predictable based on certain cryptographic assumptions such as the intractability of factoring Blum integers. \noindent {\bf UCSC-CRL-88-27}\sp {\bf A TABLE OF EFFICIENT TRUNCATED BCH CODES}\\ Paul A. Monte and R. Michael Tanner\\ November, 1988; 30 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} Tanner has presented a transform theory that permits the construction and analysis of a large class of group-invariant codes. A subclass of these codes are shortened and punctured BCH codes, in which both information symbols and parity-check symbols have been deleted. A truncated BCH code is constructed such that 1) the BCH bound provides a lower bound on the minimum distance of the code, 2) the code rate and length are easily determined, and 3) the code can be decoded to its BCH design distance using any of the standard decoding algorithms for BCH codes, such as the Berlekamp-Massey algorithm. This paper presents a table of selected truncated binary BCH codes found on fields GF(212) and smaller with minimum distances less than 50. The codes included are significantly better than shortened primitive BCH codes; many have parameters that approach or surpass those of the best codes in the tables of MacWilliams and Sloane and of T. Verhoeff. {\bf Keywords:} error-correction, codes, BCH, quasi-cyclic, shortened, group-invariant.\\ \noindent {\bf UCSC-CRL-88-28}\sp {\bf REPRESENTING BOOLEAN FUNCTIONS WITH IF-THEN-ELSE DAGs}\\ Kevin Karplus\\ November, 1988; 24 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} This article describes the use of binary decision diagrams (BDDs) and if-then-else DAGs for representing and manipulating Boolean functions. Two-cuts are defined for binary decision diagrams, and the relationship is exhibited between general if-then-else expressions and the two-cuts of a BDD for the same function. An algorithm for computing all two-cuts of a BDD in O(n2) time is given. A new canonical form for if-then-else DAGs, analogous to Bryant's canonical form for BDDs, is introduced. The canonical form is based on representing the lowest non-trivial two-cut in the corresponding BDD, while Bryant's canonical form represents the highest two-cut. Expressions in Bryant's canonical form or in the new canonical form are shown to be prime and irredundant. Some applications of if-then-else DAGs to multi-level logic minimization are given, and the Printform transformations for reducing the complexity of if-then-else DAGs are presented. {\bf Keywords:} Boolean functions, Binary decision diagrams, if-then-else trees, if-then-else DAGs, two-cuts, Bryant's canonical form.\\ \noindent {\bf UCSC-CRL-88-29}\sp {\bf USING IF-THEN-ELSE DAGs FOR MULTI-LEVEL LOGIC MINIMIZATION}\\ Kevin Karplus\\ November, 1988; 21 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} This article describes the use of if-then-else DAGs for multi-level logic minimization. A new canonical form for if-then-else DAGs, analogous to Bryant's canonical form for binary decision diagrams (BDDs), is introduced. Two-cuts are defined for binary decision diagrams, and a relationship is exhibited between general if-then-else expressions and the two-cuts of a BDD for the same function. The canonical form is based on representing the lowest non-trivial two-cut in the corresponding BDD, instead of the highest two-cut, as in Bryant's canonical form. The definitions of prime and irredundant expressions are extended to if-then-else DAGs. Expressions in Bryant's canonical form or in the new canonical form can be shown to be prime and irredundant. Objective functions for minimization are discussed, and estimators for predicting the area and delay of the circuit produces after technology mapping are proposed. A brief discussion of methods for applying don't-care information and for factoring expressions is included. {\bf Keyword:} Logic minimization.\\ \noindent {\bf UCSC-CRL-88-30}\sp {\bf PERFORMANCE BENEFITS OF PARALLELISM IN CACHED DASD CONTROLLERS}\\ Alexandre Brandwajn\\ November, 1988; 16 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} In recent months, several mainframe disk manufacturers have introduced or announced a number of DASD products, including new cached disk controllers, such as the IBM 3990 Model 3 or the Amdahl 6100 Storage Processor. Among the several striking performance feature of these new generation controllers is their number of internal transfer paths, and the resulting degree of concurrency of I/O operations. The use of multiple cache access ports is expected to reduce storage path contention, and hence, I/O service time, since a number of cache hits can be served overlapped with the "hidden" portion of staging during which the remainder of the track following the missing record is read into the cache. Such an overlap may be limited to I/Os involving different devices (this seems to be the case in current cache controllers), or it may be extended to include hit-staging overlap for the same logical device. The goal of this note is twofold. First, to present an assessment of the expected performance impact of both limited and extended overlap between staging and cache hits using numerical results obtained from discrete-event simulations. The second goal is to propose an analytical model for path and device contention with overlaps. The accuracy of our model, believed to be novel, is also discussed. {\bf Keywords:} Disk subsystems, cached I/O, multiple cache ports, service overlaps, performance modeling, queueing models.\\ \noindent {\bf UCSC-CRL-88-31}\sp {\bf 0-1 LAWS AND DECISION PROBLEMS FOR FRAGMENTS OF SECOND-ORDER LOGIC}\\ Phokion G. Kolaitis and Moshe Y. Vardi\\ December, 1988; 35 pages (\$5.00)\\ \noindent {\bf ABSTRACT:} The probability of a property of the collection of all finite relational structures is the limit as n -> 0 of the fraction of structures with n elements satisfying the property, provided the limit exists. It is known that a 0-1 law holds for any property expressible in first-order logic, i.e. the probability of any such property exists and is either 0 or 1. Moreover, the associated decision problem for the probabilities is PSPACE-complete. We investigate here fragments of existential second-order in which we restrict the patterns of first- order quantifiers. We focus on the class S 1 existential second-order sentences in which the first-order part belongs to the Ackermann class, i.e., it contains at most one universal first-order quantifier. All properties expressible by S 1 sentences are NP-computable and there are natural NP-complete properties, such as SATISFIABILITY, that are expressible by such sentences. We establish that the 0-1 law holds for the class S 1 and show that the associated decision problem in NEXTTIME-complete. We also show that the 0-1 law fails for other fragments of existential second-order logic in which the first-order part belongs to certain prefix classes with an unsolvable decision problem.\\ \noindent {\bf UCSC-CRL-88-32 \sp A SATISFIABILITY TESTER FOR NON-CLAUSAL PROPOSITIONAL CALCULUS}\\ Allen Van Gelder\\ June, 1988; 21 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} An algorithm for satisfiability testing in the propositional calculus with a worst case running time that grows at a rate less than 2(0.25 + e) L is described. where L can be either the length of the input expression or the number of occurences of literals (i.e., leaves) in it. This represents a new upper bound on the complexity of non-clausal satisfiability testing. The performance is achieved by using lemmas concerning assignments and pruning that preserve satisfiability, together with choosing a "good" variable upon which to recur. For expressions in conjunctive normal form, it is shown that an upper bound is 21.28 L. This report appeared in Information and Computation, Vol. 79, No.1, October 1988.\\ \noindent {\bf UCSC-CRL-88-33}\sp {\bf A MODEL OF LEARNING BASED ON THE PRINCIPLE OF COGNITIVE ECONOMY}\\ Aleksandar Milosavljevic\\ December, 1988; 35 pages (\$5.00)\\ \noindent {\bf ABSTRACT:} By the principle of cognitive economy, the goal of learning is the minimization of cognitive effort in saving a set of observations. The simpler the knowledge structure that results from learning, the lower the cognitive effort. A knowledge structure consists of a set of prototypes that capture the structure in the observations. We propose a class of models of unsupervised learning that is based on the principle of cognitive economy. A model of learning from that class is obtained by choosing a particular measure of cognitive effort. There are many possible measures of cognitive effort. For a class of discrete probability distributions and for particular choices of the measure of cognitive effort we obtain models of learning identical to maximum likelihood and bayesian estimators. We implement a learning algorithm that minimizes a simple measure of cognitive economy. The performance of such algorithm correlates with human performance in a learning experiment. Also, we discover the classes of Alu genetic sequences by applying this algorithm. Finally, we demonstrate its robustness to noisy, irrelevant and missing data.\\ \noindent {\bf UCSC-CRL-88-34}\sp {\bf RELIABILITY OF REPLICATED DATA OBJECTS}\\ Darrell D. E. Long, Jehan-Francois Paris and John L. Carroll\\ December, 1988; 5 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} Improved fault tolerance of many applications can be achieved by replicating data at several sites. This data redundancy requires a protocol to maintain the consistency of the data object in the presence of site failures. The most commonly used scheme is voting. Voting and its variants are unaffected by network partitions. When network partitions cannot occur, better performance can be achieved with available copy protocols. Common measures of dependability include reliability, which is the probability that a replicated object will remain constantly available over a fixed time period. We investigate the reliability of replicated data objects managed by voting, available copy and their variants. Where possible, closed- form expressions for the reliability of the various consistency protocols are derived using standard Markovian assumptions. In other cases, numerical solutions are found and validated with simulation results. {\bf Keywords:} replication, fault-tolerance, distributed data bases.\\ \noindent {\bf UCSC-CRL-88-35}\sp {\bf THE EFFECT OF FAILURE AND REPAIR DISTRIBUTIONS ON CONSISTENCY PROTOCOLS FOR REPLICATED DATA OBJECTS}\\ John L. Carroll and Darrell D. E. Long\\ December, 1988; 13 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} The accessibility of vital information can be enhanced by replicating the data on several sites, and employing a consistency control protocol to manage the copies. Various consistency control schemes have been proposed to ensure that out of date information cannot be accessed or stored. The effect these protocols have on the accessibility of the replicated data can be investigated by simulating the operation of the network and measuring the performance of these consistency control schemes. Several strategies for replica maintenance are considered, and the benefits of each are analyzed. The details of the simulation are discussed. Measurements of the mean time to failure and the availability of the replicated data are compared and contrasted. The sensitivity of the Available Copy and Linear-Dynamic Voting protocols and their variants to common patterns of site failures and distributions of read and write accesses is studied in detail. Constant, Erlang, uniform, and hyperexponential distributions are considered, and the effect the second moments have on the results is outlined. The relative performance of competing protocols is shown to be only marginally affected by non-exponential distributions, validating the robustness of the exponential approximations. This paper is to appear in the Twenty-Second Annual Simulation Symposium. {\bf Keywords:} replication, fault-tolerance, distributed data bases.\\ \noindent {\bf UCSC-CRL-88-36}\sp {\bf WHAT SIZE NET GIVES VALID GENERALIZATION?}\\ Eric B. Baum and David Haussler\\ November, 1988; 11 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} We address the question of when a network can be expected to generalize from m random training examples chosen from some arbitrary probability distribution, assuming that future test examples are drawn from the same distribution. Among our results are the following bounds on appropriate sample vs. network size. Assume 0<e21/8. We show that if m 3 0(W/e log N/e) random examples can be loaded on a feedforward network of linear threshold functions with N nodes and W weights, so that at least a fraction 1 - e/2 of the examples are correctly classified, then one has confidence approaching certainty that the network will correctly classify a fraction 1 - e of future test examples drawn from the same distribution. Conversely, for fully-connected feedforward nets with one hidden layer, any learning algorithm using fewer than W(W/e) random training examples will, for some distributions of examples consistent with an appropriate weight choice, fail at least some fixed fraction of the time to find a weight choice that will correctly classify more than a 1 - e fraction of the future test examples.\\ \noindent {\bf UCSC-CRL-88-37}\sp {\bf THE EXPRESSIVE POWER OF WELLFOUNDED FIXPOINT LOGIC}\\ Allen Van Gelder\\ November, 1988; 13 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} Extensions of first order logic by various fixpoint operators have been studied extensively for both infinite and finite structures. Previously studied extensions that provide for a fixpoint operator on nonpositive formulas include inductive fixpoint logic (IFP), well-founded fixpoint logic (WFP), and others. We give constructive "simulation" that demonstrates that WFP e IFP (in expressive power) when the closure ordinal of the IFP system is at most W. Experience so far indicates that most concepts involving nonmonotonic induction have a more concise and natural expression in WFP than in IFP. \noindent {\bf UCSC-CRL-88-38}\\ \noindent {\bf C++ AS A SOLUTION TO C'S SHORTCOMINGS}\\ \noindent Ira Pohl and Daniel Edelson\\ \noindent December, 1988\\ \noindent 20 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} C++ extends a modern, general purpose programming language into the object oriented programming paradigm. It inherits both benefits and conflicts from its parent language. Advantages and disadvantages of traditional C have been described in "A-Z: C Language Shortcomings" (UCSC technical report UCSC-CRL-87-31.) This paper accesses C++, examining ways in which it is a better general purpose programming language. We discuss how well C++ supports the object oriented programming paradigm. $ 1 introduces new C++ features. $ 2 discusses ways C++ improves on C. $ 3 examines philosophical and practical problems with C++ that are not present in C. $ 4 assesses C++. \noindent {\bf Keywords:} object-oriented programming, language assessment, C, C++.\\ \noindent {\bf UCSC-CRL-88-39}\sp {\bf LEARNING CONJUNCTIVE CONCEPTS IN STRUCTURAL DOMAINS}\\ David Haussler\\ December, 1988; 25 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} We study the problem of learning conjunctive concepts from examples on structural domains like the blocks world. This class of concepts is formally defined and it is shown that even for samples in which each example (positive or negative) is a two-object scene it is NP-complete to determine if there is any concept in this class that is consistent with the sample. We demonstrate how this shows that it is unlikely that this class of concepts is polynomially learnable from random examples alone in the PAC framework of Valiant. On the other hand, we show that for any fixed bound on the number of objects per scene, this class is polynomially learnable if, in addition to providing random examples, we allow the learning algorithm to make subset queries. In establishing this result, we calculate the capacity of the hypothesis space of conjunctive concepts in a structural domain, and use a general theorem of Vapnik and Chervonenkis. This latter result can also be used to estimate a sample size sufficient for heuristic learning techniques that do not use queries. This paper is in part a revised version of Technical Report UCSC-CRL-87-1, UC Santa Cruz, February, 1987.\\ \noindent {\bf UCSC-CRL-88-40}\sp {\bf DESCRIPTIONS OF SOME NUMERICALLY-INTENSIVE COMPUTER APPLICATIONS}\\ Harwood Kolsky and Douglas Hellinger\\ December, 1988; 70 pages (\$6.00)\\ \noindent {\bf ABSTRACT:} As part of a Jointly Defined Effort between IBM and UCSC, a set of numerically-intensive computations were solicited from members of the UCSC faculty and their graduate students. The selected principal investigators were given grants of computer time and other facilities, including the opportunity to use the UCSC home-built set of parallel processors (called SC-2), in exchange for sharing their experiences and giving their recommendations. This report is a collection of descriptions of the projects from various scientific disciplines as they were originally submitted, together with comments concerning parallelization, etc.\\ \noindent {\bf UCSC-CRL-88-41}\sp {\bf CONSTRAINED BACK-PROPAGATION, AN ALGORITHM FOR LEARNING IN ARTIFICIAL NEURAL NETWORKS}\\ Jean-Guilhem Cailton (Master Thesis)\\ June, 1988; 35 pages (\$5.00)\\ \noindent {\bf ABSTRACT:} In the last few years, there has been a strong renewal of interest in the interdisciplinary field of "artificial neural networks". A combination of reasons seems to explain this phenomenon. The appearance of parallel machines at the same time provides efficient implementation and encourages research for a better understanding of massive parallelism. The limitations of "classical" artificial intelligence in the domains of perception and learning stimulate interest in different methods and theoretical frameworks to tackle these problems. The discovery of new learning algorithms overcomes the limitations of earlier models.\\ \noindent {\bf UCSC-CRL-88-42}\sp {\bf THE (N2-1)-PUZZLE and RELATED RELOCATION PROBLEMS}\\ Daniel Ratner and Manfred Warmuth\\ December, 1988; 24 pages (\$4.00)\\ \noindent {\bf ABSTRACT:} The 8-puzzle and the 15-puzzle have been used for many years as a domain for testing heuristic search techniques. From experience it is known that these puzzles are "difficult" and therefore useful for testing search techniques. In this paper we give strong evidence that these puzzles are indeed good test problems. We extend the 8-puzzle and the 15-puzzle to an nxn board and show that finding a shortest solution for the extended puzzle is NP-hard and is thus believed to be computationally infeasible. We also sketch an approximation algorithm for transforming boards that is guaranteed to use no more than a constant times the minimum number of moves, where the constant is independent of the given boards and their size n. The studied puzzles are instances of planar relocation problems where the reachability question is polynomial but efficient relocation is NP-hard. Such problems are natural robotics problems: A robot needs to efficiently relocate packages in the plane. Our research encourages the study of polynomial approximation algorithms for related robotics problems.\\ \noindent {\bf UCSC-CRL-88-43}\sp {\bf SEARCHIN A PARTIALLY-ORDERED KNOWLEDGE BASE OF COMPLEX OBJECTS}\\ Bernard Riff (Master Thesis)\\ December, 1988; 120 pages (\$10.00)\\ \noindent {\bf ABSTRACT:} - NOT AVAILABLE -\\