calton@cs.columbia.edu (Calton Pu) (11/01/88)
[ Please remember to reply to the author of this article for copies of this ] [ TR -- not to me! --DL ] Technical Report No. CUCS-381-88 Title: Fine-Grain Scheduling Abstract We introduce the concept of fine-grain scheduling. Conventional scheduling makes job assignment an exclusive function of time. We broaden the meaning of the term ``scheduling'' to include job assignment as a function of any stream of interrupts as a reference frame, not just timer interrupts. By fine-grain we mean frequent checking and scheduling actions (e.g. at sub-millisecond intervals), introducing new flexibility into scheduling. We have implemented fine-grain scheduling in the Synthesis operating system based on a software mechanism similar to phase locked loop. Very low overhead context switches and scheduling cost (a few microseconds on a 68020-based machine) makes Synthesis fine-grain scheduling practical. Interesting applications of fine-grain scheduling include I/O device management, real-time scheduling, highly sensitive adaptive scheduling, and distributed adaptive scheduling. by: Henry Massalin and Calton Pu Department of Computer Science Columbia University New York, NY 10027 Send your request to calton@cs.columbia.edu or columbia!calton. Please include your postal mailing address. Thanks.
darrell@midgard.ucsc.edu (Darrell Long) (11/08/88)
For those of you interested in replication... The following technical report (UCSC-CRL-88-18) can be ordered from UCSC. Send your request to: Jennifer Madden Baskin Center for Computer Engineering and Information Sciences University of California Santa Cruz, CA 95064 The Reliability of Regeneration-Based Replica Control Protocols Darrell D.E. Long John L. Carroll Kris Stewart Computer and Information Sciences Computer Science Division University of California San Diego State University Santa Cruz, CA 95064 San Diego, CA 92182 ABSTRACT The accessibility of vital information can be enhanced by replicating the data on several sites, and employing a consistency control protocol to manage the copies. The most common measures of accessibility include reliability, which is the probability that a replicated data object will remain continuously accessible over a given time period, and availability, which is the steady-state probability that the data object is accessible at any given moment. For many applications, the reliability of a system is a more important measure of its performance than its availability. These applications are characterized by the property that interruptions of service are intolerable and usually involve interaction with real-time processes, such as process control or data gathering where the data will be lost if it is not captured when it is available. The reliability of a replicated data object depends on maintaining a viable set of current replicas. When storage is limited it may not be feasible to simply replicate a data object at enough sites to achieve the desired level of reliability. If new replicas of a data object can be created faster than a system failure can be repaired, better reliability can be achieved by creating new replicas on other sites in response to changes in the system configuration. This technique, known as regeneration, approximates the reliability provided by additional replicas for a modest increase in storage costs. Several strategies for replica maintenance are considered, and the benefits of each are analyzed. While the availability afforded by each of the protocols is quite similar, the reliabilites vary greatly. Formulas describing the reliability of the replicated data object are presented, and closed-form solutions are given for the tractible cases. Numerical solutions, validated by simulation results, are used to analyze the trade-offs between reliability and storage costs. With estimates of the mean times to site failure and repair in a given system, the numerical techniques presented here can be applied to predict the fewest number of replicas required to provide the desired level of reliability.
darrell@midgard.ucsc.edu (Darrell Long) (10/10/89)
High-Speed Networks and the Internet Daniel R. Helman, Darrell D. E. Long Baskin Center for Computer Engineering & Information Sciences University of California, Santa Cruz ABSTRACT So far much of the work in advanced networks has been concentrated on high-speed transmission and the design of low-level packet switching mechanisms. Less is known about interfacing and integrating such networks into our existing data and telecommunications systems. We examine one aspect of this problem, interfacing these networks to existing LAN systems based on standard protocols. An internetworking structure is proposed, and supported with experimental evi- dence. If you are interesting in receiving a copy, please ask for UCSC-CRL-89-20. Address correspondence to: Ms. Jean McKnight Technical Report Librarian Baskin Center for Computer Engineering & Information Sciences University of California Santa Cruz, CA 95064 Internet: jean@luna.ucsc.edu The cost of the report is $4.00 to cover duplicating and mailing. A PostScript version of the report can be obtained via anonymous FTP from Midgard.UCSC.EDU, pub/tr/ucsc-crl-89-20.ps.Z. If you decide to use FTP, please send a note to Jean so we can estimate how successful electronic TR's are.
darrell@sequoia.ucsc.edu (Darrell Long) (10/16/90)
% I keep a list of ftp'able reports on midgard.ucsc.edu. If you'd like me to % add your reports to that list, send me a note. --DL The following technical report is available by anonymous FTP from midgard.ucsc.edu (128.114.134.15) as pub/tr/ucsc-crl-89-04.tar.Z A printed copy of the report can be obtained by sending $4 to: Jean McKnight Technical Librarian Baskin Center for Computer Engineering & Information Sciences Applied Sciences Building University of California Santa Cruz, CA 95064 Swift: A Storage Architecture for Large Objects Luis-Felipe Cabrera IBM Almaden Research Center Darrell D. E. Long University of California, Santa Cruz ABSTRACT Managing large objects with high data rate requirements is difficult for current computing systems. The increasing disparity between the fastest network transfer rate and the fastest disk transfer rate requires resolution. We present an architecture, called Swift, that addresses the problem of storing and retrieving, at high data rates, large data objects from slower secondary storage. Applications range from visualization of scientific computations to real-time storage and retrieval of color video. Swift addresses this issue by exploiting the available interconnection capacity and by using several slower storage devices concurrently. We study the performance characteristics of a local-area instance of Swift using a parametric simulation model. We consider a system of high-performance work stations connected to multiple storage agents by a high-speed local area network. Our simulation shows Swift compares favorably with other concurrent I/O architectures, such as disk arrays, in terms of maximum aggregate data rate and resource requirements. Keywords: high-performance storage systems, high data rates, disk striping, high-speed networks, distributed systems, data redundancy, server resiliency.
darrell@sequoia.ucsc.edu (Darrell Long) (04/20/91)
The following UCSC technical report (UCSC-CRL-91-08) is available via anonymous FTP from midgard.ucsc.edu (128.114.14.6). The file is pub/tr/ucsc-crl-91-08.ps.Z Be sure to use binary more when doing the transfer. Exploiting Multiple I/O Streams to Provide High Data-Rates Luis-Felipe Cabrera Darrell D. E. Long IBM Almaden Research Center Computer & Information Sciences Computer Science Department University of California at Santa Cruz Internet: cabrera@ibm.com Internet: darrell@sequoia.ucsc.edu ABSTRACT We present an I/O architecture, called Swift, that addresses the problem of data-rate mismatches between the requirements of an application, the maximum data-rate of the storage dev- ices, and the data-rate of the interconnection medium. The goal of Swift is to support integrated continuous multimedia in general purpose distributed systems. In installations with a high-speed interconnection medium, Swift will provide high data-rate transfers by using multiple slower storage devices in parallel. The data-rates obtained with this approach scale well when using multiple storage devices and multiple interconnections. Swift has the flexibility to use any appropriate storage technology, including disk arrays. The ability to adapt to technologi- cal advances will allow Swift to provide for ever increasing I/O demands. To address the problem of partial failures, Swift stores data redundantly. Using the UNIX operating system, we have constructed a simplified prototype of the Swift architecture. Using a single Ethernet-based local-area network and three servers, the prototype provides data-rates that are almost three times as fast as access to the local SCSI disk in the case of writes. When compared with NFS, the Swift prototype pro- vides double the data-rate for reads and eight times the data-rate for writes. The data-rate of our prototype scales almost linearly in the number of servers and the number of network segments. Its performance is shown to be limited by the speed of the Ethernet-based local-area network. We also constructed a simulation model to show how the Swift architecture can exploit storage, communication, and processor advances, and to locate the components that will limit I/O performance. In a simulated gigabit/second token ring local-area network the data-rates are seen to scale proportionally to the size of the transfer unit and to the number of storage agents.
darrell@terra.ucsc.edu (Darrell Long) (06/04/91)
The following technical report is available via anonymous FTP from midgard.ucsc.edu and also through electronic mail. For FTP, ftp> cd pub/tr ftp> binary ftp> get ucsc-crl-91-18.ps.Z ftp> quit For electronic mail, % mail reports@midgard.ucsc.edu @@ send ucsc-crl-91-18.ps.Z from tr ^D ACCESSING REPLICATED DATA IN A LARGE-SCALE DISTRIBUTED SYSTEM by Richard A. Golding Many distributed applications use replicated data to improve the availability of the data, and to improve access latency by locating copies of the data near to their use. This thesis presents a new family of communication protocols, called quorum multicasts, that provide efficient communication services for replicated data. Quorum multicasts are similar to ordinary multicasts, which deliver a message to a set of destinations. The new protocols extend this model by allowing delivery to a subset of the destinations, selected according to distance or expected data currency. These protocols provide well-defined failure semantics, and can distinguish between communication failure and replica failure with high probability. The thesis includes a performance evaluation of three quorum multicast protocols. This required taking several measurements of the Internet to determine distributions for communication latency and failure. The results indicate that the behavior of recent messages is a useful predictor for the performance of the next. A simulation study of quorum multicasts, based on the Internet measurements, shows that these protocols provide low latency and require few messages. A second study that measured a test application running at several sites confirmed these results.