calton@cs.columbia.edu (Calton Pu) (11/01/88)
[ Please remember to reply to the author of this article for copies of this ]
[ TR -- not to me! --DL ]
Technical Report No. CUCS-381-88
Title: Fine-Grain Scheduling
Abstract
We introduce the concept of fine-grain scheduling. Conventional
scheduling makes job assignment an exclusive function of time. We
broaden the meaning of the term ``scheduling'' to include job
assignment as a function of any stream of interrupts as a reference
frame, not just timer interrupts. By fine-grain we mean frequent
checking and scheduling actions (e.g. at sub-millisecond
intervals), introducing new flexibility into scheduling. We have
implemented fine-grain scheduling in the Synthesis operating system
based on a software mechanism similar to phase locked loop. Very low
overhead context switches and scheduling cost (a few microseconds on a
68020-based machine) makes Synthesis fine-grain scheduling practical.
Interesting applications of fine-grain scheduling include I/O device
management, real-time scheduling, highly sensitive adaptive
scheduling, and distributed adaptive scheduling.
by: Henry Massalin and Calton Pu
Department of Computer Science
Columbia University
New York, NY 10027
Send your request to calton@cs.columbia.edu or columbia!calton.
Please include your postal mailing address. Thanks.darrell@midgard.ucsc.edu (Darrell Long) (11/08/88)
For those of you interested in replication...
The following technical report (UCSC-CRL-88-18) can be ordered from UCSC.
Send your request to:
Jennifer Madden
Baskin Center for Computer Engineering and Information Sciences
University of California
Santa Cruz, CA 95064
The Reliability of Regeneration-Based Replica Control Protocols
Darrell D.E. Long John L. Carroll Kris Stewart
Computer and Information Sciences Computer Science Division
University of California San Diego State University
Santa Cruz, CA 95064 San Diego, CA 92182
ABSTRACT
The accessibility of vital information can be enhanced by
replicating the data on several sites, and employing a
consistency control protocol to manage the copies. The most
common measures of accessibility include reliability, which
is the probability that a replicated data object will remain
continuously accessible over a given time period, and
availability, which is the steady-state probability that the
data object is accessible at any given moment.
For many applications, the reliability of a system is a
more important measure of its performance than its
availability. These applications are characterized by the
property that interruptions of service are intolerable and
usually involve interaction with real-time processes, such
as process control or data gathering where the data will be
lost if it is not captured when it is available.
The reliability of a replicated data object depends on
maintaining a viable set of current replicas. When storage
is limited it may not be feasible to simply replicate a data
object at enough sites to achieve the desired level of
reliability. If new replicas of a data object can be
created faster than a system failure can be repaired, better
reliability can be achieved by creating new replicas on
other sites in response to changes in the system
configuration. This technique, known as regeneration,
approximates the reliability provided by additional replicas
for a modest increase in storage costs.
Several strategies for replica maintenance are
considered, and the benefits of each are analyzed. While
the availability afforded by each of the protocols is quite
similar, the reliabilites vary greatly. Formulas describing
the reliability of the replicated data object are presented,
and closed-form solutions are given for the tractible cases.
Numerical solutions, validated by simulation results, are
used to analyze the trade-offs between reliability and
storage costs. With estimates of the mean times to site
failure and repair in a given system, the numerical
techniques presented here can be applied to predict the
fewest number of replicas required to provide the desired
level of reliability.darrell@midgard.ucsc.edu (Darrell Long) (10/10/89)
High-Speed Networks and the Internet Daniel R. Helman, Darrell D. E. Long Baskin Center for Computer Engineering & Information Sciences University of California, Santa Cruz ABSTRACT So far much of the work in advanced networks has been concentrated on high-speed transmission and the design of low-level packet switching mechanisms. Less is known about interfacing and integrating such networks into our existing data and telecommunications systems. We examine one aspect of this problem, interfacing these networks to existing LAN systems based on standard protocols. An internetworking structure is proposed, and supported with experimental evi- dence. If you are interesting in receiving a copy, please ask for UCSC-CRL-89-20. Address correspondence to: Ms. Jean McKnight Technical Report Librarian Baskin Center for Computer Engineering & Information Sciences University of California Santa Cruz, CA 95064 Internet: jean@luna.ucsc.edu The cost of the report is $4.00 to cover duplicating and mailing. A PostScript version of the report can be obtained via anonymous FTP from Midgard.UCSC.EDU, pub/tr/ucsc-crl-89-20.ps.Z. If you decide to use FTP, please send a note to Jean so we can estimate how successful electronic TR's are.
darrell@sequoia.ucsc.edu (Darrell Long) (10/16/90)
% I keep a list of ftp'able reports on midgard.ucsc.edu. If you'd like me to
% add your reports to that list, send me a note. --DL
The following technical report is available by anonymous FTP from
midgard.ucsc.edu (128.114.134.15) as pub/tr/ucsc-crl-89-04.tar.Z
A printed copy of the report can be obtained by sending $4 to:
Jean McKnight
Technical Librarian
Baskin Center for Computer Engineering & Information Sciences
Applied Sciences Building
University of California
Santa Cruz, CA 95064
Swift: A Storage Architecture for Large Objects
Luis-Felipe Cabrera
IBM Almaden Research Center
Darrell D. E. Long
University of California, Santa Cruz
ABSTRACT
Managing large objects with high data rate requirements is
difficult for current computing systems. The increasing
disparity between the fastest network transfer rate and the
fastest disk transfer rate requires resolution. We present an
architecture, called Swift, that addresses the problem of storing
and retrieving, at high data rates, large data objects from
slower secondary storage. Applications range from visualization
of scientific computations to real-time storage and retrieval of
color video.
Swift addresses this issue by exploiting the available
interconnection capacity and by using several slower storage
devices concurrently. We study the performance characteristics
of a local-area instance of Swift using a parametric simulation
model. We consider a system of high-performance work stations
connected to multiple storage agents by a high-speed local area
network. Our simulation shows Swift compares favorably with
other concurrent I/O architectures, such as disk arrays, in terms
of maximum aggregate data rate and resource requirements.
Keywords: high-performance storage systems, high data rates, disk
striping, high-speed networks, distributed systems, data
redundancy, server resiliency.darrell@sequoia.ucsc.edu (Darrell Long) (04/20/91)
The following UCSC technical report (UCSC-CRL-91-08) is available via anonymous FTP from midgard.ucsc.edu (128.114.14.6). The file is pub/tr/ucsc-crl-91-08.ps.Z Be sure to use binary more when doing the transfer. Exploiting Multiple I/O Streams to Provide High Data-Rates Luis-Felipe Cabrera Darrell D. E. Long IBM Almaden Research Center Computer & Information Sciences Computer Science Department University of California at Santa Cruz Internet: cabrera@ibm.com Internet: darrell@sequoia.ucsc.edu ABSTRACT We present an I/O architecture, called Swift, that addresses the problem of data-rate mismatches between the requirements of an application, the maximum data-rate of the storage dev- ices, and the data-rate of the interconnection medium. The goal of Swift is to support integrated continuous multimedia in general purpose distributed systems. In installations with a high-speed interconnection medium, Swift will provide high data-rate transfers by using multiple slower storage devices in parallel. The data-rates obtained with this approach scale well when using multiple storage devices and multiple interconnections. Swift has the flexibility to use any appropriate storage technology, including disk arrays. The ability to adapt to technologi- cal advances will allow Swift to provide for ever increasing I/O demands. To address the problem of partial failures, Swift stores data redundantly. Using the UNIX operating system, we have constructed a simplified prototype of the Swift architecture. Using a single Ethernet-based local-area network and three servers, the prototype provides data-rates that are almost three times as fast as access to the local SCSI disk in the case of writes. When compared with NFS, the Swift prototype pro- vides double the data-rate for reads and eight times the data-rate for writes. The data-rate of our prototype scales almost linearly in the number of servers and the number of network segments. Its performance is shown to be limited by the speed of the Ethernet-based local-area network. We also constructed a simulation model to show how the Swift architecture can exploit storage, communication, and processor advances, and to locate the components that will limit I/O performance. In a simulated gigabit/second token ring local-area network the data-rates are seen to scale proportionally to the size of the transfer unit and to the number of storage agents.
darrell@terra.ucsc.edu (Darrell Long) (06/04/91)
The following technical report is available via anonymous FTP from midgard.ucsc.edu and also through electronic mail. For FTP, ftp> cd pub/tr ftp> binary ftp> get ucsc-crl-91-18.ps.Z ftp> quit For electronic mail, % mail reports@midgard.ucsc.edu @@ send ucsc-crl-91-18.ps.Z from tr ^D ACCESSING REPLICATED DATA IN A LARGE-SCALE DISTRIBUTED SYSTEM by Richard A. Golding Many distributed applications use replicated data to improve the availability of the data, and to improve access latency by locating copies of the data near to their use. This thesis presents a new family of communication protocols, called quorum multicasts, that provide efficient communication services for replicated data. Quorum multicasts are similar to ordinary multicasts, which deliver a message to a set of destinations. The new protocols extend this model by allowing delivery to a subset of the destinations, selected according to distance or expected data currency. These protocols provide well-defined failure semantics, and can distinguish between communication failure and replica failure with high probability. The thesis includes a performance evaluation of three quorum multicast protocols. This required taking several measurements of the Internet to determine distributions for communication latency and failure. The results indicate that the behavior of recent messages is a useful predictor for the performance of the next. A simulation study of quorum multicasts, based on the Internet measurements, shows that these protocols provide low latency and require few messages. A second study that measured a test application running at several sites confirmed these results.