[comp.os.research] TR available

calton@cs.columbia.edu (Calton Pu) (11/01/88)

[ Please remember to reply to the author of this article for copies of this ]
[ TR -- not to me!  --DL                                                    ]

Technical Report No. CUCS-381-88 

Title:          Fine-Grain Scheduling

                        Abstract

We introduce the concept of fine-grain scheduling.  Conventional
scheduling makes job assignment an exclusive function of time.  We
broaden the meaning of the term ``scheduling'' to include job
assignment as a function of any stream of interrupts as a reference
frame, not just timer interrupts.  By fine-grain we mean frequent
checking and scheduling actions (e.g. at sub-millisecond
intervals), introducing new flexibility into scheduling.  We have
implemented fine-grain scheduling in the Synthesis operating system
based on a software mechanism similar to phase locked loop.  Very low
overhead context switches and scheduling cost (a few microseconds on a
68020-based machine) makes Synthesis fine-grain scheduling practical.
Interesting applications of fine-grain scheduling include I/O device
management, real-time scheduling, highly sensitive adaptive
scheduling, and distributed adaptive scheduling.


by:             Henry Massalin and Calton Pu

                Department of Computer Science 
                Columbia University
                New York, NY 10027


Send your request to calton@cs.columbia.edu or columbia!calton.
Please include your postal mailing address.  Thanks.

darrell@midgard.ucsc.edu (Darrell Long) (11/08/88)

For those of you interested in replication...

The following technical report (UCSC-CRL-88-18) can be ordered from UCSC.

Send your request to:

Jennifer Madden
Baskin Center for Computer Engineering and Information Sciences
University of California
Santa Cruz, CA 95064

   The  Reliability  of  Regeneration-Based   Replica   Control Protocols

             Darrell D.E. Long           John L. Carroll   Kris Stewart
     Computer and Information Sciences     Computer Science Division
         University of California          San Diego State University
           Santa Cruz, CA 95064               San Diego, CA 92182


				 ABSTRACT

	The accessibility of vital information can  be  enhanced  by
	replicating  the  data  on  several  sites,  and employing a
	consistency control protocol to manage the copies.  The most
	common  measures of accessibility include reliability, which
	is the probability that a replicated data object will remain
	continuously  accessible  over  a  given  time  period,  and
	availability, which is the steady-state probability that the
	data object is accessible at any given moment.

	     For many applications, the reliability of a system is a
	more   important   measure   of  its  performance  than  its
	availability.  These applications are characterized  by  the
	property  that  interruptions of service are intolerable and
	usually involve interaction with real-time  processes,  such
	as  process control or data gathering where the data will be
	lost if it is not captured when it is available.

	     The reliability of a replicated data object depends  on
	maintaining  a viable set of current replicas.  When storage
	is limited it may not be feasible to simply replicate a data
	object  at  enough  sites  to  achieve  the desired level of
	reliability.  If new  replicas  of  a  data  object  can  be
	created faster than a system failure can be repaired, better
	reliability can be achieved  by  creating  new  replicas  on
	other   sites   in   response   to  changes  in  the  system
	configuration.   This  technique,  known  as   regeneration,
	approximates the reliability provided by additional replicas
	for a modest increase in storage costs.

	     Several  strategies   for   replica   maintenance   are
	considered,  and  the  benefits of each are analyzed.  While
	the availability afforded by each of the protocols is  quite
	similar, the reliabilites vary greatly.  Formulas describing
	the reliability of the replicated data object are presented,
	and closed-form solutions are given for the tractible cases.
	Numerical solutions, validated by  simulation  results,  are
	used  to  analyze  the  trade-offs  between  reliability and
	storage costs.  With estimates of the  mean  times  to  site
	failure   and  repair  in  a  given  system,  the  numerical
	techniques presented here can  be  applied  to  predict  the
	fewest  number  of  replicas required to provide the desired
	level of reliability.

darrell@midgard.ucsc.edu (Darrell Long) (10/10/89)

		    High-Speed Networks and the Internet

		    Daniel R. Helman, Darrell D. E. Long
	Baskin Center for Computer Engineering & Information Sciences
		    University of California, Santa Cruz

				  ABSTRACT

	     So far much of the work in advanced networks  has  been
	concentrated  on  high-speed  transmission and the design of
	low-level packet switching mechanisms.  Less is known  about
	interfacing  and integrating such networks into our existing
	data and telecommunications systems.  We examine one  aspect
	of  this problem, interfacing these networks to existing LAN
	systems based on  standard  protocols.   An  internetworking
	structure  is proposed, and supported with experimental evi-
	dence.


If you are interesting in receiving a copy, please ask for UCSC-CRL-89-20.
Address correspondence to:

Ms. Jean McKnight
Technical Report Librarian
Baskin Center for Computer Engineering & Information Sciences
University of California
Santa Cruz, CA 95064

Internet: jean@luna.ucsc.edu

The cost of the report is $4.00 to cover duplicating and mailing.  A PostScript
version of the report can be obtained via anonymous FTP from Midgard.UCSC.EDU,
pub/tr/ucsc-crl-89-20.ps.Z.  If you decide to use FTP, please send a note to
Jean so we can estimate how successful electronic TR's are.

darrell@sequoia.ucsc.edu (Darrell Long) (10/16/90)

% I keep a list of ftp'able reports on midgard.ucsc.edu.  If you'd like me to
% add your reports to that list, send me a note.  --DL

The following technical report is available by anonymous FTP from 
midgard.ucsc.edu (128.114.134.15) as pub/tr/ucsc-crl-89-04.tar.Z

A printed copy of the report can be obtained by sending $4 to:

Jean McKnight
Technical Librarian
Baskin Center for Computer Engineering & Information Sciences
Applied Sciences Building
University of California
Santa Cruz, CA  95064



         Swift: A Storage Architecture for Large Objects

                       Luis-Felipe Cabrera
                   IBM Almaden Research Center

                       Darrell D. E. Long
              University of California, Santa Cruz

                            ABSTRACT

Managing large objects with high data rate requirements is
difficult for current computing systems. The increasing
disparity between the fastest network transfer rate and the
fastest disk transfer rate requires resolution. We present an
architecture, called Swift, that addresses the problem of storing
and retrieving, at high data rates, large data objects from
slower secondary storage. Applications range from visualization
of scientific computations to real-time storage and retrieval of
color video.

Swift addresses this issue by exploiting the available
interconnection capacity and by using several slower storage
devices concurrently. We study the performance characteristics
of a local-area instance of Swift using a parametric simulation
model. We consider a system of high-performance work stations
connected to multiple storage agents by a high-speed local area
network. Our simulation shows Swift compares favorably with
other concurrent I/O architectures, such as disk arrays, in terms
of maximum aggregate data rate and resource requirements.

Keywords: high-performance storage systems, high data rates, disk
striping, high-speed networks, distributed systems, data
redundancy, server resiliency.

darrell@sequoia.ucsc.edu (Darrell Long) (04/20/91)

The following UCSC technical report (UCSC-CRL-91-08) is available via
anonymous FTP from midgard.ucsc.edu (128.114.14.6).

The file is pub/tr/ucsc-crl-91-08.ps.Z  Be sure to use binary more when doing
the transfer.


	Exploiting Multiple I/O Streams to Provide High Data-Rates


	    Luis-Felipe Cabrera                 Darrell D. E. Long
	IBM Almaden Research Center      Computer & Information Sciences
	Computer Science Department   University of California at Santa Cruz

	 Internet: cabrera@ibm.com      Internet: darrell@sequoia.ucsc.edu


				  ABSTRACT

	We present an I/O architecture, called Swift, that addresses
	the problem of data-rate mismatches between the requirements
	of an application, the maximum data-rate of the storage dev-
	ices,  and the data-rate of the interconnection medium.  The
	goal of Swift is to support integrated continuous multimedia
	in general purpose distributed systems.

	     In  installations  with  a  high-speed  interconnection
	medium, Swift will provide high data-rate transfers by using
	multiple slower storage devices in parallel.  The data-rates
	obtained  with  this approach scale well when using multiple
	storage devices and multiple  interconnections.   Swift  has
	the  flexibility  to use any appropriate storage technology,
	including disk arrays.  The ability to adapt to  technologi-
	cal advances will allow Swift to provide for ever increasing
	I/O demands.  To address the problem  of  partial  failures,
	Swift stores data redundantly.

	     Using the UNIX operating system, we have constructed  a
	simplified  prototype  of  the  Swift architecture.  Using a
	single Ethernet-based local-area network and three  servers,
	the  prototype  provides  data-rates  that  are almost three
	times as fast as access to the local SCSI disk in  the  case
	of writes.  When compared with NFS, the Swift prototype pro-
	vides double the data-rate for reads  and  eight  times  the
	data-rate for writes.  The data-rate of our prototype scales
	almost linearly in the number of servers and the  number  of
	network segments.  Its performance is shown to be limited by
	the speed of the Ethernet-based local-area network.

	     We also constructed a simulation model to show how  the
	Swift  architecture  can exploit storage, communication, and
	processor advances, and to locate the components  that  will
	limit  I/O performance.  In a simulated gigabit/second token
	ring local-area network the data-rates  are  seen  to  scale
	proportionally  to  the size of the transfer unit and to the
	number of storage agents.

darrell@terra.ucsc.edu (Darrell Long) (06/04/91)

The following technical report is available via anonymous FTP from
midgard.ucsc.edu and also through electronic mail.

For FTP,

ftp> cd pub/tr
ftp> binary
ftp> get ucsc-crl-91-18.ps.Z
ftp> quit

For electronic mail,

% mail reports@midgard.ucsc.edu
@@ send ucsc-crl-91-18.ps.Z from tr
^D

ACCESSING REPLICATED DATA IN A LARGE-SCALE DISTRIBUTED SYSTEM

			by

		Richard A. Golding

Many distributed applications use replicated data to improve the
availability of the data, and to improve access latency by locating
copies of the data near to their use.  This thesis presents a new
family of communication protocols, called quorum multicasts, that
provide efficient communication services for replicated data.  Quorum
multicasts are similar to ordinary multicasts, which deliver a message
to a set of destinations.  The new protocols extend this model by
allowing delivery to a subset of the destinations, selected according to
distance or expected data currency.  These protocols provide
well-defined failure semantics, and can distinguish between
communication failure and replica failure with high probability.

The thesis includes a performance evaluation of three quorum multicast
protocols.  This required taking several measurements of the Internet to
determine distributions for communication latency and failure.  The
results indicate that the behavior of recent messages is a useful
predictor for the performance of the next.  A simulation study of quorum
multicasts, based on the Internet measurements, shows that these
protocols provide low latency and require few messages.  A second study
that measured a test application running at several sites confirmed
these results.