[comp.archives] [comp.os.research] TR available: A Study of the Reliability of Internet Sites

darrell@sequoia.ucsc.edu (Darrell Long) (09/25/90)

Archive-name: internet-reliability/24-Sep-90
Original-posting-by: darrell@sequoia.ucsc.edu (Darrell Long)
Original-subject: TR available: A Study of the Reliability of Internet Sites
Archive-site: midgard.ucsc.edu [128.114.134.15]
Archive-directory: /pub/tr
Reposted-by: emv@math.lsa.umich.edu (Edward Vielmetti)


The following technical report is available for anonymous FTP from
midgard.ucsc.edu (128.54.134.15) as pub/tr/ucsc-crl-90-46.ps.Z.

A hardcopy can also be obtained by writing:

Jean McKnight
Technical Report Librarian
Baskin Center for Computer Engineering & Information Sciences
Applied Sciences Building
University of California
Santa Cruz, CA  95064

jean@cis.ucsc.edu

Please try to obtain an electronic copy if at all possible.  Also, please do
not ask Jean to e-mail you a copy of the PostScript file.  She does not have
time for a large number of e-mail requests.




		A Study of the Reliability of Internet Sites



		 D. D. E. Long                 J. L. Carroll, C. J. Park
	Computer & Information Sciences          Mathematical Sciences
	   University of California            San Diego State University
	     Santa Cruz, CA 95064                 San Diego, CA 92182

		(408) 459-2616               (619) 594-7242, (619) 594-6171

	     darrell@cis.ucsc.edu           carroll@sdsu.edu, cjpark@sdsu.edu


				  Abstract

	It is often assumed that the failure  and  repair  rates  of
	components  are  exponentially distributed.  This hypothesis
	is testable for failure rates, though the process of gather-
	ing  the necessary data and reducing it to a usable form can
	be difficult.  While no amount of testing can prove  that  a
	sample  is  drawn  from  an  exponential  distribution,  the
	hypothesis that a population distribution is exponential can
	in many cases be rejected with confidence.

	     For this study, data were collected from as many  hosts
	as  was  feasible using only data that could be obtained via
	the Internet with no special privileges or added  monitoring
	facilities.   The  Internet  was  used  to poll over 100,000
	hosts to determine the length of time that each had been up,
	and  again  polled after several months to determine average
	host availability.  A surprisingly rich collection of infor-
	mation  was  gathered in this fashion, allowing estimates of
	availability, mean-time-to-failure (MTTF) and  mean-time-to-
	repair (MTTR) to be derived.  The measurements reported here
	correspond with common experience and certainly fall in  the
	range of reasonable values.

	     By applying an appropriate test statistic, some of  the
	samples were found to have a realistic chance of being drawn
	from an exponential distribution, while others can be confi-
	dently  classed  as non-exponential.  With very large sample
	sizes, sufficient evidence could be  accumulated  to  reject
	the  exponential  hypothesis.  However, for moderately-sized
	samples, it was often not possible to exhibit the  deviation
	from exponentiality, lending credence to the common practice
	of assuming that MTTF is exponentially distributed.