[comp.os.research] File system accesses

lm@snafu.Eng.Sun.COM (Larry McVoy) (01/12/90)

I'm looking for a recent paper that can give me some idea of how long
files live.  In particular, suppose I were to delay writes for N 
seconds.  How many writes would never go to disk.  

I think Ousterhout has such a paper, anyone know for sure?

[ Here's the Ousterhout citation, I'd be interested in others.  --DL ]

@INPROCEEDINGS(ouster:bsd,
	AUTHOR		= "Ousterhout, J. and H. D{a~C}osta and D. Harrison
			and J. Kunze and M. Kupfer and J. Thompson",
	TITLE		= "A Trace-Driven Analysis of the {UNIX}
			4.2 {BSD} File System",
	BOOKTITLE	= "Proceedings of the $10^{\rm th}$ {S}ymposium on
			{O}perating {S}ystem {P}rinciples",
	ADDRESS		= "Orcas Island, Washington",
	YEAR		= 1985,
	MONTH		= dec,
	ORGANIZATION	= "ACM",
	PAGES		= "15--24"
)

---
What I say is my opinion.  I am not paid to speak for Sun, I'm paid to hack.

Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

siegel@cs.cornell.edu (Alexander Siegel) (01/15/90)

In article <21155@pasteur.Berkeley.EDU> lm@snafu.Eng.Sun.COM (Larry McVoy) writes:
>
>I'm looking for a recent paper that can give me some idea of how long
>files live.  In particular, suppose I were to delay writes for N 
>seconds.  How many writes would never go to disk.  
>
>I think Ousterhout has such a paper, anyone know for sure?

Here is a list of 4 such papers that I know about in bibtex format.
The last one may be very useful to you.

@inproceedings{ochkkt:refpat,
  author="John K. Ousterhout and Herve Da Costa and David Harrison and
John A. Kunze and Mike Kupfer and James G. Thompson",
  title="{A} {T}race-{D}riven {A}nalysis of the {UNIX} 4.2 {BSD}
{F}ile {S}ystem",
  booktitle="Proceedings of the Tenth ACM Symposium on
Operating Systems Principles",
  year=1985,
  pages="15--24",
  organization="ACM",
  month=dec,
  note="Order no. 534850"
}

@techreport{rf:refpat,
  title="{S}hort-{T}erm {F}ile {R}eference {P}atterns in a {UNIX}
{E}nvironment",
  author="Rick Floyd",
  year=1986,
  month=mar,
  institution="University of Rochester",
  number=177
}

@techreport{rf:dirrefpat,
  title="{D}irectory {R}eference {P}atterns in a {UNIX} {E}nvironment",
  author="Rick Floyd",
  year=1986,
  month=aug,
  institution="University of Rochester",
  number=179
}

@techreport{cs:filepatt,
  title="{F}ile {A}ccess {P}atterns",
  author="Carl Staelin",
  institution="Department of Computer Science at Princeton
University",
  year="1988",
  month=sep,
  number="CS-TR-179-88"
}
-- 
Alex Siegel - CS graduate drudge at Cornell
a.k.a. Scimitar;  a.k.a. Phineas Ginn (SCA);  a.k.a. Trash
siegel@cs.cornell.edu   (607)255-1165

carla@helianthus.cs.duke.edu (Carla Ellis) (01/16/90)

>I'm looking for a recent paper that can give me some idea of how long
>files live.  In particular, suppose I were to delay writes for N
>seconds.  How many writes would never go to disk.

Many of the results in the tech report by Rick Floyd 
(cited in an earlier posting from Alexander Siegel)
have appeared recently in a journal article.

@article{rf:dirrefpat,
  title="Directory Reference Patterns in Hierarchical File Systems",
  author="Richard Floyd and Carla Ellis",
  year=1989,
  month=jun,
  journal="IEEE Transactions on Knowledge and Data Engineering",
  volume="1",
  number="2",
  pages="238-247"
  }

guy@CS.UCLA.EDU (Richard Guy) (01/17/90)

Three papers besides Ousterhout's come to mind:

	Kure, Oivind. "Optimization of File Migration in Distributed Systems,"
			UCBerkeley Tech Report UCB/CSD 88/413, April, 1988.
			[Dissertation; 203 pages; ~100 references; uses same
			trace data as Smith's disk cache paper in TOCS 3:3]

	Floyd, Rick.  "Short-Term File Reference Patterns in a UNIX
			Environment," Univ of Rochester, CSD Tech Report TR 177,
			March, 1986.
			[79 pages; stratified by file type, user]

	Floyd, Rick.  "Directory Reference Patterns in a UNIX
			Environment," Univ of Rochester, CSD Tech Report TR 179,
			August, 1986.
			[79 pages; stratified by file type, user]

All three should be readily available from their originating departments.
Kure's work is based on commercial IBM 360/370 systems; Floyd's work is
from an academic 4.2BSD in a CSD environment.

richard
----------
Richard Guy
UCLA Computer Science Department
guy@cs.ucla.edu

verber@pacific.mps.ohio-state.edu (Mark A. Verber) (01/17/90)

@techreport{ filelife ,
Author	= "M. Satyanarayanan",
Title	= "A Study of File Sizes and Lifetimes",
Institution = "Carnegie-Mellon University",
Year	= "1981",
Number	= "CMU-CS-81-114",
Note	= "Study done on TOPS-10 system.  CMU10A?",
Abstract= "

An investigation of the size and lifetime properties of files on the
primary computing facility in the CSD at CMU is presented in this
paper.  Three key issues are exaimed: the effect on migration on file
characteristics, the effect of file type on file characteristics, and
the correlation between file sizes and lifetimes.  Analytical models
that fit observed data are derived using two alternative techniques."
}
-- 
Mark Verber
System Programmer, Physics Department, Ohio State University
verber@mps.ohio-state.edu, verber@solutions.com, verber@ohstpy.bitnet
(614) 292-8002