[comp.benchmarks] Measuring disk read times for Unix

pfeiffer@nmsu.edu (Joe Pfeiffer) (02/28/91)

I have a need to measure transfer rates and system time required for
disk reads and writes under Unix, for modelling purposes.  The data
must be paramaterized by block size.

In order to measure write times, I'm using the O_SYNC to flush my data
to disk on each write.  So far so good.

I'm having trouble trying to do the equivalent for reads.  I appear to
be filling up the OS's disk buffers on the first pass through the
test, after which reads come from there.  While an obvious solution of
making the test file so huge it won't all fit in there would help, it
wouldn't be a complete solution.

Any thoughts?

-Joe.

taylor@intellistor.com (Dick Taylor) (02/28/91)

In article <714@opus.NMSU.Edu> pfeiffer@nmsu.edu (Joe Pfeiffer) writes:
>...
>In order to measure write times, I'm using the O_SYNC to flush my data
>to disk on each write.  So far so good.
>
>I'm having trouble trying to do the equivalent for reads.  I appear to
>be filling up the OS's disk buffers on the first pass through the
>test, after which reads come from there.  While an obvious solution of
>making the test file so huge it won't all fit in there would help, it
>wouldn't be a complete solution.
>
>Any thoughts?

There are a number of ways to do it.  The niftiest that I ever found was
to use the mount() and unmount() system calls to unmount the filesystem
between each iteration of the benchmark.  One warning, though -- this
stresses those two system calls a lot more heavily than anything else
they're used for.  In some versions of SunOS, in particular, there
are interactions with the update daemon that can crash the system
quite spectacularly (this appears to be fixed in SunOS 4.1 and in
Solbourne's OS/SMP-4.0D).  Other ways to do this basically involve flushing
large amounts of data through the filesystem buffers and hoping that
the unwanted data disappears.  Time-consuming and nasty, if you ask me.

			-- Dick Taylor
			   Intellistor, Inc.

jsadler@misty.boeing.com (Jim Sadler) (02/28/91)

>/ misty:comp.benchmarks / taylor@intellistor.com (Dick Taylor) /  3:30 pm  Feb 27, 1991 /
>In article <714@opus.NMSU.Edu> pfeiffer@nmsu.edu (Joe Pfeiffer) writes:
>>...
>>In order to measure write times, I'm using the O_SYNC to flush my data
>>to disk on each write.  So far so good.
>>
>>I'm having trouble trying to do the equivalent for reads.  I appear to
	.
	.
	.
>
>There are a number of ways to do it.  The niftiest that I ever found was
>to use the mount() and unmount() system calls to unmount the filesystem
>between each iteration of the benchmark.  One warning, though -- this
	.
	.
	.
>			-- Dick Taylor
>			   Intellistor, Inc.
Would it possible to regen the kernel with the buffer set to the lowest
value, for the purpose of this type of test ?  Then you wouldn't have to
pass as much data though to clean out the buffer or have the unwanted
overhead as the mount(), unmount() way.  Of course if the overhead of
cleaning out the buffer is the same or greater than un/mount() it's not
such a great idea.

jim

taylor@intellistor.com (Dick Taylor) (03/01/91)

In article <23410002@misty.boeing.com> jsadler@misty.boeing.com (Jim Sadler) writes:
>>/ misty:comp.benchmarks / taylor@intellistor.com (Dick Taylor) /  3:30 pm  Feb 27, 1991 /
>>In article <714@opus.NMSU.Edu> pfeiffer@nmsu.edu (Joe Pfeiffer) writes:
>>>...
>>>In order to measure write times, I'm using the O_SYNC to flush my data
>>>to disk on each write.  So far so good.
>>>
>>
>>There are a number of ways to do it.  The niftiest that I ever found was
>>to use the mount() and unmount() system calls to unmount the filesystem
>>between each iteration of the benchmark.  One warning, though -- this
>	.
>Would it possible to regen the kernel with the buffer set to the lowest
>value, for the purpose of this type of test ?  Then you wouldn't have to
>pass as much data though to clean out the buffer or have the unwanted
>overhead as the mount(), unmount() way.  Of course if the overhead of
>cleaning out the buffer is the same or greater than un/mount() it's not
>such a great idea.
>
>jim

As it turns out, mount/unmount is quite quick (relative to typical benchmark
times, that is -- it's well under a second).  If you do your own timing (that
is, if you avoid crutches like /bin/time), there's no impact on program
performance and the impact on run time is minimal.  Any kernel hacking has the
nasty side effect that changing the buffer size can change your results, and
it's tough to shrink the buffers down to a size that can be cleared faster
than the unmount/mount can take place.  (Not to mention the fact that it's
kind of a trick to change this size at all under many memory-mapped file
systems).  What I've always really wanted is a syscall to invalidate either
the whole system cache or a piece of it qualified by file or directory.  I've
always thought it was ironic that system designers wouldn't be caught dead
without a way to invalidate the cache on, say, a processor, but they never
bother to extend the same feature to their users.  It's not terrible or
heinous or anything, just ironic.

			-- Dick Taylor

craig@bacchus.esa.oz.au (Craig Macbride) (03/01/91)

In <714@opus.NMSU.Edu> pfeiffer@nmsu.edu (Joe Pfeiffer) writes:

>I have a need to measure transfer rates and system time required for
>disk reads and writes under Unix, for modelling purposes.  The data
>must be paramaterized by block size.

>I'm having trouble trying to do the equivalent for reads.  I appear to
>be filling up the OS's disk buffers on the first pass through the
>test, after which reads come from there.  While an obvious solution of
>making the test file so huge it won't all fit in there would help, it
>wouldn't be a complete solution.

Well, you could always reduce the disk buffers to a very small number, so that
the O/S is forced to flush things a lot. Of course, this or making the files
very large would seem to create a rather unrealistic situation, which may not
be at all like the one encountered in reality. Alternatively, read a number
of different files, instead of reading the same one over and over. Better still,
read from the same or different files in the proportions you'd expect for
whatever situation it is you are modelling.

-- 
 _____________________________________________________________________________
| Craig Macbride, craig@bacchus.esa.oz.au      | Hardware:                    |
|                                              |      The parts of a computer |
|   Expert Solutions Australia                 |        which you can kick!   | 

suitti@ima.isc.com (Stephen Uitti) (03/01/91)

In article <1991Feb28.210620.21205@intellistor.com> taylor@intellistor.com (Dick Taylor) writes:
>In article <23410002@misty.boeing.com> jsadler@misty.boeing.com (Jim Sadler) writes:
>>>/ misty:comp.benchmarks / taylor@intellistor.com (Dick Taylor) /  3:30 pm  Feb 27, 1991 /
>>>In article <714@opus.NMSU.Edu> pfeiffer@nmsu.edu (Joe Pfeiffer) writes:
>>>>...
>>>>In order to measure write times, I'm using the O_SYNC to flush my data
>>>>to disk on each write.  So far so good.
>>>>
>>>There are a number of ways to do it.  The niftiest that I ever found was
>>>to use the mount() and unmount() system calls to unmount the filesystem
>>>between each iteration of the benchmark.  One warning, though -- this
>>	.
>>Would it possible to regen the kernel with the buffer set to the lowest
>>value, for the purpose of this type of test ?
>>
>As it turns out, mount/unmount is quite quick (relative to typical benchmark
>times, that is -- it's well under a second).  

Not on all systems.  Some systems convert free lists to bitmaps
on mount and back on umount.  For large filesystems with lots of
free space, this can take time.

On some systems, the free list, and therefore the file block
order is randomized by nearly any action.  So, to add to
repeatability, one such test used mkfs, or newfs between tests.

Maybe reading & writing separate files would be better.

Maybe accessing the raw disk would give more useful data.

The caching, read ahead and write behind are very important to
real throughput.  Disabling the buffer cache (if there is a
statically allocated one) may cause a good OS design to trash.

Stephen.
suitti@ima.isc.com
"We Americans want peace, and it is now evident that we must be
prepared to demand it.  For other peoples have wanted peace, and
the peace they received was the peace of death." - the Most Rev.
Francis J. Spellman, Archbishop of New York.  22 September, 1940

Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) (03/05/91)

In article <714@opus.NMSU.Edu> pfeiffer@nmsu.edu (Joe Pfeiffer) writes:
Joe>In order to measure write times, I'm using the O_SYNC to flush my data
Joe>to disk on each write.  So far so good.
Joe>
Joe>I'm having trouble trying to do the equivalent for reads...

>>>>> On 27 Feb 91 23:30:49 GMT, taylor@intellistor.com (Dick Taylor) said:
Dick> There are a number of ways to do it.  The niftiest that I ever found was
Dick> to use the mount() and unmount() system calls to unmount the filesystem
Dick> between each iteration of the benchmark.

On the nose, Dick.  SVr4 and SunOS use a lazy pageout method; a file mapped
into memory is mapped out (LRU) only if there is a request for more memory
_and_ there is no unallocated RAM (and umounts, of course).  Theoretically,
a system with lots of excess RAM may leave a a file paged into RAM
indefinitely.  I suspect the same is true of OSF/1.  Anyone in the know
care to comment?

"Sticky bits?  We don't need no stinkin' sticky bits!" -- M.E.  :-)

	Cheers,
--
Chuck Phillips  MS440
NCR Microelectronics 			chuck.phillips%ftcollins.ncr.com
2001 Danfield Ct.
Ft. Collins, CO.  80525   		...uunet!ncrlnk!ncr-mpd!chuck.phillips

jonathan@cs.pitt.edu (Jonathan Eunice) (03/19/91)

Chuck.Phillips@FtCollins.NCR.COM writes:

   SVr4 and SunOS use a lazy pageout method; a file mapped
   into memory is mapped out (LRU) only if there is a request for more memory
   _and_ there is no unallocated RAM (and umounts, of course).  Theoretically,
   a system with lots of excess RAM may leave a a file paged into RAM
   indefinitely.  I suspect the same is true of OSF/1.  Anyone in the know
   care to comment?

I believe this is true of: SunOS 4.x, SVr4, OSF/1, and AIX 3.1.