[comp.unix.internals] What does sync

steinar@ifi.uio.no (Steinar Kj{rnsr|d) (12/13/90)

The above subject and imposed question may seem trivial, but I have so far
failed to find the answer (I browsed through the 4.3 book by McKusick, Karels
and Quarterman, references to pertinent pages here are welcome). The question
arised when a disk vendor presented results from a benchmark which purpose
was to measure read/write transfer rates for his drive. His scenario
was this:

 - a stand alone BSD (SunOS 4.0.3 I think) box in single user mode
 - no other disk activity in the system

The test program looked something like this:

 - <write a HUUUUGE file and measure the write transfer rate>
 - sync(); sync(); sync();
 - <read the same file back again and measure the read transfer rate>
 
Both read and write of the file use normal IO (asynchronous) operations which
therefore involves the buffer cache. The disk vendor's assumption is that
the three sync() calls will guarantee that the read pass of the test will
read data  off the media, not from the buffer cache, while I say that
although the sync() calls force dirty pages in the buffer cache to be written
to the disk, you have no guarantee that those pages also will be wiped out
from the cache. I find this especially true since there are no other
disk activity in the system at the time when the test is running. What would
be the purpose of really clearing the cache when you have nothing to replace
the cleared pages with?

Who are right?


 +==================================================================+
 !                                                                  !
 !      Steinar Kjaernsroed, 					    !
 !	Dpt. of Informatics,					    !
 !      University of Oslo,					    !
 !      P.O.Box 1080, Blindern,					    !
 !	0316, Oslo 3, 						    !
 !	NORWAY							    !
 !                                                                  !
 !      Phone:             047+2+453460 (work), 		    !
 !      Email:             steinar@ifi.uio.no      (Internet)       !
 !                         ..!mcsun!ifi!steinar 	(UUCP)	    !
 !          							    !
 +==================================================================+

 

heiby@mcdchg.chg.mcd.mot.com (Ron Heiby) (12/14/90)

steinar@ifi.uio.no (Steinar Kj{rnsr|d) writes:

>The disk vendor's assumption is that
>the three sync() calls will guarantee that the read pass of the test will
>read data  off the media, not from the buffer cache, while I say that
>although the sync() calls force dirty pages in the buffer cache to be written
>to the disk, you have no guarantee that those pages also will be wiped out
>from the cache.

>Who are right?

My experience is with System V derived releases, but I find it hard to
believe that bsd would be different in this.  You are right.  All a
sync(2) call does is queue all the dirty pages to be written to the
disk.  As far as I know, there is no guarantee that they actually
*have* been written when the sync(2) call returns, just that they've
all been *queued* to be written.

Also, there's no reason to have a line of the form
"sync();sync();sync()" in a C program, just as there is no reason to
have a line of the form "sync;sync;sync" in a shell script.  One is
quite sufficient.  The reason one sees three is that the way to shut
down earlier UNIX systems was to change to single-user mode, then (at
single user prompt) type sync three times.  This was not done as:
	# sync;sync;sync
(except by people who didn't understand).  It was done as:
	# sync
	# sync
	# sync
The theory was that by the time the operator typed the third command,
the disk blocks queued for writing by the first had had sufficient
time to actually be written to the disk.  In the former case, with all
three on a single line, there's really no time delay between the first
"sync" and the third, so no chance for the disk blocks to be written.
What saved most people doing it that way was that it took them a
couple of seconds to get up and hit the reset or power switch.
-- 
Ron Heiby, heiby@chg.mcd.mot.com	Moderator: comp.newprod
"Give me voice mail or give me drugs!"/"Mandatory Drug Testing? Just Say NO!!!"

ables@lot.ACA.MCC.COM (King Ables) (12/15/90)

From article <52328@mcdchg.chg.mcd.mot.com>, by heiby@mcdchg.chg.mcd.mot.com (Ron Heiby):
> steinar@ifi.uio.no (Steinar Kj{rnsr|d) writes:
> 
>>The disk vendor's assumption is that
>>the three sync() calls will guarantee that the read pass of the test will
>>read data  off the media, not from the buffer cache, while I say that
>>although the sync() calls force dirty pages in the buffer cache to be written
>>to the disk, you have no guarantee that those pages also will be wiped out
>>from the cache.
> 
> The theory was that by the time the operator typed the third command,
> the disk blocks queued for writing by the first had had sufficient
> time to actually be written to the disk.  In the former case, with all
> three on a single line, there's really no time delay between the first
> "sync" and the third, so no chance for the disk blocks to be written.
> What saved most people doing it that way was that it took them a
> couple of seconds to get up and hit the reset or power switch.

The way I understood it was that when sync() got called, it
put a request in a sync queue to have the kernel flush out
the unwritten disk blocks and then returned.  When a second sync()
call was made, it added it's request to the queue, and so on.

The secret was supposed to be that the queue had only two slots
it, thus a 3rd sync() call couldn't return having successfully
put its request in the queue until the first sync request had
been processed and removed from the queue.  So when the 3rd
one returned, you knew the first one had actually happened.

I never actually looked at the code, but it was the most acceptable
explanation I ever heard.  It was also from a source I didn't have
cause to doubt, but I don't remember where I picked it up now.

For what it's worth... hmmm... where is sync.c, anyway...  ;-)

-----------------------------------------------------------------------------
King Ables                    Micro Electronics and Computer Technology Corp.
ables@mcc.com                 3500 W. Balcones Center Drive
+1 512 338 3749               Austin, TX  78759
-----------------------------------------------------------------------------
We don't inherit the Earth from our parents, we borrow it from our children.

jfh@rpp386.cactus.org (John F Haugh II) (12/15/90)

In article <1635@lot.ACA.MCC.COM> ables@lot.ACA.MCC.COM (King Ables) writes:
>The way I understood it was that when sync() got called, it
>put a request in a sync queue to have the kernel flush out
>the unwritten disk blocks and then returned.  When a second sync()
>call was made, it added it's request to the queue, and so on.

Come on, you worked at IBM for a while, you should have checked
this one out.  Tho, I must confess this is one of the most
interesting theories I've ever heard.

>The secret was supposed to be that the queue had only two slots
>it, thus a 3rd sync() call couldn't return having successfully
>put its request in the queue until the first sync request had
>been processed and removed from the queue.  So when the 3rd
>one returned, you knew the first one had actually happened.

The way it was explained to me was that all the disk blocks would
be put on the device queues and scheduled for scribbling out.
Once the blocks were sorted, any new requests would be added to
the end of the device queue.  Any further requests for disk I/O
would be on the end of the queue and not satisfied until that
sync was completed.  This ignored that fact that the sync command
should still be in the buffer cache ...

Ron Heiby is probably most accurate in that it was the time
needed to type three sync's in a row that saved most people (and
that it was the time needed to walk over to the Big Red Switch
that saved the rest).  The PDP-11/45 I learned on had blinky
lights - when they stopped blinking I'd push down the HALT switch
and cut the juice (except one time when the console switch was in
the wrong position and I had to call someone and explain that the
HALT switch didn't ...).  Now you can just listen for disk clatter,
if you are unfortunate enough to have noisy disks.  I would feel
my RISC System/6000's cabinet to tell when the disks had stopped
moving.

>For what it's worth... hmmm... where is sync.c, anyway...  ;-)

On most System V's it's in sys4.c, with the rest of the bizzare
cruft, like nice(), getuid(), etc.  You can figure out where
the system calls are if you have the kernel libraries.  They
normally live in something like /usr/src/uts/{bzzt}/lib[0-9].a
and can be examined with nm and a good pager.  There is also the
"i" command to ADB for "examining" things. ;-)  SCO Xenix (which
I use at home) puts them in /usr/sys/*/lib*.a, and sys4.o is in
/usr/sys/sys/libsys.a.  The contents of sys[123].c are left to
your imagination, which if suitably warped is probably correct.

Picking up a copy of Bach helps, too.
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org
"While you are here, your wives and girlfriends are dating handsome American
 movie and TV stars. Stars like Tom Selleck, Bruce Willis, and Bart Simpson."

jim@segue.segue.com (Jim Balter) (12/16/90)

In article <1635@lot.ACA.MCC.COM> ables@lot.ACA.MCC.COM (King Ables) writes:
>The way I understood it was that when sync() got called, it
>put a request in a sync queue to have the kernel flush out
>the unwritten disk blocks and then returned.  When a second sync()
>call was made, it added it's request to the queue, and so on.
>
>The secret was supposed to be that the queue had only two slots
>it, thus a 3rd sync() call couldn't return having successfully
>put its request in the queue until the first sync request had
>been processed and removed from the queue.  So when the 3rd
>one returned, you knew the first one had actually happened.

If one insists on believing that people are being rational in typing sync
three times, that's a great hypothesis.  However, it has no basis in fact.
While there may a such a sync queue in some kernel somewhere, there isn't one
in any of the common kernels that have had triple syncs directed at them.
(Gee, I hope I haven't violated a non-disclosure agreement by saying so.)

[This paragraph has been placed here to satisfy some incredibly stupid person
who coded inews to reject insufficiently lengthy new articles.
Oh, that wasn't the intent of the check?  Like I said, incredibly stupid.
IMHO, of course.]

shore@mtxinu.COM (Melinda Shore) (12/17/90)

In article <1635@lot.ACA.MCC.COM> ables@lot.ACA.MCC.COM (King Ables) writes:
>From article <52328@mcdchg.chg.mcd.mot.com>, by heiby@mcdchg.chg.mcd.mot.com (Ron Heiby):
>> The theory was that by the time the operator typed the third command,
>> the disk blocks queued for writing by the first had had sufficient
>> time to actually be written to the disk.  
>The secret was supposed to be that the queue had only two slots
>it, thus a 3rd sync() call couldn't return having successfully
>put its request in the queue until the first sync request had
>been processed and removed from the queue.  So when the 3rd
>one returned, you knew the first one had actually happened.

No, Ron was correct (remember the old pop song "Sync 3 Times on the
Ceiling If You Want Me?").  The convention really was for the purpose
of allowing time for the write.

To the original poster, the reason for doing a sync even if you don't
need buffers is to ensure data integrity;  you do want to maximize the
likelihood that the cache contents are consistent with the state of the
disk in the event of a crash.  /etc/update syncs every 30 seconds, so
it's pretty unusual to need to do a sync in a user program.
-- 
               Hardware brevis, software longa
Melinda Shore                                 shore@mtxinu.com
mt Xinu                              ..!uunet!mtxinu.com!shore

hotte@sunrise.in-berlin.de (Horst Laumer) (12/17/90)

steinar@ifi.uio.no (Steinar Kj{rnsr|d) writes:


>The above subject and imposed question may seem trivial, but I have so far
>failed to find the answer (I browsed through the 4.3 book by McKusick, Karels
>and Quarterman, references to pertinent pages here are welcome). The question
>arised when a disk vendor presented results from a benchmark which purpose
>was to measure read/write transfer rates for his drive. His scenario
>was this:

> - a stand alone BSD (SunOS 4.0.3 I think) box in single user mode
> - no other disk activity in the system

>The test program looked something like this:

> - <write a HUUUUGE file and measure the write transfer rate>
> - sync(); sync(); sync();
> - <read the same file back again and measure the read transfer rate>
> 

As stated in the AT&T SysV Programmer, sync(2) is used to write memory
to disk *and* actualize the superblock. Thus, the succeeding read()
ought to find the file correctly, because the last blocks and superblock
where flushed to disk. sync(1/1M) is simply the same, but as stand-alone
binary.

--HL
-- 
============================================================================
Horst Laumer, Kantstrasse 107, D-1000 Berlin 12 ! Bang-Adress: Junk-Food 
INET: hotte@sunrise.in-berlin.de                ! for Autorouters -- me --
UUCP: ..unido!fub!geminix!sunrise.in-berlin.de!hotte

mpledger@cti1.UUCP (Mark Pledger) (12/17/90)

steinar@ifi.uio.no (Steinar Kj{rnsr|d) writes:


>The above subject and imposed question may seem trivial, but I have so far
>failed to find the answer (I browsed through the 4.3 book by McKusick, Karels
>and Quarterman, references to pertinent pages here are welcome). The question
>arised when a disk vendor presented results from a benchmark which purpose
>was to measure read/write transfer rates for his drive. His scenario
>was this:

> - a stand alone BSD (SunOS 4.0.3 I think) box in single user mode
> - no other disk activity in the system

>The test program looked something like this:

> - <write a HUUUUGE file and measure the write transfer rate>
> - sync(); sync(); sync();
> - <read the same file back again and measure the read transfer rate>
> 
>Both read and write of the file use normal IO (asynchronous) operations which
>therefore involves the buffer cache. The disk vendor's assumption is that
>the three sync() calls will guarantee that the read pass of the test will
>read data  off the media, not from the buffer cache, while I say that
>although the sync() calls force dirty pages in the buffer cache to be written
>to the disk, you have no guarantee that those pages also will be wiped out
>from the cache. I find this especially true since there are no other
>disk activity in the system at the time when the test is running. What would
>be the purpose of really clearing the cache when you have nothing to replace
>the cleared pages with?

>Who are right?


If memory serves me right, sync() does not CLEAR the file buffers, but
only writes all dirty buffers to disk and clears the dirty bit flag in
the file buffer header structure.  Your right that there is no guarantee
the dirty buffers will be wiped out.  In fact, considering your system
is idle other than your processes, I would tend to believe the file 
buffers are still available for use.  Therefore in the vendor's test case,
by updating all file buffers, he quarantees that the reads will come from
the file buffers and not the actual disk.  A subtle improvement indeed!




-- 
Sincerely,


Mark Pledger

--------------------------------------------------------------------------
CTI                              |              (703) 685-5434 [voice]
2121 Crystal Drive               |              (703) 685-7022 [fax]
Suite 103                        |              
Arlington, VA  22202             |              mpledger@cti.com
--------------------------------------------------------------------------

ske@pkmab.se (Kristoffer Eriksson) (12/19/90)

In article <5156@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
>While there may a such a sync queue in some kernel somewhere, there isn't one
>in any of the common kernels that have had triple syncs directed at them.

When I type "sync" once after some disk writing activity on my system,
there is a delay before I get the prompt back. If that delay is not caused
by sync() waiting for disk blocks to be written, then I wonder what it is
caused by. How do your systems act? Do they give you any delay?
-- 
Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden
Phone: +46 19-13 03 60  !  e-mail: ske@pkmab.se
Fax:   +46 19-11 51 03  !  or ...!{uunet,mcsun}!sunic.sunet.se!kullmar!pkmab!ske

heiby@mcdchg.chg.mcd.mot.com (Ron Heiby) (12/20/90)

ske@pkmab.se (Kristoffer Eriksson) writes:

>When I type "sync" once after some disk writing activity on my system,
>there is a delay before I get the prompt back. If that delay is not caused
>by sync() waiting for disk blocks to be written, then I wonder what it is
>caused by. How do your systems act? Do they give you any delay?

I've noticed that, too.  My explanation is that when you type the
first sync command, there are a potentially large number of dirty
buffers, so there is a lot of work that that system call has to do.
Lots of buffers need to be put onto the write queue for your disks.
The second and third sync commands find relatively few dirty buffers.
In fact, if you are in single-user mode, it probably finds none.  So,
there isn't much work to be done.  Also, as soon as those blocks start
getting written, your system has a burst of activity in terms of i/o
driver code, including interrupt code, to continue to keep it busy.

On the subject of the "every 30 seconds" sync - we found at Motorola
that as buffer caches increase in size, the amount of time spent once
or twice a minute to flush the entire cache of dirty buffers to disk
was beginning to be noticeable to our customers.  Our current 68K and
88K releases of System V provide tunables to control this.  The
default settings cause 1/60th of the buffer cache to be flushed to
disk each second.  Spreading the load of writing the dirty buffers
gives much smoother performance.
-- 
Ron Heiby, heiby@chg.mcd.mot.com	Moderator: comp.newprod
"Give me voice mail or give me drugs!"/"Mandatory Drug Testing? Just Say NO!!!"

jim@segue.segue.com (Jim Balter) (12/20/90)

In article <4670@pkmab.se> ske@pkmab.se (Kristoffer Eriksson) writes:
>When I type "sync" once after some disk writing activity on my system,
>there is a delay before I get the prompt back. If that delay is not caused
>by sync() waiting for disk blocks to be written, then I wonder what it is
>caused by. How do your systems act? Do they give you any delay?

sync does a synchronous write of the superblock.  It also does synchronous
reads of the inode blocks because an inode is only a fraction of a block.
Then it puts all the dirty blocks on the disk queues (starting disk I/O if it
isn't in progress) and returns.  In addition, it's possible for the the inode
reads or other activity to cause pages of your shell to be kicked out (dirty
ones would have to be synchronously written) and they might have to be read
back in before you see your prompt.

Even if sync waited on dirty block writes, that would be a 1-deep "sync queue",
not the fabled 2-deep queue.

src@scuzzy.in-berlin.de (Heiko Blume) (12/20/90)

ske@pkmab.se (Kristoffer Eriksson) writes:

>In article <5156@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
>>While there may a such a sync queue in some kernel somewhere, there isn't one
>>in any of the common kernels that have had triple syncs directed at them.

>When I type "sync" once after some disk writing activity on my system,
>there is a delay before I get the prompt back. If that delay is not caused
>by sync() waiting for disk blocks to be written, then I wonder what it is
>caused by. How do your systems act? Do they give you any delay?

at least with interactive unix there is a little delay before you get
the prompt back. the delay is caused by the kernel queuing the transfers.
however, those three syncs often 'advertised' might not suffice to fully 
flush the dirty buffers in time. i.e. after processing incoming news, there
can be a *lot* dirty buffers to be written. with slow disks, that are often
used for /usr/spool/news, this can cause the time to flush all buffers to
be well over 5 seconds. so one should have a look at the drives' LEDs to
see if it's save to hit the red one.
-- 
      Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93
                    public source archive [HST V.42bis]:
        scuzzy Any ACU,f 38400 6919520 gin:--gin: nuucp sword: nuucp
                     uucp scuzzy!/src/README /your/home

boyd@necisa.ho.necisa.oz.au (Boyd Roberts) (12/21/90)

In article <5258@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
>Even if sync waited on dirty block writes, that would be a 1-deep "sync queue",
>not the fabled 2-deep queue.

There is no `sync queue'.

I'm so pleased to see that the demise of comp.unix.wizards has resulted
in comp.unix.misinformation being crossposted to comp.unix.*.


Boyd Roberts			boyd@necisa.ho.necisa.oz.au

``When the going gets wierd, the weird turn pro...''

mercer@npdiss1.StPaul.NCR.COM (Dan Mercer) (12/22/90)

In article <5156@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
:
:In article <1635@lot.ACA.MCC.COM> ables@lot.ACA.MCC.COM (King Ables) writes:
:>been processed and removed from the queue.  So when the 3rd
:>one returned, you knew the first one had actually happened.
:
:If one insists on believing that people are being rational in typing sync
:three times, that's a great hypothesis.  However, it has no basis in fact.
:While there may a such a sync queue in some kernel somewhere, there isn't one
:in any of the common kernels that have had triple syncs directed at them.
:(Gee, I hope I haven't violated a non-disclosure agreement by saying so.)
:
:[This paragraph has been placed here to satisfy some incredibly stupid person
:who coded inews to reject insufficiently lengthy new articles.
:Oh, that wasn't the intent of the check?  Like I said, incredibly stupid.
:IMHO, of course.]

Save the bandwidth.  postnews only checks the number of lines
beginning with '>' against the lines not beginning with it.  Yes it's
dumb.  Bypass it by globally changing the leading '>'s.  I've mapped
'g' in vi to do just that 

map g :1,$s/^>/:/^M

-- 
Dan Mercer
NCR Network Products Division      -        Network Integration Services
Reply-To: mercer@npdiss1.StPaul.NCR.COM (Dan Mercer)
"MAN - the only one word oxymoron in the English Language"

jas@llama.Ingres.COM (Jim Shankland) (12/22/90)

In article <1969@necisa.ho.necisa.oz.au> boyd@necisa.ho.necisa.oz.au (Boyd Roberts) writes:
>In article <5258@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
>>Even if sync waited on dirty block writes, that would be a 1-deep "sync queue",
>>not the fabled 2-deep queue.
>
>There is no `sync queue'.

No, no, you're all wrong.

There is a "sync protocol" that goes as follows:

Programmer:  sync <cr>
Kernel:  (Hears you, but doesn't want to be bothered, thus ignores you)
Programmer:  sync <cr>
Kernel:  (Knows you really want those blocks synched, but is busy doing other
stuff.  Puts your request on a queue, to do "when I get around to it."
Programmer:  sync <cr>
Kernel:  (Now understands you're not going to back down, and it's going to
be in big trouble if it doesn't sync those blocks RIGHT NOW; does so.)

Coincidentally, my mother used to use a similar protocol with me when
I was an adolescent.  Third request always had the I_MEAN_IT_YOUNG_MAN flag,
too.

jas

(Hey, it's no sillier than the other explanations that have been posted.)

pcg@cs.aber.ac.uk (Piercarlo Grandi) (12/23/90)

On 17 Dec 90 12:15:50 GMT, mpledger@cti1.UUCP (Mark Pledger) said:

mpledger> steinar@ifi.uio.no (Steinar Kj{rnsr|d) writes:

	[ ... whether sync guarantees freeing of buffers from the
	cache, or not like he thinks ... ]

mpledger> If memory serves me right, sync() does not CLEAR the file
mpledger> buffers, but only writes all dirty buffers to disk and clears
mpledger> the dirty bit flag in the file buffer header structure.  Your
mpledger> right that there is no guarantee the dirty buffers will be
mpledger> wiped out.

Yep. The *only* portable way to make sure that the buffers associated to
a file's block are freed is to unmount the filesystem on which the file
resides.

Here is the simple minded but effective technology I use to measure peak
read/write rates thru the filesystem, e.g. for the purpose of
determining the optimal logical interleave.

First a simple program:

	main()
	{
		int i; char b[32*1024];
		for (i = 0; i < 64; i++)
			read(0,b, sizeof b); /* or write(1,b,sizeof b); */
	}


compiled as ./write and ./read, then I choose a spare partition, and I
do

	mkfs /dev/rdsk/0s2 4836:1216 6 156 # or any other parameters

	mount /dev/dsk/0s2 /mnt

	time ./write >/mnt/x
	umount /dev/dsk/0s2	# To flush the ...
	mount /dev/dsk/0s2 /mnt # ... buffer cache
	time ./read </mnt/x

	umount /dev/dsk/0s2
--
Piercarlo Grandi                   | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

jim@segue.segue.com (Jim Balter) (12/23/90)

In article <1969@necisa.ho.necisa.oz.au> boyd@necisa.ho.necisa.oz.au (Boyd Roberts) writes:
>In article <5258@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
>>Even if sync waited on dirty block writes, that would be a 1-deep "sync queue",
>>not the fabled 2-deep queue.
>
>There is no `sync queue'.

Sigh.  I forget that literacy is a thing of the past.  What was it that
Dijkstra said about fluency in one's native language being a requirement
for programmers?

1) I already said that there is no sync queue.  Don't quote me out of context.

2) "Even if ... that would be" is subjunctive.  Look it up.  

3) A "sync queue" is not the same as a sync queue.  From my Random House
   style guide, `Use of quotation marks', note 12:  "To suggest ironic use of
   a word or phrase".

>I'm so pleased to see that the demise of comp.unix.wizards has resulted
>in comp.unix.misinformation being crossposted to comp.unix.*.

So pleased that you decided to contribute to the pollution.

dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) (12/27/90)

In article <PCG.90Dec22171527@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>
>Yep. The *only* portable way to make sure that the buffers associated to
>a file's block are freed is to unmount the filesystem on which the file
>resides.
>

Is unmounting a filesystem an easier way than doing enough reads from 
other parts of the disk to overwrite all of the buffers in the cache?




--
Dave Eisen                          dkeisen@Gang-of-Four.Stanford.EDU
1447 N. Shoreline Blvd.
Mountain View, CA 94043              Anybody have an extra New Year's ticket?
(415) 967-5644                             (I can hope, can't I?)

tif@doorstop.austin.ibm.com (Paul Chamberlain) (12/27/90)

dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) writes:
>Is unmounting a filesystem an easier way than doing enough reads from 
>other parts of the disk to overwrite all of the buffers in the cache?

I thought that it would have occurred to you while you
were typing that, to try and define "enough".

Paul Chamberlain | I do NOT represent IBM.         IBM VNET: sc30661 at ausvm6
512/838-9662     | This is rumored to work now --> tif@doorstop.austin.ibm.com

dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) (12/29/90)

In article <4633@awdprime.UUCP> tif@doorstop.austin.ibm.com (Paul Chamberlain) writes:
>dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) writes:
>>Is unmounting a filesystem an easier way than doing enough reads from 
>>other parts of the disk to overwrite all of the buffers in the cache?
>
>I thought that it would have occurred to you while you
>were typing that, to try and define "enough".


Well, sure. But you know how many buffers there are in the cache and you
know how big each of them is. If the kernel really does use a LRU algorithm
for buffer allocation (as I understand that it does), there really is no
difficulty in figuring out how much needs to be read.



--
Dave Eisen                      	    dkeisen@Gang-of-Four.Stanford.EDU
1447 N. Shoreline Blvd.
Mountain View, CA 94043              Anybody have an extra New Year's ticket?
(415) 967-5644                             (I can hope, can't I?)

gordon@sneaky.UUCP (Gordon Burditt) (12/31/90)

>:If one insists on believing that people are being rational in typing sync
>:three times, that's a great hypothesis.  However, it has no basis in fact.
>:While there may a such a sync queue in some kernel somewhere, there isn't one
>:in any of the common kernels that have had triple syncs directed at them.

There were semi-valid reasons for three sync's on a PDP-11/70 running PWB
UNIX (a variant of UNIX V6).  This was, of course, a long time ago.

(1) The length of time it took us fumble-fingered superusers to type
    "sync<CR>" 3 times ensured enough time for the first sync() to 
    complete.  As far as I know, this depended exclusively on the delay
    introduced by human typing speed.
(2) Due to the placement of the keys on a DecWriter, it was distressingly
    common to type "sync<DEL>" instead of "sync<CR>", (<DEL> being the 
    interrupt character) but you usually wouldn't make the same mistake 
    3 times in a row.
(3) It prevented one guy from typing "sync", then putting one finger on
    the <CR> key, one finger on the other hand on the CPU HALT button (or 
    whatever it was labelled), and pressing them simultaneously.

					Gordon L. Burditt
					sneaky.lonestar.org!gordon

pcg@cs.aber.ac.uk (Piercarlo Grandi) (12/31/90)

On 26 Dec 90 16:54:55 GMT, dkeisen@Gang-of-Four.Stanford.EDU (Dave
Eisen) said:

dkeisen> In article <PCG.90Dec22171527@odin.cs.aber.ac.uk>
dkeisen> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:

pcg> Yep. The *only* portable way to make sure that the buffers
                     ^^^^^^^^
pcg> associated to a file's block are freed is to unmount the filesystem
pcg> on which the file resides.

dkeisen> Is unmounting a filesystem an easier way than doing enough
dkeisen> reads from other parts of the disk to overwrite all of the
dkeisen> buffers in the cache?

The answer is YES. Note the emphasis above.


Less tersely:

Just to state the obvious: "doing enough reads" to swamp the cache is
not guaranteed to work, because whether it swamps the cache depends on
the buffer replacement strategy, and on whether file access is memory
mapped or not, and many other considerations.

For example assume that you have a 1000 buffer cache and you quickly
read sequentially 1001 blocks. This may work if the cache manager ages
buffer blocks more slowly than you read them in, it may not work if the
cache manager recognizes the sequential access pattern. You can read
1001 blocks at random, this may work is the cache manager is strictly
LRU, it may not work is the cache manager is not stupid (most are
though) and recognizes the access pattern.

In practice you can rely on the replacement algorithm to be some kind of
LRU, but me, being a fairly silly guy, would not presume to be able to
reverse engineer it and devise a synthetic caching algorithm frustration
reference pattern.

It could be a Master or even Doctoral research project to write a
program that when run under UNIX exercises the cache manager to reverse
engineer its replacement policy and then calculates a reference pattern
that frustrates it reliably and portably. There are several suble
problems...

You are welcome to try!

But I'd rather advance one humble suggestion: Unix systems are very weak
as to VM algorithms and tuning tools. It might be much more useful and
easier putting in something better than the appallingly deficient VM
algorithms of too many of today's Unix variants (System V.3.2 still does
expansion swaps! And it is proudly documented in Bach's book too!) and
to provide memory profilers to analyze the reference patterns of user
programs and increase locality, so for example more people would
understand how poorly implemented are things like the MIT X server and
GNU Emacs.
--
Piercarlo Grandi                   | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk