[comp.os.misc] Uses for access time, stuff in inodes

daveb@llama.rtech.UUCP ("It takes a clear mind to make it") (05/06/88)

One problem with the access (and modify time) fields is that when you
are running in SV's O_SYNC mode, you end up doing two disk writes all
the time: once for the data, and once for the inode.  This can be
frightfully expensive.  It would be nice if there were some way for
O_SYNCed files to avoid this: maybe just defer the access/mod time
write until the file is closed.

In another issue, a number of us here have wanted another chunk of info
in the inode for a long time.  We call it the "ratfink" field, which
would contain the real user id of the last process that modified the
file.  This would make it a lot easier to track some things down...

-dB
{amdahl, cpsc6a, mtxinu, ptsfa, sun, hoptoad}!rtech!daveb daveb@rtech.uucp

karl@triceratops.cis.ohio-state.edu (Karl Kleinpaste) (05/06/88)

daveb@llama.rtech.UUCP says...
   It would be nice if there were some way for
   O_SYNCed files to avoid [2 writes/call]: maybe just defer the
   access/mod time write until the file is closed.

I would disagree.  If data blocks are updated but the inode is left
in-core-only, what value have I gained by using O_SYNC?  If the system
crashes right now, with half a megabyte of data written, but no record
of the changes to the inode yet out to disc, O_SYNC has become
useless.

I tend to think that O_SYNC is specifically intended for those folks
who are explicitly willing to put up with the performance hit of 2
physical writes/call.

--Karl

daveb@llama.rtech.UUCP (It takes a clear mind to make it) (05/08/88)

In article <12606@tut.cis.ohio-state.edu> karl@triceratops.cis.ohio-state.edu (Karl Kleinpaste) writes:
>daveb@llama.rtech.UUCP says...
>   It would be nice if there were some way for
>   O_SYNCed files to avoid [2 writes/call]: maybe just defer the
>   access/mod time write until the file is closed.
>
>I would disagree.  If data blocks are updated but the inode is left
>in-core-only, what value have I gained by using O_SYNC?  If the system
>crashes right now, with half a megabyte of data written, but no record
>of the changes to the inode yet out to disc, O_SYNC has become
>useless.
>
>I tend to think that O_SYNC is specifically intended for those folks
>who are explicitly willing to put up with the performance hit of 2
>physical writes/call.

OK, then you write the inode when the file size changes, and update the
mod/access time at close.  Maybe you have O_SYNC work this way if an
additional (or alternative) option, say O_FASTSYNC was provided.

The specific example we have in mind is this.  There is a 1M file lying
around, within which we are doing seeks and read and writes to perform
updates (as you might imagine this is DBMS activity).  We don't give a
hoot about the access/mod time on the file, but we care a lot that the
data actually makes it to the disk.  That extra disk i/o can cut your
throughput in half, which is not acceptable.

It would also be perfectly acceptable to have an additional system call
that explicitly said, "I am extending this file to this length, get the
pages and write the inode to disk."  To a certain extent, this would be
a generalization of th existing BSD ftruncate call.

-dB

{amdahl, cpsc6a, mtxinu, ptsfa, sun, hoptoad}!rtech!daveb daveb@rtech.uucp