[comp.unix.programmer] Determining if an existant file is open

aeusejvn@csunb.csun.edu (Jon Nadelberg) (11/16/90)

We have a program that while running maintains a checkpoint file.  This
file is named based on the users id, and is stored in a specific directory
on the system.  

Our problem occurs when someone tries to start up another process under
the same user id while the first one is running. 

The first concern is that the second process, upon seeing the existence
of the checkpoint file created by the first process think that it needs
to "warmstart" from the first file.  It should not do this.  Second,
once the new process does this, it then re-initializes the checkpoint
file thus corrupting it.  

Our thinking is this: if we can determine whether or not the checkpoint
file is currently open by the first process, we can then make the decision
as to whether or not to continue with a warmstart process, or in the
case of the file still being open, not starting up a checkpoint file
for subsequent processes.
 
Is there a way to check if a file is currently open and being used by
another process?  Is there a way to "lock" a file so that other processes
can not access it?  
 
We are kind of stuck with the implementation staying the way it is.  The only
thing we can do is Band-Aid it, so a redesign of the way it works is
not a practical suggestion at this point.
 
Any help would be appreciated.  
Thank you.

--
------------------------------------------------------------------------
-   Jon Nadelberg                                                      -
-   aeusejvn@csunb.csun.edu                                            -
------------------------------------------------------------------------

rwhite@nusdecs.uucp (0257014-Robert White(140)) (11/17/90)

In article <1990Nov16.023110.1305@csun.edu> aeusejvn@csunb.csun.edu (Jon Nadelberg) writes:
>Is there a way to check if a file is currently open and being used by
>another process?  Is there a way to "lock" a file so that other processes
>can not access it?  

The things that come immedately to mind are:

Use the exclusive flag when calling open(2) -- must be done every time.

Allow multiple opens and have the processes do read and write locks
as approprate, being shure to always lock before read (you shouldn't
need manditory locks in this case, just be consistant).  If the file
is locked the file is busy.  fnctl(2) or ioctl(2) I dont remember
which right now.

Do the "put process number in lock file" thing as in uucp.

Any sutble differences between these thecniques should be considered
based on the exact internals of the app.  And then combine as many of
these as is reasonable.

Hope it was some help,
Rob.

guy@auspex.auspex.com (Guy Harris) (11/20/90)

>Use the exclusive flag when calling open(2) -- must be done every time.

You mean O_EXCL?  That flag isn't an exclusive-use open flag; it's an
"exclusive create" flag.  It only causes an error if O_CREAT is also set
and the file already exists.

chip@tct.uucp (Chip Salzenberg) (11/21/90)

According to rwhite@nusdecs.uucp (0257014-Robert White(140)):
>Do the "put process number in lock file" thing as in uucp.

This approach cannot be used reliably without either (1) having a lock
held forever by a dead process, which I find unacceptable, or (2)
using a kernel locking call, which renders the pid file redundant.

For you doubters: Once you've determined that a lock file is stale and
you want to unlink() it, how do you know the lock file hasn't been
replaced in between the decision to unlink() and the unlink() itself?

I really like kernel record locks for lots of reasons, but the primary
one is that locks disappear automatically when the locking process
dies, even if the death is abnormal (SIGKILL, core dump, anything).
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
    "I've been cranky ever since my comp.unix.wizards was removed
         by that evil Chip Salzenberg."   -- John F. Haugh II

aeusejvn@Twg-S5.uucp (jon nadelberg) (11/21/90)

Thanks to everyone who answered my question.  We'll probably be able to
get something functioning now.  Much appreciated.



--
------------------------------------------------------------------------
-   Jon Nadelberg                                                      -
-   aeusejvn@csunb.csun.edu                                            -
------------------------------------------------------------------------

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/22/90)

In article <274975C4.21C@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
> According to rwhite@nusdecs.uucp (0257014-Robert White(140)):
> >Do the "put process number in lock file" thing as in uucp.
> This approach cannot be used reliably without either (1) having a lock
> held forever by a dead process, which I find unacceptable, or (2)
> using a kernel locking call, which renders the pid file redundant.

It can be used reliably if you put information like pid and date into
the name of the lock file.

> For you doubters: Once you've determined that a lock file is stale and
> you want to unlink() it, how do you know the lock file hasn't been
> replaced in between the decision to unlink() and the unlink() itself?

The old name will never be reused, so there's no danger of replacement.

---Dan

jax@well.sf.ca.us (Jack J. Woehr) (11/23/90)

	Grammar Police, Ma'am! Open Up!

	Ahem ...

	The word y'all are fseeking and not ffinding is:

		extant

	There *is* and English word "existence", but if "existant"
is a relative thereof, it's from the wrong side of the bedsheets.

		Dank U.

-- 
 <jax@well.{UUCP,sf.ca.us} ><  Member, >        /// ///\\\    \\\  ///
 <well!jax@lll-winken.arpa >< X3J14 TC >       /// ///  \\\    \\\/// 
 <JAX on GEnie             >< for ANS  > \\\  /// ///====\\\   ///\\\ 
 <SYSOP RCFB (303) 278-0364><  Forth   >  \\\/// ///      \\\ ///  \\\

afsipmh@cid.aes.doe.CA (Patrick Hertel) (11/24/90)

In article <21795@well.sf.ca.us> jax@well.sf.ca.us (Jack J. Woehr) writes:
>
>	Grammar Police, Ma'am! Open Up!
>
>	Ahem ...
>
>	The word y'all are fseeking and not ffinding is:
>
>		extant
>
>	There *is* and English word "existence", but if "existant"
>is a relative thereof, it's from the wrong side of the bedsheets.
>
>		Dank U.
>

 You are only partly right since you can't use extant in the context of
 "if an extant...". Therefore the word is:

	   existing
-- 
Pat Hertel                 Canadian Meteorological Centre
Analyst/Programmer         2121 N. Service Rd.        % rm God
phertel@cmc.aes.doe.ca     Dorval,Quebec              rm: God non-existent
Environment Canada         CANADA           H9P1J3

rwhite@nusdecs.uucp (0257014-Robert White(140)) (11/25/90)

In article <26251:Nov2119:42:1090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <274975C4.21C@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>> According to rwhite@nusdecs.uucp (0257014-Robert White(140)):
>> >Do the "put process number in lock file" thing as in uucp.
>> For you doubters: Once you've determined that a lock file is stale and
>> you want to unlink() it, how do you know the lock file hasn't been
>> replaced in between the decision to unlink() and the unlink() itself?
>The old name will never be reused, so there's no danger of replacement.

The above is (except for my comment) incorrect.  The application of the
basic rules of computer science reveals the following procedure:

Given:  A singular lock file name (multiple names is a waste of effort).
Given:  This file will have read-and-write access to all pertinant parties.
Given:  Complete File longevity (e.g. never remove it) can simplify the
	following but is not always desireable.
Given:	using fixed-width data simplifies internal manipulation.

Perform the following:
Serarch for and open file.
	If file is not found, create it.
	<At this point the "window of vulnerablility exists>
	<as this is not yet the "lock" procedure.>
Lock the file for exclusive Access.
	<Once you own the lock on the file there is no chance of>
	<confilict durring lock validation.  The requestors will>
	<be FIFO(ed) by the locking mechanisim.  The active lock>
	<may in itself be enough for a given purpose, this would>
	<tie up a lock slot so it may be desireable to continue>
Recover the process ID of the current locking process.
	If process is current, do nothing.
	Otherwise, replace process ID with this-process' ID.
Release Lock and close file.
Perform Actions of program.
Remove file.

Given that the ownership of the file is immeterial, since it is just
as easy to arbitrarily decide on universal writeability as it is to
decide on read and delete permissions, the "delete old file" step is
ineffective and tends to introduce unnecessary windows of vulnerability.

The use of the file lock keeps the above essentially atomic (as in
atomic operations like test-and-set) in its scope of operations.  
This trait means that the lock on the file, which is cleaned up on
program exit by the operating system in cases of abnormal exit, is
enough to garentee exclusivity;  the file lock state becomes a
semiphore (uesful for RFS environemnts where semiphores are not available
across the network link) and the access can be seezed and released 
by a pool of users without relying on PIDs.

There are a large number of really good variations on the above, each
of which is effective only so long as every program uses the mechanisim
every time.  But then again there is nothing you can really do along
these lines to protect yourself from rogue programs.

*******************************************************************
Robert C. White Jr.    |   Not some church, and not the state,
Network Administrator  |      Not some dark capricious fate.
National University    |   Who you are, and when you lose,
crash!nusdecs!rwhite   |      Comes only from the things you chose.
(619) 563-7140 (voice) |                             -- me.
*******************************************************************

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/26/90)

Here's a way to simulate safe locks without flock() or lockf().

1. Create wantlock.time.pid, filling in the current time and pid in
whatever format. Put enough information in it that a later process can
detect if you crash. (If the only reason for such a crash is system
failure, and if each system failure is logged in a central file, it's
enough to leave wantlock empty and depend on the timestamps. Otherwise
you should create the file as temp.time.pid, then write the necessary
information, then move it to wantlock.time.pid.)

2. Read the directory. If the first wantlock (in order of time, and then
in order of pid) is yours, go ahead and use the resource. Otherwise
sleep a second and repeat this step. Make sure to remove (ignoring
ENOENT) any file (before yours in time order) whose process has crashed.
This step requires a semblance of intelligence in your directory-reading
mechanism---no matter what other operations are going on, the system
must guarantee that you see every existing file that was created before
you opened the directory, and that you won't keep reading filenames
forever if some other process keeps creating files.

3. When you are finished with the resource, remove the wantlock file.

One strategy for implementing #1 is to create wantlock as a named pipe.
Then #2 opens the pipe in nonblocking mode to see if the process is
still alive.

An alternative way to detect crashes is to clear the locks on each
reboot, but unless the resource is centralized this can be a pain.

In article <1990Nov25.014317.11660@nusdecs.uucp> rwhite@nusdecs.uucp (0257014-Robert White(140)) writes:
> In article <26251:Nov2119:42:1090@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >In article <274975C4.21C@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
> >> According to rwhite@nusdecs.uucp (0257014-Robert White(140)):
> >> >Do the "put process number in lock file" thing as in uucp.
> >> For you doubters: Once you've determined that a lock file is stale and
> >> you want to unlink() it, how do you know the lock file hasn't been
> >> replaced in between the decision to unlink() and the unlink() itself?
> >The old name will never be reused, so there's no danger of replacement.
> The above is (except for my comment) incorrect.

Uh, you left out what I said before that ``incorrect'' statement: viz.,
if you include appropriate information in the name of the lock file,
Chip's objections disappear. In the situation he describes, the old name
will never be reused, so there's no danger of replacement.

In other words, you take advantage of the kernel's synchronization
mechanisms for directories. This is often easier than using higher-level
lock mechanisms. (The three-step method above is pretty simple.)

> The application of the
> basic rules of computer science reveals the following procedure:
> Given:  A singular lock file name (multiple names is a waste of effort).

What are ``the basic rules of computer science''? Using multiple names
solves the problem, so obviously it isn't a waste of effort. Not that
there's anything wrong with the method you outline, but kernel locks are
not necessary to implement higher-level locks.

> Lock the file for exclusive Access.
> 	<Once you own the lock on the file there is no chance of>
> 	<confilict durring lock validation.  The requestors will>
> 	<be FIFO(ed) by the locking mechanisim.

Actually, there's no guarantee that the other processes are handled FIFO
by the kernel lock (at least on some UNIX systems). The method I outline
does guarantee FIFO behavior.

---Dan

chip@tct.uucp (Chip Salzenberg) (11/28/90)

According to rwhite@nusdecs.uucp (0257014-Robert White(140)):
>In article <274975C4.21C@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>> According to rwhite@nusdecs.uucp (0257014-Robert White(140)):
>> >Do the "put process number in lock file" thing as in uucp.
>> For you doubters: Once you've determined that a lock file is stale and
>> you want to unlink() it, how do you know the lock file hasn't been
>> replaced in between the decision to unlink() and the unlink() itself?
>
>The above is (except for my comment) incorrect.  The application of the
>basic rules of computer science reveals the following procedure:

Mr. White thus reveals that what he lacks in reading comprehension he
makes up for in pomposity.

Had he read the first part of my article, he would have noticed the
qualification emphasized below:

In article <274975C4.21C@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
> This approach cannot be used reliably without either (1) having a lock
> held forever by a dead process, which I find unacceptable, or (2)
> using a kernel locking call, which renders the pid file redundant.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It is excruciatingly obvious that kernel locking calls are sufficient
to provide serial access to a file, thus guaranteeing lock file
integrity without requiring unlinking and linking.  Exclusive access
based on kernel record locking primitives is child's play.

In any case, my point was that creating a lock file, and then updating
it safely using kernel locking primitives, is silly if the resource in
question can be locked directly.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
    "I've been cranky ever since my comp.unix.wizards was removed
         by that evil Chip Salzenberg."   -- John F. Haugh II

det@hawkmoon.MN.ORG (Derek E. Terveer) (11/28/90)

afsipmh@cid.aes.doe.CA (Patrick Hertel) writes:

> You are only partly right since you can't use extant in the context of
> "if an extant...". Therefore the word is:
>	   existing

Huh?  In my dictionary, "extant" is billed as an adjective.  Since it is
modifying the noun "file", you should be able to use it in this context.
Substitute another adjective, like "huge" or "red":

	if a huge file is open
	if a red file is open
	if an extant file is open

Same thing.

derek
-- 
Derek Terveer						det@hawkmoon.MN.ORG
Minnesota Field Hockey Association,  North Central Section
University of Minnesota Women's Lacrosse,  Midwest District