[net.unix] Results of Locking Poll

hokey@plus5.UUCP (Hokey) (10/15/84)

Early in June I sent out a poll on the net to ask people what they wanted
in the way of file and/or record locking for Unix.  I received about 25
responses before I had to leave for Wherever the conference was, which
was also the site of a /usr/group Standards meeteing.

I apologize for the delay in posting the results of my study; the effort
was so frustrating that I get quite pissed off every time I think about it.
Anybody who has heard me pitch for mod.std.{c,mumps,unix} will probably
understand why I feel the use of these groups is so important.

Anyway, here we go:

1) Do you ever need a mutual exclusion mechanism?
	This was a trick question.  Everybody who replied said yes.
	Some suggested differentiating between semaphores and database
	locking.  They are right, but the same tool can *occasionally*
	be used for either purpose.

2) Do you prefer file or record locking?  Why?
	12 said File locking,
	 3 said Record locking,
	 3 said either, and
	13 said both, although most said file locks were usually sufficient.

3) Do you ever need (logical) transaction locks instead of a "simple"
file lock?
	17 said Yes
	 3 said No
	11 didn't know what I meant.

	Transaction locks can be thought of as "logical" locks, in that
	they apply over several files.  For example, a logical lock on
	"login" files would cover both /etc/passwd and /etc/group.
	Specifically, a logical lock encompasses all files which are
	part of the current "transaction".  They can be advisory or
	compulsory.  Advisory is fine in a controlled environment.

4) Do you wish to be able to have shared read locks as well as exclusive
write locks, or is "exclusive lock or no lock" sufficient?  Why?
	 2 said "All or nothing".  One of these said they only used
	  them for hardware devices, in any event.
	27 said they wanted shared/exclusive locks.  One said semaphores
	  was the answer for them.
	 2 didn't understand the question (their answers made no sense to me)

5) Do you ever do any recursive file operations in which a third type
of lock: shared read with option to convert to exclusive write?  This
lock is useful in multiway tree insertion or deletion, where node changes
might propagate up the tree to the root, but there is no need to prohibit
shared read access to the node until it is necessary to have exclusive
access to it in order to change its contents.
	19 said Yes,
	 4 said no, and they never would want it,
	 7 said no, but they might want it later, maybe.

That is it.  After I stripped out useless header info, the file containing
these replies is over 60Kbytes in length.  That's about 2K per respondent!

I told the /usr/group committee the results of this study, and the response
was "Too late.  They should have said so earlier.  We already printed the
draft standards and the votes are coming in.  We don't want to change it."
John Bass tells me that shared reads are hard to do under the currently
proposed lockf mechanism (1 process per "area" in the file.  This "area"
can extend to EOF.).  I don't yet see how this is difficult, but John
hasn't told me what the problem is.  Note that this mechanism provides
only all-or-nothing locks, as opposed to shared/exclusive access,
the one point where almost everybody who responded said they wanted
shared/exclusive access.

Please use the mod.std groups.  They are a much better way to make your
voice heard than hoping Somebody who reads net.unix or net.lang.c will
take your idea and run with it.  We all have to live with the results.
-- 
Hokey           ..ihnp4!plus5!hokey
		  314-725-9492

rcd@opus.UUCP (Dick Dunn) (10/19/84)

(Original only appeared in net.unix.)
> Early in June I sent out a poll on the net to ask people what they wanted
> in the way of file and/or record locking for Unix.
>...
> John Bass tells me that shared reads are hard to do under the currently
> proposed lockf mechanism (1 process per "area" in the file.  This "area"
> can extend to EOF.).  I don't yet see how this is difficult, but John
> hasn't told me what the problem is.  Note that this mechanism provides
> only all-or-nothing locks, as opposed to shared/exclusive access,

The one-process-per-area restriction seems almost by definition to preclude
shared reading.  Without shared read, it's not clear how useful the locking
could be for databases--which are the most likely major candidate.

As to the all-or-nothing aspect, I don't quite see how that can work out.
It would seem that there would be a restriction on permissions before you
can lock a file.  But with an all/nothing lock, you either
	- require write permission and deny processes which only have/need
	  read permission the right to read-lock a file to prevent
	  modification while they're reading it.
	- require only read permission and allow processes to gain
	  exclusive access to files which they don't "own" (in a colloquial
	  sense).

Doesn't sound very good.  Clarification, please?
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...Lately it occurs to me what a long, strange trip it's been.

hokey@plus5.UUCP (Hokey) (10/23/84)

There has almost been a small mail discusison going on between me, John
Bass, and gentleman named Radek (at HP in Germany) regarding these issues.
Here is the letter John recently sent to us.  Some of my comments are
interspersed.

> Date: Sat, 20 Oct 84 12:30:32 pdt
> From: wucs!ihnp4!hpda!dmsd!bass
> To: hpda!hpfcla!hpbbn!hpbbla!radek
> Subject: Re:  shared locking
> Cc: hpda!hpfcla!ihnp4!wucs!gang!plus5!hokey
> 
>>  Date: Fri, 19 Oct 84 03:04:16 pdt
>>  From: hpda!hpfcla!hpbbn!hpbbla!radek
>>  To: hpbbn!hpfcla!hpda!dmsd!bass
>>  Subject: shared locking
>>  Cc: hpbbn!hpfcla!ihnp4!wucs!gang!plus5!hokey
>>  
>>  Hi John,
>>  
>>  about 88% of the people, who answered to Hokey Stenn's survey,
>>  wish to have shared file & record locking, in addition to
>>  exclusive locks.
>>  
>>  As far I know, you expressed some concerns about the feasibility
>>  to implement this feature.
>>  
>>  Would you be so kind and tell me the major issues on that?
>>  
>>  Thanks.
>>  
>>  
>>  Radek Linhart			Hewlett-Packard GmbH
>>  dmsd!hpda!hpfcla!hpbbn!radek	P.O. Box 1430
>>  				D-7030 Boeblingen 1
>>  phone +49 7031 142052		W. Germany
>>  
> 
> Hi Radek,
> 
> 	I don't know anything about Hokey Stenn's questions or responses.
> I do know that when people are told that they MUST/CAN have shared locks the
> response is nearly 100% that they are required ... but when you ask why,
> what for, and how used ... the answers are nearly 100% confusion.
> 
> 	Your request has been the first on this issue in many months.
> There are many issues that say a general locking scheme should
> not have shared locks ... all of which revolve around a combination of
> implementation issues , usage senarios, and safety/protection of the
> typical MIS Dept programmer.
> 
> 	Lockf record locks are simply SEMAPHORES that use files as the
> resource name space. They are good for handling typical semaphore problems.
> Lockf (or other general semaphore scheme) doesn't replace (or allow building)
> a data base file system using transactional access -- particularly n-way trees
> and multiple keys.. 
> 
> 	First there are AT LEAST two major forms of shared locks, and several
> minor forms.
> 
> 	1) A process needs to update a record in a database. The record is
> 	to be updated via operator interaction over an unconstrained time
> 	period ... it is necessary that other processes have unrestricted
> 	read access. Thus the interactive process locks the record as
> 	"shared", reads the record, edits it, and rewrites the record
> 	as a critical region. The critical region is normally formed by
> 	relocking the record as non-shared, writing the data, and unlocking
> 	the record. Depending on how the record is processed the write
> 	operation may be a critical region in some systems (IE the inode is
> 	locked for the duration of the read/write system call). This is neither
> 	required or generally implemented for ALL classes of files.
[ This type of operation is often seen in airline reservation systems and
the like.  One solution is to read-lock the record, read and unlock the
record, do the work, write lock the record, and see if the record was
"significantly" changed in the interim.  If not, write and release the
record.  Otherwise, release the write-lock and tell the operator to try
again. ]
> 
> 	2) A process requires multiple data segements to PIECE togather
> 	a data item ... it is critical to maintain PHASE with all data
> 	items ... the general ledger, receivables, inventory ... etc update
> 	problem ... AND ... traversal of an n-way tree or complex data base
> 	structure. Here the reader locks all required segments as shared to
> 	maintain order while fereting out complex data sources. The sharing
> 	is used to allow other readers to access the data ... but to block
> 	writers.
> 
> NOTE that 1 implies a single process can read share lock a resource and
> 	has the unrevokable right to transition to non-shared at will.
> 
> NOTE that 2 implies all processes may lock a read shared resource and
> 	that there is no right to transition to non-shared status.
> 
> 	Thus at minimum there are 4 resource states, unlocked, non-shared,
> update-shared, and read-shared. There are also an interesting
> set of contention states that lead to deadlocks and race conditions.
> The code to implement is highly system and use dependent and it
> is NOT a portable construct. I have modeled some portions of the code,
> but can not in my own mind rationally address the tradeoffs without
> addressing specific data based implementations and usage patterns.
> In effect I feel that any result will be neither general or portable.
> Most results will lead novice programmers (and most other programmers that
> don't grasp concurrency well) down the garden path to deadlocks and race
> conditions. The non-shared approach generally is brutal enough that
> they think about the major issues or find another approach.
> 
> 	The number 1 request for read-shared locks is to implement
> n-way tree type databases ... or multiply-keyed data bases. In general
> the implementors haven't thought out the impact of their request.
[     ************ If you mean users instead of implementors, I disagree.
The purpose of Users is to keep Implementors in line, and bitch about bad
implementations.  Once these concepts are learned, they're pretty easy to
use. ]
> 
> 	Ponder for an hour or two a b-tree or b+-tree with most upper
> nodes read-share locked due to normal access traffic and transversals.
> For a transversal the top most node will be locked for the entire transversal.
> Thus the topmost node, and about 10-20% of the remaining nodes, are in
> effect live-locked (never really available for non-shared or update-shared
> access). This is a CRITICAL limitation in that a SIGNIFICANT class of data
> can not be DELETED or ADDED due to implied compaction and expansion of
> nodes. And in data bases with DATA in the nodes (not just in leaves) the
> upper nodes often may be difficult to obtain locks for update.
[ The solution to this problem is to give write lock requests a higher
priority than read locks. ]
> 
> 	The learning process one-to-one exploring this problem is several
> hours -- with common goals. The learning process in a group with diverse
> goals is difficult -- MANY people AFTER the above explanation still
> don't understand the ramifications of items 1 & 2 and still think of
> a shared lock that is both 1 & 2 at the same time. I wonder how many
> of the 88% responding have a need that can only be handled with a shared
> lock -- and have thought through HOW they would use it and what it's semantics
> would be. I would guess only a few.
> 
> 	A poll of how many folks would like a free X vs X+Y will generally
> be highly swayed to X+Y due to basic human nature.
> 
> 	What is generally needed are general semaphores (lockf) for the
> common problems -- and a kernel based transactional filesystem (database)
> for the tougher problems. The split is about 90%++ and less than 10%.
> Lockf was implemented to be both general and portable across many systems
> (including non-unix systems) -- I think lockf achieves that goal.
> To handle the remaining 10% of the application areas is significantly
> application specific or system specific.
> 
> 
> Hope this helps ... have fun
> 
> John Bass
> 
I, as one might guess, would prefer 4.2-style locks to be the standard,
as they are the fastest in operation.  During a phone conversation with
John Bass this afternoon he told me that time is not significant; the
existing lockf() mechanism on most PDP-11s and 68K machines require
200 nsec of system time and 200 nsec of "real" work.  This is based on
knowledge that there are rarely more than 3-5 processes after the "same"
thing.  I *could* change my mind...
-- 
Hokey           ..ihnp4!plus5!hokey
		  314-725-9492