[comp.databases] Distributed Database Locking

pjs269@tijc02.UUCP (Paul Schmidt ) (03/14/88)

Distributed Database Management Systems and locking

I am doing research into distributed databases for my Master's Thesis
and I have a few comments and questions.

First I feel there are two methods for locking distributed databases.
One is to lock all resources before a transaction is started and the
other is to lock resources as they are needed.  Locking all resources
first gaurantees deadlock prevention, but may lock resources that are
not needed, and keep them locked for longer than they are needed.  By
locking resources when they are needed, one can avoid unnecessary locks
but a deadlock detection scheme would have to be used.  (Resource, in
my view, is any peice of data; a relation, a tuple, or an attribute in
a tuple.)

Let me go over a few of the pros and cons of each method.
	
Locking of all resources at the beginning of the transaction has the
following advantages:  ease of implementation and deadlock prevention.
This will be easier to implement since a deadlock detection algorithm
and transaction rollback scheme does not have to be implemented.
Deadlock is prevented in this scheme.

Locking of all resources at the beginning of each trasaction has the
following disadvantages:  possibility of wasted processing time at a
node, all resources must be locked on transaction invocation, and
unneeded resources may be locked.  Wasted processing time can occur
when a node has locked a resource but is not currently using it.
Not all transactions can be clearly defined when they are invoked so
forcing all resources to be locked at invocation will make some applications
difficult, if not impossible, or may force the locking of unneeded 
resources.

Locking of resources as needed has the following advantages:  Resources
are locked only when being used, and the total transaction does not need to
be known at the start of the transaction.   Since resources are only locked
when they are being used, processing time will not be lost when another
process is waiting on that resource.

Locking of resources as needed has the following disadvantages:  A deadlock
detection and recovery scheme must be implemented, time may be wasted
when a possible deadlock is recovered from, and it is much harder to
implement.

It appears that the first proposal is the simplest way of implementing
a Distributed DBMS but it has some serious drawbacks.  Specifically, it
can hurt the performance of a distributed system by having unneeded
resources locked.  But the second proposal will have problems if deadlock
is detected often and the recovery takes longer than the time we saved.

Questions:

Are there other methods?

What are the distributed deadlock detection schemes?

How can one tell when one method is better than the other?

How do crash recovery and deadlock detection rollback relate?  Are most of
the recovery problems already solved by crash recovery?

Any references and comments on this subject would help further my 
research greatly.


	Paul Schmidt - East Tennessee State University
	502 West Pine Street
	Johnson City, TN 37604

	EMAIL: mcnc!tijc02!pjs269

speegle@im4u.UUCP (Greg Speegle) (03/17/88)

In article <207@tijc02.UUCP> pjs269@tijc02.UUCP (Paul Schmidt        ) writes:
>Distributed Database Management Systems and locking
>
>I am doing research into distributed databases for my Master's Thesis
>and I have a few comments and questions.
>
>First I feel there are two methods for locking distributed databases.

Non-locking mechanisms also exist, such as distributed timestamps.

>Locking of all resources at the beginning of the transaction has the
>following advantages:  ease of implementation and deadlock prevention.
>... transaction rollback scheme does not have to be implemented.

Transaction rollback must still be implemented in case of user aborts.
One of the big advantages (or disadvantages of the lock as needed approach)
is you don't have to check for non-existant deadlocks. Most deadlock 
detection schemes check whenever a deadlock MIGHT exist, thereby checking
when one doesn't exist.
 
>Questions:
>
>Are there other methods?

See above.

>What are the distributed deadlock detection schemes?

There are so many distributed deadlock detection schemes out in the world
that to even try to list them all would take forever. Check Sigmod and 
VLDB for the last 5 years (maybe more!) and you'll find plenty. Be careful,
though, as incorrect algorithms are sometimes published.

>How can one tell when one method is better than the other?

This is an area of current research and depends on what you mean by better.
Some are better in terms of messages sent, others are better in terms of 
time before a deadlock is detected. You'll have to establish your own 
criteria for what is better. Another idea would be to implement different
algorithms and see which worked better. Does anyone know if this has
been tried? I would think it has, but I don't know.

>How do crash recovery and deadlock detection rollback relate?  Are most of
>the recovery problems already solved by crash recovery?

A deadlock detection rollback is a system determined abort of a transaction.
Any changes to the database must be "undone" by the system. However, some
methods of crash recovery are based on the concept of "redo"-ing the 
transaction, while others are based on an "undo" scheme. So the answer
is maybe, depending on the crash recovery algorithm used.

>Any references and comments on this subject would help further my 
>research greatly.

The best reference is the book by Ceri and Pelagatti(sp?) which is called
"Distributed Databases" or something very similar. They cover all of the
basics of concurrency, recovery and design of distributed databases.

>	Paul Schmidt - East Tennessee State University
>	502 West Pine Street
>	Johnson City, TN 37604
>
>	EMAIL: mcnc!tijc02!pjs269

Greg Speegle
-- 
{seismo,ihnp4,pyramid}!ut-sally!speegle		speegle@sally.utexas.edu

john@geac.UUCP (John Henshaw) (03/18/88)

In article <207@tijc02.UUCP>, pjs269@tijc02.UUCP (Paul Schmidt) writes:
> First I feel there are two methods for locking distributed databases.
> [much text deleted]

I'll confine most of my comments to LOCKING strategies. There are of course,
several non-locking approaches, but I assume that you are focusing on
locking only. Some comments further down reference other methods...

The preemptive locking strategy (PRE) has the benefit that deadlock
cannot occur. However, we rejected it as a viable method on "practical
grounds". In particular, PRE requires that the programmer know *all* the
resources that will be required *prior* to the initiation of the
transaction. This is rarely the case. As far as I know, all non-PRE
locking schemes require deadlock detection, which is a pain, but a
necessary one.

PRE will lock resources at the earliest point in time (transaction
initiation) and hold them for the duration of the transaction. This
means that the "growing phase" of the 2-phase locking activity happens
very quickly, and the transaction can proceed without ever being
blocked. This suggests that PRE may hold transaction locks for a SHORTER
time than other schemes, not longer. I suspect that the real answer to
this issue could be resolved via simulation and that the major elements
affecting this would be the probability of conflict between
transactions, and the level of concurrency (number of transactions
executing at once).

You write:
>                                                                    By
> locking resources when they are needed, one can avoid unnecessary locks
> but a deadlock detection scheme would have to be used.

Could you please explain what is meant by an "unnecessary lock"? I
imagine that you really meant to address the *timely* initiation of
locks so that objects (resources) are locked for as short a period as
possible. Otherwise, I will disagree: "If ya gotta lock it, ya gotta
lock it...", hence all locks *are* necessary.

You write:
> Locking of all resources at the beginning of the transaction has the
> following advantages:  ease of implementation and deadlock prevention.
> This will be easier to implement since a deadlock detection algorithm
> and transaction rollback scheme does not have to be implemented.

PRE, while deadlock free, does not free the implementor from transaction
rollback. While it is clear that deadlock requires transaction rollback,
there are other reasons why one may roll back a transaction. For
example, a end-user may deliberately choose to ABORT a transaction, or the
application software may decide, due to various conditions (data
dependencies, data values, etc.) to ABORT. Either case requires rollback.

You write:
> Locking of resources as needed has the following advantages:  Resources
> are locked only when being used, and the total transaction does not need to
> be known at the start of the transaction.   Since resources are only locked
> when they are being used, processing time will not be lost when another
> process is waiting on that resource.

This description is somewhat inexact. Resources, once locked, are locked
until Commit time of the transaction. Having locked a resource/object,
other processes will be blocked accessing that object. (This assumes
that the object has been either written, or is assumed to be updatable
at the time of lock initiation.) You point out a major advantage of
standard 2 phase locking -  that the programmer need not know which
objects/resources will be locked at the Start Transaction juncture. It
might be useful in your analysis to distinguish between the types of
2 phase locking that you have in mind. For example, when a lock is
granted on an object which is read, do you assume that you are going to
update it? or do you convert read locks to write locks at update time?

A reasonably good paper involving this issue is: "The Performance of
Concurrency Control Algorithms for Database Management Systems" by
Michael Carey, and Michael Stonebreaker, 1984, VLDB Publications
(Proceedings of the Tenth International Conference on Very Large
Databases). It introduces a useful simulation framework that you may
find helpful in your efforts. 

You ask:
> Questions:
> 
> Are there other methods?

Sure: tree-locking, timestamp, optimistic, and multiversion are all
written up. I suggest that you check out the library, looking at the
last few years of: "Transactions on Database Systems". If you can get
hold of "Distributed Systems: Volume 2", edited by Wesley Chu, ISBN
0-89006-213-7, chapter 2 talks about concurrency control, and other
chapters address several case studies. Author names that may be useful
are: Bernstein, Carey, Chu, Eager, Ellis, Goodman, Gray, Kung, Moss, 
Obermarck, Sevcik, Stonebreaker, Robertson, and Weiderhold, to name only
a few!

> What are the distributed deadlock detection schemes?

The two I know about are "wait-for-strings", and "wait-for-graphs". Most
systems that do deadlock detection, don't check for deadlock every time
a lock is placed. Instead, a transaction, when blocked, times out, and
then the check is made. In the distributed case, the lifetime of the
average transaction may be quite long, and communication delays compound
the difficulties that this approach already has. This approach of
waiting is done in order to minimize the cost of checking graphs for
cycles - usually a CPU-intensive process.

> How can one tell when one method is better than the other?

Good question. I think the answer to this question depends a lot on your
criteria - something that differs for each situation. Reliability,
robustness, efficiency, correctness, etc., etc., etc.... are all in the
list somewhere. Judging DISTRIBUTED locking schemes is like judging
CENTRALIZED locking schemes, except a few of the base rules/assumptions
are changed.

> How do crash recovery and deadlock detection rollback relate?  Are most of
> the recovery problems already solved by crash recovery?
> 

Judging from your question, you are more interested in Distributed
Transaction Management - not just "distributed locking". The "short"
answer to this question is "yes", but the implementation details are
where the real heart of the matter lies.


Good luck with your work Paul.

-john-
-- 
John Henshaw,			(mnetor, yunexus, utgpu !geac!john)
Geac Computers Ltd.		If we don't pay for education now, are we
Markham, Ontario, Canada, eh?	going to be able to pay for ignorance later?

lchirica@polyslo.UUCP (Laurian Chirica) (03/21/88)

In article <2465@geac.UUCP> john@geac.UUCP (John Henshaw) writes:
>
>In article <207@tijc02.UUCP>, pjs269@tijc02.UUCP (Paul Schmidt) writes:
>> First I feel there are two methods for locking distributed databases.
>> [much text deleted]
>
>I'll confine most of my comments to LOCKING strategies. There are of course,
> [Much more good stuff deleted]

>You write:
>>                                                                    By
>> locking resources when they are needed, one can avoid unnecessary locks
>> but a deadlock detection scheme would have to be used.
>
>Could you please explain what is meant by an "unnecessary lock"? I
>imagine that you really meant to address the *timely* initiation of
>locks so that objects (resources) are locked for as short a period as
>possible. Otherwise, I will disagree: "If ya gotta lock it, ya gotta
>lock it...", hence all locks *are* necessary.

I am sure Paul can answer the question for himself but I volunteer
the following explanation: What if the set of objects to be locked depends 
on some logical condition, say "if condition A then lock set X else lock
set Y."  Locking all resources at the beginning of a transaction require
that we lock both sets X and Y.

> [A lot more good stuff deleted]

I fully agree with all the rest of your article, John.  

One more note, Jim Gray of Tandem (ex-IBM'er of IMS, System R fame and
Transaction Management guru) wrote  a paper sometime ago exposing the
idea of deadlock prevention as a "straw man."  If I remember correctly,
he had analyzed some transaction logs from IBM installations which showed
that the probability of deadlock was so small that it was not worth
the expense of preventing deadlock.  He, essentially said that we are
better off (from a performance standpoint) to let it happen, and rollback
one of the transactios involved in the deadlock.  As you pointed out John,
the rollback is needed anyway for recovery purposes.

-- Laurian

-- 
Laurian M. Chirica  (lchirica@polyslo.UUCP)
Computer Science Department
California Polytechnic State University (CAL POLY)
San Luis Obispo, CA 93407 - (805) 756-1332