[comp.sw.components] Concurrent Priority Queues

wtwolfe@hubcap.clemson.edu (Bill Wolfe) (09/04/89)

This is a review of the article "Concurrent Operations on Priority Queues",
by Douglas W. Jones, which appeared in the January 1989 issue of 
Communications of the ACM (Volume 32, Number 1).  The article describes 
a concurrent implementation of the priority queue structure known as the 
"skew heap".  It begins with the algorithms used for the manipulation of 
the sequential version, and applies a series of transformations to those 
algorithms which ultimately result in a set of algorithms which permits 
the enqueuing and dequeuing of items concurrently.  The correctness of the 
algorithms is discussed, and there are concluding remarks regarding 
limitations on available concurrency and regarding a workaround which can 
be applied in order to minimize the problems associated with machines 
on which process creation is expensive.

The skew heap was selected because most other data structures commonly 
used to implement the priority queue require that the root of a tree 
structure be locked throughout each operation; the remaining candidate 
was the implicit heap, versions of which could either encounter deadlocks 
(the bottom-up enqueue, top-down dequeue version, although the problem is 
described as possibly being avoidable), or require excessive time for most 
access patterns (the exclusively top-down version).  A reference is given 
to a different scheme for applying concurrency to implicit heaps, but 
without comment.

After the review of other possible data structures, the top-down variant of
the skew heap was selected.  This data structure's manipulation algorithms
give balanced insertions when the inserted item is of lower priority than 
any other item in the subtree, but produce imbalance when the inserted item
is of higher priority than any other item in the subtree; specifically, a 
sequence of insertions in descending order of priority will result in a 
perfectly balanced tree, while a sequence of insertions in increasing order 
of priority will result in a left linear degenerate tree.  This implies that 
operations on the structure are individually O(n), although Tarjan has shown 
that the amortized complexity is theta (log2 n).  Since at least one of the 
other data structures (the binomial forest) is O(log2 n) (and in the case of 
the dequeuing operation, omega (log2 n)), this represents an increase in 
unpredictability relative to existing sequential technology. 

The concurrent version of the top-down skew heap involves the setting of
locks such that the node currently being operated upon is always locked,
and the child to which descent is about to occur is locked immediately 
prior to the release of the lock held on the current node; this technique 
is known as lock-coupling, and it permits less concurrency than the two 
alternate techniques, the link technique and the give-up technique (all 
three techniques are analyzed in "Concurrent Search Structure Algorithms",
ACM Transactions on Database Systems, Vol. 13, #1, an overview of which 
appeared in an earlier posting).  A further restriction on concurrency 
exists in that access to the structure is serialized by virtue of the need 
to start by acquiring exclusive access to the structure's root node; this 
contrasts unfavorably with technologies which offer serializability (a 
guarantee that the effect will be equivalent to some serial ordering of 
the presented operations, despite the fact that operations may in fact 
proceed completely in parallel) without resorting to serialization.

The essential idea is that after beginning at the root, each operation can
descend into the tree concurrently; beyond contention for the root of the
structure, there will not be further contention unless one process happens
to take a path which leads it to a place where some earlier process is still
working (slowly), in which case it is necessary to wait for the earlier 
process to release its lock on the desired node before further descent can
occur.  Processes will eventually reach the bottom of the tree and exit the 
system.

The author implemented the structure and tested it by initially inserting
25 items and then activating 3 producers and 3 consumers.  Insertions were
done in decreasing priority order, which is the ordering that results in
perfectly balanced tree structures (as opposed to the degenerate left 
linear structures which result when increasing priority order is used).  
With this perfectly balanced tree structure, the number of processes 
concurrently active in the structure was usually at least 2 and ranged up 
to 5, demonstrating that the structure does indeed provide some 
opportunities for concurrency.  There was a problem in that process 
creation was apparently a rather expensive operation on the author's 
machine, which resulted in another version which featured a fixed number 
of support processes which would remain permanently active and perform 
any desired operations upon request; this, of course, results in support 
process contention when the queue size and the demand for operations 
results in more work than the fixed number of support processes can handle,
and would therefore constitute yet another undesirable limitation upon the 
level of available concurrency. 

The algorithms presented were moderately readable; improvements could have
been made by writing an Exchange procedure rather than simply writing the
three lines of code without any indication of what was being done, and by 
noting explicitly that the algorithm's comparison of priorities was 
relying upon a *numerical* comparison in which increasing priorities were 
associated with numerically smaller values, thus causing the appearance 
that the < operation was testing whether the first operand's priority was 
less than that of the second operand when in fact the reverse was actually 
happening.

A final technical point to be considered is the reliance of the author's
solution on a specific parameter passing mechanism, namely, pass by 
reference.  This mechanism is hard-wired in Concurrent Pascal-S (the 
author's implementation language); Ada defines its "in out" parameters 
to be independent of the underlying parameter passing mechanism, which 
could be pass by reference, copy-in/copy-out, or any of a number of other 
mechanisms.  This problem would have to be addressed in a generic Ada
implementation by passing instead a pointer to the pointer which is to 
be modified; this would enforce the author's assumptions and enable the 
structure to be implemented correctly, regardless of the actual (compiler- 
and/or machine-dependent) parameter passing mechanism.

Despite its imperfections, however, the concurrent top-down skew heap
does offer a limited amount of concurrency to the implementor of a
concurrent priority queue, and therefore should be a useful structure
until further advances occur in concurrent priority queue technology.


   Bill Wolfe, wtwolfe@hubcap.clemson.edu