wtwolfe@hubcap.clemson.edu (Bill Wolfe) (09/04/89)
This is a review of the article "Concurrent Operations on Priority Queues", by Douglas W. Jones, which appeared in the January 1989 issue of Communications of the ACM (Volume 32, Number 1). The article describes a concurrent implementation of the priority queue structure known as the "skew heap". It begins with the algorithms used for the manipulation of the sequential version, and applies a series of transformations to those algorithms which ultimately result in a set of algorithms which permits the enqueuing and dequeuing of items concurrently. The correctness of the algorithms is discussed, and there are concluding remarks regarding limitations on available concurrency and regarding a workaround which can be applied in order to minimize the problems associated with machines on which process creation is expensive. The skew heap was selected because most other data structures commonly used to implement the priority queue require that the root of a tree structure be locked throughout each operation; the remaining candidate was the implicit heap, versions of which could either encounter deadlocks (the bottom-up enqueue, top-down dequeue version, although the problem is described as possibly being avoidable), or require excessive time for most access patterns (the exclusively top-down version). A reference is given to a different scheme for applying concurrency to implicit heaps, but without comment. After the review of other possible data structures, the top-down variant of the skew heap was selected. This data structure's manipulation algorithms give balanced insertions when the inserted item is of lower priority than any other item in the subtree, but produce imbalance when the inserted item is of higher priority than any other item in the subtree; specifically, a sequence of insertions in descending order of priority will result in a perfectly balanced tree, while a sequence of insertions in increasing order of priority will result in a left linear degenerate tree. This implies that operations on the structure are individually O(n), although Tarjan has shown that the amortized complexity is theta (log2 n). Since at least one of the other data structures (the binomial forest) is O(log2 n) (and in the case of the dequeuing operation, omega (log2 n)), this represents an increase in unpredictability relative to existing sequential technology. The concurrent version of the top-down skew heap involves the setting of locks such that the node currently being operated upon is always locked, and the child to which descent is about to occur is locked immediately prior to the release of the lock held on the current node; this technique is known as lock-coupling, and it permits less concurrency than the two alternate techniques, the link technique and the give-up technique (all three techniques are analyzed in "Concurrent Search Structure Algorithms", ACM Transactions on Database Systems, Vol. 13, #1, an overview of which appeared in an earlier posting). A further restriction on concurrency exists in that access to the structure is serialized by virtue of the need to start by acquiring exclusive access to the structure's root node; this contrasts unfavorably with technologies which offer serializability (a guarantee that the effect will be equivalent to some serial ordering of the presented operations, despite the fact that operations may in fact proceed completely in parallel) without resorting to serialization. The essential idea is that after beginning at the root, each operation can descend into the tree concurrently; beyond contention for the root of the structure, there will not be further contention unless one process happens to take a path which leads it to a place where some earlier process is still working (slowly), in which case it is necessary to wait for the earlier process to release its lock on the desired node before further descent can occur. Processes will eventually reach the bottom of the tree and exit the system. The author implemented the structure and tested it by initially inserting 25 items and then activating 3 producers and 3 consumers. Insertions were done in decreasing priority order, which is the ordering that results in perfectly balanced tree structures (as opposed to the degenerate left linear structures which result when increasing priority order is used). With this perfectly balanced tree structure, the number of processes concurrently active in the structure was usually at least 2 and ranged up to 5, demonstrating that the structure does indeed provide some opportunities for concurrency. There was a problem in that process creation was apparently a rather expensive operation on the author's machine, which resulted in another version which featured a fixed number of support processes which would remain permanently active and perform any desired operations upon request; this, of course, results in support process contention when the queue size and the demand for operations results in more work than the fixed number of support processes can handle, and would therefore constitute yet another undesirable limitation upon the level of available concurrency. The algorithms presented were moderately readable; improvements could have been made by writing an Exchange procedure rather than simply writing the three lines of code without any indication of what was being done, and by noting explicitly that the algorithm's comparison of priorities was relying upon a *numerical* comparison in which increasing priorities were associated with numerically smaller values, thus causing the appearance that the < operation was testing whether the first operand's priority was less than that of the second operand when in fact the reverse was actually happening. A final technical point to be considered is the reliance of the author's solution on a specific parameter passing mechanism, namely, pass by reference. This mechanism is hard-wired in Concurrent Pascal-S (the author's implementation language); Ada defines its "in out" parameters to be independent of the underlying parameter passing mechanism, which could be pass by reference, copy-in/copy-out, or any of a number of other mechanisms. This problem would have to be addressed in a generic Ada implementation by passing instead a pointer to the pointer which is to be modified; this would enforce the author's assumptions and enable the structure to be implemented correctly, regardless of the actual (compiler- and/or machine-dependent) parameter passing mechanism. Despite its imperfections, however, the concurrent top-down skew heap does offer a limited amount of concurrency to the implementor of a concurrent priority queue, and therefore should be a useful structure until further advances occur in concurrent priority queue technology. Bill Wolfe, wtwolfe@hubcap.clemson.edu