[comp.theory] Article of interest in this month's issue of CACM

pugh@CS.UMD.EDU (Bill Pugh) (06/01/90)

There is an article appearing this month (June) in CACM that should be of
interest to many theoretical computer scientists as well as practitioners. The
paper describes a new simple, efficient and practical alternative to balanced
trees. I've enclosed an overview of the paper below.

  Part of the reason for posting this note is that a number people in
the theory community have noted that they routinely ignore CACM. Hopefully,
this can be changed if more good papers are submitted to CACM.

  In addition to the paper in CACM, a number of follow-on papers have been
generated. Files associated with skip lists, such as overviews, postscript
versions of papers and source code libraries are available for anonymous
ftp from mimsy.cs.umd.edu (in the directory pub/skipLists).


	Bill Pugh
	Dept. of Computer Science
	University of Maryland,
	College Park MD 20742
	pugh@cs.umd.edu


----------------------------------------------------------------------

	        CACM, June 1990, pages 668-676
    Skip Lists: a Probabilistic Alternative to Balanced Trees

                        William Pugh,
                Department of Computer Science
           and Institute for Advanced Computer Studies,
              University of Maryland, College Park

			Overview

Skip lists are a data structure that can be used in place of balanced trees.
Skip lists use probabilistic balancing rather than strictly enforced balancing
and as a result the algorithms for insertion and deletion in skip lists are
much simpler and significantly faster than equivalent algorithms for balanced
trees (although only by constant factors).

Binary trees can be used for representing abstract data types such as
dictionaries and ordered lists. They work well when the elements are inserted
in a random order. Some sequences of operations, such as inserting the
elements in order, produce degenerate data structures that give very poor
performance. If it were possible to randomly permute the list of items to be
inserted, trees would work well with high probability for any input sequence.
In most cases queries must be answered on-line, so randomly permuting the input
is impractical. Balanced tree algorithms re-arrange the tree as operations
are performed to maintain certain balance conditions and assure good
performance.

Skip lists are a probabilistic alternative to balanced trees. Skip lists
are balanced by consulting a random number generator. Although skip lists have
bad worst-case performance, no input sequence consistently produces the worst-
case performance (much like quicksort when the pivot element is chosen
randomly). It is very unlikely a skip list data structure will be significantly
unbalanced (e.g., for a dictionary of more than 250 elements, the chance
that a search will take more than 3 times the expected time is less than
one in a million). Skip lists have balance properties similar to that of search
trees built by random insertions, yet do not require insertions to be random.

Balancing a data structure probabilistically is easier than explicitly
maintaining the balance. For many applications, skip lists are a more natural
representation than trees, also leading to simpler algorithms. The simplicity
of skip list algorithms makes them easier to implement and provides significant
constant factor speed improvements over balanced tree and self-adjusting
tree algorithms. Skip lists are also very space efficient. They can easily be
configured to require an average of 1 1/3 pointers per element (or even less)
and do not require balance or priority information to be stored with each node.

bizard@imag.imag.fr (Philippe Bizard) (06/08/90)

In article <9006011620.AA05029@irt.watson.ibm.com> Bill Pugh <pugh%cs.umd.edu@VM1.NoDak.EDU> writes:
>There is an article appearing this month (June) in CACM that should be of
>interest
 [...]
>  Part of the reason for posting this note is that a number people in
>the theory community have noted that they routinely ignore CACM. 
   But the theory community don't ignore Springer Verlag's "Lecture Notes
in Computer Science", do they ? And isn't it the same article which appeared
there in 1989 (in Algorithms & data structures) ? So it seems now that nobody
in the theory community can claim he never heard of these skiplists ;-)
-- 
 Le monde devient complexe, les problemes sont reels, j'essaie d'etre rationnel,
                                            mais le mystere, lui, reste entier. 
Philippe Bizard (2eme kyu)
{mcvax|inria}!imag!bizard      bizard@imag.imag.fr