[net.lang] Bubble sorts, etc.

leichter@yale-comix.UUCP (Jerry Leichter) (06/13/84)

Ah, sorting functions:

	quicksort, if implemented correctly, is NOT O(n^2) on a sorted input
	list.  The "correct" techniques have been known since the original
	article on quicksort (by Tony Hoare) in the late 50's.  However, people
	do still do things the wrong way.  The best reference on quicksort is
	an article by Sedgewick in the CACM about 4-5 years ago; I can find the
	reference if anyone is interested.  It remains true that there exist
	input orderings that make even correctly-done quicksort slow, but they
	don't seem to arise in practice - in the same way that hash table can
	in principle have impossible performance (if everything hashed to the
	same bin) but we don't worry about that.

	quicksort has the best constant factor to go with its O(n log n) of
	any in-place sort known.  Shellsort is comparable, but not quick as
	good.  On the other hand, shellsort MIGHT do better for almost-ordered
	inputs; this is tough to decide since correct quicksort's optimizations
	will interact well with the ordering, too.  Shellsort has the advantage
	of being a somewhat simpler algorithm to implement than a really good
	quicksort.  I don't remember off-hand what the theoretical worst-case
	performance for shellsort is, but I'd guess it, too, is O(n^2).  (There
	is a conjecture that a sort with O(n log n) expected performance that
	has O(n) behavior for some inputs must have O(n^2) behavior for some
	other inputs.)

	Another sort worth knowing is heapsort.  Heapsort has the advantage of
	being a guaranteed O(n log n) algorithm independent of input ordering;
	it always takes the same number of steps.

	There are tons of O(n^2) algorithms, and there is little to choose
	from among them.  They are worth knowing for two reasons:

		For small-enough input sets, it is better to use a simple,
		low-overhead O(n^2) sort than a theoretically-faster high-
		overhead one.  The cutoff for "small enough" is typically
		around 10 or so elements.  Note that such small sets will
		arise in the running of any more-efficient algorithm that
		does a "divide and conquer", such as quicksort:  A good
		implementation of quicksort uses some other sort - typically
		an insertion sort or a bubble sort - when it gets below
		some cutoff.  See Sedgewick for a lot of details on this.

		It is worth teaching these sorts because they are the basis
		of the efficient sorts.  Shellsort, for example, is bubble
		sort "done right".  (It can also be viewed as a form of merge
		sort.)
							-- Jerry
					decvax!yale-comix!leichter leichter@yale