[comp.lang.c] sorting

ok@quintus.UUCP (Richard A. O'Keefe) (11/16/87)

The UNIX library includes qsort(3), and explicitly says that
"qsort is an implementation of the quicker-sort algorithm".
SAS Lattice C for IBM VM/CMS and MVS also claims to use
"quicksort".  Some microcomputer Cs also use some version
of quicksort.

My question is: do these people know something about quicksort
that I don't?  Contrast merge sort and quicksort:

(1')	merge sort does o(NlgN) comparisons in the WORST case.
(2')	merge sort can be coded to take o(N) comparisons when
	the input is already in order.
(3')	merge sort is stable, so it is good for multi-key
	sorting.
(4')	merge sort can be coded easily in a non-recursive
	language, using a stack of o(lgN) pointers.
(5')	merge sort is well adapted to sorting lists or
	permutation vectors.
(6')	to sort N elements, merge sort needs N additional words
	for links, but if you have allocated your records
	dynamically you probably want the links anyway.

(1")	quick sort does o(1.4NlgN) comparisons on the AVERAGE.
	In the worst case it does O(N**2) comparisons.
(2")	quick sort cannot exploit existing order; in fact it
	has to go to some trouble to ignore it.
(3")	quick sort is not stable, so multi-key sorting has to
	be done by writing a combined comparison.
(4")	quick sort can be coded easily in a non-recursive
	language, using a stack of o(2lgN) pointers.
(5")	quick sort is not well adapted to sorting lists.
(6")	quick sort doesn't need the links.  On the other hand,
	if the data items are large, you would be silly to sort
	the array in place, and would sort an array of pointers
	instead, so you are generally paying for the links anyway.

Actual measurements seem to bear this out: a naive implementation
of merge sort on a SUN-3/50 running SunOs 3.2 took 19NlgN usec to
sort an array of N strings using strcmp(), whereas qsort took
31NlgN usec:  this is roughly the 1:1.4 ratio predicted by the
number of comparisons.

Have C implementors been hypnotised by the magic word "quick" in
the name of "quick" sort, or am I missing something?  30% overhead
seems a lot to pay just for a name.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (11/17/87)

In article <187@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>The UNIX library includes qsort(3), and explicitly says that
>"qsort is an implementation of the quicker-sort algorithm".
>My question is: do these people know something about quicksort
>that I don't?  Contrast merge sort and quicksort:
>(1')	merge sort does o(NlgN) comparisons in the WORST case.
>(1")	quick sort does o(1.4NlgN) comparisons on the AVERAGE.
>	In the worst case it does O(N**2) comparisons.

The UNIX qsort() implementation does fairly well; its worst case is
unlikely to arise in practice.

>(6')	to sort N elements, merge sort needs N additional words
>	for links, but if you have allocated your records
>	dynamically you probably want the links anyway.
>(6")	quick sort doesn't need the links.  On the other hand,
>	if the data items are large, you would be silly to sort
>	the array in place, and would sort an array of pointers
>	instead, so you are generally paying for the links anyway.

This is the key point.  The interface to qsort() is simpler and much
more useful than one that requires links in the records.  An
alternative would be for qsort() to allocate extra dynamic storage
for linkage, sort on the links only, then make a final pass to
rearrange the original records according to the sorted links (this
is "list merge" sorting and it's one of my favorites).  However,
qsort() does not fail due to inability to obtain dynamic memory (heap).

>Have C implementors been hypnotised by the magic word "quick" in
>the name of "quick" sort, or am I missing something?  30% overhead
>seems a lot to pay just for a name.

I doubt the name has anything to do with it; sorting is one of the
few things in the software world that is well understood by most
serious practitioners.  The simplicity of the interface is probably
the main factor, combined with the pre-existence of an implementation
in the UNIX C library.  One is free to implement qsort() any way that
one wishes, so long as it retains the same interface.

Note that neither ANSI C nor POSIX is requiring qsort(), bsearch(),
or other similarly useful general data structure algorithms.

Possibly interesting side note:  Some time ago I implemented a
utility for sorting binary files, similar to the UNIX "sort"
utility.  I started out using qsort() to sort internal merge runs,
but after several passes through the code the only remaining use
of qsort() was to build an initial heap structure, not to do any
actual sorting.  All my sorting ended up being merge sorts, even
the internal run sorting.  In the near future I plan to package
this suitably for installation as a generic binary sorting
utility, and will post it to one of the source newsgroups.

eager@amd.UUCP (11/17/87)

In article <6684@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>Note that neither ANSI C nor POSIX is requiring qsort(), bsearch(),

The ANSI Draft Proposed Standard does require the General Utilities library
<stdlib.h> to include qsort() and bsearch().  See section 4.10.5.1 (for 
bsearch) and 4.10.5.2 (for qsort).

-- Mike Eager

gwyn@brl-smoke.UUCP (11/17/87)

In article <4679@amd.AMD.COM> eager@amd.UUCP (mike eager) writes:
>The ANSI Draft Proposed Standard does require the General Utilities library
><stdlib.h> to include qsort() and bsearch().

Oops, I stand corrected.
I was misled by something I read in the POSIX (IEEE 1003.1) Rationale.
I now wonder why it bothers to mention these functions!

atbowler@orchid.UUCP (11/18/87)

In article <6684@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <187@cresswell.quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>
>>Have C implementors been hypnotised by the magic word "quick" in
>>the name of "quick" sort, or am I missing something?  30% overhead
>>seems a lot to pay just for a name.
>
>I doubt the name has anything to do with it; sorting is one of the
>few things in the software world that is well understood by most
>serious practitioners.  The simplicity of the interface is probably
>the main factor, combined with the pre-existence of an implementation
>in the UNIX C library.  One is free to implement qsort() any way that
>one wishes, so long as it retains the same interface.
>
Personally I've always felt that the problem is that computer science
profs are fascinated by the nice elegant algorithm analysis that
quicksort has.  It is pretty so it gets a lot of time in class.
Shellsort on the other hand has no nice analysis and tends to get
short mention, and students get the idea that quicksort is the
only useful memory resident sort.
    In doing the QSORT for the Honeywell DPS-8 series machines I
chose Shellsort for the implemenation, the interface is the same.
1) The code for Shellsort is simpler and so easier to get right
   I don't know how many times I've had to debug somebody's
   quicksort that didn't for on some data set.
2) If Shellsort is coded as per Knuth's recommended sequence,
   our empirical measurements show that it tends to outperform
   quicksort for small to medium problems (less than 1000 records)
   which is what we expect the QSORT function to be used on anyway.
   Larger problems will probably be done with some type of external
   merge sort (i.e. you will call the system SORT program).
   Shellsort tends to win on the small problems because the simpler
   code reduces the overhead (i.e the constant you multiply the O() by)
   It is definitly true that as the problem gets large enough
   quicksort wins because of the algorithmic advantage (provided
   you don't get a pathological case (O(n**2)), but as I argued above
   this is not the common use of the QSORT() function.
3) Although, I don't know of a good Shellsort analysis, it is clear
   that its worst case behaviour is never as bad as quicksort's.
   This is particularly useful since the naively coded quicksorts
   often exhibit their worse behaviour on presorted or almost sorted
   data which is common in real problems.  (Notice that I said
   naively coded.) It is true that better algorithms to pick the
   pivot element decrease this probablity of a pathological case
   but they also increase the linear factor in the running time.

bard@THEORY.LCS.MIT.EDU (Bard Bloom) (11/22/87)

> My question is: do these people know something about quicksort
> that I don't?  Contrast merge sort and quicksort:

Quicksort -- if you code it just right -- has a very nice inner loop:

   v := a[r]
   i := left-1; j := right;
   REPEAT
     REPEAT i := i+1 UNTIL a[i] >= v;
     REPEAT j := j-1 UNTIL a[j] <= v;
     a[i] :=: a[j]
   UNTIL j <= i;

Notice that most of the work is simply an increment and a comparison of an
indexed variable against a fixed value v.  Most other sorting algorithms have
a lot more bookkeeping --- diddling two indices doubles the amount of work
--- and have comparisons a[i]<=a[j] which involve indexing twice.  Of course,
you typically have to code this in assembly language or something to get the
most out of it; if you have (e.g.) a subroutine call to compare strings, you
lose a large chunk of the speedup.

Here's the verdict on Quicksort (from Sedgwick's _Algorithms_).  He does not
justify all these points, so check the literature rather than flaming at me.
Try Aho, Hopcroft, and Ullman for the references.

It's been around a long time, and is very well understood.  It works well in
a variety of situations, and consumes less resources than any other sorting
method in many situations.  The analysis has been verified by extensive
empirical experience.  A carefully tuned version of Quicksort is likely to
run significantly faster than any other sorting method on most computers.  
The carefully tuned version uses a number of tricks, e.g. switches to another
sorting algorithm (e.g., full decision tree) when the lists get too short.
Often parts are coded in assembly language.

However, Quicksort is very very delicate; if you don't implement it just
right, it slows down a lot.  To make matters worse, usually if you screw up
the quicksort, it still stays a correct sorting algorithm.  It is so touchy
that you should use some simpler algorithm -- shellsort is a good one --
unless you really want to take the time to get your quicksort perfect.  I
don't know what version you were using; was it a carefully tuned version, or
a naive implementation?  (The simplest implementation is O(N^2) on a sorted
list; a smarter one is O(N).)

> (2')	merge sort can be coded to take o(N) comparisons when
>	the input is already in order.



>   (1')	merge sort does o(NlgN) comparisons in the WORST case.
                                ^^^^^^^
>   (1")	quick sort does o(1.4NlgN) comparisons on the AVERAGE.
>       	In the worst case it does O(N**2) comparisons.


(Comment on the mathematics: You presumably mean O(NlgN).  o(NlgN) means
"asymptotically less than NlgN", which is impossible for any comparison
sorting algorithm.  O(1.4NlgN) is precisely the same thing as O(NlgN); the O
operator ignores constant factors.  Check any Intro Algorithms book on this
point; it's important if you want to analyse algorithms.)

Most people don't worry about the constant factors, because they vary a lot
with the model of computation.  Most likely your source just said O(NlgN)
time for Mergesort, meaning kNlgN for some value of k which depends on fine
points of your computer architecture.  The same comments go for Quicksort.
Unless you've done some fine analysis, 1" isn't saying much.

> Have C implementors been hypnotised by the magic word "quick" in
> the name of "quick" sort, or am I missing something?  30% overhead
> seems a lot to pay just for a name.

No.  They may have been hypnotized by the 27 years of practical and
theoretical investigation of how quicksort works and how to make it work
well.  They may have seen papers that said "Quicksort is the best" and then
gone out and implemented a decent-enough quicksort, without making sure that
they were tweaking it in just the right way to make it the best.  

Since the people who wrote it for your SunOs presumably had to write their
own code from scratch, they might not have taken the time to search the
(unsorted and fairly hairy) literature to get the fine points right.  If you
can find a machine which has a carefully tweaked qsort, try your comparison there.

-- Bard

ok@quintus.UUCP (Richard A. O'Keefe) (11/24/87)

In article <10491@brl-adm.ARPA>, bard@THEORY.LCS.MIT.EDU (Bard Bloom)
replied to my question about the wide-spread (mis)use of "quick"sort.
He starts by saying that
> Quicksort -- if you code it just right -- has a very nice inner loop:
This is wildly, ludicrously, hilariously IRRELEVANT to generic sorting
algorithms.  What do I mean by a "generic" sorting algorithm?  I mean
one like the qsort() function in the ANSCI C draft which takes a user-
supplied function, or one like the sort(BU_CMD) command in the SVID
which takes a user-supplied sort key specification.  In fact, even if
you are sorting strings, the book-keeping in the inner loop is not of
much interest.  

There are about five different flavours of "oh" notation.  The one I
meant was the one where
	f(n) = o(g(n)) if lim[n->infinity] ((f(n)-g(n))/g(n)) = 0
that is, where f(n) is approximately equal to g(n) WHERE THE CONSTANT
FACTOR IS SIGNIFICANT.  I'm afraid this keyboard hasn't got thetas or
omegas, so I left it to common sense to work out what I meant.

Let me give you part of a table from Kurt Melhorn's "Data Structures
and Algorithms 1: Sorting and Searching", Springer-Verlag.
The table is on page 66.  I have omitted the column for A-sort, and
the row for run-time (because it is expressed in rather odd units,
and makes the inappropriate assumption that comparisons are cheap).

		  Single	
		  Maximum      HeapSort      QuickSort	  MergeSort
		  selection
# of comparisons+------------+------------+-------------+----------+
worst case	|  0.5 n**2  |  2 n lg n  | 0.5  n**2   |  n lg n  |
average case	|  0.5 n**2  |~ 2 n lg n  | 1.44 n lg n |  n lg n  |
		+------------+------------+-------------+----------+
storage		|  n         |  n         |  n + lg n   |  2n      |
requirement     |  + const   |  + const   |  + const    |  + const |
		+------------+------------+-------------+----------+
access  	|  random    |  random    |  random     | sq'ntial |
		+------------+------------+-------------+----------+
stable?         |  no        |  no        |  no         |  yes     |
		+------------+------------+-------------+----------+

The constant factors MATTER here.  The average case of heapsort,
quicksort, and merge sort is O(NlgN), BUT heapsort is twice as
costly as merge sort and quicksort is 1.44 times as costly.

The "storage requirement" line is misleading.  Suppose you are sorting
/etc/passwd on a PDP-11.  This is a typical application of a sorting
routine; I can count the number of times I have sorted a sequence of
numbers -- other than to test a sorting routine -- in the last eight
years on the fingers of no hands, but I sort files of text every day.
The /etc/passwd file here has 76 lines for a total of 4458 bytes.  The
overhead to enable merge-sort (on a PDP-11) is thus 76*sizeof (short)
bytes, which comes to 3%. Trading 3% more space for 30% to 40% less time
seems like a good bargain to me.  More generally, if you are sorting a
sequence of varying-length strings, you have to sort an array of
pointers if you want to use qsort(), and that's N locations of overhead,
whereas storing links with the strings so you can sort a list costs you,
oh my goodness, exactly N locations of overhead.  At least for
applications where you are sorting variable-sized objects, the alleged
space advantage of "quick" sort turns out to be bogus too.

Bloom quotes Sedgewick as advising that "if one is not willing to invest
the effort to be sure that a Quicksort implementation is not flawed,
Shellsort is a much safer choice."  True.  But also according to
Sedgewick {page 98} "The functional form of the running time of
Shellsort is not even known .... two conjectures are (lg N)**2 * N and
N**1.25."  I don't know what YOU think about a book which recommends a
complicated unanalysed method like Shellsort when there is a trivially
simple method (mergesort) which outperforms it -- not only in order of
magnitude but in constant factor -- but I know what *I* think.

Bloom says
> Most people don't worry about the constant factors, because they
> vary a lot with the model of computation.
When you are counting the number of times that some operation is done,
the model of computation has nothing to do with the case, tra-la.
This may be the heart of the matter:  if "most people don't worry about
the constant factors" that may explain why the qsort() routine in SunOS
3.2 was (31/19 - 1) = 63% slower at sorting an array of strings than a
naive implementation of merge sort in one particular test (using
strcmp() as the comparison function).  I expect the factor to be higher
on VAXen, where procedure calls are much more expensive, and less on
SPARCs, where procedure calls should be cheaper.  I have no reason to
believe that the qsort() routine in SunOS 3.2 is not identical to either
the 4.2BSD or the SYS V.2 one, and I doubt that Bloom has either.  If
the sorting routine in UNIX isn't "carefully tweaked", it ---- well
should be after all this time.  But perhaps nobody cared about constant
factors.  [I hasten to add that in general I have the greatest
admiration for the SUN operating system and for their machines, and
that I expect that they copied what they were told was the best code.]

The fact that the AVERAGE case of "quick" sort does about 1.4 times as
many comparisons as the WORST case of merge sort has been known ever
since quicksort was analysed.  The fact that the cost of calling a
user-supplied comparison routine completely swamps any reasonable
book-keeping has been obvious since the beginning.  The conclusion
which follows from this is that
	if the comparisons done by a sorting routine are
	single in-line machine instructions comparable in cost
	to an add or move,
	    then and only then quicksort is a good method
	    otherwise merge sort beats the pants off quicksort
However, this conclusion seems not to have been drawn.

Over the years I have measured the speeds of heapsort, quicksort,
and merge sort on DEC-10s (using Simula, Pop-2, and Prolog),
PDP-11s and VAX-11s (using C) and 680x0s (using C and Prolog).
There are wide variations between languages and machines, but
the ratio heapsort : quicksort : mergesort :: 2 : 1.4 : 1 holds
reasonably well within each machine/language combination.

I'm afraid that Bard Bloom has not provided evidence to refute this
conclusion, but has merely repeated the Party line.

The questions remain:
    (1) does anyone in net-land know of any references which purport
	to show that quick-sort does fewer comparisons than merge-sort?
    (2) does anyone in net-land know of any studies comparing actual
	implementations of quick-sort and merge-sort where the
	comparisons were not single in-line machine instructions but
	were procedure calls in which quick-sort outperformed merge-sort?
    (3) has anyone got an ancedote about an application which was
	originally coded using merge-sort but had to be recoded usin
	quick-sort (or any other in-place sort) to fit into a small
	machine?
    (4) does anyone know of a general-purpose sorting algorithm which
	consistently out-performs merge-sort?

If anyone wants to conduct their own studies, DON'T try your sorting
routine out on an array of numbers.  I suggest random permutations of
random subsets of /usr/dict/words as being more realistic.

I think this discussion does belong in *.lang.c, because presumably
people writing in C care about efficiency and portability, and if it
turns out (a) that qsort() is often implemented as some version of
quick-sort and (b) that quick-sort is as bad as I claim, then a lot
of us will be looking for something better to use.

levy@ttrdc.UUCP (Daniel R. Levy) (11/24/87)

In article <265@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes:
>[ merge sort is better than quick sort; try it and see ]

OK, has anyone out there replaced their UNIX library's qsort() with a merge
sort implementation and found that it speeded up things that use it
considerably?  For that matter, would anyone care to post an example of
this implementation for us to test (and chew on, and flame at :-) ?
-- 
|------------Dan Levy------------|  Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
|         an Engihacker @        |  	<most AT&T machines>}!ttrdc!ttrda!levy
| AT&T Computer Systems Division |  Disclaimer?  Huh?  What disclaimer???
|--------Skokie, Illinois--------|

trt@rti.UUCP (11/24/87)

In article <265@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes:
> In article <10491@brl-adm.ARPA>, bard@THEORY.LCS.MIT.EDU (Bard Bloom)
> replied to my question about the wide-spread (mis)use of "quick"sort.
> He starts by saying that
> > Quicksort -- if you code it just right -- has a very nice inner loop:
> This is wildly, ludicrously, hilariously IRRELEVANT to generic sorting
> algorithms.  What do I mean by a "generic" sorting algorithm?  I mean
> one like the qsort() function in the ANSCI C draft ...

Someone such as ok@quintus, who has coded mergesort on a variety of machines
in a variety of languages, should find writing a qsort()-like function
implemented with mergesort a simple exercise.
So show us, don't just tell us, what fools we have been:

	POST A "GENERIC" SORTING FUNCTION IMPLEMENTED WITH MERGESORT

That will benefit mankind far more than volleys of "mine is quicker than yours".
	Tom Truscott