[comp.lang.misc] sorting stuff

rang@cs.wisc.edu (Anton Rang) (11/19/90)

One more note on sorting, in two parts.

1.  Radix sort <> Quicksort.

  It would be possible to write a radix sort which had the same
calling interface as qsort(), but the performance would be horrid and
it would generally be pretty silly.

  Why?  A radix sort doesn't work by comparing the keys of records,
and qsort() is only passed a function which compares two records.  In
a sense, a radix sort does a 'partial comparison' and there isn't any
equivalent to that in the qsort() interface (gross oversimplification
here).

  In practice, if the criteria used to compare records is complex, a
radix sort may not be appropriate, unless there is an easy way to make
a key for each record in the form that a radix sort needs (e.g. a
significance ordering on bits).

2.  Measuring time is difficult.

  When people say an algorithm is O(n log n) or whatever, it's
important to remember the units that's measured in.  For instance,
ordinary quicksort is often quoted as being an average-time (n log n)
algorithm (remember that its worst case is n squared!).  This is
measured in comparisons!  If the length of a key is larger than what
you can compare in some fixed time (e.g. 32 bits), your time bound
instantly gets worse.

  When comparing radix sort and other sorts, note that if the number
of passes in the radix sort becomes an O(log n) factor, then the time
to compare two records in the other sorting methods also becomes
O(log n) and you find yourself comparing the radix sorts, O(n log n),
against a sort which is now O(n (log n)^2).  Depending on how you
compare keys, this can change; for instance, comparing two strings can
be done in time proportional to the number of leading equal characters
and this factor changes as you sort.  (I've never seen an analysis of
this.)

	Anton
   
+---------------------------+------------------+-------------+
| Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison |
+---------------------------+------------------+-------------+