[net.lang] A Simple Bubble Sort Function

chris@umcp-cs.UUCP (06/12/84)

(Editorial note: I've moved this to net.lang (from net.sources,
net.followup, and net.flame) because none of the three original
groups is a good place for it.)

Quicksort is not always the fastest sort.  In fact, in general,
it is best on a completely unsorted list (sometimes being O(n))
and worst on a completely sorted list (where it is O(n^2)).

If you are likely to have a completely (or nearly completely)
sorted list, use a Bubble Sort (or perhaps an insertion sort).
If you are likely to have a completely scrambled list, use a
quick sort.  If you have a list in which the items are usually
sorted but occasionally way out of place, you might try a Shell
sort.

For a real example, the TeX Versatec driver uses a Shell sort,
after reading in the list of all the characters on a page.  These
are either mostly sorted, with occasional things a line or two of
characters out of place, or mostly exactly unsorted by a common
distance [this is where the Shell sort wins!] because the pages
are to be printed sideways.  (To clarify a bit, this paragraph
so far would be require the following character order:
    s,a,d,c,a,a,F
    o,r,i,h,r,f,o
     ,e,s,a,e,t,r
    ...
In other words, it reads upward along the columns.)

I tried replacing the Shell sort with various other sorts, but
none of the ones I tried gave quite as good results.  (Part of
it was, no doubt, that the Shell sort code makes particularly
good use of the Vax's addressing modes.)

For another real example, a quicksort seems to work best in
Emacs's startup code in macros.c.  (It used to use a variation
on an insertion sort.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

pedz@smu.UUCP (06/13/84)

#R:umcp-cs:-747100:smu:19700003:000:575
smu!pedz    Jun 13 12:35:00 1984

The radix sort will almost always win out if applied properly and the
list to sort is reasonable large.  All of the sort algorithms
mentioned above are O(n^2) (worst case).  There are several sorts
(mergesort, heap sort, ...) which are guaranteed to be O(n log n)
which will usually over a O(n^2) but not alway becuase of the contant
terms.  With large cases they will thought.  However, some sorts (such
as the bubble sort) are the best sorts in particular applications.
But I still maintain that a radix sort will come out ahead in almost
all cases.

Perry
convex!smu!pedz

stew@harvard.UUCP (06/14/84)

Re Torek's assertion that quicksort is O(n^2) for sorting already sorted
lists:  this embarrassing property can be eliminated by sorting a small
sample (say, the first, last, and middle elements) and choosing the
median.  With this change, sorting a sorted list is back to O(log2(n)).
This was originally suggested by Singleton (CACM 12 (1969), 185-187).

-----------------------
Stew Rubenstein     UUCP: {{ihnp4,allegra}!ima,decvax!genrad!wjh12}!harvard!stew
Harvard Chemistry   ARPA: stew@harvard

nessus@mit-eddie.UUCP (Doug Alan) (06/14/84)

>	From: chris@umcp-cs.UUCP

>	If you are likely to have a completely (or nearly completely)
>	sorted list, use a Bubble Sort (or perhaps an insertion sort).
>	If you are likely to have a completely scrambled list, use a
>	quick sort.  If you have a list in which the items are usually
>	sorted but occasionally way out of place, you might try a Shell
>	sort.

Please, if you ever feel the urge to use a bubble sort, use a straight
insertion sort instead.  It's just as easy, and always faster.

About a quick sort being slow if the list to sort is ordered:  I suppose
you could always shuffle the list before sorting it.  That would only
take O(n) time (though it might be a bad O(n)).  Then you would be
almost guaranteed of the list being scrambled (unless you're unlucky).

Actually I like the "merge sort" the best.  It's so nice and clean and
recursive.  And it's always O(n*log(n)).  The algorithm for a merge sort
(in case you don't already know) is basically: Cut the list in half,
sort each half (recursively), and merge the two halves together.  See
how nice it is!
-- 
				-Doug Alan
				 mit-eddie!nessus
				 Nessus@MIT-MC

				"What does 'I' mean"?

andrew@orca.UUCP (06/15/84)

The continued popularity of the bubble sort is perplexing.

In Knuth's "Sorting and Searching" [1], the definitive work on the
subject, the author invests several pages and quite a bit of math in an
exhaustive mathematical analysis of the bubble sort, and concludes:

	"It took a good deal of work to analyze the bubble sort; and
	although the techniques used in the calculations are
	instructive, the results are disappointing since they tell us
	that the bubble sort isn't really very good at all.  Compared
	to straight insertion, bubble sorting requires a more
	complicated program and takes about twice as long!

	"Some suggestions can be given for improving the bubble sort
	... But none of these refinements lead to an algorithm better
	than straight insertion ...

	"In short, the bubble sort seems to have nothing to recommend
	it, except a catchy name and the fact that it leads to some
	interesting theoretical problems."

[1] Knuth, Donald E., "Sorting and Searching", volume 3 of "The Art of
Computer Programming", Addison-Wesley, 1973.

  -- Andrew Klossner   (decvax!tektronix!orca!andrew)      [UUCP]
                       (orca!andrew.tektronix@rand-relay)  [ARPA]

chris@umcp-cs.UUCP (06/18/84)

To forestall further comments:  yes, I know quicksort can be modified
such that sorting an already sorted list is no longer O(n^2).  (In
fact, the ``qsort'' routine in the C library is already so modified,
I believe.)  However, the point remains that if you know a great
deal about the input data, you can almost always come up with
something better than a ``generic'' sort algorithm.  For instance,
if you know that at most one item is out of place, you shouldn't
use any of the standard sorts.  Just don't claim that some particular
sort is ``the best,'' because there is no single best sort.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

ags@pucc-i (Seaman) (06/18/84)

>  But I wonder if using parallel processing might improve bubble sort's rating.
>  A parallel version would run in O(n).  Is O(log n) possible?  What is
>  the fastest known/possible parallel sorting method?

Bjorn Mossberg of Control Data Corporation presented a paper on "Sorting
on the CYBER 205" at the Symposium on CYBER 205 Applications, held at the
Institute for Computational Studies at Colorado State University, Fort
Collins, Colorado, Aug. 12-13, 1982.

His empirically-derived timing formulas for vectorized versions of the
three sorting algorithms which have been under discussion here (Bubble,
Insertion and Quicksort) are as follows:  (all logs are base e)

1.  Bubble sort

    t(N) = N^2/2 + N(151 log N + 96) cycles

    This is still an O(N^2) algorithm, but faster than the scalar version.

2.  Insertion sort

    t(N) = N^2/4 + 100N cycles (pure vector version)

    This is considerably faster than Bubble for all N.  For sorting
    small lists, the 205 can do somewhat better with a scalar insertion
    sort that takes advantage of the large number of working registers
    on the machine.  This method will sort a list of M elements (M small)
    in approximately

    t(M) = 15M^2/4 + 3M cycles.

    The preferred scheme would therefore be to do the first 27 steps in
    scalar mode, and the rest in vector mode.

3.  Median-of-three Quicksort

    t(N) = 2.5N(36 + log N) cycles

    Note that for all reasonable values of N, the linear term dominates.
    In other words, for all lists small enough to fit into the mass storage
    available at a typical computer installation, the algorithm sorts in
    essentially linear time!  The only known sorting algorithm that can
    compete with Quicksort on the CYBER 205 is Batcher's sort, and only
    for particular values of N.
-- 

Dave Seaman			"My hovercraft is full of eels."
..!pur-ee!pucc-i:ags

liberte@uiucdcs.UUCP (06/19/84)

#R:umcp-cs:-747100:uiucdcs:26400014:000:1077
uiucdcs!liberte    Jun 18 00:00:00 1984

/**** uiucdcs:net.lang / andrew@orca / 10:22 am  Jun 16, 1984 ****/
The continued popularity of the bubble sort is perplexing.

	"In short, the bubble sort seems to have nothing to recommend
	it, except a catchy name and the fact that it leads to some
	interesting theoretical problems."
/* ---------- */

Worth some study:  
	1. the continued popularity of bubble sort and
	2. the perplexity of its popularity - we know so little

Perhaps bubble sort has an intuitive edge.  It is easier to imagine how
it works because it is "flat" and global and items float to their sorted
level.  Unlike the faster, more complex, recursive and hierarchical sorting
schemes, the bubbling of data has great appeal.  Too bad it is slow.

But I wonder if using parallel processing might improve bubble sort's rating.
A parallel version would run in O(n).  Is O(log n) possible?  What is
the fastest known/possible parallel sorting method?


Daniel LaLiberte          (ihnp4!uiucdcs!liberte)
U of Illinois, Urbana-Champaign, Computer Science
{moderation in all things - including moderation}