[comp.lang.modula2] Quicksort vs. Heapsort

mrys@ethz.UUCP (Michael Rys) (09/20/89)

A little bit late, but...

In 1987 a guy called Carlson (I think) improved the Heapsort-Algorithm
by using a binary search for inserting into the sorted list. In this
way Heapsort is always faster than Quicksort for very larg n.

Cheers.../Michael

+---------------------------------------------------------------+
| Michael Rys, V. Conzett Str. 34; CH-8004 Zuerich; Switzerland |
+---------------------------------------------------------------+
| UUCP:  mrys@ethz.UUCP or       EAN:     mrys@ifi.ethz.ch	|
|        mrys@bernina.UUCP       IPSANet: mrys@ipsaint		|
| Voice: +41 1 242 35 87					|
+---------------------------------------------------------------+
-- Wovon man nicht sprechen kann, darueber muss man schweigen. --
       Ludwig Wittgenstein, Tractatus logico-philosophicus

zebolskyd@yvax.byu.edu (09/29/89)

In <2033@ethz.UUCP>, Michael Rys writes:

>In 1987 a guy called Carlson (I think) improved the Heapsort-Algorithm
>by using a binary search for inserting into the sorted list. In this
>way Heapsort is always faster than Quicksort for very larg n.

It is interesting that it took until 1987 for somebody to make that
improvement. I was watching a demonstration program for QuickBasic
on the Macintosh that compares the various methods. The one they called
a heap sort used a linear search to insert. I thought there was a mistake or
bug, because a binary search would have been so much faster. But maybe it
took a graphical display to make it obvious. Anybody who wants to see an
entertaining graphical comparison of search algorithms should find a friend
who has QuickBasic for their Mac. It is even in color on a Mac II.

--Lyle D. Gunderson   zebolskyd@yvax.byu.edu

steve@basser.oz (Stephen Russell) (09/30/89)

In article <828zebolskyd@yvax.byu.edu> zebolskyd@yvax.byu.edu writes:
>In <2033@ethz.UUCP>, Michael Rys writes:
>
>>In 1987 a guy called Carlson (I think) improved the Heapsort-Algorithm
>>by using a binary search for inserting into the sorted list. In this
>>way Heapsort is always faster than Quicksort for very larg n.
>
>It is interesting that it took until 1987 for somebody to make that
>improvement. I was watching a demonstration program for QuickBasic
>on the Macintosh that compares the various methods. The one they called
>a heap sort used a linear search to insert. I thought there was a mistake or
>bug, because a binary search would have been so much faster.

I'm going to feel a real fool if I get this wrong, but ... since when
did _anyone_ use 'linear searching' to add an element to a heap? I
think there is some confusion here. For example, there is no "sorted
list" in a heap, at least in the conventional sense of sorted. A
"linear search" to find the "insertion" point makes no sense at all.

To add an element to a heap with elements h[1] to h[n], you just add
one to n, put the new element at h[n] (for new n), then compare it with
h[n/2].  If h[n] > h[n/2], swap them, then compare h[n/2] with h[n/4], etc.
This maintains the invariant h[k] >= max(h[k*2], h[k*2+1]) for k = 1..n/2,
which is the definition of a heap.

This is obviously a binary search (the divisor doubles at each
iteration), and is the _only_ way to add elements to a heap.

I suspect that some out there are confusing an "insertion sort" with a
"heapsort".

Steve Russell

ags@mentor.cc.purdue.edu (Dave Seaman) (10/01/89)

In article <2585@basser.oz> steve@basser.oz (Stephen Russell) writes:
>I'm going to feel a real fool if I get this wrong, but ... since when
>did _anyone_ use 'linear searching' to add an element to a heap? I
>think there is some confusion here. For example, there is no "sorted
>list" in a heap, at least in the conventional sense of sorted. A
>"linear search" to find the "insertion" point makes no sense at all.

I have never seen the QuickBasic Demo, nor have I heard a description of this
variety of Heapsort before, but I understood immediately what the writer meant.

There is indeed a "linear list" involved, although the elements of the list do
not reside in consecutive locations in memory.  

>To add an element to a heap with elements h[1] to h[n], you just add
>one to n, put the new element at h[n] (for new n), then compare it with
>h[n/2].  If h[n] > h[n/2], swap them, then compare h[n/2] with h[n/4], etc.
>This maintains the invariant h[k] >= max(h[k*2], h[k*2+1]) for k = 1..n/2,
>which is the definition of a heap.

Precisely.  The linear list begins with h[n], h[n/2], h[n/4], ..., and ends
with h[1].  Just imagine a triangular-shaped heap display and notice the path
back to the root.  It zigs and zags, but it is basically a linear list.
It is sorted, except for the last element, which needs to be inserted in the
proper place.  The traditional way to do this has been to step through the
list in linear fashion, as you described.  But, since it is a linear list, it
is quite sensible to do a binary search instead.

>This is obviously a binary search (the divisor doubles at each
>iteration), and is the _only_ way to add elements to a heap.

No.  That is not a binary search.  Consider, for example, the case where n =
137 (after incrementing).  The linear list, in this case, consists of

	h[1], h[2], h[4], h[8], h[17], h[34], h[68], h[137].

The method of comparing h[137] first with h[68], then h[34], etc., is obviously
a linear search.  A binary search would begin by comparing h[137] with h[8]
(the one in the middle of the sorted segment).  Depending on the result of that
comparison, h[137] would next be compared with either h[2] or h[34].

>I suspect that some out there are confusing an "insertion sort" with a
>"heapsort".

Not at all.  The confusion is between "linear search" and "binary search".

-- 
Dave Seaman	  					
ags@seaman.cc.purdue.edu

steve@basser.oz (Stephen Russell) (10/02/89)

In article <4287@mentor.cc.purdue.edu> ags@mentor.cc.purdue.edu (Dave Seaman) writes:
>In article <2585@basser.oz> steve@basser.oz (Stephen Russell) writes:
>>I'm going to feel a real fool if I get this wrong, but ... 

Well, I was wrong, and I'm feeling quite foolish, really ... :-)

Dave's explanation of the technique of performing a binary search to
find the correct insertion point along the path to the new leaf is very
clear. This is a technique I've never seen before, though having been shown
it so clearly by Dave, I'm amazed how `obvious' it is.

Well, we learn something new every day. Thanks, Dave, much appreciated.
(BTW, I found your node's name "mentor" particularly apropos to this
situation.)

zebolskyd@yvax.byu.edu (10/05/89)

Perhaps it would help if I described what I remember seeing in the
QuickBasic demo that did visual sorts.

The algorithm that was (perhaps erroneously) labelled a "Heap Sort" started
out with a nice colorful mess of unsorted elements displayed as vertical
bars of various heights and corresponding colors. One element was then
selected (the criteria for the selection was not apparent, but appeared to
be the leftmost element) and moved to the far left. Then the next unsorted
element was moved to the right of where the first was deposited, and swapped
with it if it was smaller. The next element to be sorted was moved to the
left of the pair, and exchanged with its immediate left neighbor until the
neighbor was smaller. The process continued in this manner, with each unsorted
element being placed at the right edge of what was presumably termed the
"heap" and swapped, one step at a time, until it found its place. Painfully
slow, especially for the latter elements.

How does a _real_ heapsort work?

Lyle D. Gunderson     zebolskyd@yvax.byu.edu

MARKV@UKANVAX.BITNET ("MARK GOODERUM - UNIV. OF KANSAS ACS - MARKV@UKANVAX") (10/05/89)

A 'real' heapsort works in a heap.  The easiest implementation to imagine
is an array.

Heapsort works on the principle of a partially ordered tree.  For each element
n, it's two children are stored as element 2n and 2n+1.  The partailly ordered
tree property states that every element n is greater than its left child 2n
and less than its right child 2n+1.

Ohh, shoot.  Emergency, gotta go, sorry.  Anyone want to pick up where I left
off?

-Mark Gooderum

piet@cs.ruu.nl (Piet van Oostrum) (10/09/89)

In article <INFO-M2%89100509454230@UCF1VM>, MARKV@UKANVAX ("MARK GOODERUM - UNIV. OF KANSAS ACS - MARKV@UKANVAX") writes:
 `A 'real' heapsort works in a heap.  The easiest implementation to imagine
 `is an array.
 `
 `Heapsort works on the principle of a partially ordered tree.  For each element
 `n, it's two children are stored as element 2n and 2n+1.  The partailly ordered
 `tree property states that every element n is greater than its left child 2n
 `and less than its right child 2n+1.
 `
No, <= both, or >=, depending on the direction of the sort.
-- 
Piet van Oostrum, Dept of Computer Science, University of Utrecht
Padualaan 14, P.O. Box 80.089, 3508 TB Utrecht,  The Netherlands.
Telephone: +31-30-531806      Internet: piet@cs.ruu.nl
Telefax:   +31-30-513791      Uucp: uunet!mcsun!hp4nl!ruuinf!piet