[comp.lang.modula2] Heapsort vs. Quicksort

ariel@bimacs.BITNET (Ariel J. Frank) (10/05/89)
Hi net/lang guys. Regarding the following Quicksort vs. Heapsort discussion:

-----------------------------------------------------------------------------

Path: bimacs!barilvm!psuvm!psuvax1!brutus.cs.uiuc.edu!ginosko!uunet!munnari.oz.au!basser!steve
From: steve@basser.oz (Stephen Russell)
Newsgroups: comp.lang.modula2
Subject: Re: Quicksort vs. Heapsort
Message-ID: <2585@basser.oz>
Date: 30 Sep 89 16:31:55 GMT
References: <828zebolskyd@yvax.byu.edu>
Sender: msgs@basser.oz
Organization: Dept of Comp Sci, Uni of Sydney, Australia
Lines: 32

In article <828zebolskyd@yvax.byu.edu> zebolskyd@yvax.byu.edu writes:
>In <2033@ethz.UUCP>, Michael Rys writes:
>
>>In 1987 a guy called Carlson (I think) improved the Heapsort-Algorithm
>>by using a binary search for inserting into the sorted list. In this
>>way Heapsort is always faster than Quicksort for very larg n.
>
>It is interesting that it took until 1987 for somebody to make that
>improvement. I was watching a demonstration program for QuickBasic
>on the Macintosh that compares the various methods. The one they called
>a heap sort used a linear search to insert. I thought there was a mistake or
>bug, because a binary search would have been so much faster.

I'm going to feel a real fool if I get this wrong, but ... since when
did _anyone_ use 'linear searching' to add an element to a heap? I
think there is some confusion here. For example, there is no "sorted
list" in a heap, at least in the conventional sense of sorted. A
"linear search" to find the "insertion" point makes no sense at all.

To add an element to a heap with elements h[1] to h[n], you just add
one to n, put the new element at h[n] (for new n), then compare it with
h[n/2].  If h[n] > h[n/2], swap them, then compare h[n/2] with h[n/4], etc.
This maintains the invariant h[k] >= max(h[k*2], h[k*2+1]) for k = 1..n/2,
which is the definition of a heap.

This is obviously a binary search (the divisor doubles at each
iteration), and is the _only_ way to add elements to a heap.

I suspect that some out there are confusing an "insertion sort" with a
"heapsort".

Steve Russell

Path: bimacs!barilvm!psuvm!psuvax1!gatech!purdue!mentor.cc.purdue.edu!ags
From: ags@mentor.cc.purdue.edu (Dave Seaman)
Newsgroups: comp.lang.modula2
Subject: Re: Quicksort vs. Heapsort
Message-ID: <4287@mentor.cc.purdue.edu>
Date: 1 Oct 89 16:02:05 GMT
References: <828zebolskyd@yvax.byu.edu> <2585@basser.oz>
Reply-To: ags@mentor.cc.purdue.edu (Dave Seaman)
Organization: Purdue University
Lines: 48

In article <2585@basser.oz> steve@basser.oz (Stephen Russell) writes:
>I'm going to feel a real fool if I get this wrong, but ... since when
>did _anyone_ use 'linear searching' to add an element to a heap? I
>think there is some confusion here. For example, there is no "sorted
>list" in a heap, at least in the conventional sense of sorted. A
>"linear search" to find the "insertion" point makes no sense at all.

I have never seen the QuickBasic Demo, nor have I heard a description of this
variety of Heapsort before, but I understood immediately what the writer meant.

There is indeed a "linear list" involved, although the elements of the list do
not reside in consecutive locations in memory.

>To add an element to a heap with elements h[1] to h[n], you just add
>one to n, put the new element at h[n] (for new n), then compare it with
>h[n/2].  If h[n] > h[n/2], swap them, then compare h[n/2] with h[n/4], etc.
>This maintains the invariant h[k] >= max(h[k*2], h[k*2+1]) for k = 1..n/2,
>which is the definition of a heap.

Precisely.  The linear list begins with h[n], h[n/2], h[n/4], ..., and ends
with h[1].  Just imagine a triangular-shaped heap display and notice the path
back to the root.  It zigs and zags, but it is basically a linear list.
It is sorted, except for the last element, which needs to be inserted in the
proper place.  The traditional way to do this has been to step through the
list in linear fashion, as you described.  But, since it is a linear list, it
is quite sensible to do a binary search instead.

>This is obviously a binary search (the divisor doubles at each
>iteration), and is the _only_ way to add elements to a heap.

No.  That is not a binary search.  Consider, for example, the case where n =
137 (after incrementing).  The linear list, in this case, consists of

        h[1], h[2], h[4], h[8], h[17], h[34], h[68], h[137].

The method of comparing h[137] first with h[68], then h[34], etc., is obviously
a linear search.  A binary search would begin by comparing h[137] with h[8]
(the one in the middle of the sorted segment).  Depending on the result of that
comparison, h[137] would next be compared with either h[2] or h[34].

>I suspect that some out there are confusing an "insertion sort" with a
>"heapsort".

Not at all.  The confusion is between "linear search" and "binary search".

--
Dave Seaman
ags@seaman.cc.purdue.edu

-----------------------------------------------------------------------------

The idea for using binary search for insertion into a heap is an
interesting one (cost is now O(loglogn)). This is only however for
comparisons!  You still have to do all the shifts to place it in the
right place (instead of swaps though).

However, this does not help in the regular Heapsort!  The initial heap
construction using siftup (Aho calls it pushdown) is O(nlogn) (really
O(n)) and so is the sort phase itself. None of these phases uses
insertion into a heap - only pushdown! So where is the improvement in
the constant factor (compared to Quicksort)! Any comments?

Any reference to Carlson? Thanks, Ariel.

--
    Ariel J. Frank
    Deputy Chairperson, Dept. of Mathematics and Computer Science
    Bar Ilan University, Ramat Gan, Israel 52100
    Tel: (972-3) 5318407/8

    BITNET: ariel@bimacs (also F68388@barilan)
    ARPA:   ariel%bimacs.bitnet@cunyvm.cuny.edu
    CSNET:  ariel%bimacs.bitnet%cunyvm.cuny.edu@csnet-relay
    UUCP:   uunet!mcvax!humus!bimacs!ariel