ariel@bimacs.BITNET (Ariel J. Frank) (10/05/89)
Hi net/lang guys. Regarding the following Quicksort vs. Heapsort discussion: ----------------------------------------------------------------------------- Path: bimacs!barilvm!psuvm!psuvax1!brutus.cs.uiuc.edu!ginosko!uunet!munnari.oz.au!basser!steve From: steve@basser.oz (Stephen Russell) Newsgroups: comp.lang.modula2 Subject: Re: Quicksort vs. Heapsort Message-ID: <2585@basser.oz> Date: 30 Sep 89 16:31:55 GMT References: <828zebolskyd@yvax.byu.edu> Sender: msgs@basser.oz Organization: Dept of Comp Sci, Uni of Sydney, Australia Lines: 32 In article <828zebolskyd@yvax.byu.edu> zebolskyd@yvax.byu.edu writes: >In <2033@ethz.UUCP>, Michael Rys writes: > >>In 1987 a guy called Carlson (I think) improved the Heapsort-Algorithm >>by using a binary search for inserting into the sorted list. In this >>way Heapsort is always faster than Quicksort for very larg n. > >It is interesting that it took until 1987 for somebody to make that >improvement. I was watching a demonstration program for QuickBasic >on the Macintosh that compares the various methods. The one they called >a heap sort used a linear search to insert. I thought there was a mistake or >bug, because a binary search would have been so much faster. I'm going to feel a real fool if I get this wrong, but ... since when did _anyone_ use 'linear searching' to add an element to a heap? I think there is some confusion here. For example, there is no "sorted list" in a heap, at least in the conventional sense of sorted. A "linear search" to find the "insertion" point makes no sense at all. To add an element to a heap with elements h[1] to h[n], you just add one to n, put the new element at h[n] (for new n), then compare it with h[n/2]. If h[n] > h[n/2], swap them, then compare h[n/2] with h[n/4], etc. This maintains the invariant h[k] >= max(h[k*2], h[k*2+1]) for k = 1..n/2, which is the definition of a heap. This is obviously a binary search (the divisor doubles at each iteration), and is the _only_ way to add elements to a heap. I suspect that some out there are confusing an "insertion sort" with a "heapsort". Steve Russell Path: bimacs!barilvm!psuvm!psuvax1!gatech!purdue!mentor.cc.purdue.edu!ags From: ags@mentor.cc.purdue.edu (Dave Seaman) Newsgroups: comp.lang.modula2 Subject: Re: Quicksort vs. Heapsort Message-ID: <4287@mentor.cc.purdue.edu> Date: 1 Oct 89 16:02:05 GMT References: <828zebolskyd@yvax.byu.edu> <2585@basser.oz> Reply-To: ags@mentor.cc.purdue.edu (Dave Seaman) Organization: Purdue University Lines: 48 In article <2585@basser.oz> steve@basser.oz (Stephen Russell) writes: >I'm going to feel a real fool if I get this wrong, but ... since when >did _anyone_ use 'linear searching' to add an element to a heap? I >think there is some confusion here. For example, there is no "sorted >list" in a heap, at least in the conventional sense of sorted. A >"linear search" to find the "insertion" point makes no sense at all. I have never seen the QuickBasic Demo, nor have I heard a description of this variety of Heapsort before, but I understood immediately what the writer meant. There is indeed a "linear list" involved, although the elements of the list do not reside in consecutive locations in memory. >To add an element to a heap with elements h[1] to h[n], you just add >one to n, put the new element at h[n] (for new n), then compare it with >h[n/2]. If h[n] > h[n/2], swap them, then compare h[n/2] with h[n/4], etc. >This maintains the invariant h[k] >= max(h[k*2], h[k*2+1]) for k = 1..n/2, >which is the definition of a heap. Precisely. The linear list begins with h[n], h[n/2], h[n/4], ..., and ends with h[1]. Just imagine a triangular-shaped heap display and notice the path back to the root. It zigs and zags, but it is basically a linear list. It is sorted, except for the last element, which needs to be inserted in the proper place. The traditional way to do this has been to step through the list in linear fashion, as you described. But, since it is a linear list, it is quite sensible to do a binary search instead. >This is obviously a binary search (the divisor doubles at each >iteration), and is the _only_ way to add elements to a heap. No. That is not a binary search. Consider, for example, the case where n = 137 (after incrementing). The linear list, in this case, consists of h[1], h[2], h[4], h[8], h[17], h[34], h[68], h[137]. The method of comparing h[137] first with h[68], then h[34], etc., is obviously a linear search. A binary search would begin by comparing h[137] with h[8] (the one in the middle of the sorted segment). Depending on the result of that comparison, h[137] would next be compared with either h[2] or h[34]. >I suspect that some out there are confusing an "insertion sort" with a >"heapsort". Not at all. The confusion is between "linear search" and "binary search". -- Dave Seaman ags@seaman.cc.purdue.edu ----------------------------------------------------------------------------- The idea for using binary search for insertion into a heap is an interesting one (cost is now O(loglogn)). This is only however for comparisons! You still have to do all the shifts to place it in the right place (instead of swaps though). However, this does not help in the regular Heapsort! The initial heap construction using siftup (Aho calls it pushdown) is O(nlogn) (really O(n)) and so is the sort phase itself. None of these phases uses insertion into a heap - only pushdown! So where is the improvement in the constant factor (compared to Quicksort)! Any comments? Any reference to Carlson? Thanks, Ariel. -- Ariel J. Frank Deputy Chairperson, Dept. of Mathematics and Computer Science Bar Ilan University, Ramat Gan, Israel 52100 Tel: (972-3) 5318407/8 BITNET: ariel@bimacs (also F68388@barilan) ARPA: ariel%bimacs.bitnet@cunyvm.cuny.edu CSNET: ariel%bimacs.bitnet%cunyvm.cuny.edu@csnet-relay UUCP: uunet!mcvax!humus!bimacs!ariel