[comp.sys.mac.hypercard] Sorting gets very slow

man@cs.brown.edu (09/11/89)

I've developed a stack for use as a bibliographic database supporting
BibTeX format.  I'm getting a report from a user that his stack, which
is about 1.2 meg, takes around 20 minutes to sort using HyperCard's
builtin sort routine.  Why is this so slow?  Did they use an n log n
algorithm for the sort?  Is there any way of writing an XCMD to do the
sorting which might work faster?


	--Mark

dan@Apple.COM (Dan Allen) (09/13/89)

HyperCard uses QuickSort to sort cards of a stack.  It should be quite
fast in most cases.  Large stacks obviously take longer, but the length
is mainly determined by the number of cards, not the size of these
cards.

Dan Allen
HyperCard Team

taylorj@yvax.byu.edu (09/21/89)

In message <14843@brunix.UUCP>, Mark <Somebody> complains about slow sorting on
large stacks.

There is no way you can write an XCMD that will sort a stack any faster than
HyperCard.
I recommend that you solve the problem by keeping the stack in sorted order.
When a new card is inserted you can do a binary search (which in large stacks
will actually be faster than HyperCard's built in search) to find the correct
position for the new card and then insert it in that position.  This way you
will never have to sort the stack.

Jim Taylor
Microcomputer Support for Curriculum   |
Brigham Young University               |   Bitnet: taylorj@byuvax.bitnet
101 HRCB, Provo, UT  84602             |   Internet: taylorj@yvax.byu.edu

man@brunix (Mark H. Nodine) (09/22/89)

In article <807taylorj@yvax.byu.edu> taylorj@yvax.byu.edu writes:
>In message <14843@brunix.UUCP>, Mark <Somebody> complains about slow sorting on
>large stacks.
>
>I recommend that you solve the problem by keeping the stack in sorted order.
>When a new card is inserted you can do a binary search (which in large stacks

Unfortunately, this won't work for my stack because I have several different
backgrounds, all of which have a key field with the same name.  When I need
to create a new card, I suppose it is (barely) conceivable that I could go
to the card with the same background that is closest in sorted order to where
I want it to appear, but in general I can't just create it in the right place.

Some of the n log n sorting algorithms become n^2 when the list is in "almost
sorted" order.  My stack is usually "almost sorted", so I wonder it that was
the problem.

	--Mark

taylorj@yvax.byu.edu (09/26/89)

Dan Allen said HyperCard uses a QuickSort routine (and if Dan Allen says so, I
believe it!).  Unlike some sorts, QuickSort's speed does not improve if the
data is already sorted (or mostly sorted).  It shouldn't actually be any slower
than normal, but it won't be any faster either.


Jim Taylor
Microcomputer Support for Curriculum
Brigham Young University
taylorj@byuvax.bitnet

pepke@loligo (Eric Pepke) (09/26/89)

In article <818taylorj@yvax.byu.edu> taylorj@yvax.byu.edu writes:
>Dan Allen said HyperCard uses a QuickSort routine (and if Dan Allen says so, I
>believe it!).  Unlike some sorts, QuickSort's speed does not improve if the
>data is already sorted (or mostly sorted).  It shouldn't actually be any slower
>than normal, but it won't be any faster either.

If I remember correctly, the speed of QuickSort is based on the ability
accurately to guess the median of a range in constant time.  If this is
done perfectly all the time, the algorithm is O(n log n).  If this is done
poorly, the algorithm is O(n^2).  One way of guessing the median which
is often presented as a bad way is to take the first three values and
average them.  In this case, the more sorted the data, the slower the
algorithm would be.

Of course, there are much better ways to guess the median.  However, it 
still remains a best case O(n log n), worst case O(n^2) algorithm.  If you
are clever, you can get the expected case to be O(n log n).

I don't understand why QuickSort is so popular, because there are plenty
of good, easy, stable sorts that give worst case O(n log n) performance and 
best case O(n).  Maybe people are misled by the name, or maybe everybody 
starts from the source to qsort.

Then again, I don't understand why some people buy double-strength pain
reliever that costs three times as much, either.

Eric Pepke                                     INTERNET: pepke@gw.scri.fsu.edu
Supercomputer Computations Research Institute  MFENET:   pepke@fsu
Florida State University                       SPAN:     scri::pepke
Tallahassee, FL 32306-4052                     BITNET:   pepke@fsu

Disclaimer: My employers seldom even LISTEN to my opinions.
Meta-disclaimer: Any society that needs disclaimers has too many lawyers.