[comp.sys.mac.hypercard] Sorting Containers

dan@Apple.COM (Dan Allen) (06/25/89)

Several people have wanted a routine to sort lines of a field or
variable (containers).  Here is a HyperTalk script that uses the
selection sort method of sorting.  Selection sorting is an algorithm of
order n squared, however, for the average random cases where n < 1000,
selection sorting often performs closer to n log n, so it is not bad for
small sort jobs.  Quicksort or Heapsort would be better for large n, but
this is so simple...

This script is a function.  For example, if you wanted to sort field 1
of the current card, you could say:

	put sortContainer(field 1) into field 1

Here is the actual script:

function sortContainer anyContainer -- selection sort
  get anyContainer
  put the number of lines of it into numLines
  repeat with i = 1 to numLines - 1
    put i into k
    put line i of it into x
    repeat with j = i+1 to numLines
      if line j of it < x then
        put j into k
        put line j of it into x
      end if
    end repeat
    put line i of it into line k of it
    put x into line i of it
  end repeat
  return it
end sortContainer

Enjoy!

Dan Allen
HyperCard Team
Apple Computer

kurtzman@pollux.usc.edu (Stephen Kurtzman) (06/26/89)

In article <32656@apple.Apple.COM> dan@Apple.COM (Dan Allen) writes:

>Here is a HyperTalk script that uses the selection sort method of sorting.
>Selection sorting is an algorithm of order n squared, however, for the
>average random cases where n < 1000, selection sorting often performs closer
>to n log n, so it is not bad for small sort jobs.  Quicksort or Heapsort
>would be better for large n, but this is so simple...

>This script is a function.  For example, if you wanted to sort field 1
>of the current card, you could say:

>	put sortContainer(field 1) into field 1

>Here is the actual script:

>function sortContainer anyContainer -- selection sort
>  get anyContainer
>  put the number of lines of it into numLines
>  repeat with i = 1 to numLines - 1
>    put i into k
>    put line i of it into x
>    repeat with j = i+1 to numLines
>      if line j of it < x then
>        put j into k
>        put line j of it into x
>      end if
>    end repeat
>    put line i of it into line k of it
>    put x into line i of it
>  end repeat
>  return it
>end sortContainer

This is picky, but I suspect that an expression such as "line j of it" is
not an O(1) operation in HyperTalk. If we assume some reasonable fixed bound
on line length, then it is probably O(j) since it would have to scan the
string for the j-1 occurrence of a line break and then copy all of the
characters up to the jth line break. What this means is, if my suspicion
is correct and we assume there is an upper limit on line length,
the script above is actually O(n**3), where n is the number of lines in
the container.

Like I said, it was a picky thing. It shouldn't stop anyone from using
the script. Thanks to Dan for posting it.

A point the script supports, which Dan may or may not have been trying to
make, is that it is easy to quickly put together some very functions useful
functions in HyperTalk. The sortContainer script is a great example. It
probably took Dan a couple of minutes because HyperTalk has simple expressions
to access bits and pieces (actually lines, words, and chars) of containers.
In other words, there's little reason wait around for someone to write an
XCMD for a function that can be implemented very quickly in HyperTalk.

stone@rye.cs.unm.edu (Andrew Stone) (06/26/89)

>In other words, there's little reason wait around for someone to write an
>XCMD for a function that can be implemented very quickly in HyperTalk.

Why wait? Wing Eng of Cornell posted ShellSort awhile back. It sorts 
containers several MAGNITUDES faster than a hypertalk script That's
the efficiency difference between a tool and a toy.

andrew

||<<++>>||<<-->>||<<==>>||<<++>>||<<??>>||<<++>>||<<-->>||<<==>>||<<++>>||
!!	   Andrew Stone	            !!        the fictive milieu of	!!
!!         stone@rye.cs.unm.edu	    <> 	      contemporary society!	!!
||<<++>>||<<-->>||<<==>>||<<++>>||<<??>>||<<++>>||<<-->>||<<==>>||<<++>>||

dan@Apple.COM (Dan Allen) (06/27/89)

In article <18102@usc.edu> kurtzman@pollux.usc.edu (Stephen Kurtzman) writes:
>This is picky, but I suspect that an expression such as "line j of it" is
>not an O(1) operation in HyperTalk. If we assume some reasonable fixed bound
>on line length, then it is probably O(j) since it would have to scan the
>string for the j-1 occurrence of a line break and then copy all of the
>characters up to the jth line break. What this means is, if my suspicion
>is correct and we assume there is an upper limit on line length,
>the script above is actually O(n**3), where n is the number of lines in
>the container.

Good point.  This script could probably be optimized massively, but as
mentioned, it is a short (1 screen) script that sorts lines.  I hope
other people can pull out their Knuth "Sorting and Searching" volumes
and improve on this simple example.  I wrote the script in 10 minutes.
I am sure that hours will prove worthwhile.

Dan Allen
Apple Computer

Douglas.Welch@f823.n102.z1.FIDONET.ORG (Douglas Welch) (06/29/89)

Thank you Dan,
 
This is exactly what I was looking for.  I haven't written a sort
procedure in a long time and was not exactly looking forward to creating
one from scratch.
 
BTW... if you want to sort on an item internal to the line  you just need
to alter the line that reads...
 
if line j of it < X then
 
to something like...
 
if char 16 to 30 of line of it < char 16 to 30 of x then ....
 
Leave every thing else alone and it will sort on the 16-30 character of
each line but leave the line intact.
 
Thanks Again...
 
Douglas E. Welch
Van Nuys, CA


--  
Douglas Welch via cmhGate - Net 226 fido<=>uucp gateway Col, OH
UUCP: ...!osu-cis!n8emr!cmhgate!102!823!Douglas.Welch
INET: Douglas.Welch@f823.n102.z1.FIDONET.ORG

dan@Apple.COM (Dan Allen) (07/03/89)

In article <15027.24ACEB9A@cmhgate.FIDONET.ORG> Douglas.Welch@f823.n102.z1.FIDONET.ORG (Douglas Welch) writes:
>Thank you Dan,
> 
>This is exactly what I was looking for.  I haven't written a sort
>procedure in a long time and was not exactly looking forward to creating
>one from scratch.

Thanks for the thanks.  BTW, in my analysis of selection sorting I may
have left the wrong impression...

Selection sorting is on the average an O(n squared) algorithm for
comparisons, O(n ln n) for moves, and O(n squared) in the worst case.
The reason that sometimes selection sorting is quite zippy is that moves
are USUALLY a lot more expensive than comparisons, so on the average
selection sorting works out to be the best of the straight methods
(insertion, selection, exchange or bubblesort).  However, as someone has
already pointed out, the way my posted algorithm stands, the comparisons
are also quite expensive due to the way HyperTalk has to search for each
line linearly, not a performance boost to say the least.

If I get a few free minutes, maybe I'll try coding up QuickSort, which
will be a big win compared to selection sorting.

Dan Allen
Apple Computer

wade@sdacs.ucsd.EDU (Wade Blomgren) (07/04/89)

Here is the Quicksort algorithm from Wirth translated into HyperTalk.
It seems to work pretty well.  When run against the Selection sort
posted previously (by Dan Allen), it is competitive for N < 30 (sometimes
slower, sometimes faster depending on the initial state of the container),
but increasingly faster for larger N.  It sorted 200 more or less
random lines in about 1/5 the time.  You are still probably better
off with an XCMD or XFCN sort routine for large N, due to the inherent 
overhead of HyperTalk, but this is a workable solution for some situations.

Wade Blomgren
UC San Diego
wade@sdacs.ucsd.edu

-------------cut here--------------
This is a translation of an algorithm, not a software release for any
particular purpose or recipient.  Use at your own risk. No warranty 
is expressed or implied.

function quickSort anyContainer
  get anyContainer
  put 1 into s
  put 1 into item 1 of line s of stack
  put the number of lines of it into item 2 of line s of stack
  repeat until s = 0
    put item 1 of line s of stack into l
    put item 2 of line s of stack into r
    subtract 1 from s
    repeat until l >= r
      put l into i
      put r into j
      put line ((l+r) div 2) of it into x
      repeat until i > j
        repeat while line i of it < x
          add 1 to i
        end repeat
        repeat while x < line j of it
          subtract 1 from j
        end repeat
        if i <= j then
          put line i of it into w
          put line j of it into line i of it
          put w into line j of it
          add 1 to i
          subtract 1 from j
        end if
      end repeat
      if i < r then
        add 1 to s
        put i into item 1 of line s of stack
        put r into item 2 of line s of stack
      end if
      put j into r
    end repeat
  end repeat
  return it
end quickSort

-------------cut here--------------

Douglas.Welch@f823.n102.z1.FIDONET.ORG (Douglas Welch) (07/05/89)

Thanks again...  I will be awaiting quicksort with baited breath.
 
Douglas E. Welch
Van Nuys, CA


--  
Douglas Welch via cmhGate - Net 226 fido<=>uucp gateway Col, OH
UUCP: ...!osu-cis!n8emr!cmhgate!102!823!Douglas.Welch
INET: Douglas.Welch@f823.n102.z1.FIDONET.ORG