[comp.lang.c] sort

glenn@stsim.UUCP (Glenn Ford) (06/22/89)

Does anyone have a sort routine (in memory sort) faster than quicksort, of
so please e-mail it to me, thanks.

Glenn Ford
..uunet!ocsmd!stsim!glenn

glenn@ocsmd.ocs.com (glenn ford) (06/27/89)

Does anyone know of a sort routine (in memory) faster than non-recursive
quick sort?  If so, could you please e-mail it to me at the below
address.  Thank you.

..uunet!ocsmd!stsim!glenn

kenny@m.cs.uiuc.edu (06/28/89)

Glenn Ford (uunet!ocsmd!stsim!glenn) writes:
>Does anyone have a sort routine (in memory sort) faster than quicksort, of
>so please e-mail it to me, thanks.

Generally speaking, there isn't any; at least, no sort can be faster
than quicksort by more than a constant factor on a von Neumann
machine.  

Depending on what you're doing, it may be to your advantage to code a
custom sort for your particular application; qsort(), while `as fast
as possible' up to the constant factor, is still rather slow (you pay
for the generality with a subroutine call for every comparison, and
lots of pointer de-references).  A non-recursive implementation of
quicksort is probably the way to go; use care in selecting the pivot
element.  You may also find it advantageous to hand-optimize the
generated code.

If your lists are close to ordered already, you may do well to use one
of the sorts that can take advantage of the ordering; Shell-Metzner
sort is one of them.  Also, if you're adding items to an
already-sorted list, you do better to sort the additions and then
merge the two lists.

There are lots of applications that are sort-bound; good luck in
achieving a reasonable running time for yours.  Sorry that there isn't
any easy way from where you are already;  a good source for ideas will
be Volume 3 of Knuth.

| /         o            Kevin Kenny                             (217) 333-5821
|<  /) |  | | |/\        Department of Computer Science           o  ,    o  ,
| \ X_  \/  | | |        University of Illinois                 40 07 N 88 13 W
kenny@cs.uiuc.edu        1304 W. Springfield Ave.       
uunet!uiucdcs!kenny      Urbana, IL   61801                  AD ASTRA PER ARDUA
k-kenny@uiuc.edu
kenny%cs@uiucvmd.bitnet

mpl@cbnewsl.ATT.COM (michael.p.lindner) (06/29/89)

In article <4700040@m.cs.uiuc.edu>, kenny@m.cs.uiuc.edu writes:
> 
> Glenn Ford (uunet!ocsmd!stsim!glenn) writes:
> >Does anyone have a sort routine (in memory sort) faster than quicksort, of
> 
> Generally speaking, there isn't any; at least, no sort can be faster
> than quicksort by more than a constant factor on a von Neumann
> machine.  
	...

Of COURSE a sort can be faster than quicksort.  First of all, quicksort has
problems with data which is already close to being sorted.  There are sorts
which are consistently N log N no matter WHAT shape the data is in, such as
a merge sort.  Then there's the radix sort, which is N log (number of bits
in key).  Which sort is best depend on the data.  I refer Glenn to Donald E.
Knuth's book "Searching and Sorting," considered by many to be the definitive
sorting algorithm book.  If he's just looking for an implementation, well,
I have some, but I signed an intellectual property agreement with my
employer, so he'll need permission from Bell Labs to get it from me, but
I'm sure someone out there has a radix sort or some such that they'd share
with us.

Mike Lindner
attunix!mpl
AT&T Bell Laboratories
190 River Rd.
Summit, NJ 07901

henry@utzoo.uucp (Henry Spencer) (06/29/89)

In article <4700040@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
>>Does anyone have a sort routine (in memory sort) faster than quicksort, of
>>so please e-mail it to me, thanks.
>
>Generally speaking, there isn't any; at least, no sort can be faster
>than quicksort by more than a constant factor on a von Neumann machine.  

There is an important phrase missing here:  "in the worst case".  There
are sort algorithms which are considerably better (linear rather than
linear-log) in the *typical* case.  Or so I am told.
-- 
NASA is to spaceflight as the  |     Henry Spencer at U of Toronto Zoology
US government is to freedom.   | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

aitken@shamesh.cs.cornell.edu (William E. Aitken) (06/30/89)

In article <1989Jun29.155648.28819@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>In article <4700040@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
>>>Does anyone have a sort routine (in memory sort) faster than quicksort, of
>>>so please e-mail it to me, thanks.
>>
>>Generally speaking, there isn't any; at least, no sort can be faster
>>than quicksort by more than a constant factor on a von Neumann machine.  
>
>There is an important phrase missing here:  "in the worst case".  There

No.  One of quicksort's problems is its horrendous worst case behavior :
quadratic.  It's rather easy to find sorts that are faster than quicksort
in the worst case.  Merge sort and Shell sort come to mind.



William E. Aitken <aitken@cs.cornell.edu>   | On Earth it is considered 
{uw-beaver,rochester,decvax}!cornell!aitken | ill mannered to kill your
Home: (607) 273-7810 Away : (607) 255-9206  | friends while commiting suicide
============================================*============= Avon ==============

desnoyer@apple.com (Peter Desnoyers) (06/30/89)

In article <1989Jun29.155648.28819@utzoo.uucp> henry@utzoo.uucp (Henry 
Spencer) writes:
> In article <4700040@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
> >>Does anyone have a sort routine (in memory sort) faster than 
quicksort, of
> >>so please e-mail it to me, thanks.
> >
> >Generally speaking, there isn't any; at least, no sort can be faster
> >than quicksort by more than a constant factor on a von Neumann machine. 
 
> 
> There is an important phrase missing here:  "in the worst case".  There
> are sort algorithms which are considerably better (linear rather than
> linear-log) in the *typical* case.  Or so I am told.

Actually, quicksort is O(N**2) worst case and O(NlogN) typical case. It can
be modified to be O(NlogN) in all cases, at the cost of a steep (but
constant) performance hit. (This modified qsort is only of academic 
interest.)

Depending on how you define "sort", the theoretical bound is either 
O(NlogN) or O(N). If you can only compare two elements and determine 
<, =, or >, then the limit is O(NlogN), which is achieved by quicksort. 
If you can express each element as a string of symbols in a finite 
alphabet you can do a radix sort - e.g. (for purely alphabetic data)

   rsort(list)
     if length of list == 1 return list
     else
        look at first letter of each element, put into 1 of 26 buckets
        rsort each bucket
        concatenate each sorted bucket, return resulting list
   end rsort

You can determine that the running time of this algorithm is proportional
to the sum of the lengths of the strings to be sorted - e.g. O(N) instead
of O(NlogN). However, you need to know the data representation - you 
can't write a general purpose radix sort.

The last thing to remember is that O(N**2) isn't always worse than 
O(NlogN). A good practical hash algorithm may take time kN**2, where 
k is really small, compared to a theoretically "better" algorithm 
that takes time KNlogN, where K is really big. Unless the size of your
data set approaches infinity (e.g. census data) the practical algorithm
will be better.

                                      Peter Desnoyers
                                      Apple ATG
                                      (408) 974-4469

djones@megatest.UUCP (Dave Jones) (06/30/89)

In article <4700040@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
>>Does anyone have a sort routine (in memory sort> faster than quicksort, of
>>so please e-mail it to me, thanks.
>
>Generally speaking, there isn't any; at least, no sort can be faster
>than quicksort by more than a constant factor on a von Neumann machine.  


Where'd you come up with that? If you can spare the memory for it, a bucket sort,
with a bucket for each key, is O(n), not O(n-log-n). You can make trade-offs
between speed and memory usage with mixed methods. Besides, there are considerations
of average-case vs. worst-case. The relative costs of internal and external
memory access and key comparison also matters.

The subject of sorting has been (ahem) hashed over perhaps as much as any
other in the field of computers. Why not "open book before engaging mouth"? :-)

bengsig@oracle.nl (Bjorn Engsig) (06/30/89)

From article <1989Jun29.155648.28819@utzoo.uucp> by henry@utzoo.uucp (Henry Spencer):
|There
|are sort algorithms which are considerably better (linear rather than
|linear-log) in the *typical* case.  Or so I am told.

I simply couldn't resist this, although it has nothing at all to do with C.

A few years ago an article in either Comm. of the ACM or Scientific American
(I don't remember which), described the spaghetti sorter, which is linear in
time for any initial distribution.

You use a bunch of (unboiled) spaghettis, and cut them in length corresponding
to each of the elements you want to sort, this is of course linear in time.
You then hold all the spaghettis together and put them vertically on a table
(being worstly linear in time), and you can easily pick out the largest,
then the next to largest, etc., again linear in time.  As a result, you have
your data sorted in linear time!

Don't ask me how you implement this in C :-)
-- 
Bjorn Engsig, ORACLE Europe         \ /    "Hofstadter's Law:  It always takes
Path:   mcvax!orcenl!bengsig         X      longer than you expect, even if you
Domain: bengsig@oracle.nl           / \     take into account Hofstadter's Law"

paul@moncam.co.uk (Paul Hudson) (06/30/89)

In article <29474@cornell.UUCP> aitken@shamesh.cs.cornell.edu (William E. Aitken) writes:
   In article <1989Jun29.155648.28819@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
   >In article <4700040@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
   >>>Does anyone have a sort routine (in memory sort) faster than quicksort, of
   >>>so please e-mail it to me, thanks.
   >>
   >>Generally speaking, there isn't any; at least, no sort can be faster
   >>than quicksort by more than a constant factor on a von Neumann machine.  
   >
   >There is an important phrase missing here:  "in the worst case".  There

   No.  One of quicksort's problems is its horrendous worst case behavior :
   quadratic.  It's rather easy to find sorts that are faster than quicksort

However using a median-of-three pivot makes the worst case *extremely*
unlikely. See Sedgewick "Algorithms". 

--
Paul Hudson	 MAIL: Monotype ADG, Science Park, Cambridge, CB4 4FQ, UK.
		PHONE: +44 (223) 420018	  EMAIL: paul@moncam.co.uk,
	;"	  FAX: +44 (223) 420911		 ...!ukc!acorn!moncam!paul
 `"";";"        "/dev/null full: please empty the bit bucket"

brock@tuvie (Inst.f.Prakt.Info 1802) (06/30/89)

In article <950@cbnewsl.ATT.COM> mpl@cbnewsl.ATT.COM (michael.p.lindner) writes:
>In article <4700040@m.cs.uiuc.edu>, kenny@m.cs.uiuc.edu writes:
>> Glenn Ford (uunet!ocsmd!stsim!glenn) writes:
>> >Does anyone have a sort routine (in memory sort) faster than quicksort, of
>Of COURSE a sort can be faster than quicksort.  First of all, quicksort has
[...]
>I'm sure someone out there has a radix sort or some such that they'd share
>with us.

May I refer to
G.H.Gonnet
Handbook of Algorithms and Data Structures
Addison-Wesley Publ.Comp. 1984
You find there Heap-, Shell-, Radix-, Merge-...sort both in C and Pascal source.

Ulrich

glenn@ocsmd.ocs.com (glenn ford) (06/30/89)

In article <4700040@m.cs.uiuc.edu> kenny@m.cs.uiuc.edu writes:
>
>Glenn Ford (uunet!ocsmd!stsim!glenn) writes:
>>Does anyone have a sort routine (in memory sort) faster than quicksort, of
>>so please e-mail it to me, thanks.
>
>Depending on what you're doing, it may be to your advantage to code a
>custom sort for your particular application; qsort(), while `as fast
>as possible' up to the constant factor, is still rather slow (you pay
>for the generality with a subroutine call for every comparison, and
>lots of pointer de-references).  A non-recursive implementation of
>quicksort is probably the way to go; use care in selecting the pivot
>element.  You may also find it advantageous to hand-optimize the
>generated code.

Let me explain to the world what I am currently doing.  At our work we
do alot of B-tree builds which require a sorted text file as input
to sort.  Thus the need for something fast, since the text file
to sort can be up to 250 megabytes with about 12-15 million records
to sort.  I currently run on a 6310 (VAX) that is about 5 mips rating
with 780 at 1 mip.  Now the data is totally random and is spread
across the alphabet pretty well.  Only requirements is that it be
CASE sensitive (yes, it makes things REAL slow then!) and that
you have a prefix offset.  Now I can sort about 1.5 megabytes per
minute.  I use ONLY quicksort in memory and if I don't have enough
memory I either a)split the file into multiple smaller files, then
sort and append to output file or b) I call sortmerge (VAX sort)
which takes care of those problems.  I usually use case a unless I
have allready split the file.  I only split the original file, and
I split it into 28 seperate files.
Now, is there anything that can beat quicksort?

Thanks for all your help.

Glenn Ford
..uunet!ocsmd -->!glenn
            \--->!stsim!glenn

glenn@ocsmd.ocs.com (glenn ford) (07/04/89)

I wanted to just thank you EVERYONE who has sent me e-mail in finding
a quicker sort.  Thanks everyone!

Glenn

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/06/89)

In article <422@siswat.UUCP> buck@siswat.UUCP (A. Lester Buck) writes:
>But this [merge sorting with replacement selection] is definitely
>_not_ what Unix sort does, using qsort(3) for all sorts.

As of SVR2, the UNIX "sort" utility definitely did perform n-way merge
sorting with replacement selection.  It used a binary tree for each of
the internal runs being merged, inserting a fresh record from an
external merge input file after each merged record was output.  A
quicksort algorithm (not qsort() from the C library, however) was used
only to produce the initial internal runs.

When did this change?

buck@siswat.UUCP (A. Lester Buck) (07/08/89)

In article <10488@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> In article <422@siswat.UUCP> buck@siswat.UUCP (A. Lester Buck) writes:
> >But this [merge sorting with replacement selection] is definitely
> >_not_ what Unix sort does, using qsort(3) for all sorts.
> 
> As of SVR2, the UNIX "sort" utility definitely did perform n-way merge
> sorting with replacement selection.  It used a binary tree for each of
> the internal runs being merged, inserting a fresh record from an
> external merge input file after each merged record was output.  A
> quicksort algorithm (not qsort() from the C library, however) was used
> only to produce the initial internal runs.
> 
> When did this change?

I was referring to replacement selection as the method to generate the
initial runs, not as the method for the n-way merge.  But these initial runs
are not "internal," as the original poster's problem called for an external
(tape or disk) sort.  In replacement selection, a tournament tree is
maintained of losers in the sort comparison, so a record can either be
output in the current run or saved in the tree.  When the tree is filled
with losers, then a new run is started.

These runs then go into the obvious n-way merge algorithm to be merged at the
highest order possible, limited only by the number of open files.
Replacement selection generates, on the average, runs that are twice as long
as a simple in-core sort, thus allowing a lower merge order and possibly
saving one or more input/output passes through all the data.  On partially
sorted data, the runs can be much longer.  Knuth Vol. 3 gives a cute
pictorial representation of the effective doubling of the run length using a
snowplow traveling in a circle during a snow storm.

Before I wrote my own custom tape sort with replacement selection, I was
using the SVR2 sort.  I was sorting ~160 MB tapes.  It clearly was doing a
simple in-core sort, because all the intermediate runs were exactly the same
length (~1.2 MB in my case, for the initial set of runs).  [I have also
glanced at the source since then, and verified that it was doing a quicksort
type sort in core (though not qsort(3), as Doug reminds me).  I do not
remember seeing any support for a replacement selection tree, just a simple
n-way comparison with no memory.]  I also had to make a binary patch to sort
to have it use a huge scratch filesystem, rather than /tmp or whatever by
default.  Only later did I learn of the undocumented switch to add a
temporary space directory.  In addition, different manuals described
different SVR2 sort options for the amount of memory to use, none of which
seemed to do anything on my version.  (The program could malloc() lots of
virtual memory, but we would only want to use the available physical RAM.)

Replacement selection can also be used in the n-way merge of initial runs,
but the savings are marginal and usually not worth the trouble.  For the
initial runs, the order of merge P is effectively the number of records that
will fit in RAM (real RAM, not virtual memory), while for the n-way merge,
the order cannot exceed the number of free file descriptors, which is about
15 in Unix.  Knuth suggests the breakeven in maintaining the comparison tree
is order P ~ 8. For less than this, it is faster to do P-1 comparisons for
each record.  The version of SVR2 sort I employed did not appear to use
replacement selection for any part of the sort/merge (and only merged at
order 10 or 11, if I remember), and all the intermediate merge files were
always the same fixed length (except for the last partial file).


From "File Structures: A Conceptual Toolkit" by Folk and Zoellick (p. 6):

	A few months ago one of us was interviewing an applicant for a
	position that involved working on a large information retrieval
	problem.  Over lunch we told the applicant about the difficulties
	we had finding a sort package suitable for sorting the hundreds
	of millions of records involved in the index structures for
	the system we were working on.  The applicant asked, "I suppose
	you use Quicksort for that?"

	His question can serve as a nice illustration of the difference
	between a typical _data structure_ course and the study of
	_file structures_.  [...]

	Unfortunately, he was also showing us that he had not had the
	opportunity to take a file structure course.  A file consisting
	of hundreds of millions of records is much too big to load into
	RAM for sorting.  The sorting must involve use of secondary storage.
	That means that we are no longer in a _random access_ environment;
	the cost of getting some pieces of information is now enormously
	more expensive than the cost of getting other information.
	[...]

	A later chapter gives careful consideration to the limitiations
	of secondary storage devices in the construction of a sorting
	algorithm that is very different than Quicksort.


-- 
A. Lester Buck		...!texbell!moray!siswat!buck

buck@siswat.UUCP (A. Lester Buck) (08/08/89)

In article <770@ocsmd.ocs.com>, glenn@ocsmd.ocs.com (glenn ford) writes:
> Let me explain to the world what I am currently doing.  At our work we
> do alot of B-tree builds which require a sorted text file as input
> to sort.  Thus the need for something fast, since the text file
> to sort can be up to 250 megabytes with about 12-15 million records
> to sort.  I currently run on a 6310 (VAX) that is about 5 mips rating
> with 780 at 1 mip.  Now the data is totally random and is spread
> across the alphabet pretty well.  Only requirements is that it be
> CASE sensitive (yes, it makes things REAL slow then!) and that
> you have a prefix offset.  Now I can sort about 1.5 megabytes per
> minute.  I use ONLY quicksort in memory and if I don't have enough
> memory I either a)split the file into multiple smaller files, then
> sort and append to output file or b) I call sortmerge (VAX sort)
> which takes care of those problems.  I usually use case a unless I
> have allready split the file.  I only split the original file, and
> I split it into 28 seperate files.
> Now, is there anything that can beat quicksort?

Sure.  Once you can't fit everything into memory, you have a sort/merge
problem.  I had the same problem of sorting magnetic tapes full
of fixed length records, up to about 160-180 MBytes.  I only have
one tape drive, so I, of course, have to simulate other tape drives
on disk.  If you do the in-core sort with quicksort, you generate
fixed length runs to merge.  Instead, you should use a replacement
selection sorting algorithm for the in-core sort, which, on the
average, generates runs twice as long.  (Much longer if the data
is nearly sorted, which mine sometimes is.)  This saves a lot of
tape handling in the merge phase.  (Or a lot of seeking on disk.)
I would hope that this is what the VAX sortmerge is doing.  But this is
definitely _not_ what Unix sort does, using qsort(3) for all sorts.
Just remember, if even one merge pass can be eliminated by longer
runs, then that means the entire file (~200 MBytes) needs to be
input one less time.

I had to write my own routines tuned for my application, but the
algorithms are described very clearly in several places.  I used
Baase, "Computer Algorithms", and Knuth Vol. 3.  A general discussion
in Folk and Zoellick, "File Structures: A Conceptual Toolkit"
covers various sorting problems, in-core and on disk.

-- 
A. Lester Buck		...!texbell!moray!siswat!buck

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/26/89)

In article <422@siswat.UUCP> buck@siswat.UUCP (A. Lester Buck) writes:
>I would hope that this is what the VAX sortmerge is doing.  But this is
>definitely _not_ what Unix sort does, using qsort(3) for all sorts.

This claim was made before, and it was just as false then.  The UNIX
"sort" utility (all versions I've seen, definitely the SVR2 one) do
merge sorting via replacement selection.