staylor@pmdvax.UUCP (Scott G. Taylor) (10/09/89)
I am looking for a better sort(1) which will exploit free memory to increase speed. Actually, I'm looking for one that will do wnything to increase speed. My application involves sorting many (>500,000) records. Has anyone written a sort which hashes or otherwise speeds execution? -Scott -- Scott G. Taylor Pmd Resources (818) 991-0068 {wlbr,mahendo}!snidely!staylor 31230 Cedar Valley Dr. Westlake Village, CA 91361 "Vienoti Latvijai!"
ok@cs.mu.oz.au (Richard O'Keefe) (10/09/89)
In article <110@pmdvax.UUCP>, staylor@pmdvax.UUCP (Scott G. Taylor) writes: > I am looking for a better sort(1) which will exploit free memory > to increase speed. Which sort(1) are you using? The System V sort(1) has an option -y$MEMORY_TO_USE_IN_KILOBYTES I recently posted a small fast sort (called nsort) to comp.sources.misc -- somewhere along the way it lost a "}"; patch coming -- which slurps the entire file into memory, sorts it, and writes it out; it is up to twice as fast as sort(1) even with the -y option because (a) it is using strcmp() rather than a generic key comparison routine (b) it uses a list merge rather than the misleading misnamed "quick"sort. A device you might use is to break your file into chunks, read them in, and sort them using a carefully crafted key comparison routine (nsort is a good place to start), then use sort(1) just to merge the chunks.
davidc@vlsisj.VLSI.COM (David Chapman) (10/11/89)
In article <110@pmdvax.UUCP> staylor@pmdvax.UUCP (Scott G. Taylor) writes: >I am looking for a better sort(1) which will exploit free memory >to increase speed. Actually, I'm looking for one that will do >wnything to increase speed. My application involves sorting >many (>500,000) records. Has anyone written a sort which hashes >or otherwise speeds execution? Try Knuth's Replacement Selection Sort, Vol. 3 p. 256. It's the algorithm we use to sort data (millions of records). It works particularly well when the time spent in comparing keys is small relative to I/O time. When this happens it sorts in almost linear time. Warning: it is not an in-place sort. You need enough disk space to hold a temporary file the size of the data file, and possibly room for a third copy if you don't want to stomp on the original. Sorry, I can't send you source code but if you have questions on how it works I'll be glad to answer them. -- David Chapman {known world}!decwrl!vlsisj!fndry!davidc vlsisj!fndry!davidc@decwrl.dec.com