BOTCHAIR@UOGUELPH.BITNET (Alex Bewley) (09/10/89)
Can anybody out there send me any kind of sorting procedure. I need one badly for a project I am working on. All the books on sorting always seem to be out. Alex Bewley BOTCHAIR@UOGUELPH
MARKV@UKANVAX.BITNET ("MARK GOODERUM - UNIV. OF KANSAS ACS - MARKV@UKANVAX") (09/10/89)
Hmm, sorting can be very implementaion dependant. Are you sorting numbers, strings, etc. Are they a linked list, an array, etc. Are your sorting keys unique or not, etc. You get the idea. However, give me one or more details and I might dig out a procedure or algorithm around. I have a couple of Quicksort routines lying around that aren't too bad and will give you much better performance than the old quick and dirty bubble sorts. (Of course if your sorting LARGE amounts of data I have a nice Heapsort routine I did last semester.) Adios, Mark Gooderum MARKV@UKANVAX markv@kuhub.cc.ukans.edu
ok@cs.mu.oz.au (Richard O'Keefe) (09/10/89)
In article <INFO-M2%89091001300983@UCF1VM>, MARKV@UKANVAX.BITNET ("MARK GOODERUM - UNIV. OF KANSAS ACS - MARKV@UKANVAX") writes: > I have a couple of Quicksort routines lying around that > aren't too bad and will give you much better performance than the old > quick and dirty bubble sorts. *WARNING* If your comparisons are costly (e.g. comparing strings, or almost _anything_ more than a single in-line numeric comparison), Quicksort was known to do 40% more comparisons than Merge sort _when_ _it_ _was_ _invented_. It was invented for a machine whose main memory was something like 512 words (its disc was a massive 16k words), so that minimal extra space was vital. If you have room for one extra word per record (and if you are using variable length records you are almost certainly using a link word of some sort anyway) then the space advantage vanishes. Even with the median-of-three hack, the average case of "Quick" sort is slower than the *worst* case of merge sort. If you are sorting numbers, radix sort is _grossly_ faster than "quick" sort. I have posted some sorting routines in C to the original requester. (I accidentally posted them to MARKV too, sorry 'bout that.) Bubble sort? Surely nobody has ever used bubble sort except as an example in a textbook of how badly it is possible to do it.
heilpern@VAX1.ACS.UDEL.EDU (HEILPERN) (09/12/89)
As I am only in a first level modula course, I can't yet describe the specifics, but here is the simple method of performing a bubble sort, not the best sort in the world, but effective, especially as a learning tool... for (I = 0; I < NUMITEMS-1; I++) for (J = I+1; J < NUMITEMS; J++) if (item[J] 'goesbefore' item[I] 'swapthem') This is written similar to the C construct. I++ means 'I:=I+1', or INC(I). Let me know if this is too vauge... Mark A. Heilpern --M. respond to heilpern@brl rather than as the mailer says!!!
lins@Apple.COM (Chuck Lins) (09/12/89)
I have a highly optimize d quicksort implementation that will appear in "The Modula-2 Software Component Library" Volume 4. As long as you are sorting arrays, it is independent of the data being sorted (since the caller provides the routines necessary for comparison and swapping of items). Link me if your're interested. Chuck Lins -- Chuck Lins | "Exit right to funway." Internet: lins@apple.com | AppleLink: LINS Snail Mail: 20525 Mariani Ave. M/S: 41-K Cupertino, CA 95014 "I like the future - I'm in it." - Firesign Theater
BOTCHAIR@UOGUELPH.BITNET (Alex Bewley) (09/12/89)
Bubble sort is a little bit too slow for my application. I need to sort a linked list in memory. The algorithm I want must not use or require a FIXED array to sort, as the list is dynamic. Lists are not like arrays either, if you go swapping elements in the list pointers become mixed up and you end up trying to sort the interrupt table or screen (which isnt't very good). For the time being I have to use a fixed ARRAY OF POINTERS and then sort the array using a comparison routine that checks the information that the pointers are pointing to, which suprisingly works, but strikes me as being a rather ridiculous way to sort the list. Besides, the array limits the amount of information I can store, which I don't want to do. Alex Just this guy...
MARKV@UKANVAX.BITNET ("MARK GOODERUM - UNIV. OF KANSAS ACS - MARKV@UKANVAX") (09/12/89)
>Bubble sort? Surely nobody has ever used bubble sort ...
Except maybe a non-computerese technical person hacking out quick programs.
My poor friend the ME major couldn't figure out why the simple little program
he wrote to sort 10000 or so lines of a parts list by differnet numbers took
so long... Oh well. "The book showed it so it must be okay..."
But what sort to use as we all (should) know depends on many factors, which
is why answering the original question in anything mor than a vague reply
was difficult. My personal favorite is Heapsort or Binary Insertion (what I
think you refer to as Merge) sorting.
Cordially,
Mark Gooderum
MARKV@UKANVAX
ODX@PSUVM.BITNET (Tim Larson) (09/12/89)
In article <INFO-M2%89091120313809@UCF1VM>, BOTCHAIR@UOGUELPH.BITNET (Alex Bewley) says: > > Bubble sort is a little bit too slow for my application. I need to sort > a linked list in memory. The algorithm I want must not use or > require a FIXED array to sort, as the list is dynamic. Still not quite enough information to tell, but you may not need to sort at all if it's a linked list! That is, as you load each element in the list, simply link it in in-order. When the last element is added, the list is sorted already. I used this technique once in school, when a grade depended on it (we were graded partially on how fast our program was in comparison to the rest of the class). In that case, my program outperformed all the programs that performed sorts. If the list is long, and you can afford a bit more space, this can be made more efficient by using two links and a tree structure. I don't know if this applies to your situation, but maybe there is a germ of an idea you can use here. -Tim Larson odx@psuvm.bitnet
GRANGERG@VTVM1.BITNET (Greg Granger) (09/12/89)
AB>From: Alex Bewley <BOTCHAIR@UOGUELPH> AB>Bubble sort is a little bit too slow for my application. I need to AB>... AB>pointers become mixed up and you end up trying to sort the interrupt AB>... AB>which I don't want to do. Well, I was having the same problem a few months back. I was using TopSpeed Modula-2 to read a directory (queue the filenames of a dir.) and wanted to sort the results. TopSpeed provides a very useful QuickSort and HeapSort with their 'standard' libraries, but these routines quite reasonably expected a indexed list. (Note that the TopSpeed canned sorts require as arguments the list size, a Less than procedure and a Swap procedure, therefore the use of ARRAYs is not mandated, this is not true with all such canned sorts) In any case, what I did was write a module that allowed GETs, PUTs and SWAPs on a heap based indexed list. This allows dynamic array handling, which solved my problem. I have been reworking this module lately to include, unlike types (not `true' arrays), dropping elements, deinit'ing (kill the array and recovering all heap space used), etc... I would be happy to share my source code with you with the following understanding: (A) this is a module being actively tested (deinit'ing is at this time not functional) (B) it's written in TopSpeed Modula-2 with absolutely no attention given to keeping with the Wirth defined standard (ie if TS gave me a special non-standard function that made things easier I used it). (C) Please don't pass the source code around, while I believe that it will help you, it is not finished to the point that I would like to have it floating around with my name on it. ----- To interested other readers. If other are interested in this code, let me know and I will consider releasing it to the list when I have finished testing and debugging. Greg (GRANGERG @ VTVM1)
lins@Apple.COM (Chuck Lins) (09/12/89)
In article <INFO-M2%89091120313809@UCF1VM> Modula2 List <INFO-M2%UCF1VM.bitnet@jade.berkeley.edu> writes: > > Bubble sort is a little bit too slow for my application. I need to sort > a linked list in memory. The algorithm I want must not use or > require a FIXED array to sort, as the list is dynamic. > Lists are not like arrays either, if you go swapping elements in the list > pointers become mixed up and you end up trying to sort the interrupt table > or screen (which isnt't very good). > For the time being I have to use a fixed ARRAY OF POINTERS and then sort > the array using a comparison routine that checks the information > that the pointers are pointing to, which suprisingly works, but strikes me > as being a rather ridiculous way to sort the list. Besides, the array > limits the amount of information I can store, which I don't want to do. > Alex, There are several alternative approaches you can take: 1. use quicksort for lists. Originally developed by Dalia Motzkin, in an article that appeared in Communications of the ACM I think in the mid-70's. A Pascal implementation appears in G.H. Gonnet's book "Handbook of Algorithms and Data Structures", Addison-Wesley, 1983. 2. Use a dynamically allocated array of pointers instead of a fixed size one. An outline of how this could be done follows: MDOULE SortPointers; TYPE ptrData = POINTER TO Data; TYPE arrayOfPtrs = ARRAY [0..100000] OF ptrData; (* or some other very large number for the maximum index *) VAR myArray : POINTER TO arrayOfPtrs; BEGIN (* allocate an array of the size you need *) Allocate(myArray, SIZE(ptrData) * "number of pointers in the array"); (* fill the array with the pointers *) (* sort it *) (* do whatever else you need to do with the sorted data *) (* get rid of the array of pointers when done *) Deallocate(myArray, SIZE(ptrData) * "number of pointers"); END SortPointers. The only limitation depends on the compiler - will it allow a very large index or does it limit the data space used for arrays (e.g., 32/64K limits on arrays that appear in microcomputer Modula and Pascal compilers). -- Chuck Lins | "Exit right to funway." Internet: lins@apple.com | AppleLink: LINS Snail Mail: 20525 Mariani Ave. M/S: 41-K Cupertino, CA 95014 "I like the future - I'm in it." - Firesign Theater
lins@Apple.COM (Chuck Lins) (09/12/89)
Binary Insertion sort and Merge sort are _NOT_ the same thing. They are different algorithms. Wirth shows both algorithms in "Algorithms and Data Structures". Merge sort requires 2N space (e.g., to sort 1000 element array you need a 2000 element array). Binary Insertsion sort is an in-pace sort needing only a single temporary. HeapSort is a good algorithm that guarantees sorting in O(n log n) time regardless of the input. Though, "most" of the time Quicksort will outperform it. Quicksort's main problem is that is has a worst-case asymptotic complexity of O(n**2) which typically occurs when the array is already sorted in either ascending or descending sequence. Optimizations can reduce the chances of poor performance in the worst case but cannot eliminate it. -- Chuck Lins | "Exit right to funway." Internet: lins@apple.com | AppleLink: LINS Snail Mail: 20525 Mariani Ave. M/S: 41-K Cupertino, CA 95014 "I like the future - I'm in it." - Firesign Theater
ok@cs.mu.oz.au (Richard O'Keefe) (09/13/89)
In article <34666@apple.Apple.COM>, lins@Apple.COM (Chuck Lins) writes: > Binary Insertion sort and Merge sort are _NOT_ the same thing. They are > different algorithms. Wirth shows both algorithms in "Algorithms and Data > Structures". Merge sort requires 2N space (e.g., to sort 1000 element array > you need a 2000 element array). Binary Insertsion sort is an in-pace sort > needing only a single temporary. It is certainly true that Binary Insertion sort and Merge sort are not the same. It is seriously misleading to say that Merge sort requires 2N space. The truth of the matter is that Merge sort requires N (pointers or integers) *in addition to the data being sorted*, so that if you are sorting records that contain 100N bytes Merge sort needs 102N bytes total. Even more to the point: if you want to sort a *LIST* then the pointer fields Merge sort needs are already there, and Merge sort doesn't need any more. (It does need a stack with O(lg N) pointers, but so does "Quick" sort.) If you have N dynamically allocated objects, you are already paying N pointers to refer to them; if you put those pointers in the objects themselves and link them into a list, you (a) won't have trouble with fixed-size arrays of pointers and (b) will be able to use Merge sort with no extra overheads. Binary Insertion sort does O(N**2) moves, which means that it is seldom a good idea. Merge sort costs O(N.lg N) for moves/links as well as comparisons.
diamond@csl.sony.co.jp (Norman Diamond) (09/13/89)
In article <INFO-M2%89091000074411@UCF1VM> Modula2 List <INFO-M2%UCF1VM.bitnet@jade.berkeley.edu> writes: > Can anybody out there send me any kind of sorting procedure. I need one > badly for a project I am working on. All the books on sorting always seem > to be out. If the library's copies are always checked out then buy one. They're too big to fit through the wires going to your terminal. Anyway, you might take a course some day in algorithms or databases. -- -- Norman Diamond, Sony Corporation (diamond@ws.sony.junet) The above opinions are inherited by your machine's init process (pid 1), after being disowned and orphaned. However, if you see this at Waterloo or Anterior, then their administrators must have approved of these opinions.
MARKV@UKANVAX ("MARK GOODERUM - UNIV. OF KANSAS ACS - MARKV@UKANVAX") (09/15/89)
Good points. Here are some quick notes I looked up last night to add to this discussion. Quick sort has a worse case time of O(n^2) but an average case of O(n log n). Heapsort is both worst and average case O(n log n) times. Quicksorts average times are usually a little better than Heapsort. I should be noted that if often times if you take a set and randomly distribute the order the elements, then you get a random distribution that Quicksort can do in n log n quicker than most algorithms. It should be noted that ANY algorithm for sorting an arbitrary set using only s LessThan comparison can not be quicker than n log n time. If something special is known about the keys, like they are all unique, then times can be improved. If the keys are all unique and can be reduced to a contiguous, ordered sequence, then you can sort in O(n) time. For now, Mark