neal@druhi.ATT.COM (USENET News Administrator) (12/02/89)
I typed in the sort function from the reference manual and was very disappointed with it. First of all, it uses a bubble sort, which may be the most overused bad example around. As Knuth said in 1973, "In short, the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems." Second, it operates on lists. It seems that doing GETs from lists is much slower than doing GETs from arrays. Just changing the way that the length of the input is calculated allows the same algorithm to run much faster given an array as an argument. I implemented an insertion sort algorithm which is faster still, especially for data which is already nearly in order, but it still takes around 2 minutes to sort 50 random numbers (without using FAST). Another thing I want is a way to re-order whole arrays based on a single column. It might be best to generate the inversion of the column and then make one pass through the array to re-order it. Another possibility is to change the comparison method in the sort program to be user-specified and sort a list of vectors (a taste of object-oriented programming! Too bad we can't overload primitive operators like "<"...) It would seem that a natural way to speed things up would be to develop a machine-language implementation of the inner loop of the program. Have any of these ideas been implemented and published? I note that EduCalc is listing a new book on tricks for the HP-28 which advertises a sorting program. Does anyone know if it is any good? Thanks for any help! If no one has a better solution, I'm willing to post my insertion sort program. -Neal McBurnett, neal@druhi.att.com or att!druhi!neal
ge@kunivv1.sci.kun.nl (Ge' Weijers) (12/04/89)
neal@druhi.ATT.COM (USENET News Administrator) writes: >I typed in the sort function from the reference manual and was very >disappointed with it. First of all, it uses a bubble sort, which >may be the most overused bad example around. As Knuth said in 1973, >"In short, the bubble sort seems to have nothing to recommend it, except >a catchy name and the fact that it leads to some interesting theoretical >problems." >I implemented an insertion sort algorithm which is faster still, especially >for data which is already nearly in order, but it still takes >around 2 minutes to sort 50 random numbers (without using FAST). Well, insertion sort is only better than bubble sort by a constant factor. I'll try to write a sort routine that sorts 50 numbers in 30 seconds or so. This should be possible. (I hope) >It would seem that a natural way to speed things up would be to develop >a machine-language implementation of the inner loop of the program. Only after you've used a better algorithm. I'll get back on this. >-Neal McBurnett, neal@druhi.att.com or att!druhi!neal Ge' Weijers, ge@cs.kun.nl Ge' Weijers Internet/UUCP: ge@cs.kun.nl Faculty of Mathematics and Computer Science, (uunet.uu.net!cs.kun.nl!ge) University of Nijmegen, Toernooiveld 1 6525 ED Nijmegen, the Netherlands tel. +3180612483 (UTC-2)
neal@druhi.ATT.COM (Neal D. McBurnett) (12/05/89)
in article <553@kunivv1.sci.kun.nl>, ge@kunivv1.sci.kun.nl (Ge' Weijers) says: > Well, insertion sort is only better than bubble sort by a constant factor. Thanks for the correction of any misperception I might have spread. I realize that for big sorting tasks one clearly wants an algorithm better than O(N^2) (both bubble and insertion sorts are O(N^2)). On the other hand, it is important to tailor a sorting algorithm to the hardware and application at hand. If we are interested in getting a routine which can quickly sort 10 or 20 numbers we will be less interested in the n^2 term than in the constant factor. If we want to sort 100 numbers we are much more interested in the n^2 term. If we're sorting large arrays it is also desireable to do the sorting in place, and avoid having to use a lot of extra memory. After some more thought, I have concluded that there are at least two promising avenues for the HP28 architecture. One is to take advantage of the "sigmaMAX" or "sigmaMIN" operation and implement a selection sort. Unfortunately, the POS function doesn't work on arrays, and sigmaMIN doesn't work on lists. What a shame. It may make sense to keep two copies of the data, one in an array for finding the minimum value, and one in a list for finding out where that minumum value is. The other idea is to do the sorting on the stack. It seems that ROLL on the stack is faster than a PUT into an array even for a set of data 100 elements long. If I were to choose a good algorithm for a large array, it would probably be one of heapsort, quicksort or shell's sort, depending on which one was easiest to implement. -Neal McBurnett, neal@druhi.att.com or att!druhi!neal