ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/09/90)
Date: 02-07-90 (17:50) Number: 2869 (Echo) To: ALL Refer#: NONE From: ZAFAR ESSAK Read: (N/A) Subj: LEARN TO SORT Status: PUBLIC MESSAGE This is not meant to detract from Ian Green's Modula-2 discussions where he has been discussing SORTS, but rather may shed more light on the topic; at least from a novice sorter's viewpoint. I would also like to apologize at the outset for the lack of references to the results of the Forth Sort contest that was held on GEnie. I would appreciate a note from anyone who recalls the name(s) of the files that pertain to that contest. I have looked over the various files related to sorts on the BCFB and found the descriptions of sorts by Norm Arnold in SORT_WIN.ZIP most helpful. To begin using and understanding SORTS I needed a generalized approach that would allow applying the SORT methods to a variety of data elements. This requires the use of a few variables and deferred execution words. Vector variables were chosen for the deferred words. There is an obvious performance penalty in using a generalized approach such as this but I felt it would give me an opportunity to understand the various SORT methods as well as allow experimentation on different data elements. The most striking, though simple, find was discovered through experimenting with consecutive SORTS. Data elements containing several fields were first sorted on one field and then sorted on a second field. Of the three sorts tried (BUBBLE, SHELL, and QWIK) only the BUBBLE Sort preserved the ordering of the first sort within the second sort. Unfortunately, the BUBBLE sort is by far the slowest of the three, taking five times longer than the QWIK sort and twice as long as the SHELL sort. The next few messages contain the Forth code used. This is essentially Forth-83 Standard which has implications in the DO...+LOOP where a negative integer is used. A simple example of an array sort was used to initially test the functioning of the sorts. This is illustrated with the code but requires a RANDOM number generator, several of which are available on the board. --- * Via Qwikmail 2.01 NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/09/90)
Date: 02-07-90 (17:50) Number: 2870 (Echo) To: ALL Refer#: NONE From: ZAFAR ESSAK Read: (N/A) Subj: LEARN TO SORT Status: PUBLIC MESSAGE \ SORTS.FLY Part 1 - Generalizing the Sort Function (( For a sort utility to have flexibility some words have deferred actions. Essentially, when sorting a collection of data elements, two elements are compared at a time. Depending on the results of the comparison the placement of two elements in the data collection may require that they be exchanged. The comparison test itself may need to be modified to suit the data elements. Also the exchange routine will need to be modified for the particular data collection. The following assumes the data elements are fixed length and uses the space at PAD when comparing and exchanging elements. )) ONLY FORTH ALSO DEFINITIONS DECIMAL VARIABLE DATA.MAX# ( --adr) \ maximum number of data elements VARIABLE DATA.WIDTH ( --adr) \ width of each data element VARIABLE ~DATA@ ( --adr) \ Vector for method to fetch data : DATA@ ( adr,i--) ~DATA@ @EXECUTE ; VARIABLE ~DATA! ( --adr) \ Vector for method to store data : DATA! ( adr,i--) ~DATA! @EXECUTE ; (( The following word uses PAD as the buffer and accesses different parts of the buffer for temporary storage of different data elements. )) : data.buf ( n--adr) DATA.WIDTH @ * PAD + ; : EXCHANGE ( n1,n2--) \ n1,n2 are indexes to the data elements 2DUP >R >R >R >R 1 data.buf R> DATA@ 2 data.buf R> DATA@ 2 data.buf R> DATA! 1 data.buf R> DATA! ; (( In the following comparison of data elements the flag returned shall be -1, 0, or 1 to indicate if the data at adr1 is less than, equal to, or greater than the data at adr2, respectively. )) VARIABLE ~SORT? ( --adr) \ Vector action for comparison test : SORT? ( adr1,adr2--?) ~SORT? @EXECUTE ; (( Various SORT algorhythms require the following variables be set by any applications that call the sort. DATA.MAX# DATA.WIDTH ~DATA@ ~DATA! ~SORT? )) \ The following is a sort of an array of 2 byte data elements VARIABLE startdata ( --adr) : CELL@ ( adr,i--) 2* startdata @ + @ SWAP ! ; : CELL! ( adr,i--) >R @ R> 2* startdata @ + ! ; : CELL.COMPARE ( adr1,adr2--?) SWAP @ SWAP @ - ; : CELL.SORT ( --) 2 DATA.WIDTH ! ['] CELL@ ~DATA@ ! ['] CELL! ~DATA! ! ['] CELL.COMPARE ~SORT? ! ; Part 2... --- * Via Qwikmail 2.01 NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/09/90)
Date: 02-07-90 (17:50) Number: 2871 (Echo) To: ALL Refer#: NONE From: ZAFAR ESSAK Read: (N/A) Subj: LEARN TO SORT Status: PUBLIC MESSAGE \ SORTS.FLY Part 2 - Different Sort Methods (( The BUBBLE SORT compares the first element with the second element. If necessary they are exchanged. Then the second element is compared with the third, etc. Thus, the largest element will be moved to the end of the list while most others only move down one position. The result is an unsorted list with a length one less than the previous list. The list size is decreased and the sort repeated until the list size is zero. Consecutive sorts will preserve the effects of prior sorts. )) : BUBBLE ( --) 1 DATA.MAX# @ 1- ?DO I 0 ?DO I DUP 1+ 0 data.buf 1 data.buf 2OVER 2OVER ROT DATA@ SWAP DATA@ SORT? 0> IF EXCHANGE ELSE 2DROP THEN LOOP -1 +LOOP ; (( The SHELL SORT is similar to the bubble sort except that instead of comparing each element with the next element it is compared with an element further down the list. However, it does not preserve the effects of prior sorts when consecutive sorts are performed. When the sort is started the gap between the numbers being compared is set equal to half the length of the list. When the routine can go through the list without trading the gap is halved and the routine repeated. When the gap equals zero the sort is finished. )) VARIABLE gap ( --adr) VARIABLE traded ( --adr) \ flag to indicate if elements exchanged : SHL ( --) gap @ 0= IF EXIT THEN gap @ 2/ gap ! 1 DATA.MAX# @ 1- gap @ - ?DO traded OFF I 0 ?DO I DUP gap @ + 0 data.buf 1 data.buf 2OVER 2OVER ROT DATA@ SWAP DATA@ SORT? 0> IF EXCHANGE traded ON ELSE 2DROP THEN LOOP traded @ 0= ?LEAVE -1 +LOOP RECURSE ; : SHELL.SORT ( --) DATA.MAX# @ gap ! SHL ; (( The QWIK SORT is much faster than either the Bubble or Shell sorts. Furthermore, it does not preserve the effects of prior sorts when consecutive sorts are performed. The middle element of the unsorted list is used to divide the list into two piles. The highest pile is in turn divided into two and this process repeated until the top pile is a collection of sorted elements. Then attention is focused on the preceeding pile to sort it. This is repeated until all the "piles" have been sorted. )) : QSORT| ( l,r--) \ QWIK SORT by Dan Phillips 2DUP < NOT IF 2DROP EXIT THEN 2DUP + 2/ 0 data.buf SWAP DATA@ 2DUP BEGIN SWAP BEGIN 1 data.buf OVER DATA@ 1 data.buf 0 data.buf SORT? 0< WHILE 1+ REPEAT SWAP BEGIN 1 data.buf OVER DATA@ 1 data.buf 0 data.buf SORT? 0> WHILE 1- REPEAT 2DUP > NOT IF 2DUP EXCHANGE SWAP 1+ SWAP 1- THEN 2DUP > UNTIL -ROT SWAP RECURSE RECURSE ; : QSORT ( --) 0 DATA.MAX# @ 1- QSORT| ; \ Part 3... --- * Via Qwikmail 2.01 NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/09/90)
Date: 02-07-90 (17:50) Number: 2872 (Echo) To: ALL Refer#: NONE From: ZAFAR ESSAK Read: (N/A) Subj: LEARN TO SORT Status: PUBLIC MESSAGE \ SORTS.FLY Part 3 - Testing the Sort Methods \ Test the cell sort INCLUDE \FORTH\RANDOM.TXT \ Random number generator 100 DATA.MAX# ! CELL.SORT \ Initialize Sort Function Variables CREATE testdata ( --adr) DATA.MAX# @ DATA.WIDTH @ * ALLOT testdata startdata ! : printdata ( --) CR DATA.MAX# @ 0 ?DO ENOUGH? IF LEAVE THEN I 2* testdata + @ . LOOP ; : BTEST ( --) DATA.MAX# @ 0 ?DO random I 2* testdata + ! LOOP printdata CR ." Go " CR BUBBLE printdata CR ; : STEST ( --) DATA.MAX# @ 0 ?DO random I 2* testdata + ! LOOP printdata CR ." Go " CR SHELL.SORT printdata CR ; : QTEST ( --) DATA.MAX# @ 0 ?DO random I 2* testdata + ! LOOP printdata CR ." Go " CR QSORT printdata CR ; (( Test times: BTEST = 9.76 secs on Turbo XT STEST = 5.16 secs on Turbo XT QTEST = 2.68 secs on Turbo XT )) \ SORTS.FLY The End. --- * Via Qwikmail 2.01 NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/10/90)
Date: 02-08-90 (09:25) Number: 2874 (Echo) To: ZAFAR ESSAK Refer#: 2869 From: IAN GREEN Read: 02-08-90 (09:25) Subj: LEARN TO SORT Status: PUBLIC MESSAGE Sorting is a bit more subtle than you seem to realize. In my comm program, I use two sorts, one is the bubble sort where I sort the phone book, and the other is a stack oriented partition sort. Granted the bubble sort is slow on unordered random data, data that is already partially sorted can be sorted much faster with the bubble sort because it is able to exploit the partial ordering. Stack oriented sorts like the Quick Sort, cannot exploit this and thus are not all that suitable to partially ordered data like real data-bases. My phone book (in my comm program) is 500 entries, rather small compared to major data-bases, but still representative. Since I modify at most 2 or 3 entries, it makes sense to use the bubble sort since the data is already sorted from the last modification. Because it is always partially ordered at each modification, the logic of algorithm choice becomes obvious. My apologies for not having a Forth example for you. Ian Green NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/10/90)
[I'm porting this message in case anyone on the c.l.f side of the world would like to have access to any of the files mentioned in this post. I can arrange to have them uploaded to simtel20 (takes approx. 1 week) if you drop me a note at one of the addresses at the end of this message. -Doug] Date: 02-08-90 (13:44) Number: 2875 (Echo) From: JACK BROWN Read: 02-08-90 (09:25) Subj: LEARN TO SORT Status: PUBLIC MESSAGE >I would also like to apologize at the outset for the lack of references >to the results of the Forth Sort contest that was held on GEnie. I >would appreciate a note from anyone who recalls the name(s) of the >files that pertain to that contest. I have looked over the various >files related to sorts on the BCFB and found the descriptions of sorts >by Norm Arnold in SORT_WIN.ZIP most helpful. > Zafar... I did a ZIPpy dir scan on BCFB " ZIP SORT " and came up with the following references to sorting Forth. DVD&KNKR.ZIP 5527 10-17-89 Entry in the Forth Dimensions sort contest FIGSORT.ZIP 4470 11-09-89 Sort contest entry from H. Hansen RADXSORT.ZIP 2853 10-22-89 DK Elvey's Forth Radix sort. Contest entry. SORTCONT.ZIP 10814 07-24-89 Instructions for FIG's SORT contest by Dennis Ruffer. SORTDEMO.ZIP 12928 01-04-88 Multi-Forth SortDemo&Gloss--Five Diff Sorts 4THSORT.ZIP 2361 12-14-87 Some FORTH sorts (fig-FORTH) BATCHER.ZIP 1961 12-13-87 Batcher's Sort from Forth Dim (F83) SORT_WIN.ZIP 9318 09-27-88 Norm Arnold's Contest Winning Sorts. VISISORT.ZIP 9318 10-04-88 Visible sort routines - bubble, shell, kwik QSORT312.ZIP 40960 02-14-88 Good Sort Package for DOS (used by WORDNDX) SORT-OUT.ZIP 156155 08-02-89 Henning Hansen's wonderful sorting demo. Written in LMI Forth. Includes source for for sorts and windowing package. SORTLM.ZIP 2255 12-15-87 Adaption of Will Baden's Quicksort & SWORDS SORTZ80.ZIP 3211 12-15-87 Compact/fast sort for Z80 by S. Vance SORTURF.ZIP 1844 03-15-88 Sort and execution arrays for UR/Forth QSORT322.ZIP 41522 10-01-89 Great quick sort utility NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/11/90)
Date: 02-09-90 (07:29) Number: 2876 (Echo) To: IAN GREEN Refer#: 2874 From: STEVE PALINCSAR Read: NO Subj: LEARN TO SORT Status: PUBLIC MESSAGE How does the selection sort compare to the bubble sort in your application? In my (*very*) limited experience I found the selection sort to be considerably faster (between 2 & 4 times faster) in sorting 80 character long strings quantity 100-250. Selection sort is also, according to Knuth, a stable method, preserving the original order for identical keys. It's also extremely easy to code and understand -- far more so than the bubble sort. ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/11/90)
Date: 02-09-90 (16:04) Number: 2877 (Echo) To: IAN GREEN Refer#: 2874 From: ZAFAR ESSAK Read: NO Subj: LEARN TO SORT Status: PUBLIC MESSAGE Ian, Your assumption that the tests I did were on random lists is absolutely correct. Your comments on partly ordered lists resulting in admirable performances for the BUBBLE Sort got me to review the code. The following contains a minor revision to exit as soon as no exchanging of items is required. \ The BUBBLE Sort VARIABLE gap ( --adr) VARIABLE traded ( --adr) \ flag to indicate if elements exchanged : pair.sort ( n1,n2--) \ : pair.sort ( n1,n2--) 0 data.buf 1 data.buf \ 2DUP 1 data.buf SWAP DATA@ 2OVER 2OVER \ 0 data.buf SWAP DATA@ ROT DATA@ SWAP DATA@ SORT? 0> \ 0 data.buf 1 data.buf SORT? 0> IF EXCHANGE traded ON \ IF EXCHANGE traded ON ELSE 2DROP THEN ; \ ELSE 2DROP THEN ; : BUB ( --) 1 DATA.MAX# @ gap @ - ?DO traded OFF I 0 ?DO I DUP gap @ + pair.sort LOOP traded @ 0= ?LEAVE -1 +LOOP ; : BUBBLE ( --) 1 gap ! BUB ; \ The SHELL Sort : SHL ( --) gap @ DUP 0= IF DROP EXIT THEN 2/ gap ! BUB RECURSE ; : SHELL.SORT ( --) DATA.MAX# @ gap ! SHL ; This provides for some better factoring as well as performance. I have also gone through an early (1986) issue of Forth Dimensions and have added the BATCHER'S Sort to the set. Here it is. (( BATCHER'S SORT is slightly slower, but more robust than Qwik sort and has consistent sorting times. Depending on the input data, in some cases Batcher's sort may be faster than Qwik sort. Unfortunately, it too does not preserve the effects of prior sorts when consecutive sorts are performed. Batcher's sort consists of three nested loops. The innermost loop compares pairs of keys that are completely independent. Thus parallel computing could be applied for very fast sorting. From Forth Dimensions Nov/Dec 86 p 39 by John Konopka The names of the variables were chosen to be consistent with the description of the algorithm given by Knuth. )) CREATE 2** ( --adr) 1 , 2 , 4 , 8 , 16 , 32 , 64 , 128 , 256 , 512 , 1024 , 2048 , 4096 , 8192 , 16384 , : 2**N ( n -- 2**n ) 2* 2** + @ ; VARIABLE DD ( --adr) VARIABLE NN ( --adr) VARIABLE PP ( --adr) VARIABLE QQ ( --adr) VARIABLE RR ( --adr) VARIABLE TT ( --adr) : SELECT-T ( --) NN @ 15 0 DO DUP I 2**N <= IF DROP I LEAVE THEN LOOP 1- 14 MIN TT ! ; : INNER-LOOP ( --) NN @ DD @ - 0 ?DO I PP @ AND RR @ = IF I DUP DD @ + pair.sort THEN LOOP ; : Q-TEST ( --) QQ @ PP @ <> IF QQ @ PP @ - DD ! QQ @ 2/ QQ ! PP @ RR ! 0 THEN ; : QRD-SET ( --) TT @ 2**N QQ ! RR OFF PP @ DD ! ; : BATCHER ( --) DATA.MAX# @ NN ! SELECT-T TT @ 2**N PP ! BEGIN QRD-SET QQ @ BEGIN INNER-LOOP Q-TEST UNTIL PP @ 2/ DUP PP ! 0= UNTIL ; --- * Via Qwikmail 2.01 NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/12/90)
Date: 02-10-90 (08:33) Number: 2880 (Echo) To: ZAFAR ESSAK Refer#: 2877 From: IAN GREEN Read: NO Subj: LEARN TO SORT Status: PUBLIC MESSAGE Even though I am not much of a Forther, I am still willing to help Forth programmer's make the best of what they are doing. The added exit in the program will indeed improve the performance dramatically, especially in partially ordered data. If you saw my earlier discourse on sorts in the Modula-2 conference, you may have noticed how the use of a guard could improve the bubble sort even more. The reason is simple, if the zero-eth element of the array is set to the minimum value of the data type in question, then one of the two compares can be effectively eliminated by virtue of the fact that the bubbling process is now guaranteed to terminate at the top of the array. Later I plan to post so additional sorts in the Modula-2 conference so keep your eyes open. Ian Green NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/12/90)
Date: 02-10-90 (08:39) Number: 2881 (Echo) To: STEVE PALINCSAR Refer#: 2876 From: IAN GREEN Read: NO Subj: LEARN TO SORT Status: PUBLIC MESSAGE In the real world, the selection sort is so inefficient it isn't funny. Real databases are massive. Consider Brokerage XYZ whose has an army of analysists working on finding the next LBO candidate. If we needed to extract from the database some information and sort it so that it could be readily browsed and furthur parsed, huge amounts of sorting and selecting is required. It therefor behooves you to use the fastest algorithms available in order to get so 'performance' out of the query lest the client get impatient. Combined with insider information it is then practicle to make hundreds of millions of dollars, all tax free (at least under present rules). You don't need lots of money, all you need is smart software and a few well placed friends. Ian Green NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/13/90)
Date: 02-11-90 (08:41) Number: 2884 (Echo) To: IAN GREEN Refer#: NONE From: STEVE PALINCSAR Read: NO Subj: SORTING ALGORITHMS] Status: PUBLIC MESSAGE As a matter of fact, even as august an authority as Knuth says that for some sorts of data, the selection sort happens to be the best one available. But besides that, your assumption that "the real world" consists only of massive databases simply takes my breath away. As it happens, the application that I translated KNuth into Forth for is a real-world application for a task I do at work. Every month I do a search on a Library of Congress bibliographic database on their inhouse SCORPIO system and I download a list of their reports added since the previous issue. (That's usually between 60 and 100 or so items.) The entries are downloaded in report number order. My program strips off the unnecessary material from each citation, and puts the cites in author title order. In fact, I think the requirement to order the data came about because one of the editors of the publication secretly wished to eliminate the column, and thought that the amount of manual effort required to do so would cause me to simply abandon the effort. Ha! Using this program it takes me about 30 seconds to process the data. As it so happens, she's taking over production of the column, and was astounded at how easy it was... Incidentally, if you'll stop to consider the situation, you'll see that the bubble sort is less efficient than the selection sort - so by your own criteria you oughtn't be using it! But as we both know, there are times in the real world that you need to sort relatively small numbers of items, and in such cases more complex algorithms may be even slower than the simple ones. Even Knuth says so... ;-) ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/16/90)
Date: 02-13-90 (14:57) Number: 2901 (Echo) To: ZAFAR ESSAK Refer#: 2898 From: STEVE PALINCSAR Read: NO Subj: SELECTION SORT Status: PUBLIC MESSAGE To quote from Knuth (The Art of Computer Programming, vol. 3, Sorting and Searching), p. 139: . "5.2.3 Sorting by Selection . An important famly of sorting techniques is based on the idea of repeated selection. The simplest selection method is perhaps the following: i) Find the smallest key; transfer the corresponding record to the output area; then replace the key by the value "" (which is assumed to be higher than any actual key). ii) Repeat step (i). This time the second smallest key will be selected, since the smallest key has been replaced by . iii) Continue repeating step (i) until N records have been selected. . The above method involves N - 1 comparisons... There is an obvious way to improve upon the situation, avoiding the use of : we can take the selcted value and move it into its proper position by exchanging it with the record currently occupying that position. Then we need not consider that position again in future selections. This idea yields our first selection sorting algorithm." . [He goes on to describe Algorithm S (Straight selection sort) which I won't quote.] He goes on, on p. 141, to say: . "It is interesting to compare Algorithm S to the bubble sort... Bubble sorting does less comparisons than straight selection and it may seem to be preferable; but in fact Program 5.2.2B [the bubble sort] is more than twice as slow as Progam S! Bubble sorting is handicapped by the fact that it does so many exchanges, while straight selection involves very little data movement." . To quote from Robert Sedgewick's _Algorithms_ (p. 95): "Despite its simplicity, selection sort has quite an important application: it is the method of choice for sorting files with very large recoords and small keys." Here'ss his code (inn Pascal): procedure selection; var i,j,min,t: integer; begin for i:=1 to N do begin min:=i; for j:=1+1 to N do if a[j]<a[min] then min:=j; t:=a[min]; a[min]:=a[i]; a[i]:=t end; end; . Sedgewick describes the algorithm as follows: "find the smallest element in the array and exchange it with the element in the first position, then find the second smallest element and exchange it with the element in the second position, continuing in this way until the entire array is sorted. ...This is among the simplest of sorting methods, and it will work very well for small files. Its running time is proportional to N2. ...Still selection sort is quite attractive for sorting (say) a thousand 1000-word records on one-word keys." (p. 95 for all the Sedgewick quotes & code) ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/19/90)
Date: 02-17-90 (13:52) Number: 2918 (Echo) To: IAN GREEN Refer#: 2881 From: DON MADSON Read: NO Subj: LEARN TO SORT Status: PUBLIC MESSAGE Except for a list that's already sorted the selection sort performs better than the bubble sort. The bubble sort (with a flag) only performs well on an ordered list. If the list is already ordered why sort it? When a new record is added insert it where it belongs and the list is still sorted. If a record is deleted close the gap. I suppose one option is to include every possible sort in your program and before sorting your database perform an analysis of that data to determine which sorting procedure to call. Or ... just include one that performs well most of the time. There is a good chapter on sorting in Wirth's book "Algorithms + Data Structures = Programs." --- ~ EZ-Reader 1.14 ~ ><><><><><><><><><><><><><><<>><><> ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/22/90)
Date: 02-19-90 (10:10) Number: 2933 (Echo) To: DON MADSON Refer#: 2918 From: IAN GREEN Read: NO Subj: LEARN TO SORT Status: PUBLIC MESSAGE Well, in my application where I used the bubble sort, I decided to use that algorithm because of the data-structure I chose. It is simply an array of records. Now since it is an array (to be sure a simple data structure), insertion and deletion is trivial; to insert just add a record to the first 'blank' entry, to delete, change an entry to 'blank'. Then after editing is completed the bubble sort quickly 'cleans' up the array as the bulk of it is still ordered. The blank entry is designed to represent the highest possible key value so it naturally moves to the end of the list where it belongs. Simple and efficient. Ian Green NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/22/90)
Date: 02-19-90 (22:24) Number: 2934 (Echo) To: STEVE PALINCSAR Refer#: NONE From: ZAFAR ESSAK Read: NO Subj: SELECTION SORT Status: PUBLIC MESSAGE Thanks for the descriptions. I have also been looking at the SORT_OUT.ZIP that contains the program by Henning Hansen and Niels Jergen Christensen. That is an awesome collection of sorts complete with an executable program that demonstrates a large number of sorts incluing a couple of their own. Unfortunately, for novices of the sort like me, there are no descriptive narratives of the different sorts. The list of sorts contained in that file are: \ Bubblesort \ Shakersort \ Shakersort with a flag \ Shakersort with interval reduction \ Shuttlesort - Sifting \ Straight Insertionsort \ Insertionsort, assymmetric search \ Insertionsort with bisection search \ Selectionsort \ Selectionsort, stable \ Stacksort (Selectionsort w. stack) \ Stacksort, stable \ Heapsort - Treesort \ Shellsort, dim. incr. insertion \ Shellsort with stack-seletion \ Splicesort \ Splicesort with insertion \ Splicesort with selection \ Batchersort \ Mergesort, binary subdivision \ Mergesort, natural \ Quicksort, first pivot element \ Quicksort, random pivot element \ Quicksort, middle pivot element \ Quicksort, median-of-three pivot \ Quicksort-partition + other sort It will be awhile before I have enough time to explore this variety of sorts. --- * Via Qwikmail 2.01 NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/25/90)
Date: 02-23-90 (06:44) Number: 2950 (Echo) To: BILL MCCARTHY Refer#: 2943 From: STEVE PALINCSAR Read: 02-23-90 (18:12) Subj: SELECTION SORT Status: PUBLIC MESSAGE No, adding a few unordered records wasn't my application. My stuff comes down in item number order, which corresponds roughly to when the records were created -- actually, when the bibliographers & analysts applied for the number, I think, not when they got the job finished! -- and I have to rearrange them into author title order. Since the selection sort handles my data sets in under 2 seconds, it's hard to see how _any_ improvement in sort methodology would produce a meaningful improvement in run time! OTOH, the insertion sort also got a very good write-up in both Knuth & Sedgewick, and I do intend one of these days to take a look at it. BTW, nice to hear from you again! How's it going? ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/27/90)
Date: 02-25-90 (09:15) Number: 2965 (Echo) To: IAN GREEN Refer#: NONE From: STEVE PALINCSAR Read: NO Subj: DO LOOPS IN FORTH Status: PUBLIC MESSAGE Ian, you're probably looking for something like this: : LOOP_DEMONSTRATION ( n2 n1 -- ) \ That is, both the inner and outer loop limits are on the stack \ before this word executes. You could put them there by fetching \ from a couple of variables, or whatever other means you like. 1 DO ( Takes n1 from param stack, puts in on Rstack ) 1 DO ( Takes n2 from param stack, puts it on Rstack ) <SOMETHING> ( whatever you want to execute ) LOOP LOOP ; The "FOR-NEXT" construct in Forth is written as DO - LOOP. ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (02/28/90)
Date: 02-26-90 (22:34) Number: 2971 (Echo) To: STEVE PALINCSAR Refer#: NONE From: BILL MCCARTHY Read: NO Subj: SHELL SORT Status: PUBLIC MESSAGE I uploaded HS-SHELL.ZIP which contains both a selection and shell sort in HS/Forth. On a 16 Mhz 386, I got the following timings (seconds) sorting random 40 character strings. Items Selection Shell 100 .44 .11 200 1.58 .21 300 3.61 .43 400 6.37 .65 500 9.98 .82 You mentioned trying insertion sort. That method is awful except when adding to a sorted set of when the elements of the set are all near their final positions (such as a cleanup to a minimum partition quicksort). ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (03/06/90)
Category 3, Topic 37 Message 31 Sun Mar 04, 1990 F.SERGEANT [Frank] at 21:52 CST To Steve & Ian, SP> : LOOP_DEMONSTRATION ( n2 n1 -- ) SP> \ That is, both the inner and outer loop limits are on the stack SP> \ before this word executes. You could put them there by fetching SP> \ from a couple of variables, or whatever other means you like. SP> 1 DO ( Takes n1 from param stack, puts in on Rstack ) SP> 1 DO ( Takes n2 from param stack, puts it on Rstack ) SP> <SOMETHING> ( whatever you want to execute ) SP> LOOP SP> LOOP ; Nope, the above will not work (unless n1 is 1). The 2nd & subsequent passes thru the outer loop will have no limit available on the stack for the inner loop. You need a DUP after the 1st DO and a DROP after the last LOOP. Also, I keep seeing these "1"s as beginning indexes for your loops in these messages. Presumably you want to say something like 10 STARS and have your word STARS print ten of 'em. If you define STARS with a built-in index of 1, you'd have to say 11 STARS to get 10 of 'em. : STARS ( #-of-stars-desired - ) 0 DO STAR LOOP ; or : STARS ( 1-more-than-#-of-stars-desired - ) 1 DO STAR LOOP ; On the other hand, you might want to use a built-in index of 1 when you plan to use the index inside of the loop, eg for addressing a one based array. I would think the more common situation would be to use a zero based array, in which case you would still use a starting index of 1. Any questions? (By the way, I don't use DO LOOP anymore, just FOR NEXT.) -- Frank ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (03/08/90)
Date: 03-06-90 (18:12) Number: 3005 (Echo) To: FRANK SERGEANT Refer#: NONE From: IAN GREEN Read: NO Subj: ALL SORTS OF SORTS Status: PUBLIC MESSAGE As you explained it, I am a bit confused. In Modula-2, I can use what is called a subrange to index arrays, consequentky I often use indicies whick are more meaningful, such as the following: Sales: ARRAY [1970..1990] OF REAL; If I pass that array to what is called an open array procedure, 1970 becomes 0 (the arrays internally are zero indexed). Consequently I am oriented to zero-based arrays, simply because I make such extensive use of open-array parameters. Now you mentioned that you use FOR-NEXT over DO-LOOP so could you illustrate examples of nested loops that use the control variables inside, say in a shell-sort of an array. Ian Green NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (03/12/90)
Category 3, Topic 37 Message 33 Sun Mar 11, 1990 F.SERGEANT [Frank] at 10:41 CST Ian, you asked about nested FOR-NEXT loops. Suppose we have a two dimensional array of quarterly net profit figures for the years 1970 thru 1975. Suppose further that we made very small profits (so they will fit in 16-bits). We'll hard-code the values to start with. CREATE PROFITS ( year 1stQ 2ndQ 3rdQ 4thQ ) ( 1970) 10 , 5 , 3 , 7 , ( each entry takes 2 bytes) ( 1971) 8 , 9 , -3 , -4 , ( each year takes 8 bytes) ( 1972) 2 , 3 , 4 , 8 , ( 1973) 6 , 6 , 6 , 6 , ( 1974) 5 , 3 , 3 , -7 , ( 1975) 12 , 11 , 9 , 10 , We can address the array various ways. Suppose we want to say 3rdQ 1972 PROFIT and thereby place the number 4 on the stack. We could do it with the following definitions: 1 CONSTANT 1stQ 2 CONSTANT 2ndQ 3 CONSTANT 3rdQ 4 CONSTANT 4thQ : PROFIT ( quarter year - n) 1970 - 8 * ( quarter offset-to-start-of-the-year) 1- SWAP 2* + ( offset-to-quarter) PROFITS ( offset-to-quarter starting-addr-of-the-array) + @ ; ( profit-for-the-quarter) In memory, if you were to dump the contents of the array, on a PC with Intel-like byte order, you would see HEX PROFITS DUMP DUMP DUMP ( one way to do it in Pygmy) XXXX 0A 00 05 00 03 00 07 00 08 00 09 00 FD FF FC FF XXXX 02 00 03 00 04 00 08 00 06 00 06 00 06 00 06 00 XXXX 05 00 03 00 03 00 F9 FF 0C 00 0B 00 09 00 0A 00 (The XXXXs stand for the actual addresses where the array is stored.) Now let's step thru the entire array using FOR-NEXT loops. : PRINT-PROFITS ( -) CR ." Quarterly Profits for 6 Years" CR CR 1970 ( starting-year) 5 FOR ( we'll do this outer loop 6 times) 1 ( starting-quarter) 3 FOR ( we'll do this inner loop 4 times) ( year quarter) 2DUP SWAP ( y q q y) PROFIT ( y q profit) . ( y q ) ( year old-quarter) 1+ ( year new-quarter) NEXT ( old-year old-quarter) DROP ( old-year) 1+ ( new-year) CR ( so each year will get a separate line) NEXT DROP ( ) ; Notice that I have used nested loops, but I haven't used the built in "control variables" named 'I' and perhaps 'J'. But, I have used control variables that are kept on the stack. The above should work in Pygmy ver 1.2 and with most Forths that have FOR- NEXT. My latest thinking is that I prefer 5 FOR to do it 5 times, not 6, so I have changed my copy of Pygmy to do it the new way. Robert Berkey thinks we should call the new word ?FOR. Pygmy has the word 'I' that returns the value of the built-in index in the current loop. This index counts down. : TST1 5 FOR I . NEXT ; TST1 would print 5 4 3 2 1 0 : TST2 5 FOR I . NEXT ; TST2 would print 4 3 2 1 0 Also you can't directly get to the built-in index of outer loops from within inner loops (but, you can get it from the outer loop while just in the outer loop and save it on the stack for later use by the inner loop). This facility could be built into Pygmy (named 'J' 'K' etc) just as it is built in to some DO-LOOP implementations. Because of the downward counting nature, you might not always want to use 'I' in a FOR-NEXT loop. Here is a re-write using DO-LOOP where J accesses the outer loop's index: : PRINT-PROFITS2 ( -) CR ." Quarterly Profits for 6 Years" CR CR 1976 1970 DO ( we'll do this outer loop 6 times) 5 1 DO ( we'll do this inner loop 4 times) I J PROFIT ( profit) . ( ) LOOP CR ( so each year will get a separate line) LOOP ( ) ; I've presented all of this in hopes of making it clear how it all works rather than to convince you that FOR-NEXT is better than DO-LOOP. Even I have to admit that the example with DO-LOOP looks more straight forward. I just immediately switched to FOR-NEXT loops and never use DO-LOOP loops any more. I'm pointing this out as a curiosity as I don't fully understand why I've done it, but am content with the situation. Naturally I haven't tested any of the above examples. -- Frank ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'
ForthNet@willett.UUCP (ForthNet articles from GEnie) (03/14/90)
Date: 03-12-90 (10:12) Number: 3020 (Echo) To: FRANK SERGEANT Refer#: 3017 From: IAN GREEN Read: NO Subj: ALL SORTS OF SORTS Status: PUBLIC MESSAGE Well, you're latest examples have clarified one aspect of Forth; it lacks standardization of even the most basic control structure. To this end it appears as though translation of existing code may not be too easy to achieve as the looping control structures do not behave in a regular and predictable mannor. Oh well. Ian Green NET/Mail : British Columbia Forth Board - Burnaby BC - (604)434-5886 ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'