ljdickey@water.waterloo.edu (Lee Dickey) (02/24/88)
A few weeks ago someone posted an article asking about an efficient ways to calculate the mode of an array of numbers. I guess I'll stick my neck out and suggest this algorithm. I don't know if this is best, but I think this works: NUB <- ( ( X iota X ) = iota rho X ) /x <- ,x FREQ <- +/ NUB jot.= X Z <- 1 take ( FREQ = max/ FREQ ) / NUB Of course in the last line it would be better to replace the "1 take" by "MIDDLE", where MIDDLE is the function that selects the middle element of a sorted vector, but I think that is another topic... Sorry, but I lost the name and address of the original poster, so can not send mail. I would be interested in seeing other solutions. -- L. J. Dickey, Faculty of Mathematics, University of Waterloo. ljdickey@water.waterloo.edu ljdickey@watdcs.BITNET ljdickey@water.UUCP ...!uunet!water!ljdickey
jaxon@uicsrd.csrd.uiuc.edu (03/01/88)
For a large array, the outer product in the above solution can be slow. A solution that makes fewer comparisons is: $ MODE is MODE V;#CT;FREQ;NUB;T [1] #CT is 0 note Need transitivity of = [2] V is V[gradeup V] [3] NUB is 1,(1 drop V) neq (^1 drop V) [4] FREQ is T - ^1 drop 0,T is NUB/iota rho T [5] MODE is (FREQ=max/FREQ)/NUB/V [6] MODE is MODE[floor .5 times ''rho rho MODE] $ After obtaining the FREQuency counts in line 4 you might wish to select the MODE a different way, Line[5] selects all of them. Iverson's Window Reduction operator could be used on lines [3] and [4]. Since gradeup is o(n log n) this algorithm is faster in the limit cases, but probably won't beat Prof. Dickey over short vectors. P.S. $ = del, ^ = high minus, # = quad. Greg Jaxon (jaxon@uicsrd.uiuc.edu)
zeke@fornax.UUCP (Zeke Hoskin) (03/01/88)
The nice thing about outer-product-based mode is that it lends itself to a one-liner which you can print on a T shirt and tell people you are dressed in the latest mode. I guess a shirt is an outer product in its own right.... The bad evil nasty thing is that it is an O(Nsquared) algorithm. Here is a faster one: X <- X[gradeup X <- ,X] bool <- X =/= 1 drop X,somevalueotherthangreatestX ends <- bool / iota rho bool lengths <- ends - minus1 drop 0, ends (in origin 1) pos <- ends [ lengths iota gratest reduc lengths] mode <- X[pos]
ljdickey@water.waterloo.edu (Lee Dickey) (03/08/88)
In article <49700001@uicsrd.csrd.uiuc.edu> jaxon@uicsrd.csrd.uiuc.edu writes: > ... >For a large array, the outer product in the above solution >can be slow. A solution that makes fewer comparisons is: > ... I think that Greg is right, as is David Lauer, who observed that ..."+/NUB jot.= X" can produce an immensely large intermediate array... David's solution, quite similar to Greg's is : Z <- MODE X X <- X[gradeup X] # SORT X I <- X not_eq 1 shift X # FIND BOUNDARIES BETWEEN GROUPS OF LIKE NUMBERS I <- ( neg 1 drop I),1 # DEAL WITH EDGE CONDITIONS NUB <- I/X # FIND UNIQUE NUMBERS H <- I/ iota rho I # H <- H-0, neg 1 drop H # FIND LENGTH OF RUN (IE: FREQUENCY) I <- H iota (max /H) # FIND MOST FREQUENT (SMALLEST IF SEVERAL) Z <- NUB[I] # LOOK-UP MOST FREQUENT X VALUE David's address is: David Laur <DMLAUR@PUCC.bitnet> -- L. J. Dickey, Faculty of Mathematics, University of Waterloo. ljdickey@waterloo.edu ljdickey@watdcs.BITNET ljdickey@water.UUCP ...!uunet!water!ljdickey