jimmy@pyramid.pyramid.com (Jimmy Aitken) (05/09/89)
Since perl was posted to comp.sources*, I've posted this request here. I couldn't think of a more appropriate newsgroup so here goes. I've got a perl program that requires an array to be sorted ignoring case and non alpha-numeric characters. i.e. The equivalent of 'sort -fd'. Currently I do it by: @sorted=sort fieldsort @list | | sub fieldsort { local($x=$a, $y=$b); $x =~ tr/A-Z/a-z/; $x =~ tr/ -@{-}//; $y =~ tr/A-Z/a-z/; $y =~ tr/ -@{-}//; $x lt $y ? -1 : $x gt $y ? 1 : 0; } To my mind this is ugly and uses about twice the user time comapred to a simple 'sort @list.' Can anyone come up with a cleaner/faster way of doing this operation? Thanks for any help. jimmy -- -m------ Jimmy Aitken ---mmm----- On Loan from: Pyramid Technology Ltd., U.K. -----mmmmm--- To: Pyramid Technology Corp, U.S.A -------mmmmmmm- {uunet, decwrl}!pyramid!jimmy
jgreely@previous.cis.ohio-state.edu (J Greely) (05/09/89)
In article <69461@pyramid.pyramid.com> jimmy@pyramid.pyramid.com (Jimmy Aitken) writes: >I've got a perl program that requires an array to be sorted ignoring >case and non alpha-numeric characters. i.e. The equivalent of >'sort -fd'. The *fastest* way may be to actually use "sort -fd", but that violates the kitchen sink principle of perl, so here goes. >sub fieldsort { > local($x=$a, $y=$b); > $x =~ tr/A-Z/a-z/; > $x =~ tr/ -@{-}//; > $y =~ tr/A-Z/a-z/; > $y =~ tr/ -@{-}//; > $x lt $y ? -1 : $x gt $y ? 1 : 0; >} The biggest problem with this is that you're munging both strings for each comparison, which is where your time is wasted. One ugly-but-faster way to do the job would be to pre-chew the array, like this: # create sorting array, appending position in original array # (so we can later move the real array into matching order) # @sort_list = (); for ($line=0;$line <= $#list; $line++) { $temp = $list[$line]; $temp =~ tr/A-Z/a-z/; $temp =~ tr/ -@{-}//; $temp .= "A$line"; # choose something not in any string. We *know* # there's no capital A's anymore. You might prefer # something like DEL, that would sort after anything push(@sort_list,$temp); } @sort_list = sort(@sort_list); # # loop through the sorted copy, replacing the hacked version with # the original # for ($line=0;$line <= $#sort_list;$line++) { ($temp,$num) = split(/A/,$sort_list[$line]); $sort_list[$line] = $list[$num]; } # sort_list now contains the original array, correctly sorted This isn't perfect, and it *is* ugly, but it basically works. Having typed this, I would probably just break down and call sort, but it's a nice exercise. -=- J Greely (jgreely@cis.ohio-state.edu; osu-cis!jgreely)