[comp.sources.d] folded-sorting

merlyn@intelob.intel.com (Randal L. Schwartz @ Stonehenge) (05/10/89)

In article <69461@pyramid.pyramid.com>, jimmy@pyramid (Jimmy Aitken) writes:
| Since perl was posted to comp.sources*, I've posted this request here.
| I couldn't think of a more appropriate newsgroup so here goes.
| 
| I've got a perl program that requires an array to be sorted ignoring
| case and non alpha-numeric characters. i.e. The equivalent of
| 'sort -fd'. Currently I do it by:
| 
| @sorted=sort fieldsort @list
| 	|
| 	|
| sub fieldsort {
|     local($x=$a, $y=$b);
|     $x =~ tr/A-Z/a-z/;
|     $x =~ tr/ -@{-}//;
|     $y =~ tr/A-Z/a-z/;
|     $y =~ tr/ -@{-}//;
|     $x lt $y ? -1 : $x gt $y ? 1 : 0;
| }
| 
| To my mind this is ugly and uses about twice the user time comapred to
| a simple 'sort @list.'  Can anyone come up with a cleaner/faster way
| of doing this operation?  Thanks for any help.

OK, I'll take a wild stab at it (since I keep saying how nifty Perl
is...).  Here's my idea: create an aux assoc array that has your list
as values and the folded strings as keys.  Then, sort the keys, and
extract the values in order.  To wit:
################################################## snip snip
# WARNING: UNTESTED CODE AHEAD # DO NOT USE IN MISSION-CRITICAL SOFTWARE :-) #
# ensure %aux is clean: (if you have 3.0, use "%aux=();")
for $i (keys(aux)) {
	delete $aux{$i};
}

# now stuff %aux with values
for $i (@list) {
	$folded_i = $i;
	$folded_i =~ tr/A-Z/a-z/;
	$folded_i =~ tr/ -@{-}/ /; # you were missing space here
	$aux{$folded_i} .= "\377" . $i; # presumes no DELs in @list
}

@sorted = ();
# now sort and peel
for $i (sort keys(aux)) {
	@spl = split(/\377/,$aux{i});
	shift(@spl); # first entry always null, so toss it,
	push(@sorted,shift(@spl)) while $#spl >= $[; # and copy the rest
}
################################################## snip snip
There.  That should do it.  If not, yell at me.  This sort should be
stable, since anything in order in the original array will still
pretty much be that way.
-- 
***** PLEASE NOTE THE NEW ADDRESS *****
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095===\
{ on contract to BiiN, Hillsboro, Oregon, USA, until 14 May 1989     }
{ <merlyn@intelob.intel.com> ...!uunet!tektronix!biin!merlyn         }
{ or try <merlyn@agora.hf.intel.com> after 15 May 1989               }
\=Cute quote: "Welcome to Oregon... home of the California Raisins!"=/