wengland@stephsf.stephsf.com (Bill England) (11/26/90)
What is the best method for discovering if a string is in a given set of strings? For example say you want to verify that the string 'WA' is a valid two char state abreviation? I can think of at least four methods ... Method Description 1 Search a string that contains all two letter state names; Here you would prefix and suffix the state var and then search the string. Something like ; ... $state = '~AK~AL~AR~AZ~CA~CO~...'; $search_string = '~'.$prove_state.'~'; if( /$search_string/ eq $state) { ... } 2 Build an associative array %state so that you have; $state{'AK'}="Alaska"; $state{'AL'}="Alabama"; $state{'AR'}="Arkansas"; $state{'AZ'}="Arizona"; $state{'CA'}="California"; ... Then test for the existance of $state{$prove_state}. ( Has disadvantage of extra bagage if you don't need the states full name. ) 3 Build a simple array and then search with a foreach @state = ('AK','AL','AR','AZ','CA', ...); foreach $st(@state){ last if $st eq $prove_state; } 4 The same as 3 above but, use a binary search instead of a linear search. Which of the above would be the best to use in perl? Is there a better way ??? +-------- | Bill England | wengland@stephsf.COM | * * H -> He +24Mev * * * ... Oooo, we're having so much fun making itty bitty suns * * *
chris@utgard.uucp (Chris Anderson) (11/29/90)
In article <443@stephsf.stephsf.com> wengland@stephsf.stephsf.com (Bill England) writes: > What is the best method for discovering if a string is in > a given set of strings? For example say you want to verify > that the string 'WA' is a valid two char state abreviation? > > I can think of at least four methods ... [methods deleted] > 2 Build an associative array %state so that you have; > $state{'AK'}="Alaska"; > $state{'AL'}="Alabama"; > $state{'AR'}="Arkansas"; > $state{'AZ'}="Arizona"; > $state{'CA'}="California"; > ... > Then test for the existance of $state{$prove_state}. > ( Has disadvantage of extra bagage if you don't need the > states full name. ) Naah. You don't need the extra baggage: foreach $one (@all_the_states) { $states{$one}++; } or undef %states; @states{@all_the_states} = (); Either way, you don't have the baggage and you can find the name easily and quickly. Associative arrays are great! Chris -- +---------------------------------------------------------------+ | Chris Anderson, QMA, Inc. utgard!chris@csusac.csus.edu | | My employer doesn't listen to me... why should you? | +---------------------------------------------------------------+
tchrist@convex.COM (Tom Christiansen) (11/30/90)
In article <1990Nov28.160822.21351@utgard.uucp> chris@utgard.uucp (Chris Anderson) writes: > undef %states; > @states{@all_the_states} = (); > >Either way, you don't have the baggage and you can find the name >easily and quickly. Associative arrays are great! Yes, they're great, but the recent patches have made the quoted usage of questionable utility. This I find highly unfortunate, since it's so very useful. The problem is Larry made it so you can have a key defined, but whose value is undefined. This means strange things like this happen: @colors = ( 'red', 'blue', 'green' ); @seen{@colors} = (); print "defined\n" if defined $seen{'blue'}; print "undefined\n" unless defined $seen{'blue'}; That now prints undefined, although it used to print defined. You can still say for (@seen) { $colors{$_}++; } with an explicit loop, but I really like the @foo{@bar} = () usage. I can't figure out WHY you want entries in an associative arrays with undefined values. Does doing that really make sense?? Larry addressed this in <10334@jpl-devvax.JPL.NASA.GOV> on 12 Nov 90 19:22:14 GMT, in a message entitled "Re: defined() on an undef array element created via slice." After granting that correct interpretation of this is a judgment call, he pointed out that it's nice to distinguish between @foo{"bar"} = (); and @foo{"bar"} = (''); (I still get nervous with that usage because despite the parens, novices will think it a scalar assignment as there's only one element in the slice.) But what really bothers me is that @seen{@colors} = (''); makes red defined and blue not, but they're both valid keys, and this isn't what I call following the principle the least surprise. @seen{@colors} = (1..$#colors); is a hack that will get you by, but I still prefer the prior behavior. Maybe I just got too used to it. --tom
markb@agora.uucp (Mark Biggar) (12/03/90)
In article <443@stephsf.stephsf.com> wengland@stephsf.stephsf.com (Bill England) writes: > > What is the best method for discovering if a string is in > a given set of strings? For example say you want to verify > that the string 'WA' is a valid two char state abreviation? > > I can think of at least four methods ... > > Method Description > > 1 Search a string that contains all two letter state names; > Here you would prefix and suffix the state var and > then search the string. Something like ; > ... > $state = '~AK~AL~AR~AZ~CA~CO~...'; > $search_string = '~'.$prove_state.'~'; > if( /$search_string/ eq $state) { ... } > If you search set is small this may be the bets way, but use index() instead of of an regular expression, like so: if (index($state, "~$prove_state~") > = 0) Besides your if condition is wrong, it should be: if ($state =~ /~$prove_state~/) > 2 Build an associative array %state so that you have; > $state{'AK'}="Alaska"; > $state{'AL'}="Alabama"; > $state{'AR'}="Arkansas"; > $state{'AZ'}="Arizona"; > $state{'CA'}="California"; > ... > Then test for the existance of $state{$prove_state}. > ( Has disadvantage of extra bagage if you don't need the > states full name. ) > if you don't need the full names just set the associated array value to 1. this is a very good way to go if the search set is generated by your program. If you have a very large constant search set using a dbm file for it may be the best way to go. > 3 Build a simple array and then search with a foreach > @state = ('AK','AL','AR','AZ','CA', ...); > foreach $st(@state){ > last if $st eq $prove_state; > } > This is only good if you need the array for other reasons, such as an array of month names that you wish to index by month number. > 4 The same as 3 above but, use a binary search instead of > a linear search. > Same problems as with 3. > Which of the above would be the best to use in perl? > > Is there a better way ??? That pretty much covers it. -- Perl's maternal uncle Mark Biggar
wengland@stephsf.stephsf.com (Bill England) (12/13/90)
In article <1990Dec3.035541.24184@agora.uucp> markb@agora.uucp (Mark Biggar) writes: >In article <443@stephsf.stephsf.com> wengland@stephsf.stephsf.com (Bill England) writes: > [...] >if you don't need the full names just set the associated array value to 1. >this is a very good way to go if the search set is generated by your >program. If you have a very large constant search set using a dbm file >for it may be the best way to go. > >-- >Perl's maternal uncle >Mark Biggar Thanks for the ideas. I've attached the resulting routine to the end of this article. I'm not entirely comforatable with referancing an array that has been undef'ed yet. ;-) ----- vrfy_state.pl ----- package state_check; ## # Copyright(c) 1991, Stephen Software Systems, Inc. # This software may be modified and copied under the terms of # the GNU Public License. # # State Check Library # Given a USA $state determine if it is correct. # Return a postmaster approved two letter abreviation. # $state may be either states full name or two letter abreviation. # # WSE Created Wed Nov 28 14:08:25 PST 1990 # # Send problems, suggestions, errors, and # heresys to support@stephsf.COM. ## %st_name = split(/!/, 'ALASKA!AK!ALABAMA!AL!ARKANSAS!AR!ARIZONA!AZ!CALIFORNIA!CA!COLORADO!CO!CONNECTICUT!CT!DELAWARE!DE!FLORIDA!FL!GEORGIA!GA!HAWAII!HI!IOWA!IA!IDAHO!ID!ILLINOIS!IL!INDIANA!IN!KANSAS!KS!KENTUCKY!KY!LOUISIANA!LA!MASSACHUSETTS!MA!MARYLAND!MD!MAINE!ME!MICHIGAN!MI!MINNESOTA!MN!MISSOURI!MO!MISSISSIPPI!MS!MONTANA!MT!NEBRASKA!NB!NORTH CAROLINA!NC!NORTH DAKOTA!ND!NEW HAMPSHIRE!NH!NEW JERSEY!NJ!NEW MEXICO!NM!NEVADA!NV!NEW YORK!NY!OHIO!OH!OKLAHOMA!OK!OREGON!OR!PENNSYLVANIA!PA!RHODE ISLAND!RI!SOUTH CAROLINA!SC!SOUTH DAKOTA !SD!TENNESSEE!TN!TEXAS!TX!UTAH!UT!VIRGINIA!VA!VERMONT!VT!WASHINGTON!WA!WISCONSIN!WI!WEST VIRGINIA!WV!WYOMING!WY'); %st_abr = split(/( )/,'AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA MD ME MI MN MO MS MT NB NC ND NH NJ NM NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV WY '); ## # State_Check # In: usa_state (two letter abreviation, or full state name ) # Out: () || ST(Usa two letter state abreviation) ## sub main'State_Check { local($state) = @_; $state =~ y/a-z/A-Z/; return $state if defined($st_abr{$state}); return $st_name{$state} if defined($st_name{$state}); return (); } -- +- Bill England, wengland@stephsf.COM -----------------------------------+ | * * H -> He +24Mev | | * * * ... Oooo, we're having so much fun making itty bitty suns * | |__ * * ___________________________________________________________________|
tchrist@convex.COM (Tom Christiansen) (12/14/90)
From the keyboard of wengland@stephsf.stephsf.com (Bill England): :sub main'State_Check { : local($state) = @_; : $state =~ y/a-z/A-Z/; : : return $state if defined($st_abr{$state}); : return $st_name{$state} if defined($st_name{$state}); : return (); :} : Hmm... shouldn't that last one be: return ''; # or return undef or since explicit returns cost more than implicit ones: ''; # or undef The null _scalar_ value in perl is '' (defined) or undef (undefined), whereas () is an empty list. It probably will work out ok as is, but it's one of those scalar/vector confusion things that always make me at least a little nervous, like saying @x[1] when you mean $x[1]. --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "With a kernel dive, all things are possible, but it sure makes it hard to look at yourself in the mirror the next morning." -me