[comp.lang.perl] address sniffing hack

vixie@decwrl.dec.com (Paul A Vixie) (07/27/90)

Knowing you folks, you'll probably come up with seventeen better ways
to do this.  And I'm sure Randall has a one-liner for it.  But, meanwhile,
feed your syslog into this script and see what you get.  We use it for
name server testing.  You can also feed it news articles.

#! /usr/local/bin/perl

# sniff addresses out of an input text.
#
# original: vixie@decwrl, 21jul90

%addrs = ();

while (<>) {
        while (/@\w+(\.\w+)+/) {
                $h = $&;
                $h =~ s/^@//;
                $h =~ tr/A-Z/a-z/;
##              print "-> $h\n";
                $addrs{$h} = '';
                $_ = $';
        }
}

while (($_) = each %addrs) {
        print "$_\n";
}

exit 0;
--
Paul Vixie
DEC Western Research Lab	<vixie@wrl.dec.com>
Palo Alto, California		...!decwrl!vixie

flee@guardian.cs.psu.edu (Felix Lee) (07/27/90)

> # sniff addresses out of an input text.

Here's my one-line equivalent...

$/='@'; <>; grep(/^\w+(\.\w+)+/ && !$u{$&}++ && print("$&\n"), <>);

It's curious that this one line came to mind without thought.  Writing
a while loop would have taken more effort.  I feel like I'm stuck
using particular idioms, writing in a specific dialect of perl.
Randall would probably have used an "s///eg" expression.  Hmm.

s/@(\w+(\.\w+)+)/$u{$1}++ || print("$1\n")/eg while (<>);

Not the type of code I'd like to maintain.  Oh well.
--
Felix Lee	flee@cs.psu.edu

merlyn@iwarp.intel.com (Randal Schwartz) (07/27/90)

In article <VIXIE.90Jul26183407@volition.pa.dec.com>, vixie@decwrl (Paul A Vixie) writes:
| Knowing you folks, you'll probably come up with seventeen better ways
| to do this.  And I'm sure Randall has a one-liner for it.  But, meanwhile,
| feed your syslog into this script and see what you get.  We use it for
| name server testing.  You can also feed it news articles.
| 
| #! /usr/local/bin/perl
| 
| # sniff addresses out of an input text.
| #
| # original: vixie@decwrl, 21jul90
| 
| %addrs = ();
| 
| while (<>) {
|         while (/@\w+(\.\w+)+/) {
|                 $h = $&;
|                 $h =~ s/^@//;
|                 $h =~ tr/A-Z/a-z/;
| ##              print "-> $h\n";
|                 $addrs{$h} = '';
|                 $_ = $';
|         }
| }
| 
| while (($_) = each %addrs) {
|         print "$_\n";
| }
| 
| exit 0;

OK, here's an untested version, convoluted as always:

while (<>) {
	s#@(\w+(\.\w+)*)#{($h=$1)=~y/A-Z/a-z/;$h{$h}++;$&;}#eg;
}
print join("\n",sort keys h);

Not quite a one liner.  I was coming up with some wierd ones with
split and join and grep while working on this one, but this one seems
to be the clearest (ha!).

Something like

grep(s/^@// && y/A-Z/a-z/, split(/(@\w+(\.\w)+)/, join("",<>)))

gives you all the hostnames, but they're not uniq-ified.  Maybe
somebody else can hack that down.  It's too late in the evening for me
to think.  (And now that I look at that, it doesn't handle other weird
@ combos either.  So maybe that's the wrong approach entirely...)

Just another puzzled Perl hacker, trying to code for the book instead,
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/