stealth@engin.umich.edu (Mike Pelletier) (06/04/91)
For my first trick... I'm an utter perl novice, so tone your responses appropriately. ;-) I'm working on a mail log parser, and I'm a bit perplexed as to how to deal with lines that have inconsistant field types. For instance: Jun 3 17:58:10 engin sendmail[12345]: AA01695: to=<spool.mu.edu!uunet!papaya.bbn.com!pineapple.bbn.com@engin.umich.edu>, delay=11:55:01, stat=Deferred Jun 3 17:58:10 engin sendmail[12345]: SmtpIn/Out NULL for DATA ---- The first one has a queue id, AA01695, whereas the second one just has the information right there, associated with the previous one only by the bracketed process-id (which I *didn't* make up -- it actually is there in the log file...) Any ideas how to handle this in perl? Does anyone have a mail log parser that they could send to me for educational perusal? Thanks for your help, folks! -- Mike Pelletier | "Wind & waves are breakdowns in the commitment of The University of Michigan | getting from here to there, but they are the con- College of Engineering | ditions for sailing. Not something to eliminate, Student/Systems Admin | but something to dance with."
tchrist@convex.COM (Tom Christiansen) (06/04/91)
From the keyboard of stealth@engin.umich.edu (Mike Pelletier): :For my first trick... :I'm an utter perl novice, so tone your responses appropriately. ;-) :I'm working on a mail log parser, and I'm a bit perplexed as to how :to deal with lines that have inconsistant field types. For instance: : :Jun 3 17:58:10 engin sendmail[12345]: AA01695: to=<spool.mu.edu!uunet!papaya.bbn.com!pineapple.bbn.com@engin.umich.edu>, delay=11:55:01, stat=Deferred : :Jun 3 17:58:10 engin sendmail[12345]: SmtpIn/Out NULL for DATA : :---- :The first one has a queue id, AA01695, whereas the second one just has the :information right there, associated with the previous one only by the :bracketed process-id (which I *didn't* make up -- it actually is there in :the log file...) : :Any ideas how to handle this in perl? Does anyone have a mail log parser :that they could send to me for educational perusal? Thanks for your help, :folks! I've never seen sendmail log an entry without an AA\d+ string. To parse out something that's part variable fields and part not, you could probably best use a regexp, like ($mon, $day, $time, $host, $program, $pid, $extra) = /^(\w+)\s+(\d+)\s+(\d+:\d+:\d+)\s+(\w+)\s+(\w+)\s+(.*)/; and then do with $extra as you like. Maybe my \w's should be \S's. I have a script for parsing sendmail syslog files. It may well not do what you want. It was one of my earlier perl script, so is a bit odd looking (but aren't they all?). I'll enclose it. I keep meaning to write one that shows me all transactions to or from a particular user. One of these days... --tom #!/usr/local/bin/perl # # ssl -- summarize sendmail syslog # # original by Tom Christiansen <tchrist@convex.com> # probably in 88 or 89 # # further hacking by # Paul O'Neill # Coastal Imaging Lab # Oregon State University # # 18 jun 90 # fix various bugs # add sorted output # # still more hacking by tchrist, sometime in 91 # # summarize sendmail syslog # # flags mean: # # -o outbound mail only # -i inbound mail only # -t suppress printing of totals # -e print strange lines to stderr # -m reduce to local mbox is possible ($program = $0) =~ s%.*/%%; while ($ARGV[0] =~ /^-/) { $ARGV[0] =~ s/^-//; foreach $flag ( split (//,$ARGV[0]) ) { if ( 'oitem' !~ /$flag/ ) { printf stderr "unknown flag: %s\n", $flag; die "usage: $program [-oitem] [syslog_file ...]\n"; } die "$0: '$flag' flag already set\n" if ($flags{$flag}++); } shift; } if ( !$flags{'o'} && !$flags{'i'} && !$flags{'t'}) { $flags{'o'}++; $flags{'i'}++; } do hash_passwd() if $flags{'m'}; while (<>) { if (/: [A-Z][A-Z](\d+): from=(.*), size=(\d+)/) { # next unless $flags{'t'} || $flags{'o'}; ($id, $user, $size) = ($1, $2, $3); #print "$user\n"; $user =~ s/.*<(.*)>/$1/; # get rid of <> #print "$user\n"; $user =~ tr/A-Z/a-z/; # canonical lc #print "$user\n"; if ($flags{'m'}) { $ouser = $user; #print " $user\n"; $user = do strip($user); #print " $user\n"; $user = $ouser if ! $known{$user}; #print " $user\n"; #print " $known{$user}\n"; } $from_user_size{$user} += $size; $id_size{$id} = $size; $from_user_count{$user}++; $total_from++; } elsif (/: [A-Z][A-Z](\d+): to=(.*), delay=/) { # next unless $flags{'t'} || $flags{'i'}; $id = $1; for (split(/,/, $2)) { s/.*<(.*)>/$1/; # $to = $flags{'m'} ? do strip($_) : $_; $to = $_; if ($flags{'m'}) { $oto = $to; #print " $to\n"; $to = do strip($to); #print " $to\n"; $to = $oto if ! $known{$to}; #print " $to\n"; #print " $known{$to}\n"; } $to =~ tr/A-Z/a-z/; # printf "adding %d bytes to %s from %s\n", $id_size{$id},$to,$user; if (!$to) { die "to no one: $_\n"; } $to_user_size{$to} += $id_size{$id}; $to_user_count{$to}++; $total_to++; } } else { study; next if /message-id/; next if /locked/; next if /alias database (auto|)rebuilt/; #next if /aliases/; next if /rebuilding alias database/; print stderr if $flags{'e'}; $errors++; } } printf stderr "Error lines: %d\n", $errors if $errors && ($flags{'e'}) && !($flags{'t'}); if ($flags{'i'}) { printf "To: %d\n", $total_to unless $flags{'t'};; @loop = keys(to_user_size); foreach $user (sort tosort @loop) { printf "%4d message%s %7d bytes %s\n", $to_user_count{$user}, $to_user_count{$user} != 1 ? "s" : " ", $to_user_size{$user}, $user; } } if ($flags{'o'}) { printf "From: %d\n", $total_from unless $flags{'t'};; @loop = keys(from_user_size); foreach $user (sort fromsort @loop) { printf "%4d message%s %7d bytes %s\n", $from_user_count{$user}, $from_user_count{$user} != 1 ? "s" : " ", $from_user_size{$user}, $user; } } sub tosort { ($to_user_count{$b} - $to_user_count{$a})* 10000000 + $to_user_size{$b} - $to_user_size{$a}; } sub fromsort { ($from_user_count{$b} - $from_user_count{$a}) * 10000000 + $from_user_size{$b} -$from_user_size{$a}; } sub strip { local($foo) = shift(@_); #print "$foo\n"; $foo =~ s/@.*//; $foo =~ s/.*!//; $foo =~ s/\s*\(.*\)//; $foo =~ tr/A-Z/a-z/; return $foo; } sub hash_passwd { chop($yp = `/bin/domainname`) if -x '/bin/domainname'; $passwd = $yp ? 'ypcat passwd |' : '/etc/passwd'; open passwd || die "$program: can't open $passwd: $!\n"; while (<passwd>) { /^(\w+):[^:]+:(\d+):.*/; ($who,$uid) = ($1, $2); # $uid = 'zero' if ! $uid; # kludge for uid 0 $uid = 'zero' if ($uid == 0) && $who; # $uid = 'zero' if defined($uid); $known{$who} = $uid; #print "$who $uid $known{$who}\n"; } close passwd; #print "SPECIEALLLL -- $known{''}\n"; } -- Tom Christiansen tchrist@convex.com convex!tchrist "Perl is to sed as C is to assembly language." -me