prastowo@qucis.queensu.CA (Bambang Nurcahyo Prastowo) (05/10/91)
Could any perl guru help me rewrite the following sh script in perl ? The script is a mail sorter that cleans incoming mails (simplifies headers and removes duplicated blank lines) and save them in mailboxes named after the first words of email addresses appear in "From:...". Bambang Prastowo prastowo@qucis.queensu.ca #! /bin/sh home=/grad/prastowo/MAIL inmail=$home/inmail.$$ outmail=$home/outmail.$$ header=$home/header.$$ body=$home/body.$$ neat=$home/neat.$$ sender=$home/sender.$$ echo=/bin/echo grep=/bin/grep egrep=/bin/egrep rm=/bin/rm sed=/bin/sed cat=/bin/cat awk=/bin/awk tr=/bin/tr $cat > $inmail $sed -n '1,/^$/p' $inmail > $header $sed -n '/^$/,$p' $inmail > $body trap "$rm -f $header $inmail $body $sender $neat" 0 ( $egrep '^From |^From:|^Date:|^Subject:' $header $echo $awk '/^Date:/||/^From:/||/^Subject:/ {printf(">")}{print}' $body $echo _____________________________________ $echo ) | awk '/^[ \t]*$/{if(!b){print "";b=1}}/[^ \t]/{print $0;b=0}'> $neat # keep mails in folders by the 1st words of senders' addresses. sd=`$sed -n 's/^From://p' $header | $sed 's/.*<//;s/>.*//' | $awk '{print $1}' | sed 's/[@%\.\!].*$//' | $tr a-z A-Z ` $cat $neat >> $home/$sd exit 0 -- Bambang Nurcahyo Prastowo prastowo@qucis.queensu.ca prastowo@qucis.bitnet
lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) (05/11/91)
In article <2083@qusunc.queensu.CA> prastowo@qucis.queensu.CA (Bambang Nurcahyo Prastowo) writes: : Could any perl guru help me rewrite the following sh script in perl ? : The script is a mail sorter that cleans incoming mails (simplifies : headers and removes duplicated blank lines) and save them in mailboxes : named after the first words of email addresses appear in "From:...". Sure. It'd go something like this: ------------------------------------------------------ #!/usr/bin/perl $MAILDIR = "/grad/prastowo/MAIL"; $/ = ''; # paragraph mode $* = 1; # enable multiline matching for ^ # parse header into associative array %hdr = ('FROM', split(/^([-\w]+):[ \t]*/, <>)); # open appropriate file $sd = $hdr{'From'}; $sd =~ s/^.*<\s*(.*)>.*$/$1/; # handle <> form $sd =~ s/[\s@%.!].*//; # delete trailing stuff $sd =~ tr/a-z/A-Z/; # canonicalize to upper case open(STDOUT, ">>$MAILDIR/$sd") || die "Can't open $sd: $!\n"; # write the header (pick your order) print $hdr{'FROM'} if $hdr{'FROM'}; print 'From: ', $hdr{'From'} if $hdr{'From'}; print 'Date: ', $hdr{'Date'} if $hdr{'Date'}; print 'Subject: ', $hdr{'Subject'} if $hdr{'Subject'}; print "\n"; # now do each remaining paragraph while (<>) { $* = 0; s/^\n+//; # delete extra blank lines $* = 1; s/^(Date|From|Subject):/>$1:/g; print; } print "_____________________________________\n\n"; ------------------------------------------------------ I translated it fairly literally, including the fact that addresses like foo!bar@fiddle save to file FOO rather than BAR. I chose to process the message in paragraph mode because it makes it easy to find extra newlines (they show up at the front of the paragraph). The only tricky think is that you have to turn on and off whether ^ matches the beginning of each line in the string or just the beginning of the string. This could fairly easily be restructured to handle multiple messages on the input stream by putting the header code into the final loop and executing it whenever the current paragraph looks like a header. Larry Wall lwall@netlabs.com