prastowo@qucis.queensu.CA (Bambang Nurcahyo Prastowo) (05/10/91)
Could any perl guru help me rewrite the following sh script in perl ?
The script is a mail sorter that cleans incoming mails (simplifies
headers and removes duplicated blank lines) and save them in mailboxes
named after the first words of email addresses appear in "From:...".
Bambang Prastowo prastowo@qucis.queensu.ca
#! /bin/sh
home=/grad/prastowo/MAIL
inmail=$home/inmail.$$
outmail=$home/outmail.$$
header=$home/header.$$
body=$home/body.$$
neat=$home/neat.$$
sender=$home/sender.$$
echo=/bin/echo
grep=/bin/grep
egrep=/bin/egrep
rm=/bin/rm
sed=/bin/sed
cat=/bin/cat
awk=/bin/awk
tr=/bin/tr
$cat > $inmail
$sed -n '1,/^$/p' $inmail > $header
$sed -n '/^$/,$p' $inmail > $body
trap "$rm -f $header $inmail $body $sender $neat" 0
(
$egrep '^From |^From:|^Date:|^Subject:' $header
$echo
$awk '/^Date:/||/^From:/||/^Subject:/ {printf(">")}{print}' $body
$echo _____________________________________
$echo
) | awk '/^[ \t]*$/{if(!b){print "";b=1}}/[^ \t]/{print $0;b=0}'> $neat
# keep mails in folders by the 1st words of senders' addresses.
sd=`$sed -n 's/^From://p' $header | $sed 's/.*<//;s/>.*//' |
$awk '{print $1}' | sed 's/[@%\.\!].*$//' |
$tr a-z A-Z `
$cat $neat >> $home/$sd
exit 0
--
Bambang Nurcahyo Prastowo prastowo@qucis.queensu.ca
prastowo@qucis.bitnetlwall@jpl-devvax.jpl.nasa.gov (Larry Wall) (05/11/91)
In article <2083@qusunc.queensu.CA> prastowo@qucis.queensu.CA (Bambang Nurcahyo Prastowo) writes: : Could any perl guru help me rewrite the following sh script in perl ? : The script is a mail sorter that cleans incoming mails (simplifies : headers and removes duplicated blank lines) and save them in mailboxes : named after the first words of email addresses appear in "From:...". Sure. It'd go something like this: ------------------------------------------------------ #!/usr/bin/perl $MAILDIR = "/grad/prastowo/MAIL"; $/ = ''; # paragraph mode $* = 1; # enable multiline matching for ^ # parse header into associative array %hdr = ('FROM', split(/^([-\w]+):[ \t]*/, <>)); # open appropriate file $sd = $hdr{'From'}; $sd =~ s/^.*<\s*(.*)>.*$/$1/; # handle <> form $sd =~ s/[\s@%.!].*//; # delete trailing stuff $sd =~ tr/a-z/A-Z/; # canonicalize to upper case open(STDOUT, ">>$MAILDIR/$sd") || die "Can't open $sd: $!\n"; # write the header (pick your order) print $hdr{'FROM'} if $hdr{'FROM'}; print 'From: ', $hdr{'From'} if $hdr{'From'}; print 'Date: ', $hdr{'Date'} if $hdr{'Date'}; print 'Subject: ', $hdr{'Subject'} if $hdr{'Subject'}; print "\n"; # now do each remaining paragraph while (<>) { $* = 0; s/^\n+//; # delete extra blank lines $* = 1; s/^(Date|From|Subject):/>$1:/g; print; } print "_____________________________________\n\n"; ------------------------------------------------------ I translated it fairly literally, including the fact that addresses like foo!bar@fiddle save to file FOO rather than BAR. I chose to process the message in paragraph mode because it makes it easy to find extra newlines (they show up at the front of the paragraph). The only tricky think is that you have to turn on and off whether ^ matches the beginning of each line in the string or just the beginning of the string. This could fairly easily be restructured to handle multiple messages on the input stream by putting the header code into the final loop and executing it whenever the current paragraph looks like a header. Larry Wall lwall@netlabs.com