[comp.sources.d] C News "upact" in Perl 3.0

chip@ateng.com (Chip Salzenberg) (11/19/89)

The C News "expire" program doesn't update the third (min) field of the
active file.  The authors of C News consider this field a hack.  However,
they do provide a separate shell script, "upact", that updates the min
field according to the actual contents of the spool directory.  While the
distributed upact works, it's _very_ slow.

Enter Perl 3.0.  This wondrous language now includes opendir, readdir and
closedir.  Since I just got Perl 3.0 running on my system, I figured I
should check out its new features.

The result:  A upact script written in Perl 3.0.  It's not exactly a work
of art, but it's a lot faster than the shell script.  On my system, a 20MHz
'386 running SCO Xenix/386, it processes my 495-line active file in one
minute and twenty seconds of real time.

Shar and enjoy.

#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  upact
# Wrapped by chip@ateng on Sat Nov 18 14:59:21 1989
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'upact' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'upact'\"
else
echo shar: Extracting \"'upact'\" \(2393 characters\)
sed "s/^X//" >'upact' <<'END_OF_FILE'
Xeval 'exec /bin/perl -S $0 ${1+"$@"}'
X	if $running_under_some_shell;
X$prog = $0;
X
X# Update 3rd field (minimum art. #) of a 4-field active file.
X
X$config = "/usr/lib/news/bin/config" unless $config = $ENV{'NEWSCONFIG'};
Xopen(config) || die "can't open $config: $!";
Xwhile (<config>) {
X	if (/^(NEWS[A-Z]+)=\${\1-(.*)}/) {
X		$var = $1;
X		$value = $2 unless $value = $ENV{$var};
X		$NEWSCTL    = $value if $var eq "NEWSCTL";
X		$NEWSBIN    = $value if $var eq "NEWSBIN";
X		$NEWSARTS   = $value if $var eq "NEWSARTS";
X		$NEWSPATH   = $value if $var eq "NEWSPATH";
X		$NEWSUMASK  = $value if $var eq "NEWSUMASK";
X		$NEWSMASTER = $value if $var eq "NEWSMASTER";
X		$NEWSCONFIG = $value if $var eq "NEWSCONFIG";
X	}
X}
Xclose(config);
X
X$ENV{'PATH'} = "$NEWSCTL/bin:$NEWSBIN/expire:$NEWSBIN:$NEWSPATH";
Xumask($NEWSUMASK);
X
Xchdir($NEWSCTL) || die "$prog: can't chdir to $NEWSCTL: $!";
X
Xopen(active, "active") || die "can't open active: $!";
X$_ = <active>;
Xchop;
Xclose(active);
X@F = split(/[ \t]+/);
Xdie "$prog: active file has other than 4 fields" unless $#F == 3;
X
X# lock news system
X$lock = "$NEWSCTL/LOCK";
X$ltemp = "$NEWSCTL/L.$$";
Xopen(ltemp,">$ltemp");
Xprint ltemp "$$\n";
Xclose(ltemp);
X
X@KILL = ($ltemp);
X
X$SIG{'HUP'} = 'gotsig';
X$SIG{'INT'} = 'gotsig';
X$SIG{'TERM'} = 'gotsig';
X
Xsub gotsig {
X	local ($sig) = @_;
X	do leave("got SIG$sig\n");
X}
X
Xsub leave {
X	local (@MSG) = @_;
X	print stderr @MSG if ($#MSG >= 0);
X	unlink(@KILL);
X	exit(0);
X}
X
Xwhile (1) {
X	if (link($ltemp, $lock)) {
X		push(@KILL, $lock);
X		last;
X	}
X	sleep 30;
X}
X
Xopen(active, "active") || do leave("can't open active: $!");
Xopen(newactive, ">active.new") || do leave("can't create active.new: $!");
X
Xwhile (<active>) {
X	chop;
X	($group, $max, $min, $fourth) = split(/[ \t]+/);
X	($dir = $group) =~ y#.#/#;
X	$newsdir = "$NEWSARTS/$dir";
X	$min = (-d $newsdir) ? (do dirlow($newsdir)) : "";
X	$min = $max + 1 unless $min;
X	printf newactive "%s %010d %010d %s\n", $group, $max, $min, $fourth;
X}
X
Xclose(active);
Xclose(newactive);
X
X# replace active, carefully
X
Xunlink "active.old";
Xrename("active", "active.old");
Xrename("active.new", "active");
X
Xdo leave();
X
X# find the lowest file number in the directory.
X
Xsub dirlow {
X	local ($dir) = @_;
X	local ($low, $f);
X
X	opendir(dir,$dir) || return "";
X	$low = "";
X	while ($f = readdir(dir)) {
X		next unless $f =~ /^[0-9]+$/;
X		$low = $f if ($low eq "") || ($f < $low);
X	}
X	closedir(dir);
X	return $low;
X}
END_OF_FILE
if test 2393 -ne `wc -c <'upact'`; then
    echo shar: \"'upact'\" unpacked with wrong size!
fi
# end of 'upact'
fi
echo shar: End of shell archive.
exit 0
-- 
You may redistribute this article only to those who may freely do likewise.
Chip Salzenberg at A T Engineering;  <chip@ateng.com> or <uunet!ateng!chip>
    "Did I ever tell you the Jim Gladding story about the binoculars?"

henry@utzoo.uucp (Henry Spencer) (11/19/89)

In article <2565B57C.10978@ateng.com> chip@ateng.com (Chip Salzenberg) writes:
>The C News "expire" program doesn't update the third (min) field of the
>active file.  The authors of C News consider this field a hack.  However,
>they do provide a separate shell script, "upact", that updates the min
>field according to the actual contents of the spool directory.  While the
>distributed upact works, it's _very_ slow.
>
>Enter Perl 3.0...

Uh, in any modern C News distribution you will find updatemin.c, which does
the same thing in C.  It's not part of the default configuration because it
does require the directory library.
-- 
A bit of tolerance is worth a  |     Henry Spencer at U of Toronto Zoology
megabyte of flaming.           | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

chip@ateng.com (Chip Salzenberg) (11/21/89)

According to henry@utzoo.uucp (Henry Spencer):
>Uh, in any modern C News distribution you will find updatemin.c, which does
>the same thing in C.  It's not part of the default configuration because it
>does require the directory library.

Uh...  I knew that.   :-)

Oh well, it was still fun writing the Perl version.  And it's rather
impressive (to me, anyway) how easy it is to get reasonable speeds from
a semi-compiled language like Perl.
-- 
You may redistribute this article only to those who may freely do likewise.
Chip Salzenberg at A T Engineering;  <chip@ateng.com> or <uunet!ateng!chip>
    "Did I ever tell you the Jim Gladding story about the binoculars?"

tale@pawl.rpi.edu (David C Lawrence) (11/22/89)

According to henry@utzoo.uucp (Henry Spencer):
Henry> Uh, in any modern C News distribution you will find
Henry> updatemin.c, which does the same thing in C.

In <25682EB0.24425@ateng.com> chip@ateng.com (Chip Salzenberg) writes:
Chip> Uh...  I knew that.   :-)

I didn't know it.  I'm glad your posting prompted Henry's note, for
two reasons.

A) It munged my active file by always setting $min to 1 + $max.
B) updatemin is faster anyway.  It fixed the damage done by the perl
upact pretty quickly. :-)

Now I am also quite content to run updatemin nightly from cron after
expire.

The problem with upact was in dirlow().  $min never got set.  I don't
understand exactly what was failing, but many printf's later I tracked
it down to the fact that though $low was being set in the "while"
control structure, it's value was still "" when it exited:

sub dirlow {
        local ($dir) = @_;
        local ($low, $f);

        opendir(dir,$dir) || return "";
        $low = "";
        while ($f = readdir(dir)) {
                printf "%s\n", $f;
                next unless $f =~ /^[0-9]+$/;
                $low = $f if ($low eq "") || ($f < $low);
        }
        closedir(dir);
        return $low;
}

This is in perl 3.0.1.2, patchlevel 6, GCC 1.35 compiled on a Sun 3,
SunOS 4.0.3 system.

Dave
-- 
 (setq mail '("tale@pawl.rpi.edu" "tale@ai.mit.edu" "tale@rpitsmts.bitnet"))

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (11/23/89)

In article <empty.stack.4D9D@rpi.edu> tale@pawl.rpi.edu (David C Lawrence) writes:
: B) updatemin is faster anyway.  It fixed the damage done by the perl
: upact pretty quickly. :-)
: 
: The problem with upact was in dirlow().  $min never got set.  I don't
: understand exactly what was failing, but many printf's later I tracked
: it down to the fact that though $low was being set in the "while"
: control structure, it's value was still "" when it exited:
...
: This is in perl 3.0.1.2, patchlevel 6, GCC 1.35 compiled on a Sun 3,
: SunOS 4.0.3 system.

The preliminary suspect is the gcc optimizer.  It doesn't misbehave when
I take the -O out.  More later.

Larry Wall
lwall@jpl-devvax.jpl.nasa.gov

chip@ateng.com (Chip Salzenberg) (11/23/89)

According to tale@pawl.rpi.edu (David C Lawrence), my version of upact in
Perl munged his active file by always setting $min to 1 + $max.

Well, I have it on good authority (mine) that the posted script works.
I use it here on this system nightly and it's great.

I suspect that the problem is that your Perl doesn't know how to readdir()
correctly.  Before blaming the script, it might be good to double-check
your Perl with a small test program.
-- 
You may redistribute this article only to those who may freely do likewise.
Chip Salzenberg at A T Engineering;  <chip@ateng.com> or <uunet!ateng!chip>
    "Did I ever tell you the Jim Gladding story about the binoculars?"

allbery@NCoast.ORG (Brandon S. Allbery) (11/23/89)

As quoted from <empty.stack.4D9D@rpi.edu> by tale@pawl.rpi.edu (David C Lawrence):
+---------------
| sub dirlow {
|         local ($dir) = @_;
|         local ($low, $f);
| 
|         opendir(dir,$dir) || return "";
|         $low = "";
>----------------^^
|         while ($f = readdir(dir)) {
|                 printf "%s\n", $f;
|                 next unless $f =~ /^[0-9]+$/;
|                 $low = $f if ($low eq "") || ($f < $low);
>------------------------------------^^^^^---------^^^^^^
|         }
|         closedir(dir);
|         return $low;
| }
+---------------

"eq" applies to strings; "<" applies to numbers.  Best not to mix them; as the
Perl manual suggests, you can get unpleasant surprises.

++Brandon
-- 
Brandon S. Allbery    allbery@NCoast.ORG, BALLBERY (MCI Mail), ALLBERY (Delphi)
uunet!hal.cwru.edu!ncoast!allbery ncoast!allbery@hal.cwru.edu bsa@telotech.uucp
*(comp.sources.misc mail to comp-sources-misc[-request]@backbone.site, please)*
*Third party vote-collection service: send mail to allbery@uunet.uu.net (ONLY)*
expnet.all: Experiments in *net management and organization.  Mail me for info.

chip@ateng.com (Chip Salzenberg) (11/24/89)

According to allbery@NCoast.ORG (Brandon S. Allbery):
[From my Perl "upact"]:
>|         $low = "";
>>----------------^^
>|         while ($f = readdir(dir)) {
>|                 printf "%s\n", $f;
>|                 next unless $f =~ /^[0-9]+$/;
>|                 $low = $f if ($low eq "") || ($f < $low);
>>------------------------------------^^^^^---------^^^^^^
>|         }
>
>"eq" applies to strings; "<" applies to numbers.  Best not to mix them;
>as the Perl manual suggests, you can get unpleasant surprises.

You _can_ get unpleasant surprises if you are not careful to use the correct
comparison operator(s).  However, in this case, you do not get unpleasant
anything.  The code is correct.

Remember that the filenames come back as strings.  Note the regular
expression for acceptable filenames: all digits.  Therefore the numeric test
is valid.  Further, note that the test ($low eq "") cannot succeed if any
numeric files have been found.

Anyone who doubts the correctness of this code is invited to E-Mail me for
further explanation.
-- 
You may redistribute this article only to those who may freely do likewise.
Chip Salzenberg at A T Engineering;  <chip@ateng.com> or <uunet!ateng!chip>
    "Did I ever tell you the Jim Gladding story about the binoculars?"

tale@pawl.rpi.edu (David C Lawrence) (11/24/89)

In <256AF479.21036@ateng.com> chip@ateng.com (Chip Salzenberg) writes:
Chip> I suspect that the problem is that your Perl doesn't know how to
Chip> readdir() correctly.  Before blaming the script, it might be
Chip> good to double-check your Perl with a small test program.

I'm sorry you felt like my posting was an affront on your programming
ability, Chip, but I am quite certain that any such idea was inferred
and not implied.  If you had read what I said, $low was being
correctly set in the loop.  This couldn't happen if my readdir() is
broken, which it is not.  I can blame the script, because if I hadn't
run the script it wouldn't have munged my file.  The fact that the
script is coded fine, which is why I _did_ spend time tracking down
what was happening, does not change the damage done.

Larry suspects the problem is caused by the GCC optimizer.  In mail, J
Greely had this to say, cited with permission:

J> Patchlevel 6 tickled to life an old bug with return and local
J> variables.  I'm trying to clear it up with Larry, but my examples are
J> twisted (you have to have a local, a while, *and* a return, and the
J> body of the while can't be empty, and *must* refer to a local, and
J> when the moon is full...).

Once again, I am sorry that you felt I was making a slur on your
programming ability.  I was not.

Dave
-- 
 (setq mail '("tale@pawl.rpi.edu" "tale@ai.mit.edu" "tale@rpitsmts.bitnet"))