schwartz@psuvax1.cs.psu.edu (Scott Schwartz) (12/14/89)
Just for fun I compared the speed of perl 2.0 with perl 3.0 using the pi computing demo by W. Kebsch <nixpbe!kebsch> (minus the reporting of intermediate results.) The results, on a Sun4/280: psuvax1% perl-2 pi.pl 200 pi.pl-1.2, digits: 200, terms: 333, elements: 53 3. 1415 9265 3589 7932 3846 2643 3832 7950 2884 1971 6939 9375 1058 2097 4944 5923 0781 6406 2862 0899 8628 0348 2534 2117 0679 8214 8086 5132 8230 6647 0938 4460 9550 5822 3172 5359 4081 2848 1117 4502 8410 2701 9385 2110 5559 6446 2294 8954 9303 8196 [u=15.7833 s=.8 cu=0.0166667 cs=.116667] psuvax1% perl-3 pi.pl 200 pi.pl-1.2, digits: 200, terms: 333, elements: 53 3. 1415 9265 3589 7932 3846 2643 3832 7950 2884 1971 6939 9375 1058 2097 4944 5923 0781 6406 2862 0899 8628 0348 2534 2117 0679 8214 8086 5132 8230 6647 0938 4460 9550 5822 3172 5359 4081 2848 1117 4502 8410 2701 9385 2110 5559 6446 2294 8954 9303 8196 [u=22.6333 s=1.05 cu=0.0166667 cs=0.0666667] -- Scott
lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (12/16/89)
In article <1808@uvaarpa.virginia.edu> schwartz@psuvax1.cs.psu.edu writes:
: Just for fun I compared the speed of perl 2.0 with perl 3.0
: using the pi computing demo by W. Kebsch <nixpbe!kebsch>
: (minus the reporting of intermediate results.)
:
: psuvax1% perl-2 pi.pl 200
: [u=15.7833 s=.8 cu=0.0166667 cs=.116667]
: psuvax1% perl-3 pi.pl 200
: [u=22.6333 s=1.05 cu=0.0166667 cs=0.0666667]
This doesn't surprise me at all. I've done absolutely no optimization
on math operations, and the changes to the run-time system to allow
arrays to be passed around more freely (it's more of a stack machine
now) could certainly adversely affect some of the operations done in pi.pl.
However, I'm better poised now to be able to make a perl-to-C translator.
In fact, I've worried very little about performance for 3.0. My intent
was to get the interface into a stable configuration, and then worry
about performance. Patches which enhance performance cause few problems,
but patches which change the interface can bring on lots of headaches.
I'd be more interested in comparisons of text processing performance.
You'll probably find that some tasks are a lot faster, some are a little
faster, and some are a little slower. Hopefully a net gain.
Larry
tchrist@convex.COM (Tom Christiansen) (12/18/89)
>I'd be more interested in comparisons of text processing performance. >You'll probably find that some tasks are a lot faster, some are a little >faster, and some are a little slower. Hopefully a net gain. Here are some timings on text handling. The program is a quickie to extract all termcap entries that match the command arguments, inspired by from Larry's lib/termcap.pl Tgetent() routine. Here are the relevant data: % wc /etc/termcap 2235 8179 102598 /etc/termcap % grep -n 'wyse50[:|]' /etc/termcap 2060:ye|w50|wyse50|Wyse 50:\ % cat gent $| = 1; $\ = "\n"; for $arg (@ARGV) { do gentry($arg); } sub gentry { local ($entry) = @_; $TERMCAP = '/etc/termcap'; open TERMCAP || die "can't open $TERMCAP: $!\n"; while (<TERMCAP>) { next if /^#/; next if /^\t/; next unless /^(\S*\|)?${entry}[|:]/; chop; while (chop eq '\\') { $_ .= <TERMCAP>; chop; } $_ .= ':'; s/:\t*:/:/g; print; } } C1 timings: (32 meg) c-120% time perl2 gent wyse50 wyse50 wyse50 > /dev/null 10.5u 1.9s 0:13 89% 0+0k 33+0io 60pf+0w c-120% time perl3 gent wyse50 wyse50 wyse50 > /dev/null 6.9u 0.6s 0:08 89% 0+8k 0+1io 67pf+0w C2 timings: (128 meg) c-220% time perl2 gent wyse50 wyse50 wyse50 > /dev/null 2.4u 0.7s 0:03 82% 0+0k 25+0io 46pf+0w c-220% time perl3 gent wyse50 wyse50 wyse50 > /dev/null 1.6u 0.1s 0:01 92% 0+0k 7+0io 51pf+0w Extended precision C2 timings: c-220% /bin/time -e perl2 gent wyse50 wyse50 wyse50 > /dev/null 3.884330 real 2.448687 user 0.760707 sys c-220% /bin/time -e perl3 gent wyse50 wyse50 wyse50 > /dev/null 2.737149 real 1.709143 user 0.164598 sys That means that for THIS application on THESE architectures and THESE configurations, perl3 runs in 2/3 the user time that perl2 does on both architectures, and just 1/3 and 1/5 system time respectively on a c1 and a c2. (The difference in system time ratios MAY be because the c1 was running ConvexOS 7.1 the c2 had version 8.0 instead.) This is sure a pretty nice overall performance increase in my book. Oddly enough, on a diskless sun/350 with 4 meg of memory, there was little variance in user time (5%) but a 50% speedup in system time between perl2 and perl3. Another interesting note is that if you combine these these 2 statments: next if /^#/; next if /^\t/; into next if /^#/ || /^\t/; then your user time goes from 1.7 to 2.0 on the c2 for perl3, but if you instead make them: next if /^[#\t]/; it doesn't change. Interesting optimizations going on here somewhere. What kinds of ratios do people get for other machines? --tom DISCLAIMER: These timings should not be construed to be official benchmarks from my employer, whom I do not represent in this capacity. They are presented only to illustrate ratios between perl2 and perl3. Tom Christiansen {uunet,uiucdcs,sun}!convex!tchrist Convex Computer Corporation tchrist@convex.COM "EMACS belongs in <sys/errno.h>: Editor too big!"
schwartz@psuvax1.cs.psu.edu (Scott Schwartz) (12/18/89)
In article <4047@convex.UUCP> tchrist@convex.COM (Tom Christiansen) writes: >Oddly enough, on a diskless sun/350 with 4 meg of memory, there was little >variance in user time (5%) but a 50% speedup in system time between perl2 >and perl3. On my machine, a Sun4/280 w/ 32M, the termcap entry for wyse50 is matched by an adds entry close to the beginning. I ran it for a tvi955 instead, which is near the end. Here's four trials of each... psuvax1% time perl-3 xxx.pl tvi955 >/dev/null 1.160u 0.240s 0:01.63 85% 0+560k 0+0io 0pf+0w 1.060u 0.330s 0:01.46 95% 0+554k 0+0io 0pf+0w 1.120u 0.330s 0:01.75 82% 0+561k 1+0io 0pf+0w 1.120u 0.270s 0:01.55 89% 0+555k 2+0io 0pf+0w psuvax1% time perl-2 xxx.pl tvi955 > /dev/null 1.040u 0.530s 0:01.74 90% 0+473k 1+0io 0pf+0w 1.230u 0.390s 0:01.79 90% 0+473k 0+0io 0pf+0w 1.190u 0.420s 0:01.74 92% 0+469k 0+0io 0pf+0w 1.160u 0.460s 0:02.98 54% 0+463k 0+0io 0pf+0w I think this benchmark is not computationally expensive enough to give good results. One second of runtime tells nothing, really. -- Scott Schwartz <schwartz@shire.cs.psu.edu> "More mips; cheaper mips; never too many." -- John Mashey
tchrist@convex.COM (Tom Christiansen) (12/18/89)
In article <1989Dec18.032836.16434@psuvax1.cs.psu.edu> schwartz@psuvax1.cs.psu.edu (Scott Schwartz) writes: >I think this benchmark is not computationally expensive enough >to give good results. One second of runtime tells nothing, really. Basically all true. That's why I ran I picked an entry 2000 lines into the file and then passed it the same argument 3 times, which you didn't do -- you made it only look once. I chose it because it did a variety of text things, like regular expression matching and substitutions and concatenation. I made it run on a big file and go through several passes of the same file to make it run long enough to make the results useful. I think a bigger problem with it is that we don't all have the same termcap files. Here's another termcap benchmark, which exercises split() and associative arrays. It also shows about a 1/3 speedup going from perl2 to perl3. All runs produce this output: saw 1365 entries on 2235 lines, 15 duplicates Here are the timings (both machines had 128meg): c1% time perl2 tcount.pl < /etc/termcap > /dev/null 4.7u 0.7s 0:05 96% 0+6k 0+1io 150pf+0w c1% time perl3 tcount.pl < /etc/termcap > /dev/null 3.2u 0.4s 0:03 96% 0+8k 0+2io 132pf+0w c2% time perl2 tcount.pl < /etc/termcap > /dev/null 1.4u 0.3s 0:01 94% 0+0k 1+1io 137pf+0w c2% time perl3 tcount.pl < /etc/termcap > /dev/null 0.9u 0.2s 0:01 94% 0+0k 0+1io 116pf+0w And this was the program: #!/usr/bin/perl while (<>) { $lines++; next if /^[#\s]/; chop; s/:.*//; split(/\|/); for (@_) { $count++; $seen{$_}++; } } @keys = keys(seen); printf "saw %d entries on %d lines, %d duplicates\n", $count, $lines, $count - $#keys; Scott may not like it either because it also runs too quickly. Anybody want to post a better benchmark? I'm having trouble finding something that'll actually run for a long time. My cfman program does, but it's totally unsuitable as a benchmark because everyone has different man pages. --tom Tom Christiansen {uunet,uiucdcs,sun}!convex!tchrist Convex Computer Corporation tchrist@convex.COM "EMACS belongs in <sys/errno.h>: Editor too big!"
flee@shire.cs.psu.edu (Felix Lee) (12/18/89)
Tom Christiansen <tchrist@convex.COM> wrote: > Anybody want to post a better benchmark? I'm having trouble finding > something that'll actually run for a long time. You guys aren't really seriously into text processing, are you. :-) Here's timings for a perl script that counts word frequencies. % time perl-2 wf.pl /etc/termcap >/dev/null 13.3u + 0.9s = 0:15 (95%); (0k+864k)/92k (0+0)io (0f+80r)pg+0sw % !! 13.4u + 0.7s = 0:14 (98%); (0k+872k)/92k (0+0)io (0f+79r)pg+0sw % !! 13.3u + 0.8s = 0:14 (100%); (0k+872k)/92k (0+0)io (0f+79r)pg+0sw % time perl-3 wf.pl /etc/termcap >/dev/null 18.6u + 1.0s = 0:20 (95%); (0k+944k)/84k (0+0)io (0f+73r)pg+0sw % !! 18.7u + 0.9s = 0:20 (95%); (0k+944k)/84k (0+0)io (0f+72r)pg+0sw % !! 18.7u + 0.9s = 0:20 (94%); (0k+944k)/84k (0+0)io (0f+73r)pg+0sw This is on a Sun-4. /etc/termcap is 146k, about 32000 total words, about 2000 different words, average word length is 3 chars. If you want worse behavior, try /usr/dict/words. About 24000 words, every one unique, average length 7 chars. I get 103.0u for perl-2 and 158.2u for perl-3. If you eliminate the simple arithmetic in the script, perl-3 performs a little better, but still worse than perl-2. Here's the script. #!/usr/bin/perl # Count word frequency. while (<>) { foreach $k (split(/[^a-zA-Z]+/)) { $k =~ tr/A-Z/a-z/, ++$freq{$k} if ($k); } } foreach $k (sort downfreq keys(freq)) { printf "%5d %s\n", $freq{$k}, $k; } sub downfreq { ($freq{$b} - $freq{$a}) || ($a gt $b); } -- Felix Lee flee@shire.cs.psu.edu *!psuvax1!flee
lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (12/19/89)
In article <1989Dec18.112735.4443@psuvax1.cs.psu.edu> flee@shire.cs.psu.edu (Felix Lee) writes:
: Here's timings for a perl script that counts word frequencies.
: % time perl-2 wf.pl /etc/termcap >/dev/null
: 13.3u + 0.9s = 0:15 (95%); (0k+864k)/92k (0+0)io (0f+80r)pg+0sw
: % !!
: 13.4u + 0.7s = 0:14 (98%); (0k+872k)/92k (0+0)io (0f+79r)pg+0sw
: % !!
: 13.3u + 0.8s = 0:14 (100%); (0k+872k)/92k (0+0)io (0f+79r)pg+0sw
: % time perl-3 wf.pl /etc/termcap >/dev/null
: 18.6u + 1.0s = 0:20 (95%); (0k+944k)/84k (0+0)io (0f+73r)pg+0sw
: % !!
: 18.7u + 0.9s = 0:20 (95%); (0k+944k)/84k (0+0)io (0f+72r)pg+0sw
: % !!
: 18.7u + 0.9s = 0:20 (94%); (0k+944k)/84k (0+0)io (0f+73r)pg+0sw
:
:
: This is on a Sun-4. /etc/termcap is 146k, about 32000 total words,
: about 2000 different words, average word length is 3 chars.
:
: If you want worse behavior, try /usr/dict/words. About 24000 words,
: every one unique, average length 7 chars. I get 103.0u for perl-2 and
: 158.2u for perl-3.
:
: Here's the script.
:
: #!/usr/bin/perl
: # Count word frequency.
: while (<>) {
: foreach $k (split(/[^a-zA-Z]+/)) {
: $k =~ tr/A-Z/a-z/, ++$freq{$k} if ($k);
: }
: }
: foreach $k (sort downfreq keys(freq)) {
: printf "%5d %s\n", $freq{$k}, $k;
: }
: sub downfreq {
: ($freq{$b} - $freq{$a}) || ($a gt $b);
: }
This particular script is exercising almost none of the constructs that
were sped up in perl 3, and several of the constructs that were slowed
down.
In particular, the sorting is probably a little slower for a couple of
reasons. First, subroutine calls run a little slower due to the code
to handle array returns. Second, associative array references are a
bit slower due to the check for dbm arrays, and making sure associative
arrays don't create themselves when checked by the "defined" function.
The foreach is also a bit slower due to allowing for nested references
to the same array.
Disclaimer: the above is merely well-informed speculation. Profiling might
well pinpoint some other culprit.
Larry
flee@shire.cs.psu.edu (Felix Lee) (12/19/89)
Larry Wall <lwall@jpl-devvax.JPL.NASA.GOV> wrote: > This particular script is exercising almost none of the constructs > that were sped up in perl 3, and several of the constructs that were > slowed down. Yes, I suspected as much, especially the associative array references. I offered that script as an example of a typically expensive text application. Most perl-as-a-report-generator applications aren't going to be nearly as expensive, but will do much the same thing. Assocs are more useful than normal arrays for report generation; they're natural for tabulating data. (That's why the reverse-cat in awk is so slow: awk has assocs, not regular arrays.) But anyway, the enhancements in version 3 are great for all those writing serious applications (like newsreaders:-) in Perl. -- Felix Lee flee@shire.cs.psu.edu *!psuvax1!flee