tchrist@convex.COM (Tom Christiansen) (12/20/90)
After reading a comment from Barry Shein on his having written a complete
wc program in just 6 lines of Snobol in alt.religion.computers, I set out
to do so in perl. Well, I'm down to 7 of perl, but without using
comma-kludges, I can't quite trim off one more line. I've only thought
about it for 5 minutes or so, I do admit. Maybe I'm missing something
obvious.
#!/usr/bin/perl -n
$tchars += $chars += length;
$twords += $words += s/(\S+)/$&/g;
next unless eof;
printf "%8d %8d %8d %s\n", $., $words, $chars, ($ARGV eq '-'?'':$ARGV);
$tlines += ($.+0) + ($. = !reset wc);
next unless $files++ && eof();
printf "%8d %8d %8d %s\n", $tlines, $twords, $tchars, "total";
--tom
--
Tom Christiansen tchrist@convex.com convex!tchrist
"With a kernel dive, all things are possible, but it sure makes it hard
to look at yourself in the mirror the next morning." -metchrist@convex.COM (Tom Christiansen) (12/20/90)
I save a LOT of time if I make the part that counts words throw out the
string instead of save it:
$twords += $words += s/\S+//g;
I go from 1.7 seconds on /etc/termcap to 1.0 seconds. It's still
a long ways from the C wc's 0.25 seconds, but not too shabby either.
Anybody got a cleverer (and faster) algorithm?
I sometimes wish I could just count the number of occurrences without
having to make a new string. Not very often, though.
--tom
--
Tom Christiansen tchrist@convex.com convex!tchrist
"With a kernel dive, all things are possible, but it sure makes it hard
to look at yourself in the mirror the next morning." -metchrist@convex.COM (Tom Christiansen) (12/21/90)
From the keyboard of bzs@world.std.com (Barry Shein):
:Imagine, in perl, if you could insert any expression midway into a
:pattern so whenever whitespace was hit you could increment $words
:right there, let's say "/(\S+)@$words++@/" was a pattern which
:incremented $words every time a space run was found, that's a common
:thing in snobol. Your example is darn close, tho, at least the loop
:has been eliminated via the use of /g, that was the spirit of the
:thing.
:
:Just various ways to find mapcar nirvana...
(Barry, are you sure you're not just leading me on? :-)
Permit me to introduce you to an eval in another guise, the /e modifier:
[this one even works -- the last one had a bug. :-(]
0 #!/usr/bin/perl -n
1 $chars += length;
2 s/\S+/$words++/eg;
3 next unless eof;
4 printf "%8d %8d %8d %s\n", $., $words, $chars, ($ARGV eq '-'?'':$ARGV);
5 $tlines += $.; $twords += words; $tchars += $chars; reset 'wc'; $. = 0;
6 next unless $files++ && eof();
7 printf "%8d %8d %8d %s\n", $tlines, $twords, $tchars, "total";
While evals are pretty neat, this one does slow things down a lot. Putting
it in the line loop like that makes this program run 3.5 times longer
than with line two as simply:
$words += s/\S+//g;
Better to count when all done in this case, but there are lots of more
complex and interesting things you can do with /e easily. Let's say you
also want to create a count of all the distinct words and then print them
out in descending numeric order by count at the end of each file:
#!/usr/bin/perl -n
s/\S+/$saw{$&}++/eg;
next unless eof;
sub down { $saw{$b} <=> $saw{$a}; }
for (sort down keys %saw) { printf "%8d %s\n", $saw{$_}, $_; };
Is that the kind of thing you are looking for, Barry?
We've strayed pretty from religion here (I think:-), so I'm redirecting
followups back into comp.lang.perl.
--tom
--
Tom Christiansen tchrist@convex.com convex!tchrist
"With a kernel dive, all things are possible, but it sure makes it hard
to look at yourself in the mirror the next morning." -metchrist@convex.COM (Tom Christiansen) (12/21/90)
Shorter stil: 0 #!/usr/bin/perl -n 1 $chars += length, $words += s/\S+//eg, next unless eof; 2 printf "%8d %8d %8d %s\n", $., $words, $chars, ($ARGV eq '-'?'':$ARGV); 3 $tlines += $.; $twords += $words; $tchars += $chars;reset 'wc'; $. = 0; 4 printf "%8d %8d %8d %s\n", $tlines, $twords, $tchars, "total" if $files++ && eof(); Thanks to Randal for squishing up the last few lines. --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "With a kernel dive, all things are possible, but it sure makes it hard to look at yourself in the mirror the next morning." -me
merlyn@iwarp.intel.com (Randal L. Schwartz) (12/22/90)
In article <1990Dec21.110952.27897@convex.com>, tchrist@convex (Tom Christiansen) writes: | 3 $tlines += $.; $twords += $words; $tchars += $chars;reset 'wc'; $. = 0; I don't like the 'wc' in there. Here's a weirder way, that actually says what you are doing more clearly: ($tlines, $twords, $tchars, $., $words, $chars) = ($tlines + $., $twords + $words, $tchars + $chars, 0, 0, 0); (With short enough variable names, this fits on one line easily.) @a[3,2,1,0] = ("hacker,","Perl","another","Just"); print "@a" -- /=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\ | on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn | \=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/