tchrist@convex.COM (Tom Christiansen) (12/20/90)
After reading a comment from Barry Shein on his having written a complete wc program in just 6 lines of Snobol in alt.religion.computers, I set out to do so in perl. Well, I'm down to 7 of perl, but without using comma-kludges, I can't quite trim off one more line. I've only thought about it for 5 minutes or so, I do admit. Maybe I'm missing something obvious. #!/usr/bin/perl -n $tchars += $chars += length; $twords += $words += s/(\S+)/$&/g; next unless eof; printf "%8d %8d %8d %s\n", $., $words, $chars, ($ARGV eq '-'?'':$ARGV); $tlines += ($.+0) + ($. = !reset wc); next unless $files++ && eof(); printf "%8d %8d %8d %s\n", $tlines, $twords, $tchars, "total"; --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "With a kernel dive, all things are possible, but it sure makes it hard to look at yourself in the mirror the next morning." -me
tchrist@convex.COM (Tom Christiansen) (12/20/90)
I save a LOT of time if I make the part that counts words throw out the string instead of save it: $twords += $words += s/\S+//g; I go from 1.7 seconds on /etc/termcap to 1.0 seconds. It's still a long ways from the C wc's 0.25 seconds, but not too shabby either. Anybody got a cleverer (and faster) algorithm? I sometimes wish I could just count the number of occurrences without having to make a new string. Not very often, though. --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "With a kernel dive, all things are possible, but it sure makes it hard to look at yourself in the mirror the next morning." -me
tchrist@convex.COM (Tom Christiansen) (12/21/90)
From the keyboard of bzs@world.std.com (Barry Shein): :Imagine, in perl, if you could insert any expression midway into a :pattern so whenever whitespace was hit you could increment $words :right there, let's say "/(\S+)@$words++@/" was a pattern which :incremented $words every time a space run was found, that's a common :thing in snobol. Your example is darn close, tho, at least the loop :has been eliminated via the use of /g, that was the spirit of the :thing. : :Just various ways to find mapcar nirvana... (Barry, are you sure you're not just leading me on? :-) Permit me to introduce you to an eval in another guise, the /e modifier: [this one even works -- the last one had a bug. :-(] 0 #!/usr/bin/perl -n 1 $chars += length; 2 s/\S+/$words++/eg; 3 next unless eof; 4 printf "%8d %8d %8d %s\n", $., $words, $chars, ($ARGV eq '-'?'':$ARGV); 5 $tlines += $.; $twords += words; $tchars += $chars; reset 'wc'; $. = 0; 6 next unless $files++ && eof(); 7 printf "%8d %8d %8d %s\n", $tlines, $twords, $tchars, "total"; While evals are pretty neat, this one does slow things down a lot. Putting it in the line loop like that makes this program run 3.5 times longer than with line two as simply: $words += s/\S+//g; Better to count when all done in this case, but there are lots of more complex and interesting things you can do with /e easily. Let's say you also want to create a count of all the distinct words and then print them out in descending numeric order by count at the end of each file: #!/usr/bin/perl -n s/\S+/$saw{$&}++/eg; next unless eof; sub down { $saw{$b} <=> $saw{$a}; } for (sort down keys %saw) { printf "%8d %s\n", $saw{$_}, $_; }; Is that the kind of thing you are looking for, Barry? We've strayed pretty from religion here (I think:-), so I'm redirecting followups back into comp.lang.perl. --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "With a kernel dive, all things are possible, but it sure makes it hard to look at yourself in the mirror the next morning." -me
tchrist@convex.COM (Tom Christiansen) (12/21/90)
Shorter stil: 0 #!/usr/bin/perl -n 1 $chars += length, $words += s/\S+//eg, next unless eof; 2 printf "%8d %8d %8d %s\n", $., $words, $chars, ($ARGV eq '-'?'':$ARGV); 3 $tlines += $.; $twords += $words; $tchars += $chars;reset 'wc'; $. = 0; 4 printf "%8d %8d %8d %s\n", $tlines, $twords, $tchars, "total" if $files++ && eof(); Thanks to Randal for squishing up the last few lines. --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "With a kernel dive, all things are possible, but it sure makes it hard to look at yourself in the mirror the next morning." -me
merlyn@iwarp.intel.com (Randal L. Schwartz) (12/22/90)
In article <1990Dec21.110952.27897@convex.com>, tchrist@convex (Tom Christiansen) writes: | 3 $tlines += $.; $twords += $words; $tchars += $chars;reset 'wc'; $. = 0; I don't like the 'wc' in there. Here's a weirder way, that actually says what you are doing more clearly: ($tlines, $twords, $tchars, $., $words, $chars) = ($tlines + $., $twords + $words, $tchars + $chars, 0, 0, 0); (With short enough variable names, this fits on one line easily.) @a[3,2,1,0] = ("hacker,","Perl","another","Just"); print "@a" -- /=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\ | on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn | \=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/