[comp.lang.perl] what should length

emv@ox.com (Ed Vielmetti) (01/10/91)

I had some code that looked like

body: {
    @obody = <STDIN>;
}

which seemed to work just fine and quick too.  Now I want to find out
how many bytes are in this array.  The obvious (?) solution to me was
    $obodylen = length(@obody);
but that gets me "3", probably the size of some symbol table entry.

The solution I now have does

body: while (<STDIN>) {
    push(@obody,$_) ;
    $obodylen += length($_);
}

seems to work fine, but in the tradition of perl hacking, there's
probably a better way.

--Ed
emv@ox.com

tchrist@convex.COM (Tom Christiansen) (01/10/91)

From the keyboard of emv@ox.com (Ed Vielmetti):

:I had some code that looked like
:body: {
:    @obody = <STDIN>;
:}
:which seemed to work just fine and quick too.  Now I want to find out
:how many bytes are in this array.  The obvious (?) solution to me was
:    $obodylen = length(@obody);
:but that gets me "3", probably the size of some symbol table entry.

Remembering that @foo in a scalar context yields the number of elements,
I'll bet you that you read in between 100 and 999 elements in the array.

:The solution I now have does
:body: while (<STDIN>) {
:    push(@obody,$_) ;
:    $obodylen += length($_);
:}

:seems to work fine, but in the tradition of perl hacking, there's
:probably a better way.

A better way?  No, but in the tradition of perl hacking there are a googol
of *other* ways. :-)  The way you have there looks pretty good, since
you've got each line to grab the length of as you're going.

If I had to write an array length function the way you've described it,
I'd probably use something like:

    sub alen {  # usage: $n = &alen(*ary); where @ary exists
	local(*a) = @_; 
	local($len,$_);
	grep($len += length, @a);
	$len;
    } 

--tom

merlyn@iwarp.intel.com (Randal L. Schwartz) (01/10/91)

In article <EMV.91Jan9211727@crane.aa.ox.com>, emv@ox (Ed Vielmetti) writes:
|     $obodylen = length(@obody);
| but that gets me "3", probably the size of some symbol table entry.

My wild guess is that @obody has between 100 and 999 entries in it,
and you are looking at the length of the number of entries as
converted to a string!  Am I right?

| The solution I now have does
| 
| body: while (<STDIN>) {
|     push(@obody,$_) ;
|     $obodylen += length($_);
| }
| 
| seems to work fine, but in the tradition of perl hacking, there's
| probably a better way.

If you don't like that, but you don't mind concat-ing the whole mess
once just to throw it away, try:

body: {
	@obody = <STDIN>;
	$obodylen = length(join("",@obody));
}

If you think that obody is huge, try something like:

body: {
	@obody = <STDIN>;
	$obodylen = 0;
	for (@obody) {
		$obodylen += length;
	}
}

Yeah, actually, that's close to your solution.  You could also do it
with a grep() (my formerly favorite operator, until I got tired of it
:-) as in:

body: {
	@obody = <STDIN>;
	$obodylen = 0;
	grep($obodylen += length, @obody);
}

which probably isn't any faster than the previous code.  (Larry?)

print join("", "Just a", "nother P", "erl ha", "cker,")
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (01/11/91)

In article <1991Jan10.063409.7185@iwarp.intel.com> merlyn@iwarp.intel.com (Randal L. Schwartz) writes:
: If you think that obody is huge, try something like:
: 
: body: {
: 	@obody = <STDIN>;
: 	$obodylen = 0;
: 	for (@obody) {
: 		$obodylen += length;
: 	}
: }
: 
: Yeah, actually, that's close to your solution.  You could also do it
: with a grep() (my formerly favorite operator, until I got tired of it
: :-) as in:
: 
: body: {
: 	@obody = <STDIN>;
: 	$obodylen = 0;
: 	grep($obodylen += length, @obody);
: }
: 
: which probably isn't any faster than the previous code.  (Larry?)

It's pretty much a dead heat.

I ran both five times on /etc/termcap and got.

Slurp + foreach:
2.3u 0.8s 0:10 29% 335+1119k 23+0io 30pf+0w
2.2u 0.6s 0:08 36% 220+1134k 22+0io 68pf+0w
2.4u 0.6s 0:06 44% 161+1128k 21+0io 5pf+0w
2.1u 0.8s 0:10 29% 163+1130k 23+0io 5pf+0w
2.3u 0.6s 0:06 43% 163+1164k 22+0io 5pf+0w

Slurp + grep:
2.3u 0.7s 0:06 51% 164+1164k 23+0io 6pf+0w
2.2u 0.7s 0:05 57% 161+1118k 23+0io 6pf+0w
2.3u 0.7s 0:05 55% 160+1101k 21+0io 5pf+0w
2.2u 0.7s 0:06 44% 161+1162k 23+0io 5pf+0w
2.1u 0.8s 0:06 44% 160+1115k 23+0io 5pf+0w

Now, just so you know what kind of overhead you pay for the slurp, here's
the original solution that just iterates over the lines and accumulates
lengths into a variable:
0.9u 0.2s 0:03 30% 206+134k 23+0io 63pf+0w
0.8u 0.1s 0:01 63% 157+137k 21+0io 5pf+0w
0.9u 0.2s 0:02 50% 159+139k 21+0io 5pf+0w
0.9u 0.1s 0:02 51% 156+138k 21+0io 5pf+0w
0.9u 0.1s 0:01 52% 157+139k 21+0io 5pf+0w

Moral: don't slurp files if you don't need to.

Larry