[comp.lang.perl] I'm lazy, can this be done somehow???

harless@sdd.hp.com (Mike Harless) (01/26/91)

What I'd like to do is suck up a file into an array, 
read another file to see how many lines I've already read (if any),
and then grep strings out of what's new.  The file of lines may have been 
zero'd out since the last time I read it, so I can't just start
reading after skipping so many lines.

For example, if I've read a file into the array @lines, and then found
out that I've already looked at the first 400 lines last time I ran, I'd 
like to do something like:

	$[ = 400 ;
	@found = grep(/whatever/, @lines) ;

and only have grep work on the lines after the first 400.  I know that
I could do this by using an index into @lines, but was trying to do
things elegantly.  I've heard rumors that using indices into arrays isn't 
the perl way! :-)  Is there something simple that I missed?


				...Mike

weisberg@hpcc01.HP.COM (Len Weisberg) (01/29/91)

Setting $[ does not set a pointer into the array.
Rather, it changes the "names" (loosely speaking) by which you refer to
elements of the array.

the following debug session should illustrate:

  DB<1> @ary=('a', 'b', 'c', 'd', 'e', 'f', 'g');

  DB<2> p $[
0
  DB<3> p $ary[2];
c
  DB<4> $[=4;

  DB<9> p $ary[2];

  DB<10> p $ary[6];
c
  DB<11> q


- Len Weisberg - HP Corp Computing & Services - weisberg@corp.HP.COM

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (01/29/91)

In article <1991Jan26.154656.4794@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes:
: As quoted from <1991Jan26.012646.4937@sdd.hp.com> by harless@sdd.hp.com (Mike Harless):
: +---------------
: | 	$[ = 400 ;
: | 	@found = grep(/whatever/, @lines) ;
: | 
: | and only have grep work on the lines after the first 400.  I know that
: | I could do this by using an index into @lines, but was trying to do
: | things elegantly.  I've heard rumors that using indices into arrays isn't 
: | the perl way! :-)  Is there something simple that I missed?
: +---------------
: 
: Use a slice:
: 
: 	@found = grep(/pattern/, @lines[$start .. $#lines]);
: 
: (Note the @.  @name[...] and $name[...] are different.)
: 
: Although I do admit that I've thought $[ should work like this.

$[ only changes the offset to the first element.  It doesn't affect the
contents of the array at all.

Another way to get rid of the first 400 lines is to say

	splice(@lines,0,400);

In general, though, problems like this are best handled by recording
the byte offset to start up again and then seeking to the correct location.
Slurping in the front of the file when you aren't interested in it is bound
to be somewhat boring.

Another tack I've taken on huge log files is to search backwards from the end
of the file forwards.  Do that scan in large hunks until you get before
you want to be, then scan forward normally for the line to start on.  This
presumes there's some kind of ordered key, such as the date in the news
history file.

Larry

evans@decvax.DEC.COM (Marc Evans) (01/29/91)

In article <1991Jan26.012646.4937@sdd.hp.com>, harless@sdd.hp.com (Mike Harless) writes:
|> 
|> 
|> What I'd like to do is suck up a file into an array, 
|> read another file to see how many lines I've already read (if any),
|> and then grep strings out of what's new.  The file of lines may have been 
|> zero'd out since the last time I read it, so I can't just start
|> reading after skipping so many lines.
|> 
|> For example, if I've read a file into the array @lines, and then found
|> out that I've already looked at the first 400 lines last time I ran, I'd 
|> like to do something like:
|> 
|> 	$[ = 400 ;
|> 	@found = grep(/whatever/, @lines) ;
|> 
|> and only have grep work on the lines after the first 400.  I know that
|> I could do this by using an index into @lines, but was trying to do
|> things elegantly.  I've heard rumors that using indices into arrays isn't 
|> the perl way! :-)  Is there something simple that I missed?

Shouldn't you be able to do this using the range construct, as in:

	@found = grep(/whatever/,$lines[400 .. $#lines]);

- Marc
-- 
===========================================================================
Marc Evans - WB1GRH - evans@decvax.DEC.COM  | Synergytics     (603)635-8876
      Unix and X Software Contractor        | 21 Hinds Ln, Pelham, NH 03076
===========================================================================

allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (01/31/91)

As quoted from <11210@jpl-devvax.JPL.NASA.GOV> by lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall):
+---------------
| In article <1991Jan26.154656.4794@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes:
| : Although I do admit that I've thought $[ should work like this.
| 
| $[ only changes the offset to the first element.  It doesn't affect the
| contents of the array at all.
+---------------

I don't want it to change the contents, just to change the "name" of the first
element.  Consider "$[ = -400":  its meaning would be "the first element of
an array should be called `-400'".  The implementation would be that $[ would
be subtracted from the index provided in the program to get the *real* index.

Of course, this would be more useful as a per-array attribute instead of a
global.

++Brandon
-- 
Me: Brandon S. Allbery			    VHF/UHF: KB8JRR on 220, 2m, 440
Internet: allbery@NCoast.ORG		    Packet: KB8JRR @ WA8BXN
America OnLine: KB8JRR			    AMPR: KB8JRR.AmPR.ORG [44.70.4.88]
uunet!usenet.ins.cwru.edu!ncoast!allbery    Delphi: ALLBERY

allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (01/31/91)

As quoted from <1180011@hpcc01.HP.COM> by weisberg@hpcc01.HP.COM (Len Weisberg):
+---------------
| Setting $[ does not set a pointer into the array.
| Rather, it changes the "names" (loosely speaking) by which you refer to
| elements of the array.
| 
|   DB<4> $[=4;
+---------------

When did the values stop being restricted to 0 and 1?  *This* is exactly what
I was asking for, myself.

++Brandon
-- 
Me: Brandon S. Allbery			    VHF/UHF: KB8JRR on 220, 2m, 440
Internet: allbery@NCoast.ORG		    Packet: KB8JRR @ WA8BXN
America OnLine: KB8JRR			    AMPR: KB8JRR.AmPR.ORG [44.70.4.88]
uunet!usenet.ins.cwru.edu!ncoast!allbery    Delphi: ALLBERY

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (02/01/91)

In article <1991Jan30.235116.17942@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes:
: As quoted from <1180011@hpcc01.HP.COM> by weisberg@hpcc01.HP.COM (Len Weisberg):
: +---------------
: | Setting $[ does not set a pointer into the array.
: | Rather, it changes the "names" (loosely speaking) by which you refer to
: | elements of the array.
: | 
: |   DB<4> $[=4;
: +---------------
: 
: When did the values stop being restricted to 0 and 1?  *This* is exactly what
: I was asking for, myself.

Uh, it never stopped being restricted to 0 and 1 because it never started.
It was always that way.

Larry

weisberg@hpcc01.HP.COM (Len Weisberg) (02/01/91)

Brandon S. Allbery writes:
> |   DB<4> $[=4;
> 
> When did the values stop being restricted to 0 and 1?  *This* is exactly what
> I was asking for, myself.

The manpage (actually the info file) says:

$[
     The index of the first element in an array, and of the first
     character in a substring.  Default is 0, but you could set it to 1
     to make *perl* behave more like `awk' (or Fortran) when
     subscripting and when evaluating the `index()' and `substr()'
     functions.  (Mnemonic: `[' begins subscripts.)


This mentions 0 and 1 as the two most noteworthy values, but doesn't really
say that it is restricted to those values.  If you were misled, then so have
been others.  Maybe a slight rewording would be helpful:


$[ 
     The index of the first element in an array, and of the first
     character in a substring.  The value is an integer with default
     of 0.  You could set it to 1 to make ...


- Len Weisberg - HP Corp Computing & Services - weisberg@corp.HP.COM