[comp.lang.perl] Reading a file into a string

avi@taux01.nsc.com (Avi Bloch) (01/28/91)

How do I read an entire file into a string other than  the obvious way of
reading it line by line and concatenating it.

Please email since I don't usually have the time to read this newsgroup.

Thanks

Avi Bloch
National Semiconductor (Israel)
avi@taux01.nsc.com
-- 
	Avi Bloch
National Semiconductor (Israel)
6 Maskit st. P.O.B. 3007, Herzlia 46104, Israel		Tel: (972) 52-522263
avi@taux01.nsc.com

lixj@acf4.NYU.EDU (Xiaojian Li) (01/30/91)

>
>How do I read an entire file into a string other than  the obvious way of
>reading it line by line and concatenating it.
>
>Please email since I don't usually have the time to read this newsgroup.
>
>Thanks
>
>Avi Bloch
>National Semiconductor (Israel)
>avi@taux01.nsc.com
>-- 
>	Avi Bloch

I am  new to perl(just two days old :-). Well, I have to say the manual
is very hard, or impossible to read for non expert. Anyway, I do find
by accident that the following might solve your problem:

#!/bin/perl
open(TMP, "$ARGV[0]");
@array = ( <TMP> );   # put the whole file into an array. I don't know if
                      # this is correct, but it seems to work. Correct me if
                      # I am wrong.
chop(@array);
print @array, "\n";

lixj@acf4.NYU.EDU (Xiaojian Li) (01/30/91)

From lixj@acf4.NYU.EDU  Tue Jan 29 17:34:44 1991
Received: from ACF4.NYU.EDU by cmcl2.NYU.EDU (5.61/1.34)
	id AA00907; Tue, 29 Jan 91 17:34:44 -0500
Received: by acf4.NYU.EDU (5.61/1.34)
	id AA11782; Tue, 29 Jan 91 17:34:21 -0500
Date: Tue, 29 Jan 91 17:34:21 -0500
From: lixj@acf4.NYU.EDU (Xiaojian Li)
Message-Id: <9101292234.AA11782@acf4.NYU.EDU>
To: rnews@cmcl2.NYU.EDU

Relay-Version: version nyu B notes v1.6.1 1/11/90; site acf4.NYU.EDU
From: lixj@acf4.NYU.EDU (Xiaojian Li)
Date: 29 Jan 91 17:34 EST
Date-Received: 29 Jan 91 17:34 EST
Subject: Re: Reading a file into a string
Message-ID: <2540002@acf4.NYU.EDU>
Path: acf4!lixj
Newsgroups: comp.lang.perl
Posting-Version: version nyu B notes v1.6.1 1/11/90; site acf4.NYU.EDU
Organization: New York University
References: <5231@taux01.nsc.com>

>
>How do I read an entire file into a string other than  the obvious way of
>reading it line by line and concatenating it.
>
>Please email since I don't usually have the time to read this newsgroup.
>
>Thanks
>
>Avi Bloch
>National Semiconductor (Israel)
>avi@taux01.nsc.com
>-- 
>	Avi Bloch

I am  new to perl(just two days old :-). Well, I have to say the manual
is very hard, or impossible to read for non expert. Anyway, I do find
by accident that the following might solve your problem:

#!/bin/perl
open(TMP, "$ARGV[0]");
@array = ( <TMP> );   # put the whole file into an array. I don't know if
                      # this is correct, but it seems to work. Correct me if
                      # I am wrong.
chop(@array);
print @array, "\n";

tchrist@convex.COM (Tom Christiansen) (01/30/91)

From the keyboard of avi@taux01.nsc.com (Avi Bloch):
:How do I read an entire file into a string other than  the obvious way of
:reading it line by line and concatenating it.

I answered in mail, but for the record: 

1) $foo = `cat $file`;

2) open (FILE, $file);
   undef $/;
   $foo = <FILE>;
   close(FILE);


The cat method costs a bit more, but both are more costly 
than iterating over the file a line at a time.

--tom
--
"Hey, did you hear Stallman has replaced /vmunix with /vmunix.el?  Now
 he can finally have the whole O/S built-in to his editor like he
 always wanted!" --me (Tom Christiansen <tchrist@convex.com>)

tneff@bfmny0.BFM.COM (Tom Neff) (01/30/91)

Sorry to requote all 5, but Allah forbid someone gets this out of order :-)

In article <11235@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
|In article <97130172@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes:
|: In article <1991Jan30.005821.13013@convex.com> tchrist@convex.COM (Tom Christiansen) writes:
|: >1) $foo = `cat $file`;
|: >
|: >2) open (FILE, $file);
|: >   undef $/;
|: >   $foo = <FILE>;
|: >   close(FILE);
|: 
|: 3)  open (FILE, $file); $foo = ''; 
|:     while (!eof(FILE)) { read(FILE, $x, 32767); $foo .= $x;};
|: 
|: You can use bigger numbers than 32767...
|
|You sure can, including the size of the file:
|
|4) open(FILE, $file); read(FILE, $foo, -s $file);
|
|or better in two ways,
|
|5) open(FILE, $file); sysread(FILE, $foo, -s FILE);
|
|I sincerely doubt that any other construct is going to beat that, 
|timewise.  ...

However, 4) and 5) don't work on pipes, named or unnamed.  1) works on 
named pipes but not unnamed.  Only 2) and 3) work everywhere.

-- 
Annex Canada now!  We need the room,    \)      Tom Neff
    and who's going to stop us.         (\      tneff@bfmny0.BFM.COM

tneff@bfmny0.BFM.COM (Tom Neff) (01/30/91)

In article <1991Jan30.005821.13013@convex.com> tchrist@convex.COM (Tom Christiansen) writes:
>1) $foo = `cat $file`;
>
>2) open (FILE, $file);
>   undef $/;
>   $foo = <FILE>;
>   close(FILE);

3)  open (FILE, $file); $foo = ''; 
    while (!eof(FILE)) { read(FILE, $x, 32767); $foo .= $x;};

You can use bigger numbers than 32767...

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (01/31/91)

In article <97130172@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes:
: In article <1991Jan30.005821.13013@convex.com> tchrist@convex.COM (Tom Christiansen) writes:
: >1) $foo = `cat $file`;
: >
: >2) open (FILE, $file);
: >   undef $/;
: >   $foo = <FILE>;
: >   close(FILE);
: 
: 3)  open (FILE, $file); $foo = ''; 
:     while (!eof(FILE)) { read(FILE, $x, 32767); $foo .= $x;};
: 
: You can use bigger numbers than 32767...

You sure can, including the size of the file:

4) open(FILE, $file); read(FILE, $foo, -s $file);

or better in two ways,

5) open(FILE, $file); sysread(FILE, $foo, -s FILE);

I sincerely doubt that any other construct is going to beat that, timewise.
(Solutions using syscall(SYS_mmap...) not allowed.  :-)

Larry

rbj@uunet.UU.NET (Root Boy Jim) (01/31/91)

In article <97130173@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes:
>In article <11235@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
>|In article <97130172@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes:
>|: In article <1991Jan30.005821.13013@convex.com> tchrist@convex.COM (Tom Christiansen) writes:

The question is how to slurp a string into a file. Er, vice versa.

LW>|You sure can, including the size of the file:
>|
>|4) open(FILE, $file); read(FILE, $foo, -s $file);
>|
>|or better in two ways,
>|
>|5) open(FILE, $file); sysread(FILE, $foo, -s FILE);
>|
>|I sincerely doubt that any other construct is going to beat that, 
>|timewise.  ...
>
TN>However, 4) and 5) don't work on pipes, named or unnamed.  1) works on 
>named pipes but not unnamed.  Only 2) and 3) work everywhere.

OK, so Larry was being too literal. Suppose we replace -s FILE with $MAXINT?
Sysread will get whatever it can. However, this begs the question,
how does $foo get allocated? By sending the third arg to malloc?
In that case, perhaps one should set a limit on how much they slurp
and die/complain on therwise.

This would also be a good time to ask about args to ioctls. I have
been precreating them out of paranoia:
$sgtty = 'ioekfl'; ioctl(FH,$TIOCGETP,$sgtty);
Is this necessary? In either case, why?

(6) $"=''; @str = <FILE>; $str = "@str";
(7) @str = <FILE>; $str = join(//,"@str");

Horshoe mode .signature: perl -e 'print "Just another $0 hacker,"'
-- 

	Root Boy Jim Cottrell <rbj@uunet.uu.net>
	Close the gap of the dark year in between

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (01/31/91)

In article <120644@uunet.UU.NET> rbj@uunet.UU.NET (Root Boy Jim) writes:
: OK, so Larry was being too literal.

I'm like that sometimes.  It started at an early age, when my mom told me
to hop in the bathtub.

: Suppose we replace -s FILE with $MAXINT?
: Sysread will get whatever it can. However, this begs the question,
: how does $foo get allocated? By sending the third arg to malloc?

Noises of affirmation.

: In that case, perhaps one should set a limit on how much they slurp
: and die/complain on therwise.

Perhaps one should, as long as they aren't Perl.  Any limit Perl might set
would be too small for some people and too large for others.

: This would also be a good time to ask about args to ioctls. I have
: been precreating them out of paranoia:
: $sgtty = 'ioekfl'; ioctl(FH,$TIOCGETP,$sgtty);
: Is this necessary? In either case, why?

Never necessary on a BSD-derived system, and seldom necessary on other systems.
$TIOCGETP encodes the required length of $sgtty on BSD systems, so it
preallocates that much for you.  On non-BSD systems it preallocates 256
bytes, so you only have to preallocate it if you're expecting more info
back than that.  It also adds a sentinel byte at the end so it can die
gracefully when you blow it.

fcntl works the same way.

syscall does nothing for you by way of pre-extension.

: (6) $"=''; @str = <FILE>; $str = "@str";
: (7) @str = <FILE>; $str = join(//,"@str");
: 
: Horshoe mode .signature: perl -e 'print "Just another $0 hacker,"'

That may not do what you expect.

You might be interested to know that 4.0 will allow you to assign to $0 to
modify what ps sees.  (Suggested by Tom Christiansen.)

die "shucks" if $] < 4.000;
$0 = 'Just another Perl hacker,';
print grep(s/.*Just/Not just/, `ps`);

Larry

tchrist@convex.COM (Tom Christiansen) (01/31/91)

From the keyboard of rbj@uunet.UU.NET (Root Boy Jim):
:This would also be a good time to ask about args to ioctls. I have
:been precreating them out of paranoia:
:$sgtty = 'ioekfl'; ioctl(FH,$TIOCGETP,$sgtty);
:Is this necessary? In either case, why?

I believe ioctl() works like syscall(), in that you need to pre-extend the
string so the kernel has a place to write what it needs.  This is 
different from sysread, who knows how much it'll need.

BTW, I like to do stuff like this:

    $sgttyb_t = 'c4 s';
    sub sgttyb {
	wantarray ? unpack($sgttyb_t, @_) : pack($sgttyb_t, @_);
    } 

You could then say to init:

    $sgttyb = &sgttyb();

But don't forget the parens.  This time they matter.  You don't
want the previous guy's old @_.


:(6) $"=''; @str = <FILE>; $str = "@str";

It seems a tad wasteful to my efficiency-maniacal self to allocate all
those array members just to cat them together.  At least undef it or let
it go out of local scope.


:(7) @str = <FILE>; $str = join(//,"@str");

If you didn't set $" to null, you'll get the "wrong" answer there, 
i.e. extra spaces by default.  Skip the quotes if you want.

:Horshoe mode .signature: perl -e 'print "Just another $0 hacker,"'

"Just another /tmp/perl-e029116 hacker,"?  Well, it IS different.

--tom
--
"Hey, did you hear Stallman has replaced /vmunix with /vmunix.el?  Now
 he can finally have the whole O/S built-in to his editor like he
 always wanted!" --me (Tom Christiansen <tchrist@convex.com>)

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (01/31/91)

In article <1991Jan31.012957.25993@convex.com> tchrist@convex.COM (Tom Christiansen) writes:
: From the keyboard of rbj@uunet.UU.NET (Root Boy Jim):
: :(7) @str = <FILE>; $str = join(//,"@str");
: 
: If you didn't set $" to null, you'll get the "wrong" answer there, 
: i.e. extra spaces by default.  Skip the quotes if you want.

If you skip the quotes you get the wrong answer too.  The value of // is
likely to be 1.

Larry

rbj@uunet.UU.NET (Root Boy Jim) (01/31/91)

In article <11242@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
>In article <120644@uunet.UU.NET> rbj@uunet.UU.NET (Root Boy Jim) writes:
>: In that case, perhaps one should set a limit on how much they slurp
>: and die/complain on therwise.
>
>Perhaps one should, as long as they aren't Perl.  Any limit Perl might set
>would be too small for some people and too large for others.

Good point. I meant that if one is really going to slurp a file,
one should set one's own limit, or iterate, etc.

>: Horshoe mode .signature: perl -e 'print "Just another $0 hacker,"'
>
>That may not do what you expect.

It's a joke. Close only counts in horseshoes, and my solution
is only close. Here's another one. Lottery mode .signature:

	print pack('c',32+int rand 95) while ++$z % 26

Altho I think it'll take a random number generator which produces more
random bits and an infinite number of camels to print the magic slogan.
However, I did manage to produce some famous initials towards the end
of the sixth iteration, along with my own, at the very end.
-- 

	Root Boy Jim Cottrell <rbj@uunet.uu.net>
	Close the gap of the dark year in between