[comp.lang.perl] Binary fixed-length records

dan@kfw.COM (Dan Mick) (04/24/91)

Often, I want to dump a file full of fixed-length records.  The one way I've
been using is

#!/usr/local/bin/perl

$stbuf_def="ccCCCCCCSSII"; 		# unpack codes for struct
$streclen=20;				# length of above in bytes
$/ = 0777;				# slurp in the whole file
$_ = <>;				# into $_

$inputlen = length($_);
$offset = 0;

while ($inputlen > 0) {
	$rec = substr($_, $offset, $streclen);

	<unpack and print $rec>

	$inputlen -= $streclen;
	$offset += $streclen;
}


...but it strikes me that 1) the "slurp in whole file" will fail someday,
and 2) the copy in the substr() can't be efficient.

Anyone willing to take a crack?  (I'm aware that the surrounding goo
might not be as efficient as possible; what I'm mostly worried about are
the two points above).

If I could get the book, I would...

merlyn@iwarp.intel.com (Randal L. Schwartz) (04/24/91)

In article <1991Apr23.202307.2454@kfw.COM>, dan@kfw (Dan Mick) writes:
| Often, I want to dump a file full of fixed-length records.  The one way I've
| been using is
| 
| #!/usr/local/bin/perl
| 
| $stbuf_def="ccCCCCCCSSII"; 		# unpack codes for struct
| $streclen=20;				# length of above in bytes
| $/ = 0777;				# slurp in the whole file
| $_ = <>;				# into $_
| 
| $inputlen = length($_);
| $offset = 0;
| 
| while ($inputlen > 0) {
| 	$rec = substr($_, $offset, $streclen);
| 
| 	<unpack and print $rec>
| 
| 	$inputlen -= $streclen;
| 	$offset += $streclen;
| }
| 
| 
| ...but it strikes me that 1) the "slurp in whole file" will fail someday,
| and 2) the copy in the substr() can't be efficient.
| 
| Anyone willing to take a crack?  (I'm aware that the surrounding goo
| might not be as efficient as possible; what I'm mostly worried about are
| the two points above).

Yeah, do something like:

$stbuf_def="ccCCCCCCSSII"; 		# unpack codes for struct
$streclen = length(pack($stbuf_def, ())); # length of above in bytes

while (read(STDIN, $buf, $streclen) == $streclen) {
	@fields = unpack($stbuf_def, $buf); # process $buf
}

This presumes that you don't *really* need the full power of the <>
construct... if you do, it'd be a bit trickier.

| If I could get the book, I would...

(What stops you? :-)

print pack("c*",unpack("c*","Just another Perl hacker,"))
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/

lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) (04/24/91)

In article <1991Apr23.202307.2454@kfw.COM> dan@kfw.com (Dan Mick) writes:
: Often, I want to dump a file full of fixed-length records.  The one way I've
: been using is
: 
: #!/usr/local/bin/perl
: 
: $stbuf_def="ccCCCCCCSSII"; 		# unpack codes for struct
: $streclen=20;				# length of above in bytes
: $/ = 0777;				# slurp in the whole file
: $_ = <>;				# into $_
: 
: $inputlen = length($_);
: $offset = 0;
: 
: while ($inputlen > 0) {
: 	$rec = substr($_, $offset, $streclen);
: 
: 	<unpack and print $rec>
: 
: 	$inputlen -= $streclen;
: 	$offset += $streclen;
: }
: 
: 
: ...but it strikes me that 1) the "slurp in whole file" will fail someday,
: and 2) the copy in the substr() can't be efficient.
: 
: Anyone willing to take a crack?  (I'm aware that the surrounding goo
: might not be as efficient as possible; what I'm mostly worried about are
: the two points above).

Recalling that read actually uses stdio for efficiency, and desiring to
emulate the <> semantics, which may be overkill, we get:

    #!/usr/local/bin/perl

    $stbuf_def = "ccCCCCCCSSII"; 		# unpack codes for struct
    $streclen = length(pack($stbuf_def, 0));	# length of above in bytes

    @ARGV = '-' unless @ARGV;

    foreach $file (@ARGV) {
	open(IN, $file) || do { warn "Can't open $file: $!\n"; next; };
	while (read(IN, $_, $streclen)) {
		<unpack and print $rec>
	}
    }

Larry

dan@kfw.COM (Dan Mick) (04/26/91)

Ah.  Forgot completely about read().  

(And answers from both Book authors!  Wow.  Thanks.)