[comp.lang.perl] unpack and endianness

thoth@reef.cis.ufl.edu (Gilligan) (10/05/90)

  Is there any way to specify the endiannes of the data you are
unpacking?  I am writing a perl script to decode Moria save files (for
unspecified purposes :) and unfortunately for me the data is written
in Intel order (LSB first).  I can not use unpack (as far as I know)
on the data from our Suns (MSB first).  If I could get unpack to
perform the byte shuffling for me it would squish my code by a factor
of two.
  Does anyone know how to do this?

  Could it be as simple as

  read(STDIN, $_, 2*2);
  ($exp, $maxexp) = unpack("s-2", $_);

  Also, what is the filehandle for <>, I hate having to type
"moriainterp.perl < moria.dc".
--
--
"Until it's on my desk, it's vaporware"  (`it' is the NeXT)

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (10/05/90)

In article <THOTH.90Oct4162237@reef.cis.ufl.edu> thoth@reef.cis.ufl.edu (Gilligan) writes:
: 
:   Is there any way to specify the endiannes of the data you are
: unpacking?  I am writing a perl script to decode Moria save files (for
: unspecified purposes :) and unfortunately for me the data is written
: in Intel order (LSB first).  I can not use unpack (as far as I know)
: on the data from our Suns (MSB first).  If I could get unpack to
: perform the byte shuffling for me it would squish my code by a factor
: of two.
:   Does anyone know how to do this?
: 
:   Could it be as simple as
: 
:   read(STDIN, $_, 2*2);
:   ($exp, $maxexp) = unpack("s-2", $_);

There's no way to do this with unpack on a big-endian machine at the moment.
The best you can do, as far as I can see, is to say

	sub swab {
	    $_[0] =~ s/([\0-\377])([\0-\377])/$2$1/g;
	}

    read(STDIN, $_, 2*2);
    &swab($_);
    ($exp, $maxexp) = unpack("s2", $_);

:   Also, what is the filehandle for <>, I hate having to type
: "moriainterp.perl < moria.dc".

It's ARGV, but you can't do a read() on it and keep the other semantics of <>.  
You want something more like open(STDIN,shift).

Larry

cameron@usage.csd.oz (Cameron Simpson,foo) (10/06/90)

From article <9821@jpl-devvax.JPL.NASA.GOV>, by lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall):
| In article <THOTH.90Oct4162237@reef.cis.ufl.edu> thoth@reef.cis.ufl.edu (Gilligan) writes:
| :Is there any way to specify the endiannes of the data you are unpacking?
| 
| There's no way to do this with unpack on a big-endian machine at the moment.
| The best you can do, as far as I can see, is to say
| 	sub swab {
| 	    $_[0] =~ s/([\0-\377])([\0-\377])/$2$1/g;
| 	}
[...]

Um, why can't I write
 	sub swab {
 	    $_[0] =~ s/(.)(.)/$2$1/g;
 	}

- Cameron Simpson, cameron@spectrum.cs.unsw.oz.au

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (10/07/90)

In article <885@usage.csd.unsw.oz.au> cameron@spectrum.cs.unsw.oz.au (Cameron Simpson) writes:
: From article <9821@jpl-devvax.JPL.NASA.GOV>, by lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall):
: | 	sub swab {
: | 	    $_[0] =~ s/([\0-\377])([\0-\377])/$2$1/g;
: | 	}
: [...]
: 
: Um, why can't I write
:  	sub swab {
:  	    $_[0] =~ s/(.)(.)/$2$1/g;
:  	}

Because . doesn't match \n.  [\0-\377] is the most efficient way to match
everything currently.  Maybe \e should match everything.

And \E would of course match nothing.   :-)

Larry

cameron@usage.csd.oz (Cameron Simpson,foo) (10/07/90)

From article <9847@jpl-devvax.JPL.NASA.GOV>, by lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall):
| In article <885@usage.csd.unsw.oz.au> cameron@spectrum.cs.unsw.oz.au (Cameron Simpson) writes:
| : From article <9821@jpl-devvax.JPL.NASA.GOV>, by lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall):
| : | 	    $_[0] =~ s/([\0-\377])([\0-\377])/$2$1/g;
| : 
| : Um, why can't I write
| :  	    $_[0] =~ s/(.)(.)/$2$1/g;
| 
| Because . doesn't match \n.

Goes back and Rs the FM more closely. Oh the embarrassment.

| [\0-\377] is the most efficient way to match
| everything currently.  Maybe \e should match everything.

Nah. What's wrong with [^]?

| And \E would of course match nothing.   :-)

Correspondingly, []. Haven't tried it yet (does so). Ouch. The ] at the
start of the pattern is taken as part of the range. A feature. Well, it
was a nice idea...
	- Cameron Simpson
	  cameron@spectrum.cs.unsw.oz.au
	  "To every problem there is a simple, obvious, wrong solution."

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/08/90)

In article <887@usage.csd.unsw.oz.au> cameron@spectrum.cs.unsw.oz.au (Cameron Simpson) writes:
> From article <9847@jpl-devvax.JPL.NASA.GOV>, by lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall):
> | [\0-\377] is the most efficient way to match
> | everything currently.  Maybe \e should match everything.
> Nah. What's wrong with [^]?

How do you specify ] in a [^...] expression, then? You mention the same
problem with []...

---Dan

worley@compass.com (Dale Worley) (10/08/90)

   X-Name: Cameron Simpson,foo

   Um, why can't I write
	   sub swab {
	       $_[0] =~ s/(.)(.)/$2$1/g;
	   }

Basically, because it doesn't work.  This swaps every pair of bytes,
but doesn't swap the byte pairs, etc.  The goal is to turn ABCD into
DCBA.

Dale Worley		Compass, Inc.			worley@compass.com
--
The great unsolved problem of feminsm is to decide whether we need to
make men more like women, or women more like men.

dale@convex.com (Dale Lancaster) (10/10/90)

In <9847@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:

>Because . doesn't match \n.  [\0-\377] is the most efficient way to match
>everything currently.  Maybe \e should match everything.

>And \E would of course match nothing.   :-)

>Larry

I would rather see \* match everything and \e be ESC as it is in
other utilities. And maybe \!* matches nothing? :-)

dml

cimarron@erewhon.postgres.Berkeley.EDU (Cimarron D. Taylor </>) (10/10/90)

 | From: dale@convex.com (Dale Lancaster)
 | Newsgroups: comp.lang.perl
 | Subject: Re: unpack and endianness
 | Date: 9 Oct 90 17:20:43 GMT
 | 
 | I would rather see \* match everything and \e be ESC as it is in
 | other utilities. And maybe \!* matches nothing? :-)
 | 
 | dml

	But doesn't \* already mean "turn off the special meaning of *".
	i.e.  match only an asterix?

	Cimarron Taylor
	Electronics Research Laboratory / POSTGRES project
	University of California, Berkeley
	cimarron@postgres.berkeley.edu
	

mdb@ESD.3Com.COM (Mark D. Baushke) (10/10/90)

On 9 Oct 90 17:20:43 GMT, dale@convex.com (Dale Lancaster) said:

Dale> In <9847@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV
Dale> (Larry Wall) writes: 

>Because . doesn't match \n.  [\0-\377] is the most efficient way to match
>everything currently.  Maybe \e should match everything.

>And \E would of course match nothing.   :-)

>Larry

Dale> I would rather see \* match everything and \e be ESC as it is in
Dale> other utilities. And maybe \!* matches nothing? :-)

Dale> dml

Please do not create any non-alphanumeric metacharacters. It would
break the quoting of a pattern that might contain metacharacters using
          $pattern =~ s/(\W)/\\$1/g;
-- 
Mark D. Baushke
mdb@ESD.3Com.COM

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (10/10/90)

In article <MDB.90Oct10082856@kosciusko.ESD.3Com.COM> mdb@ESD.3Com.COM (Mark D. Baushke) writes:
: On 9 Oct 90 17:20:43 GMT, dale@convex.com (Dale Lancaster) said:
: 
: Dale> In <9847@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV
: Dale> (Larry Wall) writes: 
: 
: >Because . doesn't match \n.  [\0-\377] is the most efficient way to match
: >everything currently.  Maybe \e should match everything.
: 
: >And \E would of course match nothing.   :-)
: 
: >Larry
: 
: Dale> I would rather see \* match everything and \e be ESC as it is in
: Dale> other utilities. And maybe \!* matches nothing? :-)
: 
: Dale> dml
: 
: Please do not create any non-alphanumeric metacharacters. It would
: break the quoting of a pattern that might contain metacharacters using
:           $pattern =~ s/(\W)/\\$1/g;

'Sides, there's millions of scripts out there that already use \* to mean
a literal *.

And \e doesn't mean ESC to me, it means \.  What utilities does it mean
ESC in?  /etc/termcap uses \E, which is close.

I HAVE considered making \a mean \007 (since K&R2 has it), but there's
gotta be a limit somewhere.

[Strange, that never stopped you before.]

Oh, shaddap!

Larry

epeterso@houligan.encore.com (Eric Peterson) (10/11/90)

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:

| And \e doesn't mean ESC to me, it means \.  What utilities does it mean
| ESC in?  /etc/termcap uses \E, which is close.

Well, GNU Emacs does this, if you can call something its size a
"utility" :-)

Eric
--
       Eric Peterson <> epeterson@encore.com <> uunet!encore!epeterson
   Encore Computer Corp. * Ft. Lauderdale, Florida * (305) 587-2900 x 5208
Why did Constantinople get the works? Gung'f abobql'f ohfvarff ohg gur Ghexf.

dan@kfw.COM (Dan Mick) (10/12/90)

In article <CIMARRON.90Oct10011141@erewhon.postgres.Berkeley.EDU> cimarron@erewhon.postgres.Berkeley.EDU (Cimarron D. Taylor </>) writes:
>	But doesn't \* already mean "turn off the special meaning of *".
>	i.e.  match only an asterix?

What's an asterix?  Why would you want perl to match a cartoon character?