[comp.lang.perl] sort

ccount@athena.mit.edu (Craig A Counterman) (02/01/90)

I am a convert to perl, thanks to the Perl Reference Guide, and a2p
and s2p.  I used the programs to convert a series of awk and sed
scripts to perl, and the ref. card to understand the code and fine
tune the result.  It works great.

One thing I'd like to do but don't yet see how (if it's possible), is
to sort the lines of a file based on a given field.  My actual
application would be a bit more complex, but that's the basic idea.

sort() would sort a single array, but I think I need it to also
generate an index array I could then use to reference the other arrays.
(I'd read and split each line in the file into a series of arrays, one
column per array and visa versa, do some processing, and then want to
sort on one of the arrays).

Thanks,
Craig

merlyn@iwarp.intel.com (Randal Schwartz) (02/02/90)

In article <1990Feb1.050145.21183@athena.mit.edu>, ccount@athena (Craig A Counterman) writes:
| I am a convert to perl, thanks to the Perl Reference Guide, and a2p
| and s2p.  I used the programs to convert a series of awk and sed
| scripts to perl, and the ref. card to understand the code and fine
| tune the result.  It works great.
| 
| One thing I'd like to do but don't yet see how (if it's possible), is
| to sort the lines of a file based on a given field.  My actual
| application would be a bit more complex, but that's the basic idea.
| 
| sort() would sort a single array, but I think I need it to also
| generate an index array I could then use to reference the other arrays.
| (I'd read and split each line in the file into a series of arrays, one
| column per array and visa versa, do some processing, and then want to
| sort on one of the arrays).

Ahh yes, but what's *in* a single array?!  All sorts of things (pun
intended).

If your field is defined by fixed character position, say, the 14th
through 20th characters of the line, try this:

	@whole = <STDIN>; # snarf everything in
	sub byfield14to20 { substr($a, 13, 7) > substr($b, 13, 7); }
	print sort byfield14to20 @whole;

If your field is defined by separators, it's a little more difficult.
You can either split each line each time you test it (bad for lotsa
data), or split it once and cache it, then sort an "indexing" array,
like so (presuming you want to sort on the fourth field):

	@whole = <STDIN>;
	for (@whole) {
		@x = split;
		push(@f4, $x[3]);
	}
	# now $f4[n] is the fourth field of $whole[n]
	sub byfourthfield { $f4[$a] > $f4[$b]; }
	@indices = sort byfourthfield 0..$#whole;
	# now @indicies has the pointers into @whole for the sorted array
	for (@indicies) {
		print $whole[$_];
	}
	# larry? is that the same as 'print @whole[@indices];'?

A little tricky, but if you take it through step-by-step, you'll see
what I'm doing there.  If you get stuck, write me back, and I'll do it
in detail.

Just another Perl hacker, of sorts,
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (02/02/90)

In article <1990Feb1.163539.10576@iwarp.intel.com> merlyn@iwarp.intel.com (Randal Schwartz) writes:
: 	# now @indicies has the pointers into @whole for the sorted array
: 	for (@indicies) {
: 		print $whole[$_];
: 	}
: 	# larry? is that the same as 'print @whole[@indices];'?

Assuming you haven't diddled the $, variable, and that you have enough
memory to throw the whole array on the stack, and that you spell the
name of the array consistently, yes.

And bearing in mind that it might be more efficient to use the sort program.

: Just another Perl hacker, of sorts,

Now that he's collecting examples for the book, he's also a sorter of hacks.

Larry

merlyn@iwarp.intel.com (Randal Schwartz) (02/02/90)

In article <6963@jpl-devvax.JPL.NASA.GOV>, lwall@jpl-devvax (Larry Wall) writes:
| And bearing in mind that it might be more efficient to use the sort program.

Ahh yes, it didn't even occur to me that the question might be solved
by a *standard* Unix utility!  Arrggh.  I've been rewriting for c.u.q
too long!

| 
| : Just another Perl hacker, of sorts,
| 
| Now that he's collecting examples for the book, he's also a sorter of hacks.

Cute.

@a=split(/(\d)/,"4Hacker,2another3Perl1Just");shift(@a);%a=@a;print "@a{1..4}";
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

frech@mwraaa.army.mil (Norman R. Frech CPLS) (06/28/90)

I am sorting an array of values which represent random record numbers which
I am going to extract from a large file.  The problem is sort sorts the
values as alpha ?? and not as numeric, i.e. 3278 comes before 334.  Is there
a way of forcing sort to treat the array as numeric in the sort?

***portion of code follows ***

#generate 50 random record numbers of file.in in sorted order

$rc = `wc -l file.in`;
$i = 0;
for (1..50) {
$i = $i + 1;
@pickval[$i] = int(rand($rc)) + 5;
}
@picksort = sort @pickval;
$i = 0;
for (1..50) {
$i = $i + 1;
print @picksort[$i],"\n";
}

Norm Frech < frech@mwraaa.army.mil >

merlyn@iwarp.intel.com (Randal Schwartz) (06/28/90)

In article <1990Jun27.204253.27285@uvaarpa.Virginia.EDU>, frech@mwraaa (Norman R. Frech CPLS) writes:
| I am sorting an array of values which represent random record numbers which
| I am going to extract from a large file.  The problem is sort sorts the
| values as alpha ?? and not as numeric, i.e. 3278 comes before 334.  Is there
| a way of forcing sort to treat the array as numeric in the sort?

sub bynumeric {$a - $b;}

@sortedarray = sort bynumeric @array;

print qq/Just another Perl hacker,/
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

frech@mwraaa.army.mil (Norman R. Frech CPLS) (08/22/90)

Due to great stupidity I lost my one of my programs which contained an
sort of an array of integers.  Naturally, I need the logic and my sort
is sorting on the character string representations.  I tried the 
subroutine in the perl.man with no success.  Would someone please 
send me the logic for a numeric sort of an array.  Thanks...

Norm Frech <frech@mwraaa.army.mil>

merlyn@iwarp.intel.com (Randal Schwartz) (08/22/90)

In article <1990Aug21.202005.26275@uvaarpa.Virginia.EDU>, frech@mwraaa (Norman R. Frech CPLS) writes:
| Due to great stupidity I lost my one of my programs which contained an
| sort of an array of integers.  Naturally, I need the logic and my sort
| is sorting on the character string representations.  I tried the 
| subroutine in the perl.man with no success.  Would someone please 
| send me the logic for a numeric sort of an array.  Thanks...

sub by_the_numbers {
	$a - $b;
}

@a = (3,5,7,9,11,13,15,16,14,12,10,8,6,4,2);

@sorta = sort by_the_numbers @a;

print "@sorta\n";

How's that?

print "Just another Perl (book) hacker,"
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

Andrew.Vignaux@comp.vuw.ac.nz (Andrew Vignaux) (08/22/90)

In article <1990Aug21.224327.20194@iwarp.intel.com>,
merlyn@iwarp.intel.com (Randal Schwartz) writes:
>sub by_the_numbers {
>	$a - $b;
>}

The original poster asked about sorting integers so this answer is
right.  However, there is a possible trap.  I couldn't work out why

	@a = (50.1, 50.4, 50.2);
	@sorta = sort by_the_numbers @a;
	print "@sorta\n";

sometimes didn't work for me (well actually, imagine a LARGE list with
a few |element1 - element2| < 0.5)

"sort" wants integers (<, ==, >) 0 not reals!

Is there a better/stronger/faster "float" comparison function than

	sub by_the_numbers { $a > $b ? 1 : $a < $b ? -1 : 0; }

especially if $a and $b are indexes into an associative array?
	e.g  ... $foo{$a} > $foo{$b} ? ...


Here's an interesting sorter (extracted so my one-liner fits ;-)  I
think you can tell that perl is using qsort "as time goes by".

	$s=(localtime(time))[0]; sub n { ($a - $b) * $s; }

print grep(s/.*\t//,sort n grep($_=++$i/-50."\t$_",split(/\n*/,<<JAPH)));
,rekcah lreP rehtona tsuJ
JAPH

Andrew
-- 
Domain address: Andrew.Vignaux@comp.vuw.ac.nz
Lament:		Why do my one-liners never fit in one line? :-(

merlyn@iwarp.intel.com (Randal Schwartz) (08/22/90)

In article <1990Aug22.111018.3329@comp.vuw.ac.nz>, Andrew.Vignaux@comp (Andrew Vignaux) writes:
| 	$s=(localtime(time))[0]; sub n { ($a - $b) * $s; }
| 
| print grep(s/.*\t//,sort n grep($_=++$i/-50."\t$_",split(/\n*/,<<JAPH)));
| ,rekcah lreP rehtona tsuJ
| JAPH

Wow!  A piece of code that really *does* fail based on time-of-day!
(And in Perl... I'm heartbroken. :-)

| Lament:		Why do my one-liners never fit in one line? :-(

You don't use long enough lines. :-)

P.S. If you don't see it, think about what happens at 0 seconds into
the minute.

print "Just another Perl [book] hacker,"
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

Mike.McManus@FtCollins.NCR.com (Mike McManus) (08/28/90)

In article <1990Aug22.111018.3329@comp.vuw.ac.nz> Andrew.Vignaux@comp.vuw.ac.nz (Andrew Vignaux) writes:
>
>   In article <1990Aug21.224327.20194@iwarp.intel.com>,
>   merlyn@iwarp.intel.com (Randal Schwartz) writes:
>   >sub by_the_numbers {
>   >	$a - $b;
>   >}
>
>   The original poster asked about sorting integers so this answer is
>   right.
...
>   Is there a better/stronger/faster "float" comparison function than
>
>	   sub by_the_numbers { $a > $b ? 1 : $a < $b ? -1 : 0; }
>
>   especially if $a and $b are indexes into an associative array?
>	   e.g  ... $foo{$a} > $foo{$b} ? ...

Interestingly enuff, I had need for a similar sorting routine today, but with a
twist: I want to sort the indices of an associative array that are of the form
"A0, A1, A2, ..., A9, A10, ..."  Of course an alphabetic sort returns "A0, A10,
A11, A19, ..., A1, ...", not what I want!  

Any simple solutions?  Thanks!
--
Disclaimer: All spelling and/or grammar in this document are guaranteed to be
            correct; any exseptions is the is wurk uv intter-net deemuns,.

Mike McManus                        Mike.McManus@FtCollins.NCR.COM, or
NCR Microelectronics                ncr-mpd!mikemc@ncr-sd.sandiego.ncr.com, or
2001 Danfield Ct.                   uunet!ncrlnk!ncr-mpd!garage!mikemc
Ft. Collins,  Colorado              
(303) 223-5100   Ext. 378
                                    

tchrist@convex.COM (Tom Christiansen) (08/29/90)

In article <MIKE.MCMANUS.90Aug28165252@mustang.FtCollins.NCR.com> Mike.McManus@FtCollins.NCR.com (Mike McManus) writes:
>Interestingly enuff, I had need for a similar sorting routine today, but with a
>twist: I want to sort the indices of an associative array that are of the form
>"A0, A1, A2, ..., A9, A10, ..."  Of course an alphabetic sort returns "A0, A10,
>A11, A19, ..., A1, ...", not what I want!  
>
>Any simple solutions?  Thanks!

Is this simple enough for the sort function?

    sub bynum { substr($a,$[+1,10) > substr($b,$[+1,10); }

--tom
--
 "UNIX was never designed to keep people from doing stupid things, because 
  that policy would also keep them from doing clever things." [Doug Gwyn]

merlyn@iwarp.intel.com (Randal Schwartz) (08/29/90)

In article <MIKE.MCMANUS.90Aug28165252@mustang.FtCollins.NCR.com>, Mike.McManus@FtCollins (Mike McManus) writes:
| Interestingly enuff, I had need for a similar sorting routine today, but with a
| twist: I want to sort the indices of an associative array that are of the form
| "A0, A1, A2, ..., A9, A10, ..."  Of course an alphabetic sort returns "A0, A10,
| A11, A19, ..., A1, ...", not what I want!  
| 
| Any simple solutions?  Thanks!

Well, two come to mind.  The first one is pretty trivial source code:

sub by_the_numbers_mostly { substr($a,1,999) > substr($b,1,999) ? 1 : -1; }

However, I haven't tested this for speed (it's doing a lot of work in
those substr's over and over and over again).  What you might want to
do is build a parallel array:

@foo=('A10'..'A19','A0'..'A9');  # not in order, for demo

grep(s/^.//,@fookey = @foo);

sub byfookey { $fookey[$a] > $fookey[$b] ? 1 : -1; }

@sortfoo = @foo[sort byfookey $[..$#foo];

print "@sortfoo";

For large arrays, I believe this would win.  For small arrays, the
first would probably win (I haven't tested that... if someone has a
few more minutes than me, go ahead and please let us know).

print "Just another Perl hacker,"
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

merlyn@iwarp.intel.com (Randal Schwartz) (08/30/90)

In article <105536@convex.convex.com>, tchrist@convex (Tom Christiansen) writes:
| Is this simple enough for the sort function?
| 
|     sub bynum { substr($a,$[+1,10) > substr($b,$[+1,10); }

Nope nope nope.  I made that same mistake once.  Think about
what this returns... either "1" or "0", not "1" or "-1".  Arrrgh. :-)

You want something like I already posted involving a test and a 1 or
-1 return.

print pack("c*",(32..127)[42,85,83,84,0,65,78,79,84,72,69,82,0,48,69,82,76,0,72,65,67,75,69,82,12])
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (08/30/90)

In article <1990Aug29.191454.23527@iwarp.intel.com> merlyn@iwarp.intel.com (Randal Schwartz) writes:
: In article <105536@convex.convex.com>, tchrist@convex (Tom Christiansen) writes:
: | Is this simple enough for the sort function?
: | 
: |     sub bynum { substr($a,$[+1,10) > substr($b,$[+1,10); }
: 
: Nope nope nope.  I made that same mistake once.  Think about
: what this returns... either "1" or "0", not "1" or "-1".  Arrrgh. :-)
: 
: You want something like I already posted involving a test and a 1 or
: -1 return.

I just had the weirdest thought.  The ne and != operators should maybe
return -1 or +1 when the operands aren't equal.

Larry

merlyn@iwarp.intel.com (Randal Schwartz) (08/30/90)

In article <9337@jpl-devvax.JPL.NASA.GOV>, lwall@jpl-devvax (Larry Wall) writes:
| I just had the weirdest thought.  The ne and != operators should maybe
| return -1 or +1 when the operands aren't equal.

And when Larry has weird thoughts, *I* get to write them down. :-)
Yeah.  >>patch29, right?  Along with -A, -C, -M filetests and "DATA"
filehandle.  Or did you not announce that yet?  Oops... too late. :-)

print "Just another Perl [book] hacker,"
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

Andrew.Vignaux@comp.vuw.ac.nz (Andrew Vignaux) (09/01/90)

In article <1990Aug29.191454.23527@iwarp.intel.com>
merlyn@iwarp.intel.com (Randal Schwartz) writes:
: In article <105536@convex.convex.com>, tchrist@convex (Tom
Christiansen) writes:
: | Is this simple enough for the sort function?
: | 
: |     sub bynum { substr($a,$[+1,10) > substr($b,$[+1,10); }
: 
: Nope nope nope.  I made that same mistake once.  Think about
: what this returns... either "1" or "0", not "1" or "-1".  Arrrgh. :-)

Actually, the Berkeley qsort (well let's be precise -- the qsort on
our MORE/bsd 4.3+ boxes running on hp300s) still manages to sort if
you use this comparison routine!

In article <9337@jpl-devvax.JPL.NASA.GOV>, lwall@jpl-devvax.JPL.NASA.GOV
(Larry Wall) writes:
> I just had the weirdest thought.  The ne and != operators should maybe
> return -1 or +1 when the operands aren't equal.
> 
> Larry

This is a great idea!  In conjunction with || returning the last value
evaluated means sort functions can turn into:

    sub sort_it { $key1{$a} ne $key1{$b} || -(&fun2{$a} != &fun2{$b}); }

where &fun2() can return floats!

Unfortunately, it will break some of my scripts.  E.g

    if (($sort_rank != 0) + ($sort_articles != 0) + ($sort_size != 0) > 1) {
	print STDERR "$program: only one of -r, -a, -s should be supplied\n";
	exit (1);
    }

where $sort_* could be -1, 0, +1.  But I can fix those.

Possible problems:
    +	it looks strange.  Take another look at sort_it!  Does it sort
          things the right way?

    +	people will have to re-think how to use comparison operators
          because "!=" does not mean "!( == )" E.g negating an
          expression (ala De Morgan) will take some thought

    +	-(a != b)  <=>  (-a != -b)  which is new!

    +	it'll screw up my C programming :-)

I realise it isn't a democracy, but you've got my vote.

Andrew
-- 
Domain address: Andrew.Vignaux@comp.vuw.ac.nz

jv@mh.nl (Johan Vromans) (09/01/90)

In article <9337@jpl-devvax.JPL.NASA.GOV>, lwall@jpl-devvax.JPL.NASA.GOV
(Larry Wall) writes:
> I just had the weirdest thought.  The ne and != operators should maybe
> return -1 or +1 when the operands aren't equal.

Please don't. It will break lots of existing scripts. Moreover, perl
is already wierd enough. I can imagine lots of perl novice users
shifting back to grep/sed/nawk once they discover that != is not the
same anymore as ! == .

I hesitate to say so, but I think this deserves a new built-in
fuction, e.g 'order($a,$b)' and 'lexorder($a,$b)'. The latter can also
take internationalization issues into account, so you can have quick
and dirty string comparisons using eq ne gt ge le lt, and formal
correct (hence a bit slower) using lexorder.

	Johan
-- 
Johan Vromans				       jv@mh.nl via internet backbones
Multihouse Automatisering bv		       uucp: ..!{uunet,hp4nl}!mh.nl!jv
Doesburgweg 7, 2803 PL Gouda, The Netherlands  phone/fax: +31 1820 62911/62500
------------------------ "Arms are made for hugging" -------------------------

allbery@NCoast.ORG (Brandon S. Allbery KB8JRR/KT) (09/01/90)

As quoted from <9337@jpl-devvax.JPL.NASA.GOV> by lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall):
+---------------
| In article <1990Aug29.191454.23527@iwarp.intel.com> merlyn@iwarp.intel.com (Randal Schwartz) writes:
| : In article <105536@convex.convex.com>, tchrist@convex (Tom Christiansen) writes:
| : | Is this simple enough for the sort function?
| : | 
| : |     sub bynum { substr($a,$[+1,10) > substr($b,$[+1,10); }
| : 
| : Nope nope nope.  I made that same mistake once.  Think about
| : what this returns... either "1" or "0", not "1" or "-1".  Arrrgh. :-)
| : 
| : You want something like I already posted involving a test and a 1 or
| : -1 return.
| 
| I just had the weirdest thought.  The ne and != operators should maybe
| return -1 or +1 when the operands aren't equal.
+---------------

Not necessary.  One change should do it (untested, I have neither the space
nor the time to bring up Perl on ncoast):

	sub bynum { substr($a,$[+1,10) - substr($b,$[+1,10); }

The -1/+1 idea is interesting, however.  (Another doohickey for the Swiss Army
Chainsaw?  ;-)

++Brandon
-- 
Me: Brandon S. Allbery			    VHF/UHF: KB8JRR/KT on 220, 2m, 440
Internet: allbery@NCoast.ORG		    Delphi: ALLBERY
uunet!usenet.ins.cwru.edu!ncoast!allbery    America OnLine: KB8JRR

flee@guardian.cs.psu.edu (Felix Lee) (09/02/90)

>I just had the weirdest thought.  The ne and != operators should maybe
>return -1 or +1 when the operands aren't equal.

Call it something else, like "cmp" and "<=>" and I'll take it.  I've
had vague yearnings for a comparison operator for a long time.
--
Felix Lee	flee@cs.psu.edu

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (09/02/90)

In article <Fra5m=?1@cs.psu.edu> flee@guardian.cs.psu.edu (Felix Lee) writes:
: >I just had the weirdest thought.  The ne and != operators should maybe
: >return -1 or +1 when the operands aren't equal.
: 
: Call it something else, like "cmp" and "<=>" and I'll take it.  I've
: had vague yearnings for a comparison operator for a long time.

I like those, much though I loathe 3-character operators.  They feel right
and are easy to remember.

While we're redesigning the language, I've been considering another problem.
Subroutines right now have no way to determine what package they were
called from, which makes it difficult to install variables into the correct
package.  In addition, the debugger needs to know the file and line number
a routine was called from.  I propose a function "caller", which does this:

	($package, $file, $line) = caller;

This would be easy to implement--I already keep a pointer to the current
statement, and the current statement contains this info.  It would merely
entail making the saved current statement pointer available to the
subroutine.

Other items sneaking in at the last moment.  Filetests -M, -A and -C will
return the file's age in days (possibly fractional) at the time the script
started.  This will make it much easier to write middle-of-the-night
skulkers.

The tr/// function now has modifier c, d and s.  c complements the searchlist,
d deletes any characters in searchlist WITHOUT a replacement in replacementlist,
and s squashes multiple contiguous occurrences of replacementlist characters
to one occurence.

Chip Salzenberg sent me a complete patch to add System V IPC (msg, sem and
shm calls), so I added them.  If that bothers you, you can always undefine
them in config.sh.  :-)

Lessee...  Oh yeah.  There's a scalar() pseudo-function call that merely
supplies a scalar context in the middle of a list.  I know you can do it
by saying OP . "", but it's better documentation and more efficient.  So
you can say things like

	local($nextline) = scalar(<STDIN>);

I don't see a need for an operator to supply an array context.

I'm also supplying sysread and syswrite as direct hooks to the read
and write system calls, for those times when you just have to get past
standard I/O.

Some of these aren't even implemented yet, but I know I can get them done
by the time the book comes out...  :-)

Holler quick if any of this seems like a tragic mistake.  I already listened
to you on <=> and cmp.  I'm not unreasonable *all* the time.

Larry

jand@kuling.UUCP (Jan Dj{rv) (09/02/90)

Larry Wall writes:

> I just had the weirdest thought.  The ne and != operators should maybe
> return -1 or +1 when the operands aren't equal.
>
No, no, please don't do that. It would be very confusing if != and
! == didn't mean the same thing. It would probably break old programs as well.

Besides, such a feature would only be useful in sort sub's (right ?) and
it isn't that hard to write them correctly.

	Jan D.

tneff@bfmny0.BFM.COM (Tom Neff) (09/04/90)

In article <9384@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
>Other items sneaking in at the last moment.  Filetests -M, -A and -C will
>return the file's age in days (possibly fractional) at the time the script
>started.  This will make it much easier to write middle-of-the-night
>skulkers.

Why when the script started?  Why not their age NOW?  That way you could
not only write middle-of-the-night skulkers, but usable Perl daemons to
check for things at intervals.

If the objection is to repeated time() calls, I suggest the tradeoff is
well worth it.

If the objection is nonetheless sustained, how about making the "$NOW"
variable used by -[MAC] modifiable by the programmer.

-- 
"Just the fax, ma'am."    o..oo    Tom Neff
    -- John McClane       .oo..    tneff@bfmny0.BFM.COM

worley@compass.com (Dale Worley) (09/04/90)

   X-Name: Felix Lee

   Call it something else, like "cmp" and "<=>" and I'll take it.  I've
   had vague yearnings for a comparison operator for a long time.

Yeah, that'd be both neat and useful.

Dale Worley		Compass, Inc.			worley@compass.com
--
I try to make everyone's day a little more surreal.

schwartz@groucho.cs.psu.edu (Scott Schwartz) (09/05/90)

Dale Worley writes:

      Call it something else, like "cmp" and "<=>" and I'll take it.  I've
      had vague yearnings for a comparison operator for a long time.

   Yeah, that'd be both neat and useful.


I sent a note to Larry suggesting this, but here it is again...
Why not "@" for cmp?  i.e.

	$a @ $b 

asks where $a is at in relation to $b.  Because the @ is whitespace
delimited and is looking for strings, it doesn't clash with the @array
token.

tchrist@convex.COM (Tom Christiansen) (09/06/90)

In article <9384@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
>Other items sneaking in at the last moment.  Filetests -M, -A and -C will
>return the file's age in days (possibly fractional) at the time the script
>started.  This will make it much easier to write middle-of-the-night
>skulkers.

The "at the time of the script" part bothers me a bit; I'd MUCH rather it
were their "instantaneous" age.

When I first saw those operators, I expected them to return a time I could
pass to &ctime(), but instead they give me age.  If I want the real time,
I guess I could use (stat(FILE))[9] for mtime and so on.  Certainly these
could be easily implemented as subroutines; I'd venture to guess, Larry,
that you consider them common enough operations to make them builtins.

Speaking of such things, I've found myself writing code like this:

    if ((stat($tmp))[9] <= (stat($orig))[9]) {

And wishing I could use something like this instead

    if ($tmp -nt $orig) 

where "-nt" is a binary operator that returns whether the
first operand (either a FILE or a $file) is younger than 
the second.  

The Korn shell has these three interesting built-in tests:

 file1 -nt file2
      True if file1 exists and is newer than file2.
 file1 -ot file2
      True if file1 exists and is older than file2.
 file1 -ef file2
      True if file1 and file2 exist and refer to the same file.

I would guess the first two compare mtimes and the last 
one compares (dev,ino) pairs.

At least the first two seem common enough to be operators.

--tom
--
 "UNIX was never designed to keep people from doing stupid things, because 
  that policy would also keep them from doing clever things." [Doug Gwyn]

urlichs@smurf.sub.org (Matthias Urlichs) (09/06/90)

In comp.lang.perl, article <15819@bfmny0.BFM.COM>,
  tneff@bfmny0.BFM.COM (Tom Neff) writes:
< In article <9384@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
< >Other items sneaking in at the last moment.  Filetests -M, -A and -C will
< >return the file's age in days (possibly fractional) at the time the script
< >started.  This will make it much easier to write middle-of-the-night
< >skulkers.
< 
< Why when the script started?  Why not their age NOW?  That way you could
< not only write middle-of-the-night skulkers, but usable Perl daemons to
< check for things at intervals.
< 
On the other hand, other useable Perl daemons might get started once a
day/night/week, and may do some lengthy tasks (like analyzing log files, which
is kind of what Perl seems to be meant for originally (other than writing an
RN replacement of course ;-) )) before looking at the date of the next file.
In that case, the script will presumably be started at the same time every
[interval] and testing the age of files will not leave any "windows" which
could conceivably skew the statistics.

Anyway, it's trivial to convert between one way and the other via the time
function and a division by 24*60*60...

< If the objection is nonetheless sustained, how about making the "$NOW"
< variable used by -[MAC] modifiable by the programmer.
< 
...or by saying "$NOW = time", of course, if Larry decides to implement it
that way.
-- 
Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de
Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(Voice)/621227(PEP)

jmc@eagle.inesc.pt (Miguel Casteleiro) (01/31/91)

In article <1991Jan30.195436.8645@agate.berkeley.edu> raymond@math.berkeley.edu (Raymond Chen) writes:
<In article <1991Jan30.181924.47@eagle.inesc.pt>, jmc@eagle (Miguel Casteleiro) writes:
<>[T]he sorting order will be (at least for the portuguese):
<>
<>a A B b c d e f ...
<>
<>If there is an easy way to do this, please let me know.  
<
<# This is a standard trick.
<
<# You only need to do this part once.
<$portuguese_order = "aABbcdef";

This won't work for me as it will simply redefine a new ascii order.
What I need is a sort that will "see" 'a', 'A' and 'B' with the same
sorting value if the words are different (not counting this characters),
and will "see" 'a' < 'A' < 'B' when the words are equal (again, not
counting this characters).

<$ascii_order = 
<   pack("c" . length($portuguese_order), 1 .. length($portuguese_order));
<eval 'sub port2sort 
<      { foreach (@_) {  tr/'.$portuguese_order.'/'.$ascii_order.'/; } }';
<eval 'sub sort2port
<      { foreach (@_) {  tr/'.$ascii_order.'/'.$portuguese_order.'/; } }';
<
<# and here's how you use it:
<
<@words = ("a", "A", "B", "c", "b");

If @words = ("ao", "A", "B", "c", "b");
the sort will be wrong.
The correct sort should be: "A" "B" "ao" "b" "c"

<&port2sort(@words);		# convert to intermediate format
<@sorted_words = sort @words;	# sort the intermediate format
<&sort2port(@sorted_words);	# convert back
<
<print join(":", @sorted_words);
<
< [ A fun 'Japh' example deleted ]
--
                                                                      __
 Miguel Casteleiro at                                            __  ///
 INESC, Lisboa, Portugal.        "News: so many articles,        \\\/// Only
 Email: jmc@eagle.inesc.pt        so little time..."              \XX/ Amiga