rock@warp.Eng.Sun.COM (Bill Petro) (05/15/91)
How could I do this using either the sort command, or perl or awk? I have the following data in this format - I want to sort it on the right most column. The columns are not tab delimited, and some of the columns have arbitrary numbers of words (specifically the third column from the left, column c). Sort assumes that you start numbering on the left, and count from 0. How would you start with the right most column, column e? 0 1 2 3 4 a b c c e FOO BAR ACE CORPORATION SUNNYVALE 2.00 FOO BAR ACER COMPUTED COMPANY MILPITAS 20.00 FOO BAR APOLLO COMPUTER, INC. CHELMSFORD 1.00 FOO BAR APPLE COMPUTER, INC. CUPERTINO 8.00 FOO BAR BOEING TUKWILA 53.00 FOO BAR BOEING COMPUTER SERVICES EDDYSTONE 2.00 FOO BAR CITIBANK N. A. ANDOVER 4.00 FOO BAR CITIBANK NORTH AMERICA LONG ISLAND CITY 26.00 Thanks! -- Bill Petro {decwrl,hplabs,ucbvax}!sun!Eng!rock "UNIX for the sake of the kingdom of heaven" Matthew 19:12
tchrist@convex.COM (Tom Christiansen) (05/15/91)
From the keyboard of rock@warp.Eng.Sun.COM (Bill Petro): :How could I do this using either the sort command, or perl or awk? : :I have the following data in this format - I want to sort it on the :right most column. The columns are not tab delimited, and some of the :columns have arbitrary numbers of words (specifically the third column :from the left, column c). Sort assumes that you start numbering on the :left, and count from 0. How would you start with the right most :column, column e? : : FOO BAR ACE CORPORATION SUNNYVALE 2.00 : FOO BAR ACER COMPUTED COMPANY MILPITAS 20.00 How about? sort -t^A +0.60n Assumptions: no control-A's in data the sort should be numeric the numbers don't begin until column 61 --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "So much mail, so little time."
felps@convex.com (Robert Felps) (05/15/91)
In <13320@exodus.Eng.Sun.COM> rock@warp.Eng.Sun.COM (Bill Petro) writes: >How could I do this using either the sort command, or perl or awk? >I have the following data in this format - I want to sort it on the >right most column. The columns are not tab delimited, and some of the >columns have arbitrary numbers of words (specifically the third column >from the left, column c). Sort assumes that you start numbering on the >left, and count from 0. How would you start with the right most >column, column e? > 0 1 2 3 4 > a b c c e > FOO BAR ACE CORPORATION SUNNYVALE 2.00 > FOO BAR ACER COMPUTED COMPANY MILPITAS 20.00 > FOO BAR APOLLO COMPUTER, INC. CHELMSFORD 1.00 > FOO BAR APPLE COMPUTER, INC. CUPERTINO 8.00 > FOO BAR BOEING TUKWILA 53.00 > FOO BAR BOEING COMPUTER SERVICES EDDYSTONE 2.00 > FOO BAR CITIBANK N. A. ANDOVER 4.00 > FOO BAR CITIBANK NORTH AMERICA LONG ISLAND CITY 26.00 I don't know of any way to sort based on right to left. But you might try $ sort +0.60bn -0.70bn data Which sorts columns 60 thru 70 ignoring blanks and using numeric values to sort on. >Thanks! >-- > Bill Petro {decwrl,hplabs,ucbvax}!sun!Eng!rock >"UNIX for the sake of the kingdom of heaven" Matthew 19:12 Hope it helps, Robert Felps I do not speak for felps@convex.com Convex Computer Corp Convex and I seldom Product Specialist 3000 Waterview Parkway speak for myself. Tech. Assistant Ctr Richardson, Tx. 75080 VMS? What's that? 1(800) 952-0379
tchrist@convex.COM (Tom Christiansen) (05/15/91)
From the keyboard of felps@convex.com (Robert Felps): :I don't know of any way to sort based on right to left. But you might try You reverse all the fields, sort on first field, then reverse them again. This might screw up fixed column stuff though. --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "So much mail, so little time."
ziegast@eng.umd.edu (Eric W. Ziegast) (05/15/91)
Off the top of my head I did: #!/usr/bin/perl # Read in data from STDIN (or specified files) in to @lines @lines = (<>); # Subroutine to numerically compare last space-separated fields. sub rightmost { local(@a) = split(" ",$a); local(@b) = split(" ",$b); $a[$#a] <=> $b[$#b]; } foreach (sort rightmost @lines) { print; } After feeding it the following as standard input: FOO BAR ACE CORPORATION SUNNYVALE 2.00 FOO BAR ACER COMPUTED COMPANY MILPITAS 20.00 FOO BAR APOLLO COMPUTER, INC. CHELMSFORD 1.00 FOO BAR APPLE COMPUTER, INC. CUPERTINO 8.00 FOO BAR BOEING TUKWILA 53.00 FOO BAR BOEING COMPUTER SERVICES EDDYSTONE 2.00 FOO BAR CITIBANK N. A. ANDOVER 4.00 FOO BAR CITIBANK NORTH AMERICA LONG ISLAND CITY 26.00 I got back: FOO BAR APOLLO COMPUTER, INC. CHELMSFORD 1.00 FOO BAR ACE CORPORATION SUNNYVALE 2.00 FOO BAR BOEING COMPUTER SERVICES EDDYSTONE 2.00 FOO BAR CITIBANK N. A. ANDOVER 4.00 FOO BAR APPLE COMPUTER, INC. CUPERTINO 8.00 FOO BAR ACER COMPUTED COMPANY MILPITAS 20.00 FOO BAR CITIBANK NORTH AMERICA LONG ISLAND CITY 26.00 FOO BAR BOEING TUKWILA 53.00 There's probably a better way of doing this (isn't there always?), but this is probably the most direct. -- +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+ | Eric W. Ziegast Internet: ziegast@eng.umd.edu | | Univ. of Merryland Phonenet: Eric@[301.405.3689] | +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
merlyn@iwarp.intel.com (Randal L. Schwartz) (05/15/91)
In article <13320@exodus.Eng.Sun.COM>, rock@warp (Bill Petro) writes: | I have the following data in this format - I want to sort it on the | right most column. The columns are not tab delimited, and some of the | columns have arbitrary numbers of words (specifically the third column | from the left, column c). Sort assumes that you start numbering on the | left, and count from 0. How would you start with the right most | column, column e? | | 0 1 2 3 4 | | a b c c e | | FOO BAR ACE CORPORATION SUNNYVALE 2.00 | FOO BAR ACER COMPUTED COMPANY MILPITAS 20.00 | FOO BAR APOLLO COMPUTER, INC. CHELMSFORD 1.00 | FOO BAR APPLE COMPUTER, INC. CUPERTINO 8.00 | FOO BAR BOEING TUKWILA 53.00 | FOO BAR BOEING COMPUTER SERVICES EDDYSTONE 2.00 | FOO BAR CITIBANK N. A. ANDOVER 4.00 | FOO BAR CITIBANK NORTH AMERICA LONG ISLAND CITY 26.00 Well, here's a terribly inefficient one, good enough for small amounts of data: ################################################## #!/usr/bin/perl sub bylastcol { @a = split(/\s+/, $a); @b = split(/\s+/, $b); pop(@a) <=> pop(@b); } print sort bylastcol <>; ################################################## This is inefficient because the "last column" is computed and recomputed over and over again on each compare. A better way would be to compute it once and cache it: ################################################## #!/usr/bin/perl sub once { return $n if defined($n = $once{$_[0]}); @a = split(/\s+/, $_[0]); $once{$_[0]} = pop(@a); } sub bylastcol { &once($a) <=> once($b); } print sort bylastcol <>; ################################################## An even better way requires the cooperation of the calling routine to make a normal array, rather than an associative array, as in: ################################################## #!/usr/bin/perl sub byaux { $aux[$a] <=> $aux[$b]; } @data = <>; for (@data) { @a = split(/\s+/, $_); push(@aux, pop(@a)); } print @data[sort byaux $[..$#data]; ################################################## print "Just another Perl hacker," -- /=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\ | on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn | \=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/
suhre@ondine.dsd.trw.com (Maurice E. Suhre) (05/17/91)
In article <13320@exodus.Eng.Sun.COM> rock@warp.Eng.Sun.COM (Bill Petro) writes: > >How could I do this using either the sort command, or perl or awk? > >I have the following data in this format - I want to sort it on the >right most column. [Most of data file deleted for brevity] > FOO BAR CITIBANK N. A. ANDOVER 4.00 > FOO BAR CITIBANK NORTH AMERICA LONG ISLAND CITY 26.00 > I haven't seen an awk solution yet, so here's one. The strategy is to generate a line which has the last field duplicated in front and uses a different field separator. I selected the "plus" sign. There needs to be some character which is not found in the data file. #! /bin/sh # awk '{print $NF "+" $0}' | sort -n -t+ +0 -1 ,t1 | awk -F+ '{ print $2}' -- Maurice Suhre suhre@trwrb.dsd.trw.com
sherman@unx.sas.com (Chris Sherman) (05/18/91)
In <13320@exodus.Eng.Sun.COM> rock@warp.Eng.Sun.COM (Bill Petro) writes: >How could I do this using either the sort command, or perl or awk? >I have the following data in this format - I want to sort it on the >right most column. > 0 1 2 3 4 > a b c c e > FOO BAR ACE CORPORATION SUNNYVALE 2.00 > FOO BAR ACER COMPUTED COMPANY MILPITAS 20.00 I hate to be a spoil sport, but how about from the unix prompt: % sort -n +.60b Maybe you could do something like this inside perl... I'm not sure how though, and still keep it really fast. -- Chris Sherman .................... sherman@unx.sas.com | ,-----------------------------------------' / Q: How many IBM CPU's does it take to execute a job? | A: Four; three to hold it down, and one to rip its head off.
jpr@jpradley.jpr.com (Jean-Pierre Radley) (05/23/91)
>From the keyboard of felps@convex.com (Robert Felps): >:I don't know of any way to sort based on right to left. But you might try > With awk, $NF is the last field. So pipe the file through awk, prepending the last field ($NF) to the whole line ($0), separated by something wierd, say ^B. Pipe this to sort, then pipe to a sed to ditch characters up to and including the ^B. As I think of it, the first awk could be a sed filter too. Jean-Pierre Radley Unix in NYC jpr@jpr.com jpradley!jpr CIS: 72160,1341
dattier@vpnet.chi.il.us (David W. Tamkin) (05/24/91)
jpr@jpradley.jpr.com (Jean-Pierre Radley) wrote in <1991May22.191522.19709@jpradley.jpr.com>: | >From the keyboard of felps@convex.com (Robert Felps): | >:I don't know of any way to sort based on right to left. But you might try | With awk, $NF is the last field. So pipe the file through awk, prepending | the last field ($NF) to the whole line ($0), separated by something wierd, | say ^B. Pipe this to sort, then pipe to a sed to ditch characters up to and | including the ^B. | As I think of it, the first awk could be a sed filter too. You know, if you're looking to replace the original file with a sorted one, you could do the whole thing in ex. Let's say your field separator is a space: ex - file << \,, %s/.* \(.*\)/\1 &/ %!sort %s/[^ ]* // wq ,, David Tamkin PO Box 7002 Des Plaines IL 60018-7002 dattier@vpnet.chi.il.us GEnie:D.W.TAMKIN CIS:73720,1570 MCIMail:426-1818 708 518 6769 312 693 0591 "Parker Lewis Can't Lose" mailing list: flamingo-request@esd.sgi.com (relay) flamingo-request@ddsw1.mcs.com (digest)