[comp.unix.questions] Shell Programming Question - sort

marwood@ncs.dnd.ca (Gordon Marwood) (03/03/90)

I am trying to sort (using "sort" in Ultrix) based on the last two
characters in a line (which are numeric).  There are a variable number
of characters in a line, and these last two characters are preceded by a
space, there are also a variable number of spaces in a line, so the
number of fields will be variable if space is used as the field
separator.  None of my available texts gives me a clue as to whether a
sort can be done based on the last field in a line, regardless of the
number of fields in the line.  Is there any "sort" option that can 
do this ?

Gordon Marwood
Internet: marwood@ncs.dnd.ca

maart@cs.vu.nl (Maarten Litmaath) (03/03/90)

In article <751@ncs.dnd.ca>,
	marwood@ncs.dnd.ca (Gordon Marwood) writes:
)I am trying to sort (using "sort" in Ultrix) based on the last two
)characters in a line (which are numeric).  There are a variable number
)of characters in a line, and these last two characters are preceded by a
)space, [...]

sed 's/\(.*\) \(..\)$/\2 \1/' | sort | sed 's/\(..\) \(.*\)/\2 \1/'
--
  "Belfast: a sentimental journey to the Dark Ages - Crusades & Witchburning
  - Europe's Lebanon - Book Now!" | maart@cs.vu.nl,  uunet!mcsun!botter!maart

emv@math.lsa.umich.edu (Edward Vielmetti) (03/03/90)

In article <751@ncs.dnd.ca> marwood@ncs.dnd.ca (Gordon Marwood) writes:

   I am trying to sort (using "sort" in Ultrix) based on the last two
   characters in a line (which are numeric).  There are a variable number
   of characters in a line, and these last two characters are preceded by a
   space, there are also a variable number of spaces in a line, so the
   number of fields will be variable if space is used as the field
   separator.  None of my available texts gives me a clue as to whether a
   sort can be done based on the last field in a line, regardless of the
   number of fields in the line.  Is there any "sort" option that can 
   do this ?

Here's how I'd do it in perl:

perl -e '@all=<>;sub l {substr($a,2,-2)-substr($b,2,-2);}; print sort l @all;'

If I had to do it without perl, I'd use awk to copy the last two 
characters on the line to the beginning, sort on those, and then
sed them out.  Glad I don't have to do that anymore.

Just ano...no, you don't want to hear that,

--Ed

Edward Vielmetti, U of Michigan math dept.

brad@SSD.CSD.HARRIS.COM (Brad Appleton) (03/07/90)

Sorry to post this to the net but I couldnt successfully send mail!

In article <751@ncs.dnd.ca> Gordon Marwood writes:
>I am trying to sort (using "sort" in Ultrix) based on the last two
>characters in a line (which are numeric).  There are a variable number
>of characters in a line, and these last two characters are preceded by a
>space, there are also a variable number of spaces in a line, so the
>number of fields will be variable if space is used as the field
>separator.  None of my available texts gives me a clue as to whether a
>sort can be done based on the last field in a line, regardless of the
>number of fields in the line.  Is there any "sort" option that can 
>do this ?
>
>Gordon Marwood
>Internet: marwood@ncs.dnd.ca

If I knew what you are trying to achieve (with sort) I might be able
to provide better assistance! All I can suggest is:

1) see if you can get to the last field using $NF in awk

2) (probably better than #1 ...)

I believe that BSD Unix should have a command called "reverse" which
reverses the characters in each line of a file. Run your file through
reverse, then sort, then back through reverse (this may not work 
depending upon how fancy a sort you need to do).

	reverse file | sort [options] | reverse [-] > outfile

Just in case you dont have reverse... It is easy to write! The following
did just fine for me on Xenix (it is not a superlatively written piece of 
code but it performs a simple job, and it works (on Xenix anyway :-):

------------------cut here----------------------cut here----------------
/**
* reverse.c -- C source to reverse the characters in the lines of one
*              or more files.
*
*  NAME
*      reverse -- reverse the characters in each line of input
*
*  SYNOPSIS
*      reverse  [-]|[ filename ... ]
*
*  DESCRIPTION
*      Reverse will each line of input and print the resultant line on
*      the standard output. If "-" is given as a filename, then input is 
*      taken from the standard input. Actually, "-" may be listed as one
*      of several filenames and, at that time, stdin will be used for input
*      (after the previous files) and then will continue with the remaining
*      files. I have not tried this out however!
*
*  Created Mar '89 by Brad Appleton
*
*  Mar 8 '90, Brad Appleton -- made same minor additions of #defines and
*                              subroutines in order to not require my own
*                              personal .h files
**/

#ifndef TRUE
#define  TRUE  1
#define  FALSE 0
#endif

#define  USE_STDIN "-"
#define  LINE_LEN  512		/* make this as big as you need */

#include <stdio.h>

/* ckopen -- open file; check for success */
FILE *ckopen( filename, filemode )  char *filename, *filemode;
{
   FILE *fopen(), *fp;

   if ( (fp = fopen( filename, filemode )) == NULL ) {
     fprintf( stderr, "reverse: unable to open %s\n", filename );
     exit( 2 );
   }/* if */

   return  (fp);
}/* ckopen() */


/* reverse -- reverse the chars in a string (but not the newline) */
void  reverse( str )  char *str;
{
  char  hold;
  int   i, j, append = FALSE, len = strlen( str );

  if ( str[ len -1 ] == '\n' )  {
    str[ --len ] = '\0';
    append = TRUE;
  } /* if new-line */

  for ( i = 0, j = (len -1 ) ; i <= j ; i++, j-- )  {
      /* swap( str[i], str[j] ) */
    hold   = str[i];
    str[i] = str[j];
    str[j] = hold;
  }/* for */

  if ( append )    str[ len ] = '\n';

}/* reverse() */


main( argc, argv )  int argc; char *argv[];
{
  int  i;
  char line[ LINE_LEN ];
  FILE *infile;

  if ( argc == 1 )  {  /* print usage and exit */
    fprintf( stderr, "usage:  remind  [-]|[filename ...]\n" );
    exit( 1 );
  }/* if no args */

  /* process each file in the order given on the command line */
  for ( i = 1 ; i < argc ; i++ )  {
    if ( ! strcmp( argv[i], USE_STDIN ) )
      infile = stdin;
    else
      infile = ckopen( argv[i], "r" );

    while ( fgets( line, LINE_LEN, infile ) != NULL )  {
      reverse( line );
      fputs( line, stdout );
    }/* while */
  }/* for each arg */

  exit( 0 );
}/* main */
----------------finish cut---------------------finish cut-----------------------

+=-=-=-=-=-=-=-=-= "... and miles to go before I sleep." -=-=-=-=-=-=-=-=-=-+
|  Brad Appleton                       |  Harris Computer Systems Division  |
|                                      |  2101  West  Cypress  Creek  Road  |
|      brad@ssd.csd.harris.com         |  Fort  Lauderdale, FL  33309  USA  |
|     ... {uunet | novavax}!hcx1!brad  |  MailStop 161      (305) 973-5007  |
+=-=-=-=-=-=-=-=- DISCLAIMER: I said it, not my company! -=-=-=-=-=-=-=-=-=-+

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (03/08/90)

In article <3190@hcx1.SSD.CSD.HARRIS.COM> brad@SSD.CSD.HARRIS.COM (Brad Appleton) writes:
: I believe that BSD Unix should have a command called "reverse" which
: reverses the characters in each line of a file. Run your file through
: reverse, then sort, then back through reverse (this may not work 
: depending upon how fancy a sort you need to do).
: 
: 	reverse file | sort [options] | reverse [-] > outfile

Kinda hard to sort on a numeric field that way tho, which I think he
wanted...

: Just in case you dont have reverse... It is easy to write!

It's even easier than a 97 line C program:

    perl -ne 'chop; print reverse "\n", split(//);'

But probably not as efficient.  Especially if you include the time to
install perl.   :-)

Larry Wall
lwall@jpl-devvax.jpl.nasa.gov

justin@reed.bitnet (the specific heat) (03/10/90)

Gordon Marwood says:
>I am trying to sort (using "sort" in Ultrix) based on the last two
>characters in a line (which are numeric).  There are a variable number
>of characters in a line, and these last two characters are preceded by a
>space, there are also a variable number of spaces in a line, so the
>number of fields will be variable if space is used as the field
>separator.  None of my available texts gives me a clue as to whether a
>sort can be done based on the last field in a line, regardless of the
>number of fields in the line.  Is there any "sort" option that can 
>do this ?

The way to do this using standard unix tools which comes most quickly to my
mind is:

$ awk '{ print $NF " " $0}' filename | sort -n | sed 's/[^ ]* //'

.  The first command prepends the last field to the line, and the last undoes
that.  The output will be on stdout, so you'll have to deal with that.  Brad
Appleton suggests using reverse(1STAT) which has, on our machine, a -f (reverse
by fields) option.  So you'd do

$ reverse -f <filename | sort -n | reverse -f

.  I tried this and got output with the words separated by tabs where they were
originally separated by spaces.  Larry Wall suggests, as always, the perl
solution:

$ perl -ne 'chop; print reverse "\n", split(//);'

.  I couldn't get this to work, no matter where i put the filename.  I may have
a defective version of perl.  The chief advantage of the first solution is that
it uses well-documented tools which will be present on any unix box you'll ever
use.
--
                         JUSTIN@REED.BITNET (or) [tektronix,ogicse]!reed!justin
   Member HASA, [/bin/]sed (sOCIETY OF emacs dEVOTEES), Church of the SubGenius
  (in BOB we trust), ROCOCO, the Illuminati and any other absurdities i can get
                                    my grubby paws on (suggestions appreciated)