[comp.unix.questions] sorting

luke@mtung.ATT.COM (S. Luke Jones) (08/03/90)

Here's a brain teaser for you UNIX tool-and-shell gurus out there.

My news-reader presents articles in the order they appear in
my .newsrc file.  How can I sort it by subject?  (I don't want to do
it by hand because it seems like there's two or three new newsgroups a
day.) 
If I just sort (no options) my .newsrc I get something like
	comp.lang.c++:
	comp.zillions.of.other.things:
	comp.std.c++:
	sci.random.stuff:
	rec.c++:
	talk.bizarre:
	talk.politics.oop.c++:
and so forth.  The articles on (C++) are not clumped together the way
I want them.  But because the c++ appears at a different depth in the
name of each newsgroup,
	sort -t. +5 -6 +4 -5 +3 -4 +2 -3 +1 -2 .newsrc > puke
won't work either.  What am I missing?  Surely there's got to be an
easy way to do this.  Suggestions?
-- 
Luke Jones,   luke@mtung.att.com,  ...!att!mtung!luke,   phone 201/957-2733
Notworked Software Laboratory, Computer Systems Division, Bell Laboratories
Disclaimer: the opinions are mine but I had to sign away my rights to them.
Quote:  "System Test? That's for *idiots*!" (paraphrasing Insp. H. Calahan)

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/05/90)

In article <2943@mtung.ATT.COM> luke@mtung.ATT.COM (S. Luke Jones) writes:
>Surely there's got to be an easy way to do this.

Surely you jest.  You're asking for an artificial intelligence program here.

montnaro@spyder.crd.ge.com (Skip Montanaro) (08/05/90)

In article <13485@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:

   In article <2943@mtung.ATT.COM> luke@mtung.ATT.COM (S. Luke Jones) writes:
   >Surely there's got to be an easy way to do this.

   Surely you jest.  You're asking for an artificial intelligence program here.

I don't know. The following short shell script seems to do what Luke wanted
(cluster all the C++ groups together, regardless of their spot in the Usenet
tree).

Skip (montanaro@crdgw1.ge.com)

#!/bin/sh

newsrc=$1

options="`egrep '^options' $1`"

echo $options

egrep -v "^options" $1 | \
sed -e 's/\./	/g' -e 's/! /!/' -e 's/: /:/' | \
awk '{for (i=NF; i>0; i--) printf("%s\t", $i); printf("\n"); }' | \
sort | \
awk '{for (i=NF; i>0; i--) printf("%s\t", $i); printf("\n"); }' | \
sed -e 's/	/./g' -e 's/!/! /' -e 's/:/: /' | \
sed -e 's/\.$//'

--
Skip (montanaro@crdgw1.ge.com)

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/06/90)

In article <MONTNARO.90Aug5102632@spyder.crd.ge.com> montanaro@crdgw1.ge.com (Skip Montanaro) writes:
>In article <13485@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>   Surely you jest.  You're asking for an artificial intelligence program here.
>I don't know. The following short shell script seems to do what Luke wanted
>(cluster all the C++ groups together, regardless of their spot in the Usenet
>tree).

It only orders by the last member of the hierarchy, which is not very helpful
(as you can see by running a large .newsrc through this process).  For example,
foo.c++.bugs would be collected with other *.bugs, not with other c++ groups.

jon@jonlab.UUCP (Jon H. LaBadie) (08/11/90)

In article <13492@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> In article <MONTNARO.90Aug5102632@spyder.crd.ge.com> montanaro@crdgw1.ge.com (Skip Montanaro) writes:
> >In article <13485@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> >   Surely you jest.  You're asking for an artificial intelligence program here.
> >I don't know. The following short shell script seems to do what Luke wanted
> >(cluster all the C++ groups together, regardless of their spot in the Usenet
> >tree).
> 
> It only orders by the last member of the hierarchy, which is not very helpful
> (as you can see by running a large .newsrc through this process).  For example,
> foo.c++.bugs would be collected with other *.bugs, not with other c++ groups.

While I see little value in the original poster's desires, Doug's comment
on Skip's solution is easily taken care of with out AI type code.  To wit:

	cut -d: -f1 .newsrc |

	awk -F. '
	{
		for (i = NF; i >= 1; i--)
			printf("%s ", $i)
		print ""
	}' |

	paste -d: - .newsrc |

	sort |

	cut -d: -f2- > puke

And should you not have cut (a Xenix related deficiency) let me know
and I'll provide an awk/shell script to mimic it.  Or perhaps there
is a ???.sources version around.

Jon

-- 
Jon LaBadie
{att, princeton, bcr}!jonlab!jon
{att, attmail, bcr}!auxnj!jon

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (08/19/90)

In article <13492@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> In article <MONTNARO.90Aug5102632@spyder.crd.ge.com> montanaro@crdgw1.ge.com (Skip Montanaro) writes:
> > In article <13485@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> > > Surely you jest.  You're asking for an artificial intelligence program here.
> > I don't know. The following short shell script seems to do what Luke wanted
    [ sorts by last part of newsgroup name ]
> foo.c++.bugs would be collected with other *.bugs, not with other c++ groups.

Hmmm. Perhaps remove the tail from *.misc, then collect foo.bar.* under
foo.bar whenever the latter exists. Does this do the trick?

---Dan

wmark@wb3ffv.ampr.org (Mark Winsor) (04/13/91)

I have to files that are in no particular order, but common numbers in one.
Is there any way to sort the data in one file to be in the order of the other?
One of these files can't be sorted because it is a pointer file and needs
the record offsets to stay the same.   

Ex:

	 File 1         File 2
	 ------         ------
                               I need file 2 to be:    100
	 100            500                                500
	 500            100                                300
	 500            300
	 500
	 300
	 300

Anybody have any ideas? (awk,shell,perl (which i'll get if i have to), or
                        even a C algorythm)

Thanks,

Mark S. Winsor
ProVAR, Inc.