[comp.binaries.ibm.pc.d] Script to use with combine

bwwilson@lion.waterloo.edu (Bruce Wilson) (03/17/89)

Hi,
Here's a little script I find very handy for grabbing stuff from
c.b.i.p.  Just cut it out, put it somewhere in your path and 
make it executable (chmod +x).  

-----Start of apply-----
command=$1
shift
while test $1
do ${command} $1
shift
done
-----End of apply-----

Here are some examples of it's use:

apply combine *.uue              /* runs combine on all the .uue files */
apply 'public arc -t' *.arc      /* test all the arc files */
apply combine 'moria*' 'dis*'    /* runs combine on moria* then dis* */

And for those who don't know the combine script:

-----cut here(beg)-----
#! /bin/sh
cat $* | sed '/^END/,/^BEGIN/d' | uudecode
-----cut here(end)-----

Have fun,
bruce

bruce Wilson               | "what you don't spend you don't have to earn..."
bwwilson@lion.waterloo.edu |    from Blake (a film by Bill Mason)
bruce Wilson               | "what you don't spend you don't have to earn..."
bwwilson@lion.waterloo.edu |    from Blake (a film by Bill Mason)

arwillms@crocus.waterloo.edu (Allan Willms) (03/21/89)

In article <12424@watdragon.waterloo.edu> bwwilson@lion.waterloo.edu (Bruce Wilson) writes:
>Here's a little script I find very handy for grabbing stuff from
>c.b.i.p.  Just cut it out, put it somewhere in your path and 
>make it executable (chmod +x).  
[apply script deleted]
>Here are some examples of it's use:
>apply combine *.uue              /* runs combine on all the .uue files */
>apply 'public arc -t' *.arc      /* test all the arc files */
>apply combine 'moria*' 'dis*'    /* runs combine on moria* then dis* */

This will do the same thing:
--
#!/bin/csh
foreach file ( ${argv[2-]} )
$1 $file
end
--
I usually just do it from the command line since it isn't that much to
type.  For example:

foreach file (*.uue)
combine $file
end

or

foreach file (*.zoo)
zoo v $file
end

etc...

Of course it's using the csh.

dhesi@bsu-cs.UUCP (Rahul Dhesi) (03/22/89)

In article <12536@watdragon.waterloo.edu> arwillms@crocus.waterloo.edu (Allan
Willms) writes:
>foreach file (*.zoo)
>zoo v $file
>end

Try:

     zoo Lv *.zoo
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi
                    ARPA:  dhesi@bsu-cs.bsu.edu

rusty@cadnetix.COM (Rusty) (03/23/89)

In article <12424@watdragon.waterloo.edu> bwwilson@lion.waterloo.edu (Bruce Wilson) writes:
>Hi,
>Here's a little script I find very handy for grabbing stuff from
>c.b.i.p.  Just cut it out, put it somewhere in your path and ...

(Before we begin, I apologize for posting 12k of stuff here, but
I HOPE that it will be useful to folks.  Once I get it totally 
finished (?) I plan to sharc it and submit it to all the source
groups I currently deal with.  There are some assumptions about
how multiple archives are done, but these scripts have worked
on comp/binaries/ibm/pc and comp/sources/x, apparently just fine.)

Well, what with all the talking about scripts and such, I have decided
that it is time to post the stuff I've been working on.  It is in 2 
'modules', if you will, one which automatically grabs the articles
from our news server every night (previously posted, but its small
enough I've included it here), and the bigger one which will take
multiple shar and uuencoded archives and unshar or uudecode them.

Currently the scripts do not handle single-file archives, but I'm
working on it.  (Actually, it is a pretty simple change to the 
scripts, but I just finished these last night and wanted to post it
while I'm thinking about it.)

These scripts are hereby unleashed upon the world.  You may not sell
them (If you find somebody stupid enough to pay money for these, 
I want to meet them!), you may not call them your own.  (And, 
considering the hacki-ness of these things, I probably should 
not let you call them mine! :-).  But, in all cases, 'shar' and
enjoy (pun intended, I think).  I disclaim all responsibility
for your results.  Do NOT use these on newsgroups where you have any
feeling that a virus may be propagated, as the un-shar portion does
not do any security checking.  

Well, enough of that.  Here it is.  First, since I don't have a 
shar-generator, its in poor-mans shar (sorry about that! anybody
have a shar-generator?).

One other thing this is missing - it currently does not keep a copy of
the 'header' for the first posting.  That feature will also show
up when I fix the non-single-file problem.

I keep this in ~/bin/news.un-news-er:
#! /bin/csh
# this script is a test script to be run in a subdir in which you wish to
# convert news articles into their uudecoded or unshar'd (as appropriate)
# peices output goes into a directory called 'unpacked', compressed copies
# of the original articles go into 'unpacked/backup'.  A list of all 
# subject lines is maintained in 'unpacked/subjects', and a list of 
# sorted subject lines is kept in 'unpacked/subjects.sorted'.
#   '.subjects' and '.subjects.sorted' are lists for currently-
# not-unpacked files.

# first, lets only deal with the newsgroups which use volume numbers, which
# currently are:
#	comp/sources/games
#	comp/sources/misc
#	comp/sources/x
#	comp/binaries/ibm/pc
# this list will be stored in ~/.news.autodearc  (actually, I am only
#   doing comp/binaries/ibm/pc at the moment)

# algorithm is as follows:
#   for all articles in the directory
#	get all subjects into 'subjects' and
#			 then sort them into 'subjects.sorted'
#	for all subject lines with the word 'part' near 
#			the end of the line and
#	   <number1>/<number2> follows the word 'part' and number1 == number2
#		find first file which belongs to this set 
#		find out what sort of archive it is.
#		if shar archive, then
#			handle multiple-file shar archive
#		else if uuencode archive then
#			handle multiple-file uuencode archive
#	end for
#	for all subject lines without the word 'part' near the end of the line
#		If it is a shar archive then
#			handle single-file shar archive
#		else if it is uuencode archive then
#			handle single-file uuencode archive
#		else append subject line to file to mail to person 
#
# scriptdir is where to find the scripts for this stuff.  There is 
#	probably a way to find out where we were run from and use 
#	THAT path, but I don't know how.  Anybody care to enlighten me?
#set echo ; set verbose
set SCRIPTDIR = '~/bin'
set MAILFILE = '~/news.autodearc.results'
set subdirlist = `egrep -v '(^#)' ~/.news.autodearc`
echo results of automatic de-mailing: >$MAILFILE
foreach subdir ($subdirlist)
    pushd $subdir
    if( ! -d .COMPRESSED.STUFF ) mv .COMPRESSED.STUFF COMPRESSED.STUFF.WHATSIT
    if( ! -e .COMPRESSED.STUFF ) mkdir .COMPRESSED.STUFF
    if( ! -d .TOTAKE ) mv .TOTAKE TOTAKE.WHATSTHIS
    if( ! -e .TOTAKE ) mkdir .TOTAKE
   set nonomatch ; rm .subjects* ; unset nonomatch
   egrep '(^Subject)' * |  sort -f -3.0 > .subjects.sorted
   awk -f $SCRIPTDIR/news.unnews.awk1 .subjects.sorted > .subjects.todo
   set SUBLIST = `awk -F: '{print $1}' .subjects.todo`
# the above awk script returns the list of files which have 'part x/x' in them
   set SETLIST = ''
   foreach SET ($SUBLIST)
      set ZZZ = "(^$SET)"
      set LINE = `egrep $ZZZ .subjects.todo`
      set ARCNAME = `echo $LINE | awk -F: '{print $3}'`
      @ NFILES   = `echo $LINE | awk '{print $(NF-1)}'`
      @ ENDLINE = `echo $LINE | awk '{print $NF}'`
      @ STARTLINE = $ENDLINE + 1 - $NFILES 
#WARNING - the following 2 lines should only be ONE line.
      set FILELIST = `awk "NR<=$ENDLINE {print}" .subjects.sorted | \
		awk "NR>=$STARTLINE {print}" | awk -F: '{print $1}'`
# see if this is a uuencoded mess, below returns true if it is
      set FIRSTFILE = `echo $FILELIST | awk '{print $1}'`
      if ( "` egrep '(^begin *[0-9]+)' $FIRSTFILE`" !='' ) then
	 echo "uuencoded data: $LINE" >> $MAILFILE
	 pushd .TOTAKE
#	 $SCRIPTDIR/news.unnews.multiple.uuencode $FILELIST
	 if ( -e .trashit. ) rm .trashit. 
	 foreach SHARC ($FILELIST)
	    cat ../$SHARC |sed '/^BEGIN/,/^END/d' >>.trashit.
	 end
	 uudecode .trashit. && rm .trashit.
	 popd
	 mv $FILELIST .COMPRESSED.STUFF 
#	   it could be argued that compressing uuencoded stuff is useless.
#	   It probably is a waste of time, but lets do it anywho.
	 pushd .COMPRESSED.STUFF ; compress $FILELIST ; popd
# see if it is a shar archive
      else if ("`egrep '(^#! *\/bin)' $FIRSTFILE`" != '' ) then
	 echo "shar archive  : $LINE" >> $MAILFILE
	 mkdir .TOTAKE/$ARCNAME
	 pushd .TOTAKE/$ARCNAME
	 foreach SHARC ($FILELIST)
#WARNING - the following 2 lines should only be ONE line.
	    cat ../../$SHARC |awk '/(^#! *\/bin)/,/(^exit)/ {print}' |\
			sh >> unshar.report
	 end
	 popd
	 mv $FILELIST .COMPRESSED.STUFF 
	 pushd .COMPRESSED.STUFF ; compress $FILELIST ; popd
      else
	 echo "not recognized: $LINE" >> $MAILFILE
      endif
    end
    popd
end
echo END of results >> $MAILFILE
cat $MAILFILE | mail  $USER

END of news.un-news-er

Now, here is the new.unnews.awk1 file:
# this awk script will return only those lines which have 'part' near the
# end of the line and which also have 2 numbers following the 'part' 
# (separated by either 'of' or '/') where those 2 numbers are equal.
# It also adds to the end of the line  the number of files making up
# the set and the line number this line appears in the file.
/(part|Part|PART) *[0-9]+\/[0-9]+ *$/ {  
	# match Part<num>/<num>$,  now get last number
	#STRNG = "/[0-9]*"
	# TEST1 = match( $NF, STRNG )
	I = length ( $NF )
	STRNG = $NF
	while ( substr(STRNG,I,1) != "/" ) I--
	LASTNUM = substr( $NF, I + 1 )
	#TEST1 = match($NF, "[0-9]*")
	I = 1
	while ( substr(STRNG,I,1) != "/" ) I++
	J = 0
	found = 0
	while (found == 0) {
	   J = J + 1
	   if ( substr(STRNG,J,1) == "0" ) found = 1
	   else if ( substr(STRNG,J,1) == "1" ) found = 1
	   else if ( substr(STRNG,J,1) == "2" ) found = 1
	   else if ( substr(STRNG,J,1) == "3" ) found = 1
	   else if ( substr(STRNG,J,1) == "4" ) found = 1
	   else if ( substr(STRNG,J,1) == "5" ) found = 1
	   else if ( substr(STRNG,J,1) == "6" ) found = 1
	   else if ( substr(STRNG,J,1) == "7" ) found = 1
	   else if ( substr(STRNG,J,1) == "8" ) found = 1
	   else if ( substr(STRNG,J,1) == "9" ) found = 1
	   endif endif endif endif endif endif endif endif endif endif
	   }
	FIRSTNUM =substr( $NF, J ,I - J )
	if( FIRSTNUM == LASTNUM )
	   print $0 " " LASTNUM " " NR
	}

/((part|Part|PART) *[0-9]+\/[0-9]+\) *$)/ {  
	# match Part<num>/<num>$,  now get last number
	#STRNG = "/[0-9]*"
	# TEST1 = match( $NF, STRNG )
	I = length ( $NF ) - 1
	STRNG = $NF
	while ( substr(STRNG,I,1) != "/" ) I--
	LASTNUM = substr( $NF, I + 1, length($NF) - I - 1)
	#TEST1 = match($NF, "[0-9]*")
	I = 1
	while ( substr(STRNG,I,1) != "/" ) I++
	J = 0
	found = 0
	while (found == 0) {
	   J = J + 1
	   if ( substr(STRNG,J,1) == "0" ) found = 1
	   else if ( substr(STRNG,J,1) == "1" ) found = 1
	   else if ( substr(STRNG,J,1) == "2" ) found = 1
	   else if ( substr(STRNG,J,1) == "3" ) found = 1
	   else if ( substr(STRNG,J,1) == "4" ) found = 1
	   else if ( substr(STRNG,J,1) == "5" ) found = 1
	   else if ( substr(STRNG,J,1) == "6" ) found = 1
	   else if ( substr(STRNG,J,1) == "7" ) found = 1
	   else if ( substr(STRNG,J,1) == "8" ) found = 1
	   else if ( substr(STRNG,J,1) == "9" ) found = 1
	   endif endif endif endif endif endif endif endif endif endif
	   }
	FIRSTNUM =substr( $NF, J ,I - J )
	if( FIRSTNUM == LASTNUM )
	   print $0 " " LASTNUM " " NR
	}

/(part|Part|PART) +[0-9]+ +of +[0-9]+ *$/ {
        # found Part<num> of <num>$, now get last number
	if( $NF == $(NF-2) )
	   print $0 " " $NF " " NR
	}

END of news.unnews.awk1


Here is my .news.autodearc file:
#comp/sources/x
#comp/sources/games
#comp/sources/unix
comp/binaries/ibm/pc
END of .news.autodearc
(Notice that lines beginning with # are ignored)


that should do it for the automatic de-newser program.

Here is the automatic get-news stuff:

First, we have ~/bin/get.new.sources:
#! /bin/csh
echo Running Get.new.sources > /dev/console
chdir /usr/rusty/News
foreach relpath (`cat /usr/rusty/.news.autoget`)
  echo ====starting  $relpath ===== > /dev/console
  if ( -e $relpath/.oldest ) then
    @ OLDEST = `/bin/cat $relpath/.oldest`
  else
    set OLDEST = `/bin/ls $relpath`
    @ i = $#OLDEST
    if( 0 < $i ) then
       while( -d $relpath/$OLDEST[$i] )
         @ i -= 1
       end
       @ OLDEST = $OLDEST[$i]
    else
       @ OLDEST = 0
  endif
  echo oldest is $OLDEST > /dev/console
  set MYLIST = `/usr/ucb/rsh cadnetix ls -1 /usr2/spool/news/$relpath`
#  echo /usr2/spool/news/$relpath : >/dev/console
#  echo $MYLIST > /dev/console
#  echo $relpath : >/dev/console
#  /bin/ls $relpath >/dev/console
  foreach file ($MYLIST)
     if (! -d $relpath/$file ) then
       @ fnum = $file
       if ( $fnum > $OLDEST ) then 
         echo copying $relpath/$file > /dev/console
# WARNING - the xx below should be your news server, or whatever you
#		must do to get the files from your news server
         /usr/ucb/rcp -p xx:/usr2/spool/news/$relpath/$file  ~/News/$relpath
         set OLDEST=$file
         echo $OLDEST > $relpath/.oldest
       endif
     endif
  end
  echo Oldest is now $OLDEST > /dev/console
end
echo Get.new.sources is done > /dev/console

END of ~/bin/get.new.sources

Here is ~/bin/get.new.sources.csh:
csh /usr/rusty/bin/get.new.sources
END of ~/bin/get.new.sources.csh
 (The purpose of this is to run get.new.sources under the cshell,
	and to run the entire mess as user RUSTY rather than
	as root.  Since I use crontab to accomplish the automagic
	portion of this.  See crontab entry, below)

Here is ~/bin/news-un-news-er.csh:
csh ~/bin/news.un-news-er
END


Here are my crontab entries (remember, its run by root!)
0 5 * * * su rusty < /usr/rusty/bin/get.new.sources.csh >/dev/console
0 6 * * * su rusty < /usr/rusty/bin/news-un-news-er.csh >/dev/console
END of crontab entries.


Here is my ~/.news.autoget:
comp/sources/games
comp/sources/misc
comp/sources/unix
comp/sources/x
comp/sources/bugs
alt/sources
comp/binaries/ibm/pc
END of ~/.news.autoget



SO, there it is.  One of these days I will get a shar-maker and
release it the right way.  Probably after I get this thing to do
single files.
-----
Rusty Carruth  UUCP:{uunet,boulder}!cadnetix!rusty  DOMAIN: rusty@cadnetix.com
Cadnetix Corp. (303) 444-8075x241 \  5775 Flatiron Pkwy. \ Boulder, Co 80301
Radio: N7IKQ    'home': P.O.B. 461 \  Lafayette, CO 80026

rusty@cadnetix.COM (Rusty) (03/25/89)

In article <7203@cadnetix.COM> rusty@cadnetix.COM (Rusty) writes:
<<<lots of script and stuff>>>
>	    cat ../$SHARC |sed '/^BEGIN/,/^END/d' >>.trashit.

Oops.  This is NOT the line I was using, I just grabbed
it from someone elses 'combine' script they posted.

Guess what?  That sed script does exactly the opposite of
what you want.  (The sed command appears 2 times, I think,
in the stuff I posted).  Here is what you want that line to
look like (similar changes need to be made to the other line):

   cat ../$SHARC |awk '/(^BEGIN)/,/(^END)/ {print}'| \
	egrep -v '(^BEGIN)|(^END)' >>.trashit.

(this should only be a single line)

Sorry about that.  Thats what I get for not testing it.

The single-file handler (for uuencoded files only) has been
added now, and seems to work fine.  As soon as I get
a shell-archive builder I'll post it to comp.sources.misc.

Rahul, should I send it to you, too?
-----
Rusty Carruth  UUCP:{uunet,boulder}!cadnetix!rusty  DOMAIN: rusty@cadnetix.com
Cadnetix Corp. (303) 444-8075x241 \  5775 Flatiron Pkwy. \ Boulder, Co 80301
Radio: N7IKQ    'home': P.O.B. 461 \  Lafayette, CO 80026