[alt.sources] Undigestifier for C News

tale@cs.rpi.edu (Dave Lawrence) (12/04/89)

In <2427@taux01.UUCP> Amos Shapir <amos@taux01.UUCP> offered the
following undigestifying script for reposting component articles
locally:
 
> #!/bin/sh
> #Undigestify digests and post individual articles
> TMP=/tmp/undig$$
> trap "rm $TMP" 0 1 2 15
> cat $* > $TMP
> 
> ed - $TMP << 'END'
> g/^Subject: /?^$?c\
> End-Of-Article\
> /usr/lib/news/inews -h << 'End-Of-Article'\
> Distribution: local\
> Approved: netnews\
> .
> g,/usr/lib/news/inews ,/^Newsgroups: /t.
> ?^End-Of-Article?+1;$d
> 1;/^End-Of-Article/d
> w
> q
> END
> sh $TMP

Well, as anyone who has C News can tell you, using inews is painful
for our poor machines.  Especially if we are going to cram a score of
sub-articles from comp.sys.mac.digest through it in a loop.  So I
reworked it from scratch for people that want to burst digests with
their C News system.

[ Yes, I know some readers like NN can do it to achieve the same
functionality; ie, having all the flexibility of a regular article for
a sub article -- marking it, killing it, replying directly, et al.
For the readers that can't, or don't do it quite right, this is a nice
thing to have.  (GNUS, for example, will use RMAIL-summary to burst
digests, but it isn't well-integrated with the rest of GNUS.) ]

Dave

#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  README undig undig.awk
# Wrapped by usenet@rpi on Mon Dec  4 01:02:53 1989
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'README' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'README'\"
else
echo shar: Extracting \"'README'\" \(3281 characters\)
sed "s/^X//" >'README' <<'END_OF_FILE'
XSome simple code to burst digests into their component articles and
Xrepost them locally.
X
XWARNING: Don't use this if you forward all distributions.  People
Xare obviously fowarding "Distribution: local" postings or the newgroup
Xfor alt.swedish.chef.bork.bork.bork would never have made it to all of
Xthe sites that saw it.  Your neighbours will probably hate you if you
Xforward the burst digests to them without prior consent.
X
XTo use:
X
X	1. edit undig, the shell script, to point to your (g|n)awk and
X	   to where you are going to put the awk script.  V7 awk won't do;
X           at least, not on our Sun system[1].
X
X	2. edit undig.awk, the awk script, and change the variables 'intro',
X	   'pathhost' and 'subjform' if you want.  If you change the newline
X	   count in the intro string, change 'nlinintro' to be the correct
X	   count.
X
X	2a. To provide an alternate distribution mechanism, if you really
X	   want to pass local postings, you can change another variable
X           in the BEGIN block in the awk script.  Make it something like
X	   "Distribution: undigest\n" and then make sure you don't
X	   feed the undigest distribution class.
X
X	3. edit your news sys file and add a line like this:
X
X	undig:comp.risks,comp.sys.mac.digest,comp.ai.nlang-know-rep,\
X	      comp.ai.vision/all,!local::/usenet/bin.server/etc/undig
X
X        4. Now just put the sh script where you indicated in the sys file and
X 	   the awk script where you indicated in the sh script.
X
XHow it works:
X
X	It takes a digest article and breaks it apart on the article
X	seperators (lines of hyphens).  When it gets to the end of the
X	headers for a sub-article, various headers are substituted from
X	the parent article (eg, Newsgroups:) while others are retained
X	(eg, From: and Subject:).  When it gets to the end of the sub-
X	article, indicated by the hypen line, it prepends an rnews batch
X	header line and a Lines: count for the article.  All of the
X	articles are dumped into one batch this way and left in the
X	in.coming spool space to be sucked up by relaynews[2].
X
X	WARNING: Don't use this on comp.simulation.  Inadequate care is
X	taken in that digest to distinguish between article seperators and
X	arbitrary lines of hyphens.
X
XI'd cover this with the GPL, but that's overkill for this.  It wasn't
Xall that much effort.  It is disclaimed of any warranty and you are
Xfree to modify and redistribute it as you want.
X
XDave Lawrence   <tale@rpi.edu>
XNovember 1989
X
X[1] When I first wrote it, it used no new awk features.  GAWK could handle
Xit fine then; /bin/awk on a Sun3/OS 4.0.3 had all sorts of weird reactions
Xto it.  Compiled it internally fine, but processing the digest gave it
Xfunky problems.  The results would vary, too, depending on whether I
Xredirected output through a pipe or just let it spew out the batch to the
Xterminal (where I could post-view it in my xterm history).
X
X[2] Originally, it just piped the batch straight to relaynews.  Hah! 
XWhat a boner.  The sys entry was called by the relaynews that brought in
Xthe digest and it waited around until the process it called returned, so
Xit could be sure everything happened as it should.  In the meantime, the
Xrelaynews that got called by the script waited for a lock, which the
Xformer relaynews was holding.  Instant comatose relay system.
END_OF_FILE
if test 3281 -ne `wc -c <'README'`; then
    echo shar: \"'README'\" unpacked with wrong size!
fi
# end of 'README'
fi
if test -f 'undig' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'undig'\"
else
echo shar: Extracting \"'undig'\" \(705 characters\)
sed "s/^X//" >'undig' <<'END_OF_FILE'
X#! /bin/sh
X# undig -- burst digests into component articles for local posting.
X# Written by Dave Lawrence <tale@rpi.edu>
X
X# =()<. ${NEWSCONFIG-@<NEWSCONFIG>@}>()=
X. ${NEWSCONFIG-/usenet/bin.server/config}
XPATH=$NEWSCTL/bin:$NEWSBIN/relay:$NEWSBIN:$NEWSPATH ; export PATH
X
X# v7 awk won't do.  need [gn]awk.
Xawk=/usr/local/bin/gawk
Xscript=$NEWSCTL/awk/undigest
X
Xwbatch=${NEWSARTS}/in.coming/undig$$batch
Xnbatch=${NEWSARTS}/in.coming/0$$     # should be safe enough to avoid collision
X
Xintr="rm -f $wbatch $nbatch; echo 'Uh oh ... trouble bursting a digest.' | mail $NEWSMASTER; exit 1"
Xtrap "$intr" 0 1 2 3 15
X
Xcat | $awk -f $script > $wbatch
X
Xif [ -s $wbatch ]; then
X  mv $wbatch $nbatch
Xfi
X
Xtrap 0
Xexit 0
END_OF_FILE
if test 705 -ne `wc -c <'undig'`; then
    echo shar: \"'undig'\" unpacked with wrong size!
fi
chmod +x 'undig'
# end of 'undig'
fi
if test -f 'undig.awk' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'undig.awk'\"
else
echo shar: Extracting \"'undig.awk'\" \(4130 characters\)
sed "s/^X//" >'undig.awk' <<'END_OF_FILE'
X#! /usr/local/bin/gawk -f
X# A [gn]awk script to burst digests into batches for relaynews
X# Abandoned attempt at making it V7 compatible because of stupid bugs in V7 awk
X# Dave Lawrence <tale@turing.cs.rpi.edu>  -- Nov 1989
X
XBEGIN {
X  # line to add to beginning of Table Of Contents article (optional)
X  intro     = "\n[ The preceding digest has been locally divided into its component articles. ]\n";
X  nlinintro = 2; # newlines in intro, so line count is right.
X  # name of pseudo-host to add to Path: (optional)
X  pathhost  = "rpiburst";
X  # Subject format for articles which have none.  %s will be parent subject.
X  subjform  = "Subject: (None -- partial burst of %s)\n";
X  boa       = 1; # non-zero if not reached body (leading blank lines+headers)
X  inheaders = 1; # non-zero if end of headers not reached for article
X  seq       = 0; # article sequence number, for unique message-id
X  dist      = "Distribution: local\n"; # sending this would be very antisocial
X}
X
XNF == 0 {
X  if (inheaders) {
X    inheaders = 0; boa = 0;
X    line = -1;			                 # so line count ends up right
X    umsgid = "Message-ID: <" seq "-" msgid "\n"; # (hopefully) unique id
X    # make sure sub-articles have Subject:s
X    subject = ((seq && !header[seq,"Subject:"]) ? \
X                   sprintf(subjform, substr(header[0,"Subject:"], 10,
X                                     length(header[0,"Subject:"])-10)) : "");
X    size += length(umsgid) + length(subject) + length(dist);
X    article = article umsgid subject dist;
X    if (seq)
X      for (ind in header) {
X        split(ind,part,SUBSEP);
X        if (part[1] == 0 && !((seq,part[2]) in header)) {
X	  # Carry over any of the original headers, excepting those that
X	  # were (chronologically) purposefully excluded later (physically)
X          # usually this leave Approved:, Newsgroups: and Path:
X          size += length(header[0,part[2]]) + 1;
X	  article = article header[0,part[2]] "\n";
X	}
X      }
X    if (!seq) {  # this must be the Table Of Contents article
X      # add our explanation of what's happened to the digest
X      article = article intro;
X      size = size + length(intro);
X      line += nlinintro;
X    }
X  }
X  if (!boa) { # it's just a blank line as part of an article
X    article = article "\n";
X    size++; line++;
X  }
X  next;
X}
X
X/^----------------------------------------------------------------+$|^------------------------------$/ {
X#/^------------------------------+$/ {
X  lines=sprintf("Lines: %d\n",line);
X  size += length(lines);
X  printf("#! rnews %d\n%s%s", size, lines, article);
X  size = 0; article = ""; boa = 1; # reset everything for next article
X  seq++;
X  next;
X}
X
Xboa == 0 { # regular article lines being added
X  line++;
X  size += length + 1;
X  article = article $0 "\n";
X  next;
X}
X
X{ inheaders = 1; # if we're here, we must be in the headers
X  # print all headers except Lines: & Message-ID:, but turn >From: into From:
X  # some digests also include "Newsgroups:" occasionally in component
X  # articles, so make sure we use the original line for all of them.
X  # Also nuke Distribution: so it doesn't futz with the whole thing
X  # and rebroadcast the split digest to everyone.
X  if ( $1 == "Lines:" || $1 == "Distribution:" ||
X      ( $1 == "Newsgroups:" && seq )) next;
X  if ( $1 == "Message-ID:" ) {
X    if (!seq) msgid = substr($2,2,length($2)-1); 
X    next;
X  }
X  # change Path:, but we don't need to.  "rpi!rpi!..." looks weird though. :-)
X  if ( $1 == "Path:" )
X    if (!seq) $2 = pathhost "!" $2; else next;
X  # So helpful of mail to quote this for us.
X  if ( $1 == ">From:" ) $1 = "From:";
X  article = article $0 "\n";
X  size += length + 1;
X  # headers not to carry over to burst out articles
X  # Reply-To: usually list request address
X  # Sender:   usually "daemon" or "news"
X  # Organization: organization ("The Internet") not generally same
X  # Path:      IcK.  Use the path for the original article
X  # Lines:, Message-ID: and Newsgroups: are wrong and have already been
X  #   zapped
X  if ( $1 == "Reply-To:" || $1 == "Sender:" || $1 == "Organization:" ) {
X    if (seq) header[seq,$1] = $0;
X  } else
X    header[seq,$1] = $0;
X  next;
X}
END_OF_FILE
if test 4130 -ne `wc -c <'undig.awk'`; then
    echo shar: \"'undig.awk'\" unpacked with wrong size!
fi
chmod +x 'undig.awk'
# end of 'undig.awk'
fi
echo shar: End of shell archive.
exit 0