[comp.unix.shell] Parsing a news article with source in it

bill@polygen.uucp (Bill Poitras) (10/10/90)

What I am trying to do is pipe a news article through a shell script
(preferable /bin/sh) so that it parses the header, extracts the filename
from the "Archive: ..." line, and then writes the rest of the file,
from the '!# /bin/sh' line to the end to filename specified in the 
"Archive: ..." line.  I can't seem to get the shell script to read up to
the !# /bin/sh using <<.  There are basically two tasks:
	1) Parsing the header, extracting the filename in the Archive line
	2) Writing the rest of the file to the filename gotten from #1

Does such a script exist?  Any help would greatly be appreciated.


+-----------------+---------------------------+-----------------------------+
| Bill Poitras    | Polygen Corporation       | {princeton mit-eddie        |
|     (bill)      | Waltham, MA USA           |  bu sunne}!polygen!bill     |
|                 |                           | bill@polygen.com            |
+-----------------+---------------------------+-----------------------------+

tchrist@convex.COM (Tom Christiansen) (10/12/90)

In article <838@redford.UUCP> bill@redford.UUCP (Bill Poitras(X258)) writes:
>What I am trying to do is pipe a news article through a shell script
>(preferable /bin/sh) so that it parses the header, extracts the filename
>from the "Archive: ..." line, and then writes the rest of the file,
>from the '!# /bin/sh' line to the end to filename specified in the 
>"Archive: ..." line.  I can't seem to get the shell script to read up to
>the !# /bin/sh using <<.  There are basically two tasks:
>	1) Parsing the header, extracting the filename in the Archive line
>	2) Writing the rest of the file to the filename gotten from #1

>Does such a script exist?  Any help would greatly be appreciated.

It does now. :-)

#!/usr/bin/perl -n
if (1 .. /^$/) {
  die "can't write to $1: $!" if /^Archive:\s*(\S+)/ && !open(STDOUT, ">$1");
} elsif (/^!#\s*\/bin\/sh/ .. eof) {
  print;
}

I'd actually feel more confortable if I had some sample input, but
I believe this matches your spec.


--tom
--
 "UNIX was never designed to keep people from doing stupid things, because 
  that policy would also keep them from doing clever things." [Doug Gwyn]

emv@math.lsa.umich.edu (Edward Vielmetti) (10/13/90)

In article <107116@convex.convex.com> tchrist@convex.COM (Tom Christiansen) writes:

   In article <838@redford.UUCP> bill@redford.UUCP (Bill Poitras(X258)) writes:
   >What I am trying to do is pipe a news article through a shell script
   >(preferable /bin/sh) so that it parses the header, extracts the filename
   >from the "Archive: ..." line, and then writes the rest of the file,

   I'd actually feel more confortable if I had some sample input, but
   I believe this matches your spec.

Try comp.archives or comp.unix.sources as a sample spec, go for the
Archive-name: header.  Make sure you create whatever directories are
necessary along the way.

there's a program "rkive" from comp.sources.misc which does this as well,
and a few others of their ilk, but to my knowlege nothing in perl (yet).
It appears that to do it right and bullet-proof enough for automated
execution is non-trivial, that is to say not intrinsically hard but a
lot of potential error conditions which would need to be dealt with sensibly.

--Ed

Edward Vielmetti, U of Michigan math dept <emv@math.lsa.umich.edu>
moderator, comp.archives