[news.misc] trying to automaticly archive news

kak@stc-auts.UUCP (Kris Kugel) (01/08/88)

I think that there are some "automatic archive" systems, maybe
somebody from one of them can give me some suggestions.

We keep a lot of articles around, but the size is swamping us.
for one thing, backing up that disk partition is a pain.
Anyhow, we'd like to be able to back the stuff onto tape, but
still have it accessable in a useful fashion.

Our current scheme is (external feed) > ..spool/news > ..spool/oldnew > tape

Our system administrator here suggested that we might look
at logging significent information in a database, 
so that we can reference both new and old articles which 
bear on whatever our current problems are, and tell us where
to go to find the article.

rough example:
DatabasePrompt> search sources:vt100 
    A new vt100termcap --> tape AA12:comp/sources/misc/112
    VT100tool: vt100emulation for Suns --> news:comp/sources/unix/325
    vtem: curses based vt100 emulation --> oldnews:comp/sources/unix/250
    . . .

We would like rnews, expire, and <the backup-news-to-tape-utility>
to all update the database, so that the contents and locations of
the articles listed in the database are always correct.
None of us has time to write reviews of the articles, these entries
need to be made automaticly.

Any knowledgable comments from out there?

	Kris A. Kugel
	Storage Tek:    ...{ uunet!nbires, hao, ihnp4 }!stcvax!stc-auts!kak

barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) (01/12/88)

In article <236@stc-auts.UUCP> kak@stc-auts.UUCP (Kris Kugel) writes:
|I think that there are some "automatic archive" systems, maybe
|somebody from one of them can give me some suggestions.
|	Kris A. Kugel
|	Storage Tek:    ...{ uunet!nbires, hao, ihnp4 }!stcvax!stc-auts!kak

You may wish to look at Chuq's program, which is supplied as a
standard utility with the news software. It is called keepnews or
savenews. (depending on the filename or the comments :-).

I have made a few changes to this, and I use this program to keep
track of 300+ MBytes of old news.

One of the problems you'll notice is that you wait a lot when you try
to examine directories with 1000+ files in it.

savenews stores the files in a newsgroup/hash/article fashion.
In addition, it extracts the Subject: line and stores that in the
LOGS/newsgroup file. You can cd to the LOGS direcory and type

	grep -i vt100 *source*

And get (results edited greatly :-)

comp.sources.d:comp.sources.d/87-02/842.oakhill VT100 emulation on Sun
comp.sources.d:comp.sources.d/87-03/677.linus   Re: VT100 emulation on Sun
mod.sources:mod.sources/86-01/1348.panda        vtem - a VT100 emulator
net.sources:net.sources/85-11/1634.cbosgd       Re: Sysline on a vt100

You can then go to the appropriate directory to retrieve the article.

I made some changes that make it easier to archive the files to tape.
Instead of a hash value, I use the month and year of the posting (as
the above example shows).  Besides letting me sort the log files in
chronological order, it also allows me to archive to tape the older
news, keeping the latest news. I also prefer having all of the pieces
of an N-part posting in one directory.

Let me know if you want the patches to savenews.




-- 
	Bruce G. Barnett 	<barnett@ge-crd.ARPA> <barnett@steinmetz.UUCP>
				uunet!steinmetz!barnett