[net.news.notes] detecting problems with notesfile

z@rocksvax.UUCP (Jim Ziobro) (09/19/85)

We have been having some problems with notesfiles being corrupted.  When
this happens the daily nfarchive job dies and any notesfile after the
corrupted one never get expired.  A few days of this will result in an
overflow of the disk.  I would suggest everyone add a line to their
system startup file:

ls -l /usr/spool/notes/.locks			>/dev/console

This will tell of problems at least at startup.

In the meantime I am trying to locate the source of the corruption.  Our
version:
main.c:
$Header: main.c,v 1.5 84/04/08 01:20:25 salkind Exp $

It appears the response chain gets corrupted.  At least this is what debbugging
the compress routine shows.  What is strange though is that the notesfile
seems to still be readable!  This has been hitting us about once/week.  This
seemed to have started after I increased the expiration time to 19 days.
I have the suspision that it has something to do with the strange way notes
handles greater than 25 responses to a base note.  Simply creating a test
notesfile with >25 responses doesn't bring out the problem.  Help!?!?

Do I have the latest version?  Lou are you listening?
-- 
//Z\\
James M. Ziobro
Ziobro.Henr@Xerox.ARPA
{rochester,amd,sunybcs,allegra}!rocksvax!z

hartmann@siemens.UUCP (09/19/85)

I have been experiencing the same problem recently with our nfarchive.
The archiver tells me what it is doing in a cronlog file, so every
morning when I come in, I look at the cronlog file to make sure the
archiver ran complete.  I came in one morning to find that the archiver
had bombed out while it was archiving net.abortion.  I looked in 
net.abortion and found that the comp.* files were still there (created
by the archiver) the notesfile was closed, and there was a lock file
for it.  I opened the notesfile so it could be read, but didn't know
what to do with the rest of the mess.  I looked everywhere I could
for the source of the problem, but wasn't getting anywhere. 
This kept happening to me for about a week.  I'd come in to find that
the archiver bombed out on net.abortion again, the notesfile was
closed, and there was a lock file for it again.  At this point, 
I sent a mail message to Ray Essick asking for help.  Needless to say,
the removal of net.abortion solved all my problems.

But, just this week, I discovered that the archiver bombed out again on
net.wanted.  After going through the mess with net.abortion, I decided
to just leave the notesfile in the state it was, comp.* files present
and closed.  I was hoping to hear from Ray Essick before I tackled the
problem again.  Much to my surprise, the next morning I came in to find
that net.wanted was fixed.  There were no comp.* files, the notesfile
was open, AND the archiver had run successfully on it the previous 
evening!  What happened?  I don't know, but I DO KNOW that if I run
into this problem again, I'll leave the notesfile be and hope it
fixes itself again.  If it doesn't, then I'll worry about it.


Terri Hartmann
Siemens Research and Technology Lab
Princeton, NJ
princeton!siemens!hartmann

rs@mirror.UUCP (09/23/85)

The latest version of notes is 1.7.

This is the "main branch" that Ray Essick distributes.

If you -- or ANYONE ELSE -- wants to upgrade and can't get
in touch with Essick, I will be happy to help out with sending
back tapes with bits written on them.

Notes1.7 lets you save a note to a piped program, has a newer
nfmail that understands Berkeley mail "ignore" declarations,
and a few other nice features.  You probably also want to get
my modifications that handle "moderated" notesfiles and semi-
automatic appending of signatures.

--
Rich $alz	{mit-eddie, ihnp4!inmet, wjh12, cca, datacube} !mirror!rs
Mirror Systems	2067 Massachusetts Ave.
617-661-0777	Cambridge, MA, 02140

larry@extel.UUCP (09/25/85)

Newsinput here (11/70) regularly dumps its guts.  I suspect the problem
is due to fields in things like message-ids longer than notes allows.
Scanf is used to tear these fields apart and contains no limiter on the
length of a field.  It seems the stack gets trashed in newsinput.c which
then leaves a lock file around and sometimes (though less often) corrupts
one of the newsgroups.  I can get newsinput to dump readily by manually
having it digest the news articles in the news systems copy of the offending
news group.

One little trick I use is to delete all lock files before running nfarchive.
This is a little dangerous but so far has not caused any problems.

When I get a "failed gethrec" error from nfarchive I resort to removing
the entire group.

Larry
ihnp4!extel!larry