[news.software.nn] Less than meets the eye

hankin@sauron.osf.org (Scott Hankin) (02/23/90)

    Lately (since moving to 6.3.10?)  I have  noticed a discrepancy between
    what nn tells me in  terms of how many articles  are in a given  group,
    and how  many articles are  actually available and accessible.  Today I
    noticed one group  which supposedly had over  2000 articles, but  I was
    only shown  about 40.  nnadmin  stated  that it thought that there were
    the larger number as well, but the data file had other ideas.  I had to
    recollect in order  to  get things  to  match up.   This  has  happened
    several times on many different groups.  Can anyone shed  some light on
    this situation?

- Scott
------------------------------
Scott Hankin  (hankin@osf.org)
Open Software Foundation

hankin@sauron.osf.org (Scott Hankin) (02/23/90)

    Curiouser and curiouser.  It seems that, on closer examination, many of
    my data files are getting zeroed as well.  What would cause this?

- Scott

------------------------------
Scott Hankin  (hankin@osf.org)
Open Software Foundation

hankin@sauron.osf.org (Scott Hankin) (02/23/90)

    The final straw!  Doing a reinit of the entire  database (an admittedly
    tiresome and time  consuming activity)  does  NOT restore these  zeroed
    data  files!   They must be  recollected individually  to restore  them
    completely.  Now all I have to do is determine which data files contain
    either no data (relatively easy) or less data than they should (can you
    say, "Check each group manually in nnadmin?")  What fun!

- Scott
------------------------------
Scott Hankin  (hankin@osf.org)
Open Software Foundation

chuq@Apple.COM (Chuq Von Rospach) (02/23/90)

hankin@sauron.osf.org (Scott Hankin) writes:
>    Curiouser and curiouser.  It seems that, on closer examination, many of
>    my data files are getting zeroed as well.  What would cause this?

File system full?

-- 

Chuq Von Rospach   <+>   chuq@apple.com   <+>   [This is myself speaking]

I don't know what's scarier: President Reagan saying he had no inkling of 
his aides doing anything illegal, or an ex-president who uses the word inkling.

hankin@sauron.osf.org (Scott Hankin) (02/23/90)

chuq@Apple.COM (Chuq Von Rospach) writes:

>hankin@sauron.osf.org (Scott Hankin) writes:
>>    Curiouser and curiouser.  It seems that, on closer examination, many of
>>    my data files are getting zeroed as well.  What would cause this?

>File system full?

    No, plenty of space - I'm beginning to think I may back out of patch #10...

- Scott
------------------------------
Scott Hankin  (hankin@osf.org)
Open Software Foundation

bob@csispt.UUCP (Bob Finch) (02/24/90)

hankin@sauron.osf.org (Scott Hankin) writes:
>    The final straw!  Doing a reinit of the entire  database (an admittedly
>    tiresome and time  consuming activity)  does  NOT restore these  zeroed
>    data  files!   They must be  recollected individually  to restore  them
>    completely.

About four weeks ago I encountered what sounds like the same problem
after upgrading from 6.3.1 to 6.3.10.  After initializing the entire
database, about 10% of the data files were 0 length.  Initializing
again resulted in about the same number of 0 length data files, but in
different groups.

I spent about a day tracing through initializing the database in
nnmaster with sdb, but I never caught it zeroing a data file.  Since
then, nnmaster has run without problems.

I never found what caused the problem, and why it did not occur while
running nnmaster under sdb. I suspect, however, that if I tried to
reinitialize the database, the problems would reoccur.

-- Bob Finch      bob%csispt.UUCP@unicorn.wwu.edu
   AlphaSoft      +1 (206) 671-6214

storm@texas.dk (Kim F. Storm) (02/27/90)

hankin@sauron.osf.org (Scott Hankin) writes:
>    The final straw!  Doing a reinit of the entire  database (an admittedly
>    tiresome and time  consuming activity)  does  NOT restore these  zeroed
>    data  files!   They must be  recollected individually  to restore  them
>    completely.


bob@csispt.UUCP (Bob Finch) writes:
>About four weeks ago I encountered what sounds like the same problem
>after upgrading from 6.3.1 to 6.3.10.  After initializing the entire
>database, about 10% of the data files were 0 length.  Initializing
>again resulted in about the same number of 0 length data files, but in
>different groups.

I have now heard of this problem from several sites who have upgraded
to 6.3.10, and of course I am deeply worried about this, since I would
very much like this problem to be fixed before release 6.4 goes out.

Since the problem did not seem to occur with release 6.3.6, I have
tried to backtrack what has changed from 6.3.6 to 6.3.10 in the way
articles are collected.

The strange thing is that except for fixing a few (very) minor bugs,
nothing has changed -- except the nntp module which was rewritten for
the unofficial release 6.3.7 (and still is used in 6.4).

I therefore have to conclude that this is a problem with NNTP, but I
really do not understand why!  The changes we made to nntp.c in 6.3.7
was aimed at NOT losing articles which could occur with the old code
if the server died.  So in our attempts to make things better, they
may have become worse!

I have had indications that NNTP is indeed the cause of the problems,
and the reason being that the NNTP server was sending empty files to
the client (i.e. nnmaster) or maybe the client (nnmaster) somehow
truncated the files itself (Don't know which).

I don't know whether this is the only/real cause of the problems.
The nntp module explicitly makes a check for empty articles, and
treats it as a non-existing article; maybe that is the problem?
If the nntp server says that the article is indeed there, but it is
empty, then maybe the client should try to fetch it again?

Somebody also told me that nn does not work with NNTP 1.5.7, but I
have not heard of these problems since, so I don't know whether it is
true or not!  Maybe this could be (part of) the explanation?

I would really like to collect some exact information about this
problem.  I you have installed 6.3.7 or later, would you please send
me the following information - also if you have NOT seen the above
problem:

------------------- CUT HERE and RETURN TO ME -------------------

TICK IF YES:

( )	Have you seen the "zero files" problem?

( )	Do you use NNTP 1.5.5 ?
( ) 	Do you use NNTP 1.5.7 ?
( )	If either is yes, does the master access news via NNTP?

( )	Do you use NFS?
( )	If yes, does the database reside on the machine where the
	master runs?

( )	Do you have NETWORK_DATABASE defined?


PLEASE SPECIFY:

Release: nn 6.3.__

Operating System:
s- file used:   

Hardware:
m- file used:

COMMENTS/OBSERVATIONS YOU THINK MIGHT BE USEFUL:



----------------------------------------------------------------------

Thank you
-- 
Kim F. Storm        storm@texas.dk        Tel +45 429 174 00
Texas Instruments, Marielundvej 46E, DK-2730 Herlev, Denmark
	  No news is good news, but nn is better!