[news.software.b] trn bugs

del@thrush.mlb.semi.harris.com (Don Lewis) (08/09/90)

The first one isn't mine and I haven't been able to reproduce it, but ...

 Looks like if you do a '[' at the top -- you get assertion failed -- and
 it dumps.


The other problem manifests itself with cross posted articles.  Since
mthreads processes one newsgroup at a time, if trn happens to stumble
across an crossposted article that has only been processed in one
newsgroup, it barfs on the Xref line because it has an article number
greater than the maximum article number in the active2 file entry for
the other newsgroup.  This also gives you the pleasure of reading
the article twice.  The barf message looks like:
  Corrupt Xref line!!!  4498 --> comp.sys.sun(1..4461)
I'm tempted to hack the thread building stuff from mthreads into
C New's relaynews.  It looks like most of the information it needs
is already hanging around.
--
Don "Truck" Lewis                      Harris Semiconductor
Internet:  del@mlb.semi.harris.com     PO Box 883   MS 62A-028
Phone:     (407) 729-5205              Melbourne, FL  32901

del@thrush.mlb.semi.harris.com (Don Lewis) (08/09/90)

In article <1990Aug9.052017.2137@mlb.semi.harris.com> del@thrush.mlb.semi.harris.com (Don Lewis) writes:
>The first one isn't mine and I haven't been able to reproduce it, but ...
>
> Looks like if you do a '[' at the top -- you get assertion failed -- and
> it dumps.

Someone else here just got bit, but he claims it is caused by '{'.


Also, I had trn stop and tell me it was checking my .newsrc when I got
to the end of a group.  It pegged the CPU for about 30 seconds (on a Sun
3/60).  After it was done thinking, it prompted me for the next group
without saying anything else.
--
Don "Truck" Lewis                      Harris Semiconductor
Internet:  del@mlb.semi.harris.com     PO Box 883   MS 62A-028
Phone:     (407) 729-5205              Melbourne, FL  32901

del@thrush.mlb.semi.harris.com (Don Lewis) (08/10/90)

In article <1990Aug9.074726.2922@mlb.semi.harris.com> del@thrush.mlb.semi.harris.com (Don Lewis) writes:
>In article <1990Aug9.052017.2137@mlb.semi.harris.com> del@thrush.mlb.semi.harris.com (Don Lewis) writes:
>>The first one isn't mine and I haven't been able to reproduce it, but ...
>>
>> Looks like if you do a '[' at the top -- you get assertion failed -- and
>> it dumps.
>
>Someone else here just got bit, but he claims it is caused by '{'.
>
>
I've been able to reproduce the problem.  It is caused when you try to
back up into a missing article with { or [.  The assertion in art.c
fails, and trn kicks you out.

The following thread diagrams provoke this behavior:

  (1)+-(1)
     |-(1)--(1)
     \-(1)+-(1)
          |-(1)
          |-(1)
          \-(1)
 -( )--(1)
       ^^^ if you are here and use { or [ trn breaks
========================
 -( )--(1)
       ^^^ likewise, here
 -(1)
  ^^^ and here
--
Don "Truck" Lewis                      Harris Semiconductor
Internet:  del@mlb.semi.harris.com     PO Box 883   MS 62A-028
Phone:     (407) 729-5205              Melbourne, FL  32901

davison%drivax@uunet.uu.net (Wayne Davison) (08/12/90)

First thing, Don, please *mail* me bug reports, ok?  And please include
complete installation details, such as what hardware/os you're running,
and simple things like the fact that you're running the nntp version of
trn, rather than "normal" trn.  This will greatly aid me in debugging.
Thank you.

As for the '[' / '{' bug, I've found and fixed a problem in the artopen
==> nntpopen interaction in the rn code.  This should clear up the problem
with '[' and '{' (I hope!).  I'll notify Stan of the problem and the fix,
but it will rarely affect main-stream rn users.

The bug with reading cross-posted articles while the database is in the
middle of updating is slightly more of a problem.  I've fixed the rejection
of the xref line as corrupted, but I have to contemplate the ramifications a
bit more.

Thanx for your assistance.
-- 
Wayne Davison            \  /| / /|\/ /| /(_)     davison%drivax@uunet.uu.net
davison@drivax.UUCP     (_)/ |/ /\|/ / |/  \         ...!uunet!drivax!davison
                           (W   A  Y   N   e)

wengland@stephsf.stephsf.com (Bill England) (08/18/90)

In article <1990Aug10.045538.11435@mlb.semi.harris.com> del@thrush.mlb.semi.harris.com (Don Lewis) writes:
>I've been able to reproduce the problem.  It is caused when you try to
>back up into a missing article with { or [.  The assertion in art.c
>fails, and trn kicks you out.
>
>The following thread diagrams provoke this behavior:
>
>--
>Don "Truck" Lewis                      Harris Semiconductor

   I've tried example above but, have not been able to crash it yet.  
   The system is SCO-ODT using the Microsoft Compiler.

   ------
   Another Problem ( Bug?)

   The thread files suffix a .th onto the file corresponding to the
   newsgroup.  When a newsgroup has a name that is two long for the
   filesystem mthreads can not build a thread file for it.
   Example:  "sco.opendesktop"  I'm sure that there are others.

 +--------
 |  Bill England
 |  Stephen Software Systems, Inc.,   Tacoma Wa.
 |  wengland@stephsf.com              +1 206 564 2122
 |
  * *      H -> He +24Mev
 * * * ... Oooo, we're having so much fun making itty bitty suns *
  * *

tale@turing.cs.rpi.edu (David C Lawrence) (08/19/90)

   From: wengland@stephsf.stephsf.com (Bill England)

   The thread files suffix a .th onto the file corresponding to the
   newsgroup.  When a newsgroup has a name that is two long for the
   filesystem mthreads can not build a thread file for it.
   Example:  "sco.opendesktop"  I'm sure that there are others.

While this could be fixed on SYSV by consuming yet another inode per
group to put the threads file in a subdirectory of the group, or by
just not putting a ".th" extension on the thread file (which doesn't
quite work when you have, for example, comp.unix and
comp.unix.questions), you could work it now by letting trn keep its
databases in the spool directory of each group as a .threads file.
This is a configuration parameter.

Another possibility, and one that can save lots of inodes too, is just
to put it all in a flat filesystem.  It shouldn't be terribly
difficult to come up with a quick algorithm that could hash a news
group name into fourteen character names.  I'd even be willing to do
it if Wayne wants.

Kim Storm uses the flat filesystem approach for nn.  Well, almost
flat.  He has a GROUPS file in the database directory and a DATA
subdirectory which holds the group data in two seperate files for each
group.  The identifying portion of the file name is its line number,
zero based, in the GROUPS file.

Incidentally, in another article Mark Moraes, flat out said what I was
implying before (I know, why bother implying when you can say it flat
out) -- it would be really great if some sort of common database,
maintained by a single daemon, could be used for these newsreaders.
--
   (setq mail '("tale@cs.rpi.edu" "tale@ai.mit.edu" "tale@rpitsmts.bitnet"))

urlichs@smurf.sub.org (Matthias Urlichs) (08/20/90)

In news.software.b, article <9S&%XL#@rpi.edu>,
  tale@turing.cs.rpi.edu (David C Lawrence) writes:
<    From: wengland@stephsf.stephsf.com (Bill England)
< 
<    The thread files suffix a .th onto the file corresponding to the
<    newsgroup.  When a newsgroup has a name that is two long for the
<    filesystem mthreads can not build a thread file for it.
< [...]  you could work it now by letting trn keep its
< databases in the spool directory of each group as a .threads file.

Or you could put the "th" extension in front of the newsgroup name instead of
after. Or you can capitalize the name. Or...

But I hate to put anything other than raw articles onto my News disks.
They don't even have a lost+found directory -- what for? ;-)

< Another possibility, and one that can save lots of inodes too, is just
< to put it all in a flat filesystem. 

Lots of inodes? The 700 newsgroups at this site consume 700 for the trheads
files and another 170 for the directories (find this by "du /news/threads |
 wc -l"), so I suppose this doesn't matter.
On the other hand, I hate directories with 700+ files in them.

< [hashing newsgroups names into 14 characters]
You could use their position in the active file.
(Hashing is a bad idea because there may be collisions. What would you do?)

If the active file gets mangled/sorted/whatever, rename the database files.
(You did store the real newsgroup name in there, did you? ;-)

< Incidentally, in another article Mark Moraes, flat out said what I was
< implying before (I know, why bother implying when you can say it flat
< out) -- it would be really great if some sort of common database,
< maintained by a single daemon, could be used for these newsreaders.

Correct. It'd also be great if the news unspooler could maintain that database
instead of relying on a separate demon. (In that case, it would make sense to
have a common database for all news groups. Opening a database file,
searching for the position to enter the information, writing it, updating the
database's pointers, and closing the database, might be prohibitive if you'd
have to do it for each article. C News is very fast, and it should stay that
way.)


-- 
Matthias Urlichs -- urlichs@smurf.sub.org -- urlichs@smurf.ira.uka.de
Humboldtstrasse 7 - 7500 Karlsruhe 1 - FRG -- +49+721+621127(Voice)/621227(PEP)

davison@dri.com (Wayne Davison) (08/21/90)

David C Lawrence (tale@turing.cs.rpi.edu) wrote:
> [Some good discussion on solutions to adding the .th suffix on long thread
> filenames when your operating systems only supports 14-character names.]

I'll just add some details to what David said:  putting the .thread file in
each group's spool directory is the most portable option (if you've got the
space on the spool partition).  If you need a separate thread hierarchy,
you can make a simple change in common.h and rebuild the database; change:

	#define SUFFIX ".th"
to:
	#define SUFFIX "/thread"

to tell mthreads to create an extra directory per group to put the thread
file in.  This will eat up more space, but you won't have to worry about
long group names.

> Another possibility, and one that can save lots of inodes too, is just
> to put it all in a flat filesystem.  It shouldn't be terribly
> difficult to come up with a quick algorithm that could hash a news
> group name into fourteen character names.  I'd even be willing to do
> it if Wayne wants.

I'd certainly support such a solution, as long as there was some good way
to avoid duplicate hash names.

> Kim Storm uses the flat filesystem approach for nn.  Well, almost
> flat.  He has a GROUPS file in the database directory and a DATA
> subdirectory which holds the group data in two seperate files for each
> group.  The identifying portion of the file name is its line number,
> zero based, in the GROUPS file.

It's even less flat now -- the database has been further divided into 100-group
subdirectories to avoid creating a directory with 1200 files in it (due to the
slow file searching that results from having a lot of files around).

> it would be really great if some sort of common database,
> maintained by a single daemon, could be used for these newsreaders.

Yes, it would be nice.  Kim Storm and I are keeping in touch on this
subject, but it isn't too high on the priorities right now.
-- 
 \  /| / /|\/ /| /(_)     Wayne Davison
(_)/ |/ /\|/ / |/  \      davison@dri.com
   (W   A  Y   N   e)     ...!uunet!drivax!davison