[news.software.b] Comments on C news

dww@stl.stc.co.uk (David Wright) (06/21/89)

In article <2228@vicom.COM> lmb@vicom.COM (Larry Blair) writes:
#Major gripe:
#The log file.  ... The log file is examined to create site statistics
#that are posted (at least in the Bay Area).  It's not just that the format
#was changed; most of the useful information was removed.  

This would be a problem for us too - I look at the 'report.awk' output
daily, it shows me if all is well, and how the news is flowing.

I would not object too strongly to having to edit the awk script to
recognise a few new or even changed messages, but if C news doesn't
generate the usual error, duplicate and news routing info, that's a real
reason for me not to use it.

Regards,        "None shall be enslaved by poverty, ignorance or conformity"
        David Wright           STL, London Road, Harlow, Essex  CM17 9NA, UK
dww@stl.stc.co.uk <or> ...uunet!mcvax!ukc!stl!dww <or> PSI%234237100122::DWW

henry@utzoo.uucp (Henry Spencer) (06/21/89)

In article <2228@vicom.COM> lmb@vicom.COM (Larry Blair) writes:
>Not so good things:
>
>The fact that default mode is to allow newgroups to be executed.  The config
>doesn't even give you a choice and the documentation doesn't state how to
>disable it [just change "newgroups" /usr/lib/newsbin/ctl]...

The documentation is admittedly not all it could be.  "build" attempts to
hit the high spots, not to address every possible need of everyone (that's
one of the reasons why it builds shell files instead of just charging in
and doing it -- so you can overrule it).  We think this is a sensible default.

>The fact that
>by default news is always spooled with deferred execution [maybe there's a
>good reason for this]...

Efficiency is always an issue for us.  There are provisions for running it
immediately, although "build" doesn't know about them.

>Some of the questions in the config are unanswerable
>by even an experienced admin [is your rindex fast?].

Nevertheless, said questions are (a) significant, and (b) impossible to
figure out automatically.

>Major gripe:
>
>The log file.  The documentation states a goal of not modifying files that
>programs will look at.  The log file is examined to create site statistics
>that are posted (at least in the Bay Area)...

We consider the log file an aspect of the implementation rather than the
user interface, I'm afraid.  Yes, things that examine it will need fixing.

>It's not just that the format
>was changed; most of the useful information was removed.  Just how did they
>decide what to put in the log? ...

By our opinion of what was useful.  We don't appreciate multi-megabyte log
files, which are all too common nowadays if you use verbose log formats.
You should have seen what it was like before I talked Geoff into adding
some of the current information...

>log is broken to the point of worthlessness; I'll stick to 2.11.17+ until
>I get the time to rewrite the logger.  When I do, I'll post the changes...

Please don't expect the changes to get into the official release.  We
really do feel strongly about terse logging.

>Another thing I noticed is that the spooler won't spool the incoming batch
>if space is short.  On some systems [ours], /usr/spool/uucp and /usr/spool/
>news are on the same filesystem.  This means that spooling the incoming
>batch doesn't increase the space used (when the uucp D. file goes away).

But *not* spooling it *does* increase the space available.  That's the
point.  The space-checking stuff is not intended to routinely let you run
right up against the limit; the correct solution to that situation is to
buy more disks.  The software is aimed at averting disaster if a disk fills
up temporarily.

>Btw, is my rindex fast?  I've running SunOS 3.5 and will go to 4.0 sometime.
>What about the ANSI-compatible questions?

The answers I use on utzoo (SunOS 3.2) are fast rindex, ANSI-compatible
everything except ldiv and stdlib.h.
-- 
You *can* understand sendmail, |     Henry Spencer at U of Toronto Zoology
but it's not worth it. -Collyer| uunet!attcan!utzoo!henry henry@zoo.toronto.edu

peter@ficc.uu.net (Peter da Silva) (06/21/89)

In article <1989Jun20.211939.7835@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes:
> Please don't expect the changes to get into the official release.  We
> really do feel strongly about terse logging.

I have found the current 2.14 log file to be vital to solving news problems.
I haven't seen what C news does, but from the descriptions in this message
chain I expect that I won't be using it for a while. Please reconsider... at
least make it an option. I want to be able to look in the log file and see:

	Each message.
	Where it came from.
	What newsgroups it was in.
	What the subject was.
	Whether it was duplicate.
	Where it was forwarded to.
	And any errors that occurred.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.

Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.

lmb@vicom.COM (Larry Blair) (06/22/89)

= henry@utzoo.uucp (Henry Spencer):
> me:

>The fact that default mode is to allow newgroups to be executed.

=We think this is a sensible default.

Vehement disagreement here.  Only a masochist would allow some bozo to
change all of his moderated groups to unmoderated (or vice-versa) or
create alt.flame.weemba.nice ad infinitum.

>The fact that
>by default news is always spooled with deferred execution [maybe there's a
>good reason for this]...

=Efficiency is always an issue for us.  There are provisions for running it
=immediately, although "build" doesn't know about them.

I'd like a combination of both.  Would an rews.immed with "newspool -i &"
work?  Actually, how about a daemon that stats the news spool directory
every minute?

=We consider the log file an aspect of the implementation rather than the
=user interface, I'm afraid.  Yes, things that examine it will need fixing.

Not possible with the current output.

>It's not just that the format
>was changed; most of the useful information was removed.  Just how did they
>decide what to put in the log? ...

=By our opinion of what was useful.  We don't appreciate multi-megabyte log
=files, which are all too common nowadays if you use verbose log formats.
=Please don't expect the changes to get into the official release.  We
=really do feel strongly about terse logging.

I appreciate the need to avoid the humongous log that B news produces, but
you left out some very important things.  Unless you save the path some
duplicate articles, there is no way to tell where they are coming from.

I have added a few things to the log that will allow reasonable traffic
statistics to be created.  The overall increase in log size if less than
10% for accepted articles and perhaps 40%-80% on the duplicates.  I will
post the changes, along with a modified version of Erik Fair's awk script,
once I have run it all long enough to have confidence.  Btw, the changes
are only a few lines, so if Henry doesn't want to add them, it won't be
that difficult to re-add them after every release.

>Another thing I noticed is that the spooler won't spool the incoming batch
>if space is short.  On some systems [ours], /usr/spool/uucp and /usr/spool/
>news are on the same filesystem.  This means that spooling the incoming
>batch doesn't increase the space used (when the uucp D. file goes away).

=But *not* spooling it *does* increase the space available.  That's the
=point.

True.  But if the batch were directly to relaynews without spooling in
this case (assuming that spool/uucp and spool/news are on different
filesystems), the uucp free space would increase and the news wouldn't
be lost.  Maybe there should be a way to specify an alternate spool
area on a different filesystem in case of no space in the primary.
-- 
Larry Blair   ames!vsi1!lmb   lmb@vicom.com

davidsen@sungod.crd.ge.com (William Davidsen) (06/22/89)

In article <1989Jun20.211939.7835@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:

| >log is broken to the point of worthlessness; I'll stick to 2.11.17+ until
| >I get the time to rewrite the logger.  When I do, I'll post the changes...
| 
| Please don't expect the changes to get into the official release.  We
| really do feel strongly about terse logging.

  I am all for a terse logfile, since I'm always out of disk on most
machines, but while I believe that terse logging should be AVAILABLE,
and probably even the DEFAULT, there are sites which need more
information for debugging or resource allocation.

  I think Cnews philosophy should be clarified here... I assumed that
you released this software because you want it to be useful. If you
don't intend to include any extensions written by other people as part
of the official release, unless you find them useful at your site, it
would be useful to know that now, and perhaps someone will collect all
of the extensions if they're not going to be included in the official
version.

  Cnews looks like a great package (ask me in two days) but it will need
some extensions before it is as useful as B news. With TMNN going
through another major revision phase a lot of us will be using C news
and extending it to meet the needs at our sites.

  I have no problems with the idea of not accepting outside extensions
to your software, but if this becomes widely used it will be getting
extensions and it's desirable to have them organized.
	bill davidsen		(davidsen@crdos1.crd.GE.COM)
  {uunet | philabs}!crdgw1!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

henry@utzoo.uucp (Henry Spencer) (06/23/89)

In article <2277@vicom.COM> lmb@vicom.COM (Larry Blair) writes:
>=Efficiency is always an issue for us.  There are provisions for running it
>=immediately, although "build" doesn't know about them.
>
>I'd like a combination of both.  Would an rews.immed with "newspool -i &"
>work?  Actually, how about a daemon that stats the news spool directory
>every minute?

Uh, why?  Either you run newsrun periodically, or you ask newsspool to run
it every time a batch comes in.  Or both.  I don't see the utility of the
"&", which will foul up trouble reporting.  And I don't see any point to
checking every minute, which can in any case be achieved by running newsrun
frequently.

>I appreciate the need to avoid the humongous log that B news produces, but
>you left out some very important things.  Unless you save the path some
>duplicate articles, there is no way to tell where they are coming from.

Do remember that duplicate articles are normal in some situations; for
example, Toronto deliberately has redundant feeds, which means duplicates,
in quantity, are to be expected.  The log does report which neighbor they
came from; our experience is that information back beyond that is seldom
useful (and it's very bulky).

>... if the batch were directly to relaynews without spooling in
>this case (assuming that spool/uucp and spool/news are on different
>filesystems), the uucp free space would increase and the news wouldn't
>be lost...

I don't understand -- if there is adequate free space in spool/news,
incoming articles will not get rejected anyway.  If there isn't, then
there is nothing that can be done about it.  The borderline cases,
where spool/uucp and spool/news are on the *same* filesystem and the
space freed up by the uucp files is just enough for the articles to be
filed (unlikely, they are usually bulkier once unbatched), frankly seem
to me to come under the heading of brinkmanship.  I repeat, the space
checks are meant as disaster mitigation, not to routinely let you run
safely with disks on the brink of overflow.
-- 
NASA is to spaceflight as the  |     Henry Spencer at U of Toronto Zoology
US government is to freedom.   | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

henry@utzoo.uucp (Henry Spencer) (06/23/89)

In article <940@crdgw1.crd.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>  I think Cnews philosophy should be clarified here... I assumed that
>you released this software because you want it to be useful. If you
>don't intend to include any extensions written by other people as part
>of the official release, unless you find them useful at your site, it
>would be useful to know that now...

We already include things that are useless to us and (in our opinion) very
nearly useless to anybody.  We do not reject outside contributions out of
hand.  We do, however, reserve the right to be selective, given our intense
dislike for the feature-of-the-month club.  Chatty log files are one thing
we thoroughly loathe, on the grounds that they are very seldom useful enough
to pay for themselves.
-- 
NASA is to spaceflight as the  |     Henry Spencer at U of Toronto Zoology
US government is to freedom.   | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

lmb@vicom.COM (Larry Blair) (06/23/89)

In article <1989Jun22.174603.10483@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
=In article <2277@vicom.COM> lmb@vicom.COM (Larry Blair) writes:
=>
=>I'd like a combination of both.  Would an rews.immed with "newspool -i &"
=>work?  Actually, how about a daemon that stats the news spool directory
=>every minute?
=
=Uh, why?  Either you run newsrun periodically, or you ask newsspool to run
=it every time a batch comes in.  Or both.  I don't see the utility of the
="&", which will foul up trouble reporting.  And I don't see any point to
=checking every minute, which can in any case be achieved by running newsrun
=frequently.

The reason I'd like to background the newsspool (actually, the newsrun) is
so that my UUXQT will be able to go on to the next X. file without waiting
for the whole batch to be unbatched.  I'm worried about what happens if I
background it, though; it is reading the parent's stdin.

As far a as running newsrun every minute, that would use several orders of
magnitude more cpu than just stat'ing the directory.  I intend to implement
this sort of daemon.

=Do remember that duplicate articles are normal in some situations; for
=example, Toronto deliberately has redundant feeds, which means duplicates,
=in quantity, are to be expected.  The log does report which neighbor they
=came from; our experience is that information back beyond that is seldom
=useful (and it's very bulky).

We have 4 full feeds.  Every one of those feeds has an exclusion on what
they feed to us for some of the sites upstream from them.  The South Bay
is a veritable rat's nest of crossfeeds.  The way that we are all able to
determine what to exclude is from the log.
-- 
Larry Blair   ames!vsi1!lmb   lmb@vicom.com

bill@twwells.com (T. William Wells) (06/23/89)

In article <1989Jun22.174603.10483@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
:                                                   I repeat, the space
: checks are meant as disaster mitigation, not to routinely let you run
: safely with disks on the brink of overflow.

Ah, but I do!

To give some input: I run a system where there is, at most, 8M free.
Given the large allocation units on my system, a whole day's feed
will often use it all up. The way I've set this up is to prevent
newsrun from running when there is less than 1M of free space, as a
last ditch measure. But I run expire whenever I have less than 2M
free; this is checked every hour.

It works. I don't run out of disk space, but I do get a whole feed.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

brian@radio.utoronto.ca (Brian Glendenning) (06/24/89)

In article <1989Jun23.104001.284@twwells.com> bill@twwells.com (T. William Wells) writes:
   will often use it all up. The way I've set this up is to prevent
   newsrun from running when there is less than 1M of free space, as a
   last ditch measure. But I run expire whenever I have less than 2M
   free; this is checked every hour.


Maybe I can mention my grotty little trick. I have news batches come
in on a different file system then /news. I expire when I need to from
newsrun, i.e. I drive the expire from newsrun, I don't run expire from
crontab:

*** newsrun	Sat Jun 24 01:12:03 1989
--- newsrun.orig	Sat Jun 24 01:12:38 1989
***************
*** 7,13 ****
  PATH=$NEWSCTL/bin:$NEWSBIN/input:$NEWSBIN/relay:$NEWSBIN:$NEWSPATH ; export PATH
  umask $NEWSUMASK
  
- explevel=0
  here="$NEWSARTS/in.coming"
  cd $here
  
--- 7,12 ----
***************
*** 74,91 ****
  			rm -f $f
  			continue		# ugh
  		fi
! 		# Try to automatically clean up enough space if necessary.
! 		while test " `spacefor $batchsize articles`" -le 0
! 		do
! 			explevel=`expr $explevel + 1`
! 			if [ $explevel -gt 5 ]
! 			then
! 				echo '!!! Expiry Fails !!!' | mail $NEWSMASTER
! 				exit 1
! 			fi
! 			divactive $explevel < $NEWSCTL/explist.master > $NEWSCTL/explist
! 			$NEWSBIN/expire/doexpire
! 		done
  
  		# Decompress if necessary.
  		text=nruntmp.$$
--- 73,82 ----
  			rm -f $f
  			continue		# ugh
  		fi
! 		if test " `spacefor $batchsize articles`" -le 0
! 		then
! 			exit 0
! 		fi
  
  		# Decompress if necessary.
  		text=nruntmp.$$

(divactive just divides times in explist format, i.e.:

#! /bin/sh
# Divide times (no dashes and not a comment) by two in Cnews history files.

while read group status time archive
do
	if [ `expr $group : '#'` -eq 0 -a `expr $time : '.*-.*'` -eq 0 ]
	then
		echo $group $status `expr $time / $1` $archive
	else
		echo $group $status $time $archive
	fi
done
exit 0

While the above may not be the most elegant technique in the world,
it has done a reasonably good job of expiry-on-demand in a reasonably
small news partition (~20M) since we installed Cnews alpha a year or
so ago. Certainly it has kept the disk from blowing up occasionally
like it used to in the good old days. (Caveat - I just installed the
new Cnews today, and the above is reworked from what I had been
running, there may be problems that haven't shown up yet. However the
general idea should still be ok).

Now we get to see if I get a double signature :-)
--
	  Brian Glendenning - Radio astronomy, University of Toronto
brian@radio.astro.utoronto.ca uunet!utai!radio!brian  glendenn@utorphys.bitnet
-- 
	  Brian Glendenning - Radio astronomy, University of Toronto
brian@radio.astro.utoronto.ca uunet!utai!radio!brian  glendenn@utorphys.bitnet

henry@utzoo.uucp (Henry Spencer) (06/25/89)

In article <1989Jun23.104001.284@twwells.com> bill@twwells.com (T. William Wells) writes:
>:                                                   I repeat, the space
>: checks are meant as disaster mitigation, not to routinely let you run
>: safely with disks on the brink of overflow.
>
>Ah, but I do!

Well, I respect your bravery, but your warranty is void due to abuse. :-)
Making everything do the right thing while observing strict space limits
is *much* harder than making it fail more-or-less gracefully when an
approximate space limit is approached.  We did the latter, not the former.
-- 
NASA is to spaceflight as the  |     Henry Spencer at U of Toronto Zoology
US government is to freedom.   | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

davecb@yunexus.UUCP (David Collier-Brown) (06/28/89)

In article <2228@vicom.COM> lmb@vicom.M (Larry Blair) writes:
| Major gripe:
| The log file.  ... The log file is examined to create site statistics
| that are posted (at least in the Bay Area).  It's not just that the format
| was changed; most of the useful information was removed.  

  Well, I'm not quite as convinced.  As long as the system records
the information somewhere, its just a "database problem" to put
composite reports & statistics together from a bunch of files,
including the log.

--dave (I'd be annoyed if the log was empty, though (;-)) c-b

-- 
David Collier-Brown,  | davecb@yunexus, ...!yunexus!davecb or
72 Abitibi Ave.,      | {toronto area...}lethe!dave 
Willowdale, Ontario,  | Joyce C-B:
CANADA. 223-8968      |    He's so smart he's dumb.

bill@twwells.com (T. William Wells) (07/22/89)

A couple of points:

In article <2228@vicom.COM> lmb@vicom.COM (Larry Blair) writes:
:                                                             The fact that
: by default news is always spooled with deferred execution [maybe there's a
: good reason for this].

Actually, there's an easy fix for this: look in the input directory
of the source and you will find rnews.batch and rnews.immed. Check it
out.

:                         Some of the questions in the config are unanswerable
: by even an experienced admin [is your rindex fast?].

I had no problem with installation questions, but then again I don't
have rindex.

: Another thing I noticed is that the spooler won't spool the incoming batch
: if space is short.  On some systems [ours], /usr/spool/uucp and /usr/spool/
: news are on the same filesystem.  This means that spooling the incoming
: batch doesn't increase the space used (when the uucp D. file goes away).

This is not true. Remember that disk files are allocated in fixed
size chunks; for example, using the stats for the feed on my system:

	block size      percent over incoming
	512             14
	1024            27
	2048            51

If you are short on disk space, you may very well want to defer
processing the batches.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

lmb@vicom.COM (Larry Blair) (07/23/89)

Now that I've had a chance to play with C news, I have a few comments and one
major gripe.

Good things:

The flexible expire.  The easy aliasing and nonjunking of undesired groups
The passing on of junk.  I particularly like the way sendbatches works,
since it lets me set up a multifeed batch with "uux -l" very easily.

Not so good things:

The fact that default mode is to allow newgroups to be executed.  The config
doesn't even give you a choice and the documentation doesn't state how to
disable it [just change "newgroups" /usr/lib/newsbin/ctl].  The fact that
by default news is always spooled with deferred execution [maybe there's a
good reason for this].  Some of the questions in the config are unanswerable
by even an experienced admin [is your rindex fast?].

Major gripe:

The log file.  The documentation states a goal of not modifying files that
programs will look at.  The log file is examined to create site statistics
that are posted (at least in the Bay Area).  It's not just that the format
was changed; most of the useful information was removed.  Just how did they
decide what to put in the log?  There's no groups listed.  The duplicates
lines don't save the path, making it impossible to tune your feeds.  The
control messages aren't differentiated from regular news postings.  The
log is broken to the point of worthlessness; I'll stick to 2.11.17+ until
I get the time to rewrite the logger.  When I do, I'll post the changes,
since Geoff and Henry have said that they may never release a new version.

Another thing I noticed is that the spooler won't spool the incoming batch
if space is short.  On some systems [ours], /usr/spool/uucp and /usr/spool/
news are on the same filesystem.  This means that spooling the incoming
batch doesn't increase the space used (when the uucp D. file goes away).

Overall, I'd like to use C news.  I had hoped the Eric would get his
stuff together, but the latest round has convinced me that TMN will always
be risky.  Henry and Geoff have taken great pains to try to get it right
the first time.  If they had been a little less closed with their beta
tests, they might have gotten it perfect.

Btw, is my rindex fast?  I've running SunOS 3.5 and will go to 4.0 sometime.
What about the ANSI-compatible questions?
-- 
Larry Blair   ames!vsi1!lmb   lmb@vicom.com