[news.sysadmin] The Requested Presentation of Quota Based News Control

webber@klinzhai.rutgers.edu (Webber) (07/05/87)

In article <153@tmsoft.UUCP>, mason@tmsoft.UUCP writes:
> ...                                I would suggest you explain again your
> quota idea; then let it lie fallow.  Someone may pick up on it & it may
> save the net (if you want to get credited with saving the net, you should
> probably re-post every 6 months or so, so everyone knows it's your idea).

Ok.  Below I present how the quota based system would work and the
answers to those objections that I am aware of.  Anyone who gives much 
thought to it will realize that the net is not in eminent danger of
following my plan, so it should be sufficient to send me your objections
via mail.  If I find one overwhelming, I will gladly post to the net a
message giving you credit for straightening me out and apologizing for
the confusion that my ignorance caused.  I will keep an updated copy of this
message available for request by mail and for occasional reposting until
it no longer seems relevant to the net.  It is because of the existance
of this alternative that I believe one can legitimately oppose the conversion
of the net to all-moderation in the face of the net's limited bandwidth.

I view the entire net flow as one continuous stream of messages.  So far,
nearly every aspect of the message has been used as a basis for determining
whether or not that message will be passed on or taken except for
whether or not there is room for it.  On the other hand, when calculating the
expense of handling the net flow for any given site, whether it is disk
storage, communication bandwidth, or cpu that is the bottleneck, the
expense is always a function of the sum of the sizes of the messages
regardless of the other aspects of the message.

The closest the net has come so far to using quotas has been that some
systems impose limits on the size of messages due to problems uucp has/had
with large individual messages.  Since that quota arose at a time when 
nearly everyone had the same problem, it worked out rather well.

There are two places one could contemplate putting quotas, either on
the originator of the message or on the channel through which the
message flows.  I believe that placing a quota on the channel is the
simplest way to handle the problem.  Indeed, once a quota is imposed
on the channel, the technical problem of too much flow is solved and
the question of the nature of the flow can be handled at a pace that
is appropriate to the evolution of a new media.

Currently there is already a quota on all the channels, in that there
is a maximum flow that each can handle.  However, since the channels
have other uses than news, news must be prevented from dominating each
of these channels.  The amount of news flow that each channel (link between
neighboring nodes) can handle comfortably is differs from channel to channel.

One question that arises is how do you find out the size of a message.
There are two places that that can be done.  It can be done by remotely
querying the site from which the message is to be transfered or it
can be done by keeping a running total of the size of the messages transferred
so far and stop the transfer once they reach some cutoff (this total
could be calculated either locally or remotely).  Clearly it is not necessary
for both sites on the link to agree on the quota for that channel, but it
would certainly make it easier to handle for both of them.

The question arises of what will happen to the net when sites start refusing
to carry more than x bytes of news per day.

If communication bandwidth were all of the problem, then news would simply
pile up at various sites until it eventually got tranfered.  However,
at the the current rate of 2 meg per day, it is not difficult to imagine
that limitations on disk space would quickly dominate the situation.

Clearly if there is no room for something where it is and no place to put
it, it is not going to last long.  It is interesting to note that what
happens happens on the remote machine and not on the local machine.  Thus
it is the remote machine that will set policy on which of the unsent messages
will be expired (unless someone wants to take on a considerably larger
implementation task than just the one I am advocating).  

There are two options here: 1) delete some of the messages already in the
queue and 2) stop adding new messages to the queue.

Both the first and the second option mean that some people will miss some 
messages since some sites will be able to circulate much more news than
others.  However,  given the number of messages available and the bandwidth 
restrictions of the links in the net, this is inevitable and already the case.

Currently those people who are choosing to receive only a subset of the net 
are doing so based on group name.  This means that site administrators must
take responsibility for decisions such as whether or not it is a proper
utilization of their resources to carry a group whose discussion topic is
a computer that they don't and never plan to own or a hobby such a 
birdwatching, or job offerings from competitors, or the pros and cons of
abortion, or the philosophical aspects of the sciences.  If one attempted
to justify the groups one was transmitting on the basis of their content,
I doubt if there would be more than 4 groups that could manage country wide
distribution.  However, there is another aspect to these groups beside their
content and that is the morale of the participants in the various discussion
groups.  From a morale point of view, each of the groups is justified (and
many more groups as well).

Let us look at the various kinds of posting and consider the significance of
them getting `dropped on the floor.'

A request for information:  almost inevitably, the queries being
handled on the net are answerable elsewhere.  The public libraries
in the United States are adequate for handling many of them.
Contacting neighboring universities and colleges would result in
finding people who could handle most of the rest.  It is also worth
noting that if the questioner had been watching the traffic of the
group much, it would not be difficult to identify a few experts to 
whom a direct computer mail query could be sent.  Of course, most people
have anecdotes of questions that only the net could answer; these
are interesting because it is so rare and notable.  Certainly, the 
worst that would happen from dropping most queries for information on
the floor is that some people would learn how to use libraries and
browse professional journals.

Answer to the request for information:  clearly anyone seriously interested
in helping the person making the request would send them direct mail
to make sure that they noticed the answer.  However, most people like to
use queries for information as an opportunity to place that information
before a larger audience (this message is an example).  Thus, it doesn't
matter to the sender who specifically recieves the message just as long
as it gets wide enough distribution so that it has some possibility of
generating some action somewhere.

Blanket postings of information:  as noted above, no one is really maintaining
that every person who logs into a unix system should be given an opportunity
to see their message, just that the message is of general interest and
so they saw no reason to restrict its distribution anymore than the
net already does.  Thus if it doesn't make it to some places, the world will
not come to an end.

Part n of m:  this kind of message is rare, i.e., most people expect their
messages to be able to stand on their own basis as opposed to being
one of a series that is useless unless you collect the whole series.
However, there is one notable exception: large source postings.

One question of interest is just how big does a source posting have to be
before it will cause more trouble by trying to be sent as a single message
than it will cause by getting separated from its other pieces.  It would
seem to me that communication has improved enough over the past few years
that it would be worthwhile investigating the question of how long different
links would take to transfer an x byte file (not due to the rate of
the link but due to the strategy the link uses to handle errors in 
transmission).

Another question is: just what is happening when someone tries to post
a very large source or binary to the net?  Clearly such sources and
binaries were not meant to be read and hence are not intended as communication
between people.  Instead, they are intended to be used.  If a neighboring
node has a program that my node can use, then it makes sense to
establish an ad hoc link to transfer it (perhaps, even in rare cases 
transferring it piecewise via direct mail).  Thus, one could imagine
an announcement of an 800k source being made on the net and then it
actually move through the net as a chain letter (site to site NOT user
to user).  For that matter, floppy disks and mag tapes through the regular
post have been underutilized.  A source that isn't worth the postage and
media to the reciever probably isn't worth posting to the net either.

Rather than writing monolithic programs that do only one thing, it would
make more sense to post to the net small general purpose utilities that
other people could read and use within their own code.

While admittedly, the above is from the viewpoint of someone who can
write their own sources, I would maintain that it is less than totally
clever for a person who cannot program to use a source or binary that
they recieved off the net.  

In summary, you could say that I see the difference between mail and
news to be in the case of mail you know exactly who you want to send
to and the system tries to offer you as much support as possible in
getting your message to the recipient, whereas in news you really
don't know if there is anyone out there interested in your message and
the system distributes your message according to what links are interested
in transferring how many blind messages.  In neither case do the links
attempt to intrude by judging the contents of the messages, although the
systems have quite different behavior based on the intended usage.

Thus, I do not see any problem being generated from the use of quotas
to manage net news due to occasional loss of messages.  Indeed, I
see it as actually encouraging a more responsible usage of the media
in conjunction with making joining news less of a problem for individual
site managers.  Neither do I see quotas as causing any implementation
problems.  I await enlightenment.

> OR
> You can program your quota system and get Rutgers to use it, then
> on the basis of whatever is wonderful about it at Rutgers, convince your
> net neighbours, then the state, then the east coast, and by then you can
> probably convince the rest of us.

I by far think that this would be the best way to handle things, but
Rutgers is not a site where the flow has gotten so bad that people are
advocating the closure of unmoderated groups and the general control of
the flow through moderation.  Thus, implementing it at Rutgers would 
show that the code works, but would not address the advisability of actually
using it.  No one has so far maintained that the idea would be difficult to 
code in a manner that would integrate with the rest of the net.  Instead,
the discussion has always rested on the advisability of using it even
if it already existed.  

I believe I have adequately addressed all the issues that have formed
a basis for objection in the past as summarized above.  I have been
addressing this issue off and on since February and over that time
my understanding of the problems of implementing a quota based system
system within the structure of Usenet has grown.  In the past I have
stressed the notion that the net would adapt to the bandwidth it found
by reducing the number of postings and that this would occur by having
quotas push further and further back into the system until individual
sites were rationing the postings from their own users.  It now strikes
me that there would be little motivation for this since once the quotas
are in place, the pressure for changing the Usenet setup will be decentralized
and take a wholely unpredictable course (although I have not yet extrapolated
a future that would be worse than the currently expected one of increasing
use of moderated groups and group-name based all or nothing flow decisions).

----- BOB (webber@aramis.rutgers.edu ; rutgers!aramis.rutgers.edu!webber)

allbery@ncoast.UUCP (07/07/87)

The problems with sources via non-net channels are:

(1) Mail is often even more limited than news.  This includes the VERY
common occurrence of mailers that choke on messages longer than 60K; this
cannot be solved by saying "replace the mailer" because the companies that
provide the mailers don't supply either source or alternatives.  (When's the
last time you tried to get Microsoft to change something in Xenix for you? --
remember that Xenix is the most widely used UN*X variant.)

(2) Mailing media is wonderful -- assuming that there is a standard.  Alas,
my Pixel won't read IBM PC floppies (although there may be an upgrade for
it -- the software is 3 years old).  And there doesn't seem to be much (if any)
compatibility in cartridge tape drives.  (Not to mention $30 for ONE cart!)
Under these circumstances, the use of UUCP channels is an advantage --
as witness the AT&T Toolchest, which is UUCP-ONLY distribution.  Remember
that not all machines use the same backup methods, many machines don't HAVE
9-track tape drives and there isn't enough justification for spending mega $$$
for an add-on drive.

Frankly, the volume of the Usenet will continue to grow no matter what.  The
benefits of moderation are the cutting of chaff and the chance to let the
truly useful stuff through in its place.  (It does have its potential pit-
falls, but the first time I get a program source -- personal, not c.s.m
stuff; I *do* write my own -- will be the last and I'll switch to another
network.)

The biggest PROBLEM with moderation is that a moderator who is highly
opinionated can savage the net.  Not as much with the sources groups --
but if the moderator of, say, rec.arts.movies.reviews, disliked a particular
genre of movies enough to pan reviews of those movies or to suppress favorable
reviews of them, I think I can speak for the net in saying that that moderator
is not the best one for that group.  (NOTE:  I am not actually saying anything
about the moderator of rec.arts.movies.reviews; I simply chose a newsgroup
whose subject is basically opinion.)

The proper solution to this is to have moderators chosen by the net at large;
candidates would have to have a history of fairness in the field they would
be moderating (say 6 mos. to 1 year).  Of course, it has its own problems
as well:  moderator elections as popularity contests, or "pro-`sf'"/"pro-
`sci-fi'" lobbies (to select a recent argument on the net), etc.  Still, it
would help to insure that moderators were "straight".  (BTW, as "ombudsman"
you failed this test, Bob:  mapping all the binaries groups to "talk.bizarre"
is a good way to get a large percentage of the net riled at you.  And what
are PC users to do given that there are umpty-dump different compilers for
different languages around, and *none* standard?  Rot, perhaps, while the
smug people with compilers STANDARD on their machines sneer at them?)

This is the same problem that ALWAYS comes up -- when a {network, country,
etc.} reaches a critical point, there is suddenly NO working system of
government possible.  SOMEONE is guaranteed to be upset no matter what.
Moderators are intended to judge by quality, quotas by quantity; in a system
which is intended to transmit information, quality seems to be the more
important consideration.  This is the reason that moderators have always been
preferred over quotas, which will pass false or misleading information (or
out-and-out garbage, such as pet care techniques in rec.autos) if it comes in
before correct information, and will reject correct information if it comes in
after the quota has been met.

I think I've been vociferous enough by now.  Hopefully, I've managed to
explain the reasons behind the current system and to explore some possibilities
for alternatives; I leave it to you, the net at large, to decide what the best
solution(s) are.

++Brandon
-- 
Brandon S. Allbery, moderator of comp.sources.misc and comp.binaries.ibm.pc
ncoast Public Access UN*X, +1 216 781 6201 -- we have alt.all (email for info)

aXcess Company		    cbosgd			   \
6615 Center St. #A1-105	    {ames,harvard,mit-eddie}!necntc > !ncoast!allbery
Mentor, OH 44060-4101	    {well,ihnp4,pyramid}!hoptoad   /
+1 216 974 9210		    necntc!ncoast!allbery@harvard.harvard.edu

mjl@tropix.UUCP (Mike Lutz) (07/07/87)

WEBBERnews, with quotas, controls volume by artificial constriction of
the transmission path. There is no attempt to discriminate between gold
and dross.  Veteran flamers can simply fire off volley after volley of
identical articles so as to increase the probability that their
"contributions" will make it through.

Moderated groups control volume by discrimination.  The quality in
these groups is orders of magnitude above that in the unmoderated ones;
an advantage which, as a *reader*, I appreciate.

A note on reference works: will someone please give BOB a dictionary,
so that he can improve his spelling (recieves), his grammar (especially
agreement in number), and his vocabulary (eminent != imminent)?  Maybe
he'll become engrossed in the study of English and leave the rest of us
in peace?

A vote for C news, and a vote against WEBBERnews.

Mike Lutz GCA/Tropel tropix!mjl

sob@academ.UUCP (Stan Barber) (07/10/87)

First, I thank Bob for rehashing this. It makes much more sense than
the scatter of ideas I had previously read. Here are my thoughts.

In article <285@klinzhai.rutgers.edu> webber@klinzhai.rutgers.edu (Webber) writes:
>Currently those people who are choosing to receive only a subset of the net 
>are doing so based on group name.  This means that site administrators must
>take responsibility for decisions such as whether or not it is a proper
>utilization of their resources to carry a group whose discussion topic is
>a computer that they don't and never plan to own or a hobby such a 
>birdwatching, or job offerings from competitors, or the pros and cons of
>abortion, or the philosophical aspects of the sciences.  If one attempted
>to justify the groups one was transmitting on the basis of their content,
>I doubt if there would be more than 4 groups that could manage country wide
>distribution.  However, there is another aspect to these groups beside their
>content and that is the morale of the participants in the various discussion
>groups.  From a morale point of view, each of the groups is justified (and
>many more groups as well).

This seems to imply that you'd rather see messages restricted is some
arbitrary manner rather than a subjective one. I suspose this gets back
to your perception of the "old days" of usenet when people were able
to handle ALL the messages regardless of their perceived "worth". Your
approach also removes the need for moderatation since an arbitrary 
method would be used to retrict flow (a quota of messages or bytes or
whatever). I submit that a similiar method is in fact in use at some
low-capacity sites today. You alluded to this fact in other parts of 
this article relating to disk usage and some one-time limitation of UUCP.
It is my understanding that the whole rational behind the move toward 
moderatation is to provide a qualatative method of limiting traffic versus a
quantitative one such as yours. In theory, BOTH could exist and all sites
could use either method (or both). After all each site should be free to
manage its resources as it sees fit. The cooperative nature of usenet
allows the community to benefit, but the community should not place
restrictions on the individual sites.

The BAD thing about both systems is that some infromation is LOST.
In the quantative system, the "value" of the lost information cannot
be measured. In the qualatative system, the "value" is the main consideration.
As you are well aware, I am an advocate of the qualatative system.


>Thus, I do not see any problem being generated from the use of quotas
>to manage net news due to occasional loss of messages.  Indeed, I
>see it as actually encouraging a more responsible usage of the media
>in conjunction with making joining news less of a problem for individual
>site managers.  Neither do I see quotas as causing any implementation
>problems.  I await enlightenment.

I would like to believe that all readers and posters would come to value 
usenet in such a way that they use it responsibly. I think that many people
do, but some do not. I somehow doubt those people would come to value
usenet more if they knew about quotas, but I would love to be proved 
wrong. Those that use the net responsibly already operate under a 
self-imposed quota and would probably never be affected by a quota
system if one were created. The end result would probably be 
a group of "quota-busters" similar to your proposal to bypass
the moderation system.


>I believe I have adequately addressed all the issues that have formed
>a basis for objection in the past as summarized above.  I have been
>addressing this issue off and on since February and over that time
>my understanding of the problems of implementing a quota based system
>system within the structure of Usenet has grown.  In the past I have
>stressed the notion that the net would adapt to the bandwidth it found
>by reducing the number of postings and that this would occur by having
>quotas push further and further back into the system until individual
>sites were rationing the postings from their own users.  It now strikes
>me that there would be little motivation for this since once the quotas
>are in place, the pressure for changing the Usenet setup will be decentralized
>and take a wholely unpredictable course (although I have not yet extrapolated
>a future that would be worse than the currently expected one of increasing
>use of moderated groups and group-name based all or nothing flow decisions).
>
>----- BOB (webber@aramis.rutgers.edu ; rutgers!aramis.rutgers.edu!webber)

The future is what you make it, so if you think the current system is bad,
you need to come up with the new one. If you can't simulate it and 
make quantitative comparisions, you aren't going to win support to
your cause. As many people have said, usenet has no central authority
to speak of, so you can make noises about how bad it is and propose
a solution (as you have), but without a demostration, it will hold
little influence on the network as a whole.

People may not like the current system, but just complaining is not
an answer. The most helpful thing they can do is make suggestions for 
SPECIFIC changes _AND_ generate the programming necessary to make this
happen. You can see this in many examples: Larry Wall's rn, the nntp package,
C news, and so on. So, BOB, I encourage you to put your suggestions into a 
working configuration and let folks see it in action. 




-- 
Stan	     uucp:{killer,rice,hoptoad}!academ!sob     Opinions expressed here
Olan         domain:sob@tmc.edu                            are ONLY mine &
Barber       CIS:71565,623   BBS:(713)790-9004               noone else's.

andrews@ubc-red.uucp (Jamie Andrews) (07/10/87)

     Clearly the only way to implement Webber news
is by using Junker.............












(:-) :-) No I'm not serious you boneheads!!!!! (-: (-:)
--Jamie.

heiby@mcdchg.UUCP (Ron Heiby) (07/15/87)

It seems that with each passing article, Webber makes another inch of
progress on the long hard trek to understanding.  This one even seems
to make some sense, if we understand the premises behind it, which we
are lately beginning to.

Webber (webber@klinzhai.rutgers.edu) writes:
> I view the entire net flow as one continuous stream of messages.
Yes, it can be looked at in this way.  Unfortunately for the quota scheme,
the stream is not a homogeneous flow.  There are fish in the stream that
we want to be able to catch, while skimming off the flotsam.  Also, the
news doesn't really flow, it must be *pumped*.  Pumping takes energy (money).

> So far,
> nearly every aspect of the message has been used as a basis for determining
> whether or not that message will be passed on or taken except for
> whether or not there is room for it.
This is not so.  Several months before certain newsgroups were renamed into
the "talk" hierarchy, I took some action on my own.  I decided that my
machine did not have the disk or phone capacity to carry the articles in
those groups.  (Yes, I know that this isn't exactly what you mean.  I, with
the users of my system, had decided that we didn't want to work at pumping
the trash.)  Webber's plan seems to require sites to stop skimming the trash
and work at pumping it.  He seems to believe that we would be happier if we
went ahead and pumped some trash, but also pumped less in total.  This is
his major fallacy.  Most sites would rather pump less in total BY NOT PUMPING
THE TRASH.

> Clearly if there is no room for something where it is and no place to put
> it, it is not going to last long.
So, if there is some non-trash "upstream" that hasn't got here yet, it may
"evaporate".  Funny how the poster of the article doesn't have any idea of
how many sites actually receive the article.  Too bad.  But, as we soon read,
this is no great loss.

> Currently those people who are choosing to receive only a subset of the net 
> are doing so based on group name.  This means that site administrators must
> take responsibility for decisions such as whether or not it is a proper
> utilization of their resources to carry a group whose discussion topic is
> a computer that they don't and never plan to own or a hobby such a 
> birdwatching, or job offerings from competitors, or the pros and cons of
> abortion, or the philosophical aspects of the sciences.
It is the job of the site administrator to do exactly that.  This is the second
major fallacy that Webber maintains.

At this point, Webber lists several types of article and describes why he
does not believe that letting them get 'dropped on the floor' is a bad thing.
These include:
> A request for information
> Answer to the request for information
> Blanket postings of information
> Part n of m
  and Large source code
I guess this means that most of comp, sci, and news can be lost with no
problem, leaving us with rec, soc, and talk.  This is not the network that
my management and I are interested in supporting.  I justify my whole
involvement based on exchanging information with others and getting "free"
software.  I *cannot* justify the expenditure by saying that, "It is a fun
way to B.S. cross-country with my buddies."  I am confident that there are
people on the net who spend 98% of their time in the rec, soc, and talk
groups and can find some way of justifying that time.

> Rather than writing monolithic programs that do only one thing, it would
> make more sense to post to the net small general purpose utilities that
> other people could read and use within their own code.
We are now exposed to the Webber approach to software design.  Maybe he'd
be willing to pay people to write their donated code in such a way that he
would find more useful.  Good grief!

Now, since (according to Webber) we don't really know who, if anyone, is
going to get our article and care anything about it.  And, since when we
send mail, we do know:
> Thus, I do not see any problem being generated from the use of quotas
> to manage net news due to occasional loss of messages.  Indeed, I
> see it as actually encouraging a more responsible usage of the media
> in conjunction with making joining news less of a problem for individual
> site managers.  Neither do I see quotas as causing any implementation
> problems.  I await enlightenment.
Some of what Webber says in support of his statements does make some sense.
There *are* an awful lot of postings for information that could be got more
cheaply elsewhere.  There *are* and awful lot of "followups" that should have
been "replies".  The proposed "quota" system is not the answer, because it
is not a selective filter.  As I said above, I want to reduce the pumping by
not pumping trash, as opposed to turning off the pumps after X hours of use.

> I have been
> addressing this issue off and on since February and over that time
> my understanding of the problems of implementing a quota based system
> system within the structure of Usenet has grown.
Absolutely!  I expect you to have sufficient understanding by this Fall
at your current rate of improvement.  Keep up the good work!  :-)

Webber's statements about propagating the "logjam" of quota back to the
originating site just means that people who tend to answer more or post
more information than they request experience the majority of the logjam.
Also, since not everyone is going to implement this scheme (There are still
some 2.9 sites, right?), the logjam isn't going to propagate all the way up,
but is going to "pool" at those sites not running WEBBERnews, where it will
finally expire waiting for the logjams downstream to clear up.

Webber asks for specific problems with his proposals.  I believe that I
(and others) have given them.  If I have misunderstood his remarks or have
not been clear myself, I'd appreciate receiving clarifications or requests
for clarifications so that our understanding can increase.

I apologize for the slightly disparaging remark above and for pushing
the water analogy beyond the limits of good taste.
-- 
Ron Heiby, heiby@mcdchg.UUCP	Moderator: comp.newprod & comp.unix
"Small though it is, the human brain can be quite effective when used properly"