[net.news] mailing lists vs. newsgroups: facts

chuqui@nsc.UUCP (Chuq Von Rospach) (09/07/85)

In article <3500005@ccvaxa> preece@ccvaxa.UUCP writes:
>In what sense does a mailing list do a better job?  (1) It is less
>visible to new readers, since it isn't just there to be browsed on
>every site. (2) The traffic still has to be passed along the route
>to each reader, as mail.  In some cases that will mean MORE net traffic
>than if the notes had been passed as news.

>I wonder how significant that is.

Oh, I do so hate to put a damper on anargument, but lets try using facts
for once and see what happens...

The following formula shows the number of readers needed on a mailing list
fof a newsgroup conversion to break even:
    list_readers = (sites_on_net*efficiency)/(increase*average_hops)

The derivation of that formula is at the bottom of this article for those
that want to check my math. The definitions are:

    o sites_on_net -- the number of sites a message in this newsgroup
    is distributed to.  I'll use 1950 based on the following:  I assume
    2200 sites on the net. I assume 5% of those sites are local networks
    and transfer cost is 'free'.  5% of the sites turn the group off for
    some reason.  That leaves you about 1950 sites.

    o increase -- the factor by which readership increases when
    converted to a newsgroup (1 is no increase, 2 is doubled, 4 is
    quadrupled, etc...). For the best case, lets assume readership
    quadruples, for the worst case, it merely doubles.

    o efficiency -- the efficiency advantage of news transport over
    mail which is shown as (1-%reduction_for_efficiency). No batching
    saves you 0%, batching with no compression is about 35%, and full
    compression is about 55-65%.  The variance between worst case and
    best case is an estimate of the number of sites running various
    batching schemes, and worst case could (theoretically) be as low as
    0% but lets use the range 35% to 65%.  Because news feeds tend to
    be shorter distances than a lot of mail feeds, add another 10%.
    Worst case is then (1-.45) or .55 and best is (1-.75) or .25.

    o average hops -- the number of hops, on average, that a message in
    a mailing list needs to travel from the list to the recipient.
    Based on my two large mailing lists I've run (lan-news last year,
    nuke-winter this year) the average number of hops from my site to the
    person on the list is about 4. Let's use 3 for a best case and 5 for a
    worst case.

    o many mailing lists (mail.feminists, for instance) use intermediary
    distribution points to reduce the number of total hops. Mail.feminists
    has something like 200 people on it, but a lot of messages are sent out
    to sites that redistribute them further to keep the load down. This
    feature allows a list to support a lot more users before hitting the
    breakeven point.

    o large mailing lists can be digested, thereby reducing a lot of
    mail overhead by shipping fewer but larger messages, which also puts
    off the breakeven point (this could also be done by a mod.all group)

Best case breakeven then becomes (1950*.25)/(4*3) or 40 people on the list.
Worst case breakeven is (1950*.55)/(5*2) or 107 107 person on the list.

In general, it looks like when the number of hits somewhere between 50 and
75 readers it makes sense to turn it into either a moderated group (if
content regulation is of interest) or a net.all group (if you want a
free-for-all). 

===== Caveats =====
    o Volume tends to be higher on a newsgroup. Also, there tends to be a
    higher amount of garbage because of the loss of moderation. If there is
    a reason to keep the garbage out, a moderator ought to be used with a
    mod.all group or the mailing list ought to be maintained.

    o hop_count_cost assumes netwide traffic. Certain sites (ihnp4 and
    other major mail gateways) would see higher traffic patterns
    because of a mailing list, leaf sites would see lower.

    o Many of those numbers are estimates. Your mileage may vary,
    especially the mailing list -> newsgroup audience increase. It may
    actually be as low as 1:1, and as high as infinity -- we have no
    data to work on.  average_hops varies on how well connected the hub
    of a mailing list is, but even if they only talk to ihnp4 the
    average paths isn't much worse than 5.

    o With the exception of the fudge factor in the news efficiency, the
    increased cost of a long distance hop over a local hop is ignored.

=== breakeven formula generation ===

A hop_count_cost is considered to be the total_hops/list_readers

For a mailing list, total_hops can be defined as (average_hops * list_readers)
so the hop_count_cost becomes (average_hops * list_readers)/list_readers
or average_hops.

For a newsgroup, total_hops is defined as the number of sites on the net.
list_readers needs to be extrapolated from the number of readers on the
mailing list, and we throw in a fudge factor because transfer by batching
in news is more efficient than shipping mail. The formula becomes:

    (sites_on_net*efficiency)/(list_readers*increase)

Setting those two equations equal to each other, we can find the breakeven
point. The formula is:
    average_hops = (sites_on_net*efficiency)/(list_readers*increase)

which becomes
    list_readers = (sites_on_net*efficiency)/(increase*average_hops)

and you solve for the number of readers that need to be on the list for a 
conversion to a newsgroup to break even.

=== final disclaimer ===

Putting together this article I have finally figured out why so few people
bother with facts while arguing on the net. It took me about 2 hours to put
the math together and a lot of thinking (in other words, work...) It is a
lot easier to play with supposition and opinion, and I guess we get lazy
after a while...

chuq

-- 
Chuq Von Rospach nsc!chuqui@decwrl.ARPA {decwrl,hplabs,ihnp4}!nsc!chuqui

An uninformed opinion is no opinion at all. If you dont know what you're
talking about, please try to do it quietly.

preece@ccvaxa.UUCP (09/11/85)

> In article <3500005@ccvaxa> preece@ccvaxa.UUCP writes:
> >In what sense does a mailing list do a better job?  (1) It is less
> >visible to new readers, since it isn't just there to be browsed on
> >every site. (2) The traffic still has to be passed along the route
> >to each reader, as mail.  In some cases that will mean MORE net traffic
> >than if the notes had been passed as news.
> 
> >I wonder how significant that is.

> chuqui@nsc replies:
> Oh, I do so hate to put a damper on anargument, but lets try using
> facts for once and see what happens...

> ... detailed calculation showing that mailing lists are more
> efficient up to a crossover point somewhere between 40 and 100...

> Putting together this article I have finally figured out why so few
> people bother with facts while arguing on the net. It took me about 2
> hours to put the math together and a lot of thinking (in other words,
> work...) It is a lot easier to play with supposition and opinion, and I
> guess we get lazy after a while...
----------
Thanks for the work.  I didn't have the energy or the numbers to do it.
On the other hand, it says just what I said, only in a little more
detail: "In some cases [distribution as a mailing list] will mean
MORE net traffic...".  The assumptions used in the calculation seem
pretty reasonable.  On the other hand, a number of people have
recently suggested that backbone sites should have more say in voting
that leaf sites, and they are the sites whose traffic is more heavily
affected by the changeover (if I ran a mailing list with 50 people
getting it, the link from me through uiucdcs and ihnp4 would get all
50 copies of every distribution and every response, as opposed to one
copy in news form).  The assumptions specifically weight all sites
evenly. Actually, I would have guessed the breakeven would be
a little HIGHER than chuqui calculated it to be, but that's no
big deal.

It's hard for me to believe that anything doesn't get read by at
least 40 people, net-wide.

-- 
scott preece
gould/csd - urbana
ihnp4!uiucdcs!ccvaxa!preece