[comp.mail.misc] CompuServe backlog; mail servers

karl_kleinpaste@cis.ohio-state.edu (12/13/90)

Mail-based archive servers are a disease which should be stamped out
wherever they are found.

A whole slew of CompuServe users appears to have discovered the
wonders of bitftp@pucc.princeton.edu in the last week or so, not to
mention not a few other similar addresses.  Then there's the real
humans out there who are sending multi-megabyte blortfuls in to
themselves and their pals.

The pipe feeding as far as the gateway host here is T1, of course; but
CompuServe is not IP-connected, and it's just a 9600bps straw going
into CompuServe itself.  And we've only got B+ Protocol, not something
known for raw throughput capacity.  Effective throughput is more like
4800bps.

The MX record has been shifted to giza.cis.ohio-state.edu instead of
saqqara.cis.ohio-state.edu, so that Saqqara trades a lot of additional
mindless NFS work for no complex sendmail efforts at all.  (We
cross-mount our UUCP area.)

Nonetheless, we're struggling to keep the load on Saqqara within
double digits, UUCP logins are routinely timing out after 60 seconds
because /bin/login can't get enough done, and doing backups on it is
effectively impossible with the load so high.  I'm going to try to
balance filesystems again tonight to try to ease the abuse on one
drive, but I don't think it'll help much.

There has been discussion from time to time about the difficulties of
MBASes (mail-based archive servers) using NZIC (non-zero incremental
cost) links.  The "cost" of being an NZIC link is not necessarily
$-related.  The $-cost of the CompuServe link is essentially zero,
unless one counts my time-as-$ as the admin keeping it afloat; the
hassle-cost, admin-cost, and mail-delay-cost of the link have gotten
really, REALLY high.

We are considering, for the sake of gateway sanity, aggressively
blowing away anything that clearly comes from or is going to an
archive server.  This will require some administrative nonsense that I
don't like, because I _really_ don't like peeking in other people's
mail.  But we have to get control of the machines again.

MBASes: Just Say No, because You Don't Know Anything About The Links
Between Hither And Yon.

--karl

tneff@bfmny0.BFM.COM (Tom Neff) (12/13/90)

In article <KARL.90Dec12194010@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:
>A whole slew of CompuServe users appears to have discovered the
>wonders of bitftp@pucc.princeton.edu in the last week or so, not to
>mention not a few other similar addresses.  Then there's the real
>humans out there who are sending multi-megabyte blortfuls in to
>themselves and their pals.

A description file (NTTARZ.UUE) was recently uploaded to UNIX Forum's
Data Libraries on CompuServe.  That is the kind of thing that can get
you increased business in a hurry.

>The pipe feeding as far as the gateway host here is T1, of course; but
>CompuServe is not IP-connected, and it's just a 9600bps straw going
>into CompuServe itself.  And we've only got B+ Protocol, not something
>known for raw throughput capacity.  Effective throughput is more like
>4800bps.

I am afraid that as the world continues to discover
Internet/UUCP/Usenet, traffic will only go up.  CIS or someone had
better plan for the future.

>We are considering, for the sake of gateway sanity, aggressively
>blowing away anything that clearly comes from or is going to an
>archive server.  This will require some administrative nonsense that I
>don't like, because I _really_ don't like peeking in other people's
>mail.  But we have to get control of the machines again.

Why not just blow away anything over 10K?  Surely small BITFTP transfers
are no more objectionable than other kinds of mail.  It's the 17 part
XDungeon transfers that suck.  Just make the rule that nothing bigger
than 10K gets gated.  You don't have to know what's in the mail to know
what size it is.

>MBASes: Just Say No, because You Don't Know Anything About The Links
>Between Hither And Yon.

Not necessarily so.  This here is a leaf site off UUNET which is on the
Internet.  The only extra net.cost I incur for a BITFTP above and beyond
what some tweed jacketed prof runs up on his Sun at Enchilada State is
the single UUCP link, which I pay for myself.

-- 
Thank God for atheism!        8=8=8=8          Tom Neff / tneff@bfmny0.BFM.COM

jpr@jpradley.jpr.com (Jean-Pierre Radley) (12/14/90)

In article <KARL.90Dec12194010@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:
>Mail-based archive servers are a disease which should be stamped out
>wherever they are found.
>
>A whole slew of CompuServe users appears to have discovered the
>wonders of bitftp@pucc.princeton.edu in the last week or so, not to
>mention not a few other similar addresses.  Then there's the real
>humans out there who are sending multi-megabyte blortfuls in to
>themselves and their pals.

> ... stuff deleted ...

>We are considering, for the sake of gateway sanity, aggressively
>blowing away anything that clearly comes from or is going to an
>archive server.  This will require some administrative nonsense that I
>don't like, because I _really_ don't like peeking in other people's
>mail.  But we have to get control of the machines again.

I dig it Karl, the overload problem is getting to you. So you post your
message to this group. How will anyone reading your wailings and flailings
help resolve the problems?  Have you sent this message to John James or Pete
Holsberg? Shall I?

Who's sending multi-megabyte blortfuls? CIS mailboxes can conatain multi-
megabytes?
-- 

 Jean-Pierre Radley	    NYC Public Unix	jpr@jpr.com	CIS: 72160,1341

jpr@jpradley.jpr.com (Jean-Pierre Radley) (12/14/90)

In article <KARL.90Dec12194010@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:
>Mail-based archive servers are a disease which should be stamped out
>wherever they are found.

Meanwhile, back at the ranch, note the path that the above-referenced article
took:

Path: jpradley!uupsi!rpi!zaphod.mps.ohio-state.edu!usc!cs.utexas.edu!tut.cis.ohio-state.edu!cis.ohio-state.edu!karl_kleinpaste

From your cis, you get to one of your Pyramids, and then the stuff has to go
via texas and california to get to another of your Pyramids before moving on to
Troy?

-- 

 Jean-Pierre Radley	    NYC Public Unix	jpr@jpr.com	CIS: 72160,1341

les@chinet.chi.il.us (Leslie Mikesell) (12/14/90)

In article <KARL.90Dec12194010@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:

>The pipe feeding as far as the gateway host here is T1, of course; but
>CompuServe is not IP-connected, and it's just a 9600bps straw going
>into CompuServe itself.  And we've only got B+ Protocol, not something
>known for raw throughput capacity.  Effective throughput is more like
>4800bps.

Hmmm, that means that whoever asked for it is going to be paying
CIS $24/hour for that same effective 4800bps to download to
their own machine - CIS will get even more if the files are going
into the public download areas to be retrieved by several people.

>We are considering, for the sake of gateway sanity, aggressively
>blowing away anything that clearly comes from or is going to an
>archive server.  This will require some administrative nonsense that I
>don't like, because I _really_ don't like peeking in other people's
>mail.  But we have to get control of the machines again.

Why don't you just attach a header pointing people to the things available
from uunet's 900 number or your own anon uucp, either of which would be
cheaper than the CIS on-line charge if the particular thing they want
can be found there.  If you really have to peek in the messages you
could respond to any directory request headed toward an archive server
with your own info message.

It seems like it would be fair to ask CIS to provide whatever resources
you need to keep running anyway, since if you are keeping a 9600 baud
line busy around the clock that should generate at least $5-600/day
for them in connect time as people download it.

>MBASes: Just Say No, because You Don't Know Anything About The Links
>Between Hither And Yon.

Easy for you to say.  What's the alternative for the rest of us? 

Les Mikesell
  les@chinet.chi.il.us

AMillar@cup.portal.com (Alan DI Millar) (12/14/90)

> We are considering, for the sake of gateway sanity, aggressively
> blowing away anything that clearly comes from or is going to an
> archive server.  This will require some administrative nonsense that I
> don't like, because I _really_ don't like peeking in other people's
> mail.  But we have to get control of the machines again.

I understand your point of view.  I also happen to be one of these
Leeches Of The Electronic Society who uses BITFTP (and don't forget the
LISTSERVs which access SIMTEL20).  As such, I have a vested interest in
archive servers.

May I make a suggestion?  If the line cost is not the issue, but machine load,
is there a way to _delay_ large mail messages from archive servers instead
of blowing them away?  Can you put them in some sort of low priority queue
to be processed at off-hours or something like that?  

Even if it took five days for an archive server reply to get to its
requester, I know I would still love it more than NO access.

- Alan Millar    AMillar@cup.portal.com

bob@MorningStar.Com (Bob Sutterfield) (12/17/90)

In article <KARL.90Dec12194010@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:
   Mail-based archive servers are a disease which should be stamped
   out wherever they are found.
   ...
   MBASes: Just Say No, because You Don't Know Anything About The
   Links Between Hither And Yon.

Amen.  Say it with feeling, brother!

   A whole slew of CompuServe users appears to have discovered the
   wonders of bitftp@pucc.princeton.edu in the last week or so, not to
   mention not a few other similar addresses.
   ...
   We are considering, for the sake of gateway sanity, aggressively
   blowing away anything that clearly comes from or is going to an
   archive server.  This will require some administrative nonsense
   that I don't like, because I _really_ don't like peeking in other
   people's mail.

That's the lesser evil because

   But we have to get control of the machines again.

is a system and gateway maintainer's first responsibility.  You might
consider implementing a filter to replace the message bodies of the
irresponsible abusers' traffic with a brief explanation of why they
can't take advantage of your service any more.  That way at least they
won't be able to plead ignorance, complain about broken servers, or
bleat about censorship.

Such a filter can probably be expressed as a Perl one-liner, hidden
away in a Sendmail ruleset somewhere.  It's all just line noise :-)

karl_kleinpaste@cis.ohio-state.edu (12/18/90)

FYI: We gained substantially on the CServe-bound mail queue over the
weekend.  Unfortunately, at the curent rate of gain, it would take a
week to clear completely.  So tomorrow morning, we're bringing up
SneakerNet: I'm going to dump a dd(1) tape of the current backlog and
drive it over to CServe, where my counterpart is going to inhale them
The Hard Way for delivery.

"Never underestimate the bandwidth of a
 station wagon loaded with magtapes."
	--Unknown, but possibly Bob Sutterfield (Bob?)

jpr@jpradley.jpr.com writes:
>  I dig it Karl, the overload problem is getting to you. So you post your
>  message to this group. How will anyone reading your wailings and flailings
>  help resolve the problems?  Have you sent this message to John James or Pete
>  Holsberg?

On the latter, yes, I have.  And I'll go check CServe's UNIX Forum in
about half an hour to see what's up.

On the former, I posted the note here both for the information content
so people know why things are delayed, as well as to editorialize on
the evils of MBASes.  As a result of the posting, I am informed in
private mail that
	[a] Uunet's mail servers will not respond to any but uunet
	    customers.
	[b] Brian Kantor & Co are working on a mail server which
	    will respond only if there is an IP addr for the origin;
	    or the address is local-to-UCSD-uucp-host!user; or the
	    address is uunet!one-machine-hop!user.
I find both of these to be done The Right Way.  I find
bitftp@pucc.princeton.edu to be done The Wrong Way.

(In fairness, I may be picking on Princeton too much.  When the queue
exploded 10 days ago, and we poked at things to learn what it was that
was happening to us, bitftp figured prominently.  I no longer know for
sure who all the guilty parties are.)

>  Who's sending multi-megabyte blortfuls? CIS mailboxes can conatain multi-
>  megabytes?

CServe subscriber mailboxes can contain up to 100 messages, each up to
50Kbytes.  Or so I was told last time I asked.  This doesn't prevent
people from _trying_ to send multi-megabyte blortfuls, with one of two
effects:
	[1] The recipient downloads/deletes things as fast as they
	    come in;
	[2] The excess stuff starts bouncing back at the originator
	    due to a "mailbox full" error.  (And yes, there are
	    glitches where it inadvertently returns to the From: on
	    this error rather than the Sender:.  CServe folk are aware
	    of the problem and will fix it eventually.)
Guess what the latter does to the gateway machine...

>  Path: jpradley!uupsi!rpi!zaphod.mps.ohio-state.edu!usc!cs.utexas.edu!
>	 tut.cis.ohio-state.edu!cis.ohio-state.edu!karl_kleinpaste

>  From your cis, you get to one of your Pyramids, and then the stuff
>  has to go via texas and california to get to another of your
>  Pyramids before moving on to Troy?

Zaphod is Dave Alden's Sun4 (SS1+, I believe) in Math.  It's currently
got a hammerlock on the #1 slot in the Top 1000 Influence List,
running 11 percentage points above uunet in news propagation.  I have
no affiliation with that machine, though we exchange a feed, and I'm
surprised that Tut didn't feed the article direct to Zaphod before
Fletcher et al got a hold of it.

--karl

mrm@sceard.Sceard.COM (M.R.Murphy) (12/19/90)

In article <KARL.90Dec17233700@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:
[...]
>"Never underestimate the bandwidth of a
> station wagon loaded with magtapes."
>	--Unknown, but possibly Bob Sutterfield (Bob?)

I heard this as "the fastest way to deliver data across the country is to fill
a truck with magtapes and drive it" in the early 1970s. Is it worth calculating
to see if this might be the case? I think it was Tom Reidel that said it.

[...]
>
>On the former, I posted the note here both for the information content
>so people know why things are delayed, as well as to editorialize on
>the evils of MBASes.  As a result of the posting, I am informed in
>private mail that
>	[a] Uunet's mail servers will not respond to any but uunet
>	    customers.
>	[b] Brian Kantor & Co are working on a mail server which
>	    will respond only if there is an IP addr for the origin;
>	    or the address is local-to-UCSD-uucp-host!user; or the
>	    address is uunet!one-machine-hop!user.
>I find both of these to be done The Right Way.  I find
>bitftp@pucc.princeton.edu to be done The Wrong Way.

I recognize the problems (evils:-) of MBASes. I think that a major part of the
problem is trying to put two pounds in a one pound sack. That is, to try to
have a low-bandwith link as the sole path between two high traffic generating
entities is likely to be a loser. The solutions are:

  1) generate less traffic,
  2) raise the link speed, or
  3) make more paths.

Number one contains the the "kiss off MBASes, they stink 'cause they flood my
site" solution. Removing the service is probably the same as pulling the plug
on a machine because it is too noisy. Other ways to do this are to raise the
price or slow the turnaround or limit the flow.

Number two is probably the simplest solution. It may cost money, but it provides
new capabilities and a new toy. As always, hopefully it is someone elses money
and effort :-) :-(

Number three is the artery bypass solution. If a single path is too clogged, as
even a high speed link will become if the traffic increases, then new paths for
information flow will really help. This also contains the brian@ucsd.edu and
uunet.uu.net providing mail server solutions. (Easy for me to say, we're
connected to both :-) I think this is the preferable solution, but I'm biased.

This brings up what I think is likely to become a Real Problem. Check the UUCP
maps. See how often uunet is in a path. They become a real limiting resource
as a transport path. They caused, through their excellent service, a network
that was distributed to become heirarchical. That is, to send to a site across
town, the message first goes through Virginia. This creates a burden that is
rather large when the cross-town message is a biggie. It would seem that it
might be getting close to time to regionalize uunet. What I mean by this is
have uunet distribute its service geographically such that (simplifying to just
four regions, and in UUCPmapspeak)

uunet-north	.uu.net(LOCAL)
uunet-north=uunet
uunet-south	.uu.net(LOCAL)
uunet-south=uunet
uunet-east	.uu.net(LOCAL)
uunet-east=uunet
uunet-west	.uu.net(LOCAL)
uunet-west=uunet

and have sites in the west hook to uunet-west and such, and have uunet choose
best internal routing such that a west-west transfer stays in the west and a
west-east transfer doesn't need to go south for the winter. Did I get the
simplified example right? Correct it, please.

WRT mail, this can be accomplished to some degree by having uunet provide
nameservice and MX's pointing to regional centers that handle mail forwarding.
This brings up the problem that uunet could probably not survive financially
on nameservice and MX alone. It is too valuable a resource to kiss it off.
Would regionalization as above help solve what probably will become a problem?
Is it already this way internally in .uu.net? Does anyone care?

BTW, I know that uunet is now its own, and that this is none of my business
except as a concerned customer who likes the service.

[...]
>	[2] The excess stuff starts bouncing back at the originator
>	    due to a "mailbox full" error.  (And yes, there are
>	    glitches where it inadvertently returns to the From: on
>	    this error rather than the Sender:.  CServe folk are aware
>	    of the problem and will fix it eventually.)
>Guess what the latter does to the gateway machine...

I'd hope that eventually meant quicker than between trips to the bank to
deposit receipts. It would seem that the problems of cis.ohio-stat.edu
can be attributed to being the sole, slow-link path 'twixt two large
entities. Cut the link and don't think about other solutions :-)
Or just provide service for air-letters and don't provide parcel post.

Now back to bickering about header rewriting :-)
-- 
Mike Murphy  mrm@Sceard.COM  ucsd!sceard!mrm  +1 619 598 5874

jbaltz@cunixf.cc.columbia.edu (Jerry B. Altzman) (12/21/90)

In article <KARL.90Dec17233700@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:
>	[b] Brian Kantor & Co are working on a mail server which
>	    will respond only if there is an IP addr for the origin;
>	    or the address is local-to-UCSD-uucp-host!user; or the
>	    address is uunet!one-machine-hop!user.

Perhaps the way to do this right is the way LISTSERV does it. Although not
infallible, an MBAS may keep a log file for a user, and not permit him/her
from ordering more than XXXK/day, which could certainly slow it down a bit.

DISCLAIMER: This isn't Columbia. This is me. Columbia is them.

>--karl
//jbaltz
--
jerry b. altzman  "I didn't do it, and when I did I wasn't"      212 854 8058
jbaltz@columbia.edu   "there, and you deserved it."      jauus@cuvmb (bitnet)
NEVIS::jbaltz (HEPNET)                    ...!rutgers!columbia!jbaltz (bang!)

karl_kleinpaste@cis.ohio-state.edu (12/22/90)

mrm@sceard.Sceard.COM writes:
   >"Never underestimate the bandwidth of a
   > station wagon loaded with magtapes."
   >	--Unknown, but possibly Bob Sutterfield (Bob?)

   I heard this as "the fastest way to deliver data across the country
   is to fill a truck with magtapes and drive it" in the early 1970s.
   Is it worth calculating to see if this might be the case? I think
   it was Tom Reidel that said it.

I don't know if it's the original citation, but Craig Ward informs me
in private mail that there's a slightly different wording of this
maxim in Tanenbaum's _Computer Networks_, 2nd Ed., p. 57.  I checked,
and it's there.  Thanx to Craig for the ref.

   [Reducing traffic] contains the the "kiss off MBASes, ..."

I don't really want to do that, and we didn't (except by accident,
when uux locking broke down under the load, and mail crashed and
burned [smail diagnostic: "uux failed: -1"]:-).

As much as I dislike MBASes, they have their place.  The problem is
that most (in fact, all but a very very few) MBASes are insensitive to
resource abuse.  Kantor's solution is elegant and resource-(ab)use-
sensitive.

   [Raising link speed] is probably the simplest solution. It may cost money,

We're working on that, in several directions.  Just this instant,
there's little to be done directly.  Ask any corporate entity to do
something that will cost money, and watch inertia take over.  It'll
happen, but it'll take time.

   [Making more paths] is the artery bypass solution.

Already implemented.  We're running two lines to keep ahead of the
(still much higher than usual) flow.

   Now back to bickering about header rewriting :-)

Oh, fun. :-)

--karl

lws@comm.wang.com (Lyle Seaman) (12/29/90)

mrm@sceard.Sceard.COM (M.R.Murphy) writes:

>I heard this as "the fastest way to deliver data across the country is to fill
>a truck with magtapes and drive it" in the early 1970s. Is it worth calculating
>to see if this might be the case? I think it was Tom Reidel that said it.

Well, if we update the example a bit, at 16 Gigabytes/pound for optical
disks, a one-ton pickup truck full crossing the country in three days has
a bandwidth of 126 MBytes/sec.  Which is pretty good, but the delay is 
awful.  Do you know how long it takes to write 32 Terabytes of data? 
(Actually, the cost isn't too bad either.)

-- 
Lyle                  Wang           lws@capybara.comm.wang.com
508 967 2322     Lowell, MA, USA     Source code: the _ultimate_ documentation.