[fa.tcp-ip] MILNET/ARPANET performance

tcp-ip@ucbvax.ARPA (05/29/85)

From: Mark Crispin <Crispin@SUMEX-AIM.ARPA>

Folks -

     I have spent a good bit of time feeling out Telnet
performance to TOPS-20 sites on MILNET, ARPANET, and Canada's
DRENET.  I have observed that this performance between Milnet and
ARPANET is, in a word, terrible.  There are frequent echo delays
of over a minute in duration.  By comparison, ARPANET to DRENET
performance is considerably more tolerable.

     In a number of instances, Telnet performance from an
unloaded TOPS-20 system on ARPANET to another unloaded TOPS-20
system on Milnet has been terrible enough to make serious work
nearly impossible, while access to the Milnet TOPS-20 from the
Milnet TAC was smooth and quite usable.  At times, the delays
have been long enough for the Telnet user program to declare the
connection dead.

     This is a guess, but I believe that the gateways are
throwing out a lot of packets.  Unless they've changed it, all
three networks still endeavor reliable delivery of all 1822
messages so TCP reliable delivery is in theory not resorted to.
It is probably traffic-related, since TCP performance between
ARPANET and DRENET is tolerable in spite of the slow lines at
DRENET.

     Telnet is a worse case test of this, due to its character at
a time nature.  I wonder if the TCP retransmission parameters
need tuning depending upon whether the connection is on a
reliable network (e.g. 1822) or is going through a gateway.

-- Mark --
-------

tcp-ip@ucbvax.ARPA (05/29/85)

From: Doug Kingston <dpk@BRL.ARPA>

In the dark, no one can hear you scream...

I will add my voice to the list of those who have been silent in
the past about the lousy MILNET/ARPANET gatewaying service provided
by the swamped BBN 11/03 gateways.  Supposedly they are to be upgraded
to Butterflys to solve this, but how long must we wait...  And will
it really solve the problem?

			Using the network for real work,
					-Doug-

tcp-ip@ucbvax.ARPA (05/29/85)

From: David Roode <ROODE@SRI-NIC.ARPA>

Wasn't the goal of the MILNET/ARPANET mailbridges merely to provide
mail service?  People can use a MILNET TAC rather than ARPANET host in
the first place if they want to access a MILNET host--they are located
in more and more locations, and access is essentially added
wherever it is requested.  An awful lot of people continue to use
an ARPANET TAC merely because they happen to know its phone number.
I bet there are people on this list who do not know they can
FTP a list of MILNET TAC dialup numbers (those currently
operational) off of the host SRI-NIC.ARPA with the pathname
NETINFO:TAC-PHONES.LIST  .    In fact, it might be interesting
to see what the effect on load would be if everyone who could used
a TAC on the proper network.
-------

tcp-ip@ucbvax.ARPA (05/29/85)

From: Mark Crispin <Crispin@SUMEX-AIM.ARPA>

There are those of us who are "homed" on both Milnet and ARPANET,
and periodically need to telnet from hosts to a host on the other
network, especially when software is being sloshed around all over
the place.  This isn't random hacking, either, this is real honest
to goodness Internet business.

It is NOT acceptable to tell us that the Milnet gateways are only
for mail.  It is NOT acceptable to tell us to use a TAC until such
time as enough TAC dialups can be guaranteed.  Of a number of Milnet
TACs in this area, the only one with posted dialups is the SRI TAC,
which has all lines busy about 50% of the time.

I have local ARPANET host access, so I don't care much about local
ARPANET TAC's, but I should note that as far as I know there aren't
any public 1200 baud dialups on the local ARPANET TAC at Stanford
(and I'm not aware of any others).
-------

tcp-ip@ucbvax.ARPA (05/29/85)

From: "J. Noel Chiappa" <JNC@MIT-XX.ARPA>

	I feel that I really ought to say a few words in defense of the
gateway maintainers at BBN, who I think are possibly being unjustly
maligned. (I'll try to keep this short, but it is a complex topic.
Please excuse cryptic references; I'm not trying to write a paper!)

	I'm not so sure that the real problem is in their gateways. I
don't have any exact performance figures for their gateways, but my
long experience with LSI11 gateways and MOS indicates to me that
gateways built with that technology can run at over 200 packets/second,
way fast enough to sink an IMP. I don't know if their gateways go quite
that fast, but they can probably handle packets fast enough to swamp
the ARPANet.
	I'm also not sure how much the limited number of buffers in an
LSI11 matters. When the Stanford LSI11 gateway I maintain was upgraded
to use memory mapping and have lots of buffers, the performance was
not greatly improved. (Adding something called RFNM counting did improve
it, but the BBN gateways have had that for a long time.)

	I would point at two possible causes for the problems. The
first is that the ARPANet itself is simply not designed to handle the
style of traffic load that gateways present, and I wouldn't be
surprised if it isn't overloaded anyway. (I've heard some comments from
BBN people that indicate it is.) I don't have any load measurements
from before the conversion to TCP (~1980) but I wouldn't be surprised
if it was up from then. Perhaps someone in BBN could look up some
figures? For aficionados of fine details, there is also a problem
called 'resource blocking' that active hosts run into, which there is
no way for host software to guard against. It results in all outbound
traffic freezing for 15 seconds.
	Also, there are a limited number of gateways between the two
nets; the largest share of the load is handled by 3, the ones at DCEC,
ISI and BBN. 'Well', you say, 'no problem, the IMP's work fine with the
same number of connections. Why not the gateways?' The answer is that
the IMPs cooperate among themselves much more closely, and in addition
have control over the rate at which traffic is let INTO THE NETWORK!
IMP's can always refuse to take packets from the hosts if the resources
to deal with them are not available. Gateways have no such control; the
get given the packets and have to deal with them as best they can.

	This leads on to the final point, which Mark alluded to in
his comment about 'throwing out a lot of packets'. This is precisely
what an overloaded gateway does, and in fact it is about the only
defense mechanism it has. Needless to say, this results in terrible
performance; in addition, network resources are wasted delivering the
packet to the point at which it is discarded.
	Sad to say, Mark, adjusting the timers will probably not help
much. The problem is that any retransmission algorithm is guessing
based on incomplete information; things will always be non-optimal (and
there's probably a Shannon theorm that proves it). You'll either have
lots of waits, or waste lots of resources retransmitting when you don't
need to, (and making things worse by using those resources).

	What the network really needs to deal with these problems are
better congestion and traffic control (the ability to regulate the
traffic flow in the system better), and a lot more information passed
back to the hosts to allow them to make optimal use of the network.
	These are all just symptoms of a deeper truth, which is that
building really big packet switched networks is still emerging
technology. Understanding of problems and proposals of new mechanisms
to handle them are appearing, but there is still a way to go.

	Noel
-------

tcp-ip@ucbvax.ARPA (05/29/85)

From: Ron Natalie <ron@BRL.ARPA>

What's even worse than swamping the 11/03 gateways is the rather
inane approach to EGP routing.  All EGP packets go through the one
EGP speaking gateway because the gateways don't communicate the
information gleaned from EGP to the MIL-ARPA bridges.  From what
I hear it is the problem of DDN-PMO dragging their feet on the
matter.

In addition, the EGP gateway for MILNET is busted and trashes
a good number of packets going through it.

-Ron

tcp-ip@ucbvax.ARPA (05/29/85)

From: "J. Noel Chiappa" <JNC@MIT-XX.ARPA>

	Ron, I'm not sure I completely believe that one either,
although there is some truth to it.

	(To explain to the rest of the list what (I think) he is
alluding to, the routing protocol the BBN gateways use among
themselves, GGP, is somewhat deficient (not really its fault since it
is an ancient protocol) and cannot advise gateways of the existence of
routes that do not pass through the gateway sending the information.
To make that a little plainer, consider the concrete example of the
MIT ARPANet gateway communicating routing info via EGP with a gateway
on the ARPANet at BBN; the BBN gateway has no way, inside GGP, of
letting a gateway on the ARPAnet at ISI know that it can get directly
to MIT by going straight to the MIT ARPANet gateway. All traffic from
ISI to MIT must go via the BBN gateway.)

	It is true that this will tend to clog up the network as
a whole by sending such packets through the ARPANet twice when once
would have done. (Solving this requires replacing GGP. As I understand
it, there was a definite decision to do this in the context of the
Butterfly upgrade. I gather the schedule for that delay has slipped;
I'm not sure where the responsibility lies. You can argue about whether
that was a wise decision, as opposed to spnding resources in upgrading
the LSI11 gateways as a bandaid.)

	However, traffic from the MILNET to the ARPANet should not be
affected directly by this problem. It is true that the network
provides no way for a host (or gateway) to pick the optimum
MILNET/ARPANet gateway from the set available; this is because the
ARPANet looks like an atomic network at the IP routing level, when in
fact as we all know it is a set of links. For this reason it is
important for hosts to set the default ARPANet/MILNet gateway by hand,
using oiutside knowledge of the ARPANet topology to pick the optimal
one.
	Fixing THAT problem in some non-kludge way is yet another large
unattacked issue.

	Noel
-------

tcp-ip@ucbvax.ARPA (05/29/85)

From: "Frank J. Wancho" <WANCHO@SIMTEL20.ARPA>

David,

You have a point.  Certainly MILNET host users should use MILNET TACs
where available.  (But, given a choice between a local call to an
ARPANET TAC and a long distance call by whatever service, including
Autovon, FTS, or even WATS, to a MILNET TAC, which would you choose?)

However, I *thought* the underlying Internet philosophy is to provide
"full" interconnectivity between networks.  I see no reason for
inadequate gateways to be excused on the pretense that they are for
mail only.

The problem is more pervasive when you consider the subnets with their
own inadequate gateways, all following the ARPA/MILNET model.  If the
ARPA/MILNET gateways are to be restricted to mail only, then using
them as "proven" developed models for other gateways is misleading, to
put it mildly.

There is something wrong, and just because it is more visible with
ARPA/MILNET "mail" gateways doesn't make it any less of a problem.

--Frank

tcp-ip@ucbvax.ARPA (05/30/85)

From: CLYNN@BBNA.ARPA

Mark, et.al.,

	I have also experienced many lost packets and delays
between Milnet (ISI) and Arpanet (BBN), but I was trying to FTP
800 page data files.  They always seemed to timeout.  I have
also spent a lot of time trying to make retransmissions and
flow control work a little better than they did (on TOPS20s).

	The statement that TCP reliable delivery not being
resorted to is false, theoretically.  The Arpanet and Milnet
are both reliable 1822 networks, with a nominal limit of 8
outstanding packets between Imp/port pairs.  The gateways
redirect hosts to send packets to the gateway nearest to the
sending host.

	To see the problem, consider the following diagram.

	Milnet Imp  ---Imp--- Milnet ---Imp----  Milnet Imp  -- ISIA
	    |					      |
	  BBN GW				   ISI GW
	    |					      |
BBNA --	Arpanet Imp  --Imp--- Arpanet --Imp----  Arpanet Imp

Traffic from ISIA to BBNA goes to the local imp, through the ISI GW,
across the Arpanet, to BBNA; traffic to ISI goes through the BBN GW,
cross country via the Milnet to ISIA.  The transit time through either
net is (more or less) proportional to number of hops.  Thus it takes
longer to go from the BBN GW to ISIA (via Milnet) than from BBNA to
the BBN GW (or from the ISI GW to BBNA (via Arpanet) than from ISIA to
the ISI GW), the points where the 1822 flow control is applied.
Consequently, BBNA can reliably send packets to the BBN GW faster than
the gateway can reliably get them to ISIA -- even if there is NO other
traffic in either net.  Eventually, the packets at the gateway will
build up and the gateway will have to discard the excess packets
(sending a source quench back to the host).  I.e., assume BBNA to BBN
GW is 50ms or 8 packets per 50 ms = 160 packets per second; BBN GW to
ISIA is about 300ms or 8pkt/300ms = 26 packets per second; thus 134
packets per second down the drain.  (Note that simply switching to
faster processors, e.g., a butterfly, will not help.)

	What is needed is NOT adjustment of retransmission parameters,
what IS needed is end to end flow control algorithms that work, and
some specific guidelines to those who are implementing the protocols.

	There are a few things that could be done to relieve this
particular problem.  The gateways could be programmed to redirect the
hosts to the gateway nearest the destination (so called "destination
routing" which the gateway crew is investigating).
	It isn't simple, and requires knowledge in the gateways of
	many things about topoligies and delays between pairs of
	hosts -- a long way from the "stateless" gateway originally
	described.
One can also get busy and figure out how to do flow control.
	We have added code to our TOPS20s for flow control: it
	closes windows, uses estimated baud rates to limit
	outstanding packets (instead of just filling a window),
	limits the number of packets retransmitted when one is lost,
	and it boths sends and processes source quenchs.  Even if
	this all works, it may not help much until most of the other
	hosts take similar actions.
My solution to the FTP problem was to tell it to use a source route
through the ISI GW.
	It worked because the data was all flowing in one direction
	and because the TCP will automatically invert a received
	source route option (the FTP server didn't have to be changed).

Charlie

tcp-ip@ucbvax.ARPA (05/30/85)

From: Steve Aliff <Aliff@MIT-MULTICS.ARPA>

You're right.  The original brain-damaged idea was to limit gateways to
mail only.  (Although I saw several iterations with limited Telnet,
etc.) That idea seems to have been abandoned, and rightfully so, in
favor of full inter-net gateways.  I can think of several applications,
and even more user environments, where leaving one's favorite terminal
niche to find a dial-up terminal to access a TAC doesn't come close to
being a working solution. Let's find and fix the real problem and not
bring up ghastly ideas from the past.

That's the longest flame I've had recently. Apologies to all innocents
caught in the crossfire.

tcp-ip@ucbvax.ARPA (05/30/85)

From: Chris Torek <chris@gyre>

I might take this opportunity to note that many 4.2BSD sites are
retransmitting packets once every second, no matter what the actual
round trip ack time is; this doesn't help gateway load at all.

There is a bit of code in /sys/netinet/tcp_output.c that looks like this:

		if (SEQ_GT(tp->snd_nxt, tp->snd_max))
			tp->snd_max = tp->snd_nxt;
		
		/*
		 * Time this transmission if not a retransmittion and not
		 * currently timing anything.
		 */
		if (SEQ_GT(tp->snd_nxt, tp->snd_max) && tp->t_rtt == 0) {
			tp->t_rtt = 1;
			tp->t_rtseq = tp->snd_nxt - len;
		}

The second SEQ_GT is guaranteed to fail, thus nothing is ever timed; and
the retransmits happen at the maximum rate (1/second).

The code should be changed to:

		if (SEQ_GT(tp->snd_nxt, tp->snd_max)) {
			tp->snd_max = tp->snd_nxt;
			/*
			 * Time this transmission (it's not a retransmission)
			 * unless we're already timing something.
			 */
			 if (tp->t_rtt == 0) {
				tp->t_rtt = 1;
				tp->t_rtseq = tp->snd_nxt - len;
			}
		}

(Note, Berkeley has fixed this.)  I hope most 4.2 arpa sites are reading
this. . . .

Chris

tcp-ip@ucbvax.ARPA (05/30/85)

From: CERF@USC-ISI.ARPA


Folks,

Gateway performance IS important.  Especially for DoD where the
whole point of internet was to capitalize on connectivity where
ever it could be found; in a crisis, the traffic goes where it
can.

I think the gateway performance has been decreasingly
satisfactory as the level of traffic has built up.  Clearly, the
character-echoplex requirement exacerbates matters a good deal,
and the 8 messages outstanding rule on the ARPANET and MILNET
make the problem more severe since traffic gets throttled below
the TCP/IP level as a result (the new END/END protocol in the
IMPs should help some).

Are there any hard data about gateway throughput - Dave Mills
always seems to have his hands on measurement information - how
about it, Dave?

Can BBN say anything about higher capacity gateways under
development?

Before we tar the LSI-11/03 gateways, let's try to find out where
the bottleneck is - for all I know it is other than the gateway
itself.  I remember that in the Ft.  Bragg Packet Radio
experiments we found that 8 messages outstanding were the real
bottleneck and quickly went to line at a time application support
to reduce the packet rate.  This was particularly acute at Bragg
because nearly every appliation ran on the SAME host and the 8
message limit applied between that host (ISID) and the gateway
qua host on ARPANET.

Vint

tcp-ip@ucbvax.ARPA (05/30/85)

From: ljs@bbnccv


The 8 message limit in the Arpanet and Milnet is a major problem for
gateways.  Often in our daily statistics I have seen ARPANET (or MILNET)
gateways dropping a high percentage of packets received (20%-30%) at fairly low
throughputs (50-70 packets per second), while other gateways on faster
and non-blocking networks can pass 200 packets per second with
no dropping at all.  A quick look at the daily ARPANET log often shows 
that the ARPANET (or MILNET) IMPs were blocking their interfaces during
this period.

This says that the processing power of the LSI-11 gateway is not the problem,
at least up to 200 packets per second.  Lack of buffers in the LSI-11 is
a problem, however, since short periods of interface blocking could be
smoothed over by a greater buffering capacity.  There is a project underway
to provide more buffers for the LSI-11.  We are developing a new multiprocessor
gateway which will provide even more buffers and processing
power, in addition to a new interior routing algorithm and a better algorithm
to distribute EGP information internally.  This project is being funded by
DARPA, and to my knowledge the DDN PMO has made no commitment to switch.

The new end-to-end algorithm in the IMPs will improve the situation
considerably, since the IMP will no longer block the entire interface
just because one connection is blocked.  

In addition, there are plans to put EGP in all of the mailbridges (after
the memory upgrade).  This should reduce the EGP-related problems that
MILNET sites have been seeing.

Linda Seamonson

tcp-ip@ucbvax.ARPA (05/31/85)

From: Ron Natalie <ron@BRL.ARPA>

The goal of the MILNET/ARPANET gateways is to interconnect the two nets.
These are the only authorized ways of getting packets between hosts on
the MILNET side of the DDN backbone to hosts on the ARPANET side.  The
reason they are called mail bridges is hopefully obsolete.  Originally
certain paranoid elements in DOD thought that those experimental people
on the ARPANET were going to do something to their network, so after spending
years having an internetwork system developed, they decided that they were
going to partition the  two halves, with the exception of mail.  These
gateways were going to be a kludge that examined the TCP port number to
allow only Mail packets to go through.

Most people have probably realized that this idea is not great.  Especially
those of us on the MILNET side who need to talk to the rest of the world.
It is apparent with a little thought that it is a whole lot easier to make
a nuisance out of yourself with mail than anything else, therefore the
blocking gateways would not help.  My personal view is that the gateways
remain full IP gateways and in the case of problem or national emergency
someone at the NOC presses the "destruct gateways" button and partitions the
net.

I don't think that the TACs are loading down the gateways.  TAC's aren't
that efficient, they just don't make that many packets.  The prime TAC
loads are the silly people who are using KERMIT through them, but most
of these people stay on their own side of the chasm.  The big load, as
always is mail.  The fact that these gateways are pretty much the same
as they were two years ago, and the net load has increased dramatically
is a significant factor.  In addition, every since the EGP cutover, they
don't route as efficiently as they used to.  In addition, the entire
ARPANET/MILNET IMP complex is getting in trouble.  More and more traffic
is being pumped through it but the trunk capacity is not being increased
as rapidly.

-Ron

tcp-ip@ucbvax.ARPA (06/02/85)

From: Murray.pa@Xerox.ARPA

I think "adjusting the timers" would help more than you give it credit
for. From my experience, the single biggest problem on large networks is
retransmitting too soon and too often.

Most code gets debugged in a local environment that doesn't have
gateways dropping packets because the next phone line (or net or..) is
overloaded. People tighten down the timers to make things "work better".

Unfortunatley, the sociology of this problem doesn't help to get it
fixed. If you increase your timeouts, you don't get any positive
rewards. It's only when almost everybody does it that anybody will get
the benefits. Even then, the finks that don't cooperate get as much
benefit as everybody else.

tcp-ip@ucbvax.ARPA (06/02/85)

From: Lixia Zhang <Lixia@MIT-XX.ARPA>

I would support Noel's view point that "adjusting the timer will not help
much" in an overloaded net.  Consider the following arguments:

- Timers, in general, are not shorter that a normal round-trip delay,
  even in the case as you mentioned, "most code gets debugged in a local
  environment".

- Therefore in most cases, if there is a series of timeouts, it is
  started with jammed or lost packets.  This means that somewhere in the
  net gets overloaded by CURRENTLY offered data traffic.

- The window size = outstanding data = traffic load offered to the net.

- Therefore without reducing the window size
                                 -> no reduction on network load
                                          -> no help to the overloaded net.

- It is true that retransmitting too soon and too often will further damage
  the situation, but simply adjusting the timer to hold up retransmission
  longer will NOT resolve the congestion.

Lixia
-------

tcp-ip@ucbvax.ARPA (06/03/85)

From: POSTEL@USC-ISIF.ARPA


Folks:

I think that adjusting your times may still have a big effect on your
own performance.  Looking at the numbers Dave Mills forwarded from the
Gateway monitoring data collected by BBN one can see that the typical
gateway is receiving about 5 to 10 datagrams per second (maybe 20
datagrams per second during the peak hour of the week).  If one is
sending retransmissions at the rate of one per second then one is
contributing about 10% to 20% of the load on the gateway (maybe only
5% at the gateway's buisest time).  I think these numbers are still
big enough that ones' own traffic is not totally lost in the vast sea
of traffic contributed by others.  I think there is not as much going
on in the network as we commonly asume, and i think that one still as
a little bit of leverage on influencing the destiny of one's own
datagrams.

--jon.
-------

tcp-ip@ucbvax.ARPA (06/03/85)

From: MILLS@USC-ISID.ARPA

In response to the message sent  Sat 1 Jun 85 22:33:56-EDT from Lixia@MIT-XX.ARPA

Lixia,

A large number of hosts have been observed here using initial retransmission
timeouts in the one-to-two second range, which has been repeatedly noted as
being too short (see RFC-889). When a couple of these WMWs gang up on a
busy gateway, instant congestion occurs and doesn't go away until the
hosts time out the ACK for their SYN, usually a minute or so. The SYNfull
gateway meanwhile is dropping lots of packets for other clients, who
themselves are ratcheting the retransmission-timeout estimate upwards.
The system is obviously unstable, even when the gateway was comfortably
underloaded to begin with. All it takes is a pulse of traffic sufficient
to topple the gateway over its buffer limit.

In other words, your argument has great merit; however the assumption that
retransmission timeouts are always longer than the roundtrip time is not
correct for many players in this circus.

Dave
-------

tcp-ip@ucbvax.ARPA (06/03/85)

From: "J. Noel Chiappa" <JNC@MIT-XX.ARPA>

	How true that one can ruin it for all. One definite facet of
congestion control (when we eventually implement it) is server penalization
of hosts that don't obey the rules. Gotta have some feedback in the
system to encourage people to fix lossage.

	Noel
-------

tcp-ip@ucbvax.ARPA (06/03/85)

From: imagen!geof@SU-SHASTA.ARPA


To summarize the last few messages:

	1.Currently, many hosts retransmit too often.  This is
	  a major source of congestion, which can be alleviated by
	  forcing hosts to use better algorithms for their timeouts,
	  including (but not limited to) longer initial timeouts.

	2.After we do this (if we can do this), congestion in the
	  network will still be a problem, which according to
	  Lixia Zhang's and Noel Chiappa's arguments, can only be
	  solved by controlling the entry of packets into the internet.

Clearly item 1 is important, and easier to carry out.  Item 2 is an
equally valid problem.

- Geof Cooper