[mod.protocols.tcp-ip] EGP trouble

johnsson@DECWRL.DEC.COM (Richard Johnsson) (06/19/86)

I have recently noticed (I can't say if it recently happened :-) that
my EGP process is having disagreements with the EGP core gateways on
MILNET. I seem to acquire a neighbor and load the routes from it. A few
minutes later the EGP process reports bad checksums and after three
minutes drops the neighbor and switches to the other one.

Needless to say this is causing us some grief as the routes keeping
coming and going every few minutes. I have several questions:

1. Is there something funny going with the EGP core on MILNET?

2. Are there EGP core gateways on MILNET other than AERONET-GW and
   BBN-MINET-A-GW?

3. Is there newer/better EGP code for BSD Unix systems than what I have?
   I'm running Paul Kirton's EGP User Process with documentation dated
   23-Aug-84 which I fetched (I believe from ISI) in September 1984.

Richard Johnsson <johnsson@decwrl.dec.com>

mullen@nrl-css.UUCP (06/19/86)

In reference to yesterday's message from johnsson@decwrl.DEC.COM,
we recently started seeing the same thing here at NRL-CSS.ARPA.
We're unable to get beyond MILNET most of the time.

Could we perhaps have an acknowledgement that someone appropriate
is looking into the problem?  I've reported it to hostmaster@nic
and milnetmgr@ddn1, but I'm not sure those are the right people
to notify.

Thanks.

	Preston Mullen
	Computer Science and Systems Branch (Code 7590)
	Naval Research Laboratory
	Washington DC 20375-5000
	202 767-3507

brescia@BBNCCV.ARPA.UUCP (06/19/86)

     Could we perhaps have an acknowledgement that someone appropriate
     is looking into the problem?  I've reported it to hostmaster@nic
     and milnetmgr@ddn1, but I'm not sure those are the right people
     to notify.

When there's a problem with the core gateway system, or you suspect gateway
routing, you should call the NOC (Network Operations Center) at 800-492-4992
(or in Massachusetts, 617-497-2900).  Be prepared to name names (or net
addresses) of unreachable nets or hosts.

The tcp-ip mailing list is a bit slow for reporting current operational
problems.

BBN gateway people are looking into the lack of routing info or EGP
availablity.  The first info is that both BBN-MINET-A-GW (26.1.0.40) and
AERONET-GW (26.8.0.65) have been up continuously on MILNET since Monday 13:30
(minet) and Tuesday 18:17 (aero).  The third EGP gateway at YUMA-GW
(26.3.0.75) has been down since Wednesday 0900 while the site is changing
power service.

	Mike Brescia
	617-497-3662

johnsson@DECWRL.DEC.COM (Richard Johnsson) (06/19/86)

On a suggestion from bruce@Think.COM I changed MAXPACKETSIZE from 576
to 1006 and things got a lot better. No more checksum complaints and
I'm able to hang on to my neighbor once acquired.

In looking at a trace of the routing activity, there seems to be a lot
of bouncing around. Several networks I checked on get new metrics about
every 4 minutes. Usually the gateway remains the same but the metric
bounces between 4 and 5 or 5 and 6.

Although my immediate problem is solved, it looks like things are still
not completely healthy.

Richard

bnsw@MITRE-BEDFORD.ARPA.UUCP (06/19/86)

Due to the havoc EGP has been doing with routing info and the user confusion
factor, I have turned off EGP.  We are a Milnet site with a subnet.  I also
noticed that we received additional routing info for our subnet that
identified bbn-milnet-gw,arpa-milnet-gw,sri-milnet-gw,sac-milnet-gw, and
isi-milnet-gw as gateways to our subnet.  This is useless info from our side
and is not removed after the EGP bad checksums/cease of core gateway when the
other routes are zapped.  (just to provide more info for solving the
problem...)

Can a status report be sent out to this distro list when the problem has
been fixed?  I haven't seen any help.

Thanks,
Barbara Seeber-Wagner

karels@MONET.BERKELEY.EDU (Mike Karels) (06/19/86)

It sounds as if the milnet has re-discovered a bug in Kirton's egp.
The same thing happened on the arpanet some months ago; I think
it was discussed on egp-people.  The problem was that routing packets
grew larger than the receive buffer, resulting in truncated packets
that won't checksum.  The simple "fix" is to increase the definition
MAXPACKETSIZE to a "large" value; I used 2048, then added code
to detect truncation.

I have a handful of other bug fixes and tracing hooks for Kirton's egp,
but some of it isn't very well tested.  Also, it includes minor
modifications for 4.3BSD, which now leaves the IP header on ICMP packets
as it does for other raw IP protocols.  I can make these changes available
when it's cleaned up a bit.

		Mike

brescia@BBNCCV.ARPA (Mike Brescia) (06/19/86)

EGP Checksums?  There was a problem with this in February, and a fix announced
March 3, in the egp-people list (EGP-PEOPLE@bbnccv).

---- begin bug fix message ----

To: egp-people@BBNCCV.ARPA
Cc: tmallory@BBNCCV.ARPA
Subject: EGP Checksum errors
Date: 03 Mar 86 17:29:23 EST (Mon)
From: tmallory@BBNCCV.ARPA

Most, if not all, users of the Kirton EGP code on the Arpanet have seen
bad EGP checksum errors in recent weeks.  The immediate source of the 
problem seems to be the following line(located with grep):

defs.h:#define MAXPACKETSIZE 576

....

For now, redefining MAXPACKETSIZE to 1006 should take care of the
immediate problem for Arpanet and Milnet sites ...

---- end of bug fix message


If your site is running EGP and you are not now receiving the EGP-PEOPLE
(egp-people@bbnccv) mailing list, I invite you to register your local
distribution list (e.g. egp-people@your-site) or yourself for the list.  Send
a note to egp-people-request@bbnccv.

	Mike