[comp.protocols.tcp-ip] An even worse week for EGP....

narten@PURDUE.EDU (Thomas Narten) (01/22/88)

The trouble with bogus networks slipping into the core tables has
apparently returned. Has anyone fingered a culprit?

In my seemingly endless explorations of the EGP code in gated, I have
discovered yet another tidbit. These comments apply directly to egpup
as well, from which gated claims ancestory.

1) Totally bogus nets are slipping into the core tables. By bogus, I
mean neither class A, B or C. When encountered during the processing
of an EGP network reachability update, one cannot determine how many
bytes the address is supposed to be. Gated chokes when it receives
them in updates and tosses the partially processed update.

2) EGP detects the "problem" with the data, and assumes that the
entire packet is bad (even though it installs all the routes up to the
bogus one). After 4 such updates in a row, gated sends a cease command
to the gateway it was peering with, and goes back into neighbor
acquisition state. Meanwhile, any routes you are advertising to the
core get marked ureachable for 60 seconds.

This has happened at our site many (1-2 dozen times) since last
Thursday. (e.g. the problem doesn't seem to be going away).

3) If you are running Kirton's EGP, or an old version of gated, none
of these events will be logged anywhere. In other words, if it is
happening to you, you probably aren't even aware of it.

4) The code in gated, egpup and the BBN core gateways apparently all
allow non class A, B, C nets to slip into updates [which side of the
fence do fuzzballs and vendor gateways sit on??] The problem is no
doubt code of the following type:

if (is_classA_net(net)) then
    /* do class A stuff */
else if (is_classsB_net(net)) then
   /* do class B stuff *
else
   /* must be a class C net */
   /* do class C stuff */
endif   

For egpup users, I strongly urge that the following patch be applied
to rt_egp.c. It will prevent egp from sending out reachability updates
for non-A,B,C nets.

*** rt_egp.c	Thu Jan 21 20:02:20 1988
--- /usr/src/local/etc/egp/rt_egp.c	Mon Nov 10 19:05:38 1986
***************
*** 128,135 ****
  
  		if( in_isa( current_gw)) n_bytes = 3;
  		else if( in_isb( current_gw)) n_bytes = 2;
! 		else if (in_isc( current_gw)) n_bytes = 1;
! 		else return(ERROR);
  		bcopy( (u_char *)&current_gw+4-n_bytes, nrp, n_bytes);
  		nrp += n_bytes;
  
--- 128,134 ----
  
  		if( in_isa( current_gw)) n_bytes = 3;
  		else if( in_isb( current_gw)) n_bytes = 2;
! 		else n_bytes = 1;
  		bcopy( (u_char *)&current_gw+4-n_bytes, nrp, n_bytes);
  		nrp += n_bytes;
  
***************
*** 152,159 ****
  							/* copy net addr */
  	    if( in_isa( net_pt->net)) n_bytes = 1;
  	    else if( in_isb( net_pt->net)) n_bytes = 2;
! 	    else if (in_isc( net_pt->net)) n_bytes = 3;
! 	    else return(ERROR)
  	    bcopy(&net_pt->net, nrp, n_bytes);
  	    nrp += n_bytes;
  	}					/* end for each net */
--- 151,157 ----
  							/* copy net addr */
  	    if( in_isa( net_pt->net)) n_bytes = 1;
  	    else if( in_isb( net_pt->net)) n_bytes = 2;
! 	    else n_bytes = 3;
  	    bcopy(&net_pt->net, nrp, n_bytes);
  	    nrp += n_bytes;
  	}					/* end for each net */

For gated users, the same basic patch applies. Look in the file
rt_egp.c, routine rt_NRnets(). I would supply the patch, but I am
running a beta test version of gated that's different than everyone
elses.

I am skeptical that the above fixes really get at the heart of the
problem. Some of the nets that are appearing apperently don't really
exist, but they are technically valid Internet addresses.

Thomas

Mills@UDEL.EDU (01/22/88)

Thomas,

The fuzzbugs do not currently perform a sanity check on IP net numbers
appearing in EGPspeak or hellospeak messages; however, they do sport
a martian filter that tosses packets with unsane IP addresses should
they appear. You make a good point that the martian filter should
also be expressed in a sanity check for routing data as well.

Today the situation I complained about last Sunday happened again,
with nets 207-something figuring very prominently in the noise. I tried
to catch a MILNET gateway in the act, but without success before the
wickedness evaporated from the core tables.

Dave

CLYNN@G.BBN.COM (01/22/88)

I also noticed strange network numbers a couple months ago.  When I
ask about it, I was told that the bogus numbers were coming from a
gateway which had a noisy line to the net.  Maybe the protocol being
used didn't have a checksum, maybe it was turned off, maybe it wasn't
"strong enough".  Once the information gets into a table, it would be
propogated.  I didn't get the impression that the problem had a very
high priority; the "only" adverse effect would be to use a slot in
routing tables, and since it was a bogus net, nobody should be using
it, and they should time out.
	I suspect that, given the dynamic and evolving nature of the
Internet, any code which assumes "otherwise <something valid>" will
cause us problems before too long.  E.g., the new internet multicast
addresses, the new IP and TCP options that we would like to create,
etc.

Charlie

mcc@ETN-WLV.EATON.COM (Merton Campbell Crockett) (01/23/88)

Dave, Thomas:

Could you be more specific about the 207-something net that is figuring in the
noise.  I understand DIA has been performing some tests for the DoDIIS communi-
ty and could be using MILNET for some of the testing.  What better test data
could you find than unclassified source listings which are stored in directory
[207,7] on DoDIIS IAS if you wanted to test FTP between systems.

Something of a knee-jerk reaction on my part since the device handler and line
interface module software that I deliver to DIA and the DoDIIS community is
normally stored in the following directories:

	[  7,7]	Generation Command Files, Task Images
	[107,7]	Object Modules
	[207,7] Source Listings, Task Image Maps
	[307,7]	Source Modules

I've given you the full list of directories in case any of it might correspond
with the addresses you're seeing.

Merton Campbell Crockett
AN/GYQ-21(V) Program
EATON Corporation
Information Management Systems

Mills@UDEL.EDU (01/23/88)

Merton,

The typical "207 net" has a second octet whis is close to 255 (decimal),
like 255, 254, 253 and 252. I even tried decoding it in ASCII, but that
didn't parse well. I then went after the NSFNET gateways, since these
dudes now squawk upwards of 120 nets and may expose scaling bugs if any non-core
gateway does; however, these gateways are on ARPANET. I have not yet
uncovered definitive evidence that would implicate MILNET dudes, although
what evidence there is points that way.

Dave

Mills@UDEL.EDU (01/23/88)

Charlie,

THose protocols known to me (EGP, RIP, hellospeak) all have reasonably
competent checksums. At one time GGP had no checksums, but has since
evolved to have them (Mike B, is my head cross-threaded?).

Dave