root@topaz.RUTGERS.EDU (Charles Hedrick) (12/06/86)
There are two issues raised in this discussion: routed as a gateway to gateway routing protocol, and routed as a method by which non-gateway hosts discover what gateway to use when they want to talk to another net. Potentially, one might use routed among the gateways, and some other method for normal hosts to find a gateway. Or visa versa. Thomas Narten's message is evidence that I did not distinguish these two issues carefully enough. For smallish networks connected by high-speed media, routed is probably OK as a protocol for gateways to keep each other up to date about the topology of the network. My original assertion was that it was not appropriate in the general case. Unfortunately, I don't recall all of the problems. But here is one. It's called the "counting to infinity problem". Consider the following set of networks and gateways. Letters are gateways. Number are networks. The numbers under the gateways are the hop count they show to get to network 3. A 1 B 2 C 3 X----------X---------X-------- 2 1 0 Now suppose that gateway C goes down. For a while, A and B will continue to believe, and advertise, the old routes. But eventually they will time out. Here is what will happen when B realizes that the route via C is no longer current. A 1 B 2 C 3 X----------X--------- -------- 2 3 Since A has continued to advertise a hop count of 2, B thinks it can route via A. At some point, A will realize that its old route is no longer any good. But B is now advertising a hop count of 3. So we get A 1 B 2 C 3 X----------X--------- -------- 4 3 Etc. While this is happening, any packet destined for network 3 is bounced back and forth between A and B until its time to live goes to zero. There are other similar problems that can happen if you have larger numbers of gateways. Some of them result in long delays before new routing patterns stabilize. Others result in no convergence at all. Perhaps someone who is more familiar with the theory of these algorithms can recall some of the more complex failure mode. Here are some other problems. Note that these often are not significant for a network constructed entirely of Ethernets. But for heterogenous networks, they are. (Our network has links to the Arpanet, NSFnet, and one of our internal links is very slow.) - routed uses hop counts. This is only a good idea if all of the links are roughly the same speed. - it has no way to help coordinate load sharing among parallel routes of comparable goodness - it has no way to specify default gateways. If you have a "generally competent" exterior gateway, e.g. an Arpanet gateway, you may not want to propagate the whole list of networks accessible from the Arpanet through your routed setup. Sites with more than one Arpanet gateway are more likely to want their internal routing protocol just to find them the "nearest" Arpanet gateway. Again, let me say that I do not object to routed. I can imagine a number of places where it would be perfectly reasonable. I am simply arguing that it is not general enough for someone who wants to build a commerical gateway that will be useful in as many situations as possible. Let us now assume that our gateways have adopted some manner of communication among themselves, so that they always know the topology of the network. This may even be routed, if the network is such that routed doesn't result in problems. The other issue is how a host that wants to talk to another network should figure out which gateway to talk to. I claim that routed is not a good solution for this because it exists only on Unix. Since there are other methods that will work on every host, one would think that such methods should be preferred to ones that work only on Unix. (I'm less concerned with this issue for gateway to gateway protocols. You might well decide that all of your gateways are going to be BSD systems, and thus feel safe in using a BSD protocol among them. That doesn't bother me. But I think it is a bad idea to assume that all of your non-gateway hosts are going to be running Unix.) I think routing redirects are probably the best method, if the implementation is done right. I believe 4.3 is. Thomas Narton was concerned that redirects don't help if the gateway goes down, since there is then nothing to issue the redirect. First, I think the more common situation (at least with our gateways) is for communications facilities to go down. In that case, the gateway is still alive, and can issue redirects. But even if a gateway goes down and stays down, 4.3 should be able to handle it. Should the current gateway go down, 4.3 will time out. However before actually killing the connection, it will try recomputing the route. If I understand the code correctly, this will go back and start the whole route-finding process again (i.e. starting with the default gateway or gateways), and should come up with a new route if one still exists. This is not instant, but neither is a solution based on routed, since you have to wait for the routing tables to get updated. He commented that proxy ARP also has problems adjusting to gateways that go down. I agree, but note that I recommend proxy ARP primarily for use with 4.2. It is the only method I know to allow unmodified 4.2 systems to work with subnets. It doesn't seem much worse than anything else for more general route-finding on 4.2. As far as I can tell from the sources, routed on 4.2 isn't all that good at dynamic route adjustment either. Unfortunately, under 4.2 there is no ioctl to change the gateway address for an existing route. So routed deletes the old route and creates a new one. This means that old connections still have references to the old route, and will time out. I confess that I am somewhat biased by the fact that gateway crashes are not an issue for me. I am concerned primarily about handling reconfigurations of the network. We generally manage to do this in such a way that the old gateway is still around for a while to issue redirects to the new one. If we had gateways going down a lot, then I might find myself looking for new methods. Because we use dedicated gateways (Cisco and the original Stanford design on which the Cisco gateways are based), we don't have to worry about Unix system crashes. As far as I know, we have never had a crash of the older gateway; once we got some initial problems out of the way [most of them due to design problems in hardware from other vendors], we have not had any crashes in the Cisco machines either. Even if they did crash, they would reboot fast enough to avoid breaking connections that went through them.
gds@sri-spam.istc.sri.com (The lost Bostonian) (12/07/86)
Regarding load sharing, the BBN TCP/IP does load sharing among multiple routes to a destination net or host. This can cause problems if the gateways do not agree that the path of least resistance should be taken. The host does not listen to redirects until the load is sufficiently high on the old path of least resistance. (Some people on the tcp-ip list are suggesting that loading factors should be figured into internet gateway routing mechanisms, which is probably a good idea). Regarding changing the gateway to destinations in routing tables, it seems like a relatively easy thing to do (in the code in /sys/net/route.c, once you have hashed on the destination, you could replace the gateway with the new gateway). Why this hasn't been done is a good question (maybe someone at berkeley will answer it). Some route-specific information would need to be changed, such as the refcnt and flags. --gregbo
ks@pur-ee.UUCP (Kirk Smith) (12/07/86)
In article <7541@topaz.RUTGERS.EDU> root@topaz.RUTGERS.EDU (Charles Hedrick) writes: >Here are some other problems. Note that these often are not >significant for a network constructed entirely of Ethernets. But for >heterogenous networks, they are. (Our network has links to the >Arpanet, NSFnet, and one of our internal links is very slow.) > - routed uses hop counts. This is only a good idea if all of > the links are roughly the same speed. In 4.3 BSD, a per-interface metric can be set to increase the "hop count" for slow interfaces. This feature can be used when a high speed and a low speed link are available. Under normal operation, the high speed link would be used. When it is unavailable, the low speed link would be used. > - it has no way to help coordinate load sharing among parallel > routes of comparable goodness True. This would only be a problem, though, if the parallel routes were both "low" speed. A single "high" speed route is enough to support typical network traffic, without sharing with another "high" speed route. > - it has no way to specify default gateways. If you have a > "generally competent" exterior gateway, e.g. an Arpanet > gateway, you may not want to propagate > the whole list of networks accessible from the Arpanet > through your routed setup. Sites with more than one > Arpanet gateway are more likely to want their internal > routing protocol just to find them the "nearest" Arpanet > gateway. If a routed is started with the "-g" flag, that machine is assumed to be an "external" gateway. It will advertise a default route. If you have multiple arpanet gateways, that is no problem. Routed will pick the "nearest" OPERATIONAL Arpanet gateway. I am not trying to promote routed as the perfect solution to all problems, but it seems to work quite well for us. Many of the perceived problems with routed are just not true. Other solutions, such as Proxy ARP and route redirects, are not as complete as routed. Gateways going down are real problems that need to be dealt with. If we used routing algorithms that could not find a route when one exists, our users would be quite upset. On the other hand, routed has not been implemented by all machines that run TCP/IP. But there is no reason that it couldn't be implemented. And it certainly does not preclude those machines from running on the same internet, with less than optimal routing capabilities. At this time, there seems to be no serious alternative for us. Kirk Smith Purdue Engineering
dricej@drilex.UUCP (Craig Jackson) (12/09/86)
In article <5003@pur-ee.UUCP> ks@pur-ee.UUCP (Kirk Smith) writes: >In article <7541@topaz.RUTGERS.EDU> root@topaz.RUTGERS.EDU (Charles Hedrick) writes: >>Here are some other problems. Note that these often are not >>significant for a network constructed entirely of Ethernets. But for >>heterogenous networks, they are. (Our network has links to the >>Arpanet, NSFnet, and one of our internal links is very slow.) > >> - it has no way to help coordinate load sharing among parallel >> routes of comparable goodness > >True. This would only be a problem, though, if the parallel routes were >both "low" speed. A single "high" speed route is enough to support typical >network traffic, without sharing with another "high" speed route. This assumes that you aren't involved in really large scale computing. We are looking at at least one application involving up to 1000 users. These users would typically be terminal-io-bound (not much computation). If those users come in over Ethernet (using terminal servers, for example), a single Ethernet would be pushed to its limits. Since we'd want another one for redundancy, anyway, it would be nice to have a routing algorithm which would share the load. (This discussion originally started with terminal servers, I believe.) I know that terminal servers generally don't come with two Ethernet interfaces. However, I know of one (non-TCP/IP) that does, and it would be nice if they all did. BTW, at least one manufacturer has proposed solving this application with terminal servers, so this is a real example. > Kirk Smith > Purdue Engineering And now for some inews food. -- Craig Jackson UUCP: {harvard!axiom,linus!axiom,ll-xn}!drilex!dricej BIX: cjackson