hedrick@TOPAZ.RUTGERS.EDU (Charles Hedrick) (01/20/87)
We are just now beginning to look at making use of redundant routes, to provide some extra reliability in our network. For our internal routing, we have dedicated gateways that handle most of the traffic. (2 Cisco gateways, plus a home-brew gateway that is similar in technology to Cisco's.) There is no redundancy among those gateways. However we also have various Unix machines with multiple interfaces. If we set them all up to act as gateways, we could survive any gateway being down. However I'm somewhat unclear how to maintain our hosts' routing tables. Unix seems to be set up to get routing information through both routed and routing directs. Unfortunately, it seems that these two techniques interact unfavorably. Initially, I had the idea that redirects alone might do the job. But that clearly can't work. If the current gateway goes down, there's nobody to issue a redirect to someplace else. 4.3 makes this somewhat better by killing the current route when a TCP connection is about to time out. But that doesn't help us with NFS, which uses only UDP. The same problem applies to our current kludge of using proxy ARP's. I am beginning to come to the view that there is no real alternative to routed or something like that. However the vanilla routed still has potential problems. If a gateway issues a host redirect, the kernel makes an entry in the routing table that routed doesn't know about. Should the gateway named in the redirect go down, routed will not know that it should remove that host route. The only complete solution I have seen is Cornell's gated. If you ignore its support for EGP and HELLO (which are not relevant in our situation), it can be thought of as a souped-up routed. Unlike routed, it uses a raw socket which gets copies of every redirect received by the kernel. Thus it is able to maintain a model of exactly what routes the kernel has. Presumably this would allow gated to remove routes that no longer apply. Some folks here are reluctant to use gated on every one of our workstations. We have an aversion to regularly activating daemons on diskless Suns. (Although we don't have any real problems with them, swapping over the network provides a certain incentive to minimize the number of programs that get swapped in at regular intervals.) There is also a feeling that gated is large enough that we are probably not going to understand what it is doing. However I'm beginning to think that its approach is inevitable. I have thought of one alternative, but I'm not sure that it has enough advantages to be worth coding. That is a program that monitors routed traffic and keeps track of what gateways are up. It would not do anything with the contents of the packets -- just remember what gateways are currently sending them. The program would guarantee that there is always a default route that points to a gateway that is up. It would periodically examine the routes in the kernel (using code stolen from netstat, presumably), and kill any that involved gateways that are no longer up. One might also look at the use count field, and get rid of routes that haven't been used for a certain period of time. Presumably this program would be smaller than gated, and I think I would also be more likely to understand exactly what it is doing. But I'm not sure I want to add yet another routing daemon. (The only approach I can think of other than monitoring the gateways' routing protocol is to ping each gateway periodically. That would work, of course, but it would create more network traffic.) I'm reasonably convinced than any system acting as an actual gateway should run gated. I'd be curious to hear comments from places that have been using dynamic routing for some time. By the way, I am willing to assume that all gateways participate in routing using routed. (Cisco now supports routed.)
DCP@QUABBIN.SCRC.SYMBOLICS.COM (David C. Plummer) (01/20/87)
Date: Tue, 20 Jan 87 01:07:40 est From: hedrick@topaz.rutgers.edu (Charles Hedrick) I'd be curious to hear comments from places that have been using dynamic routing for some time. By the way, I am willing to assume that all gateways participate in routing using routed. (Cisco now supports routed.) Users of the Chaosnet protocol have been using dynamic routing for nearly 10 years now. Chaosnet is MIT AI memo number 628. I think it is online at MIT someplace, but can't find it. (Snit: it's always bothered me that IP didn't address this issue from the start.) A very brief description of Chaos routing follows, so those wishing to type D(elete) now can go ahead. Chaosnet only has 255 subnets (which is a problem for very large configurations, and therefore this may not prove useful for IP, but I had ideas back in '81 or so on how to extend this kind of routine to IP. JNC may remember some of them.) With only 256 subnets, keeping a full subnet routing table was not hard. Periodically (every 15 seconds) a multiple interface bridge would broadcast a routing packet on each interface. The routing packet contained several pairs: a subnet number and a "cost" to get to that subnet. Periodically (every 4 seconds) each machine would increment all costs in the table by 1. When processing a routing packet, an entry would get replaced if the cost in the routing packet is smaller than the cost in the current entry. If you are interested in my extensions to IP / heirarchical addressing schemes, I can try to dig them up.
JNC@XX.LCS.MIT.EDU.UUCP (01/21/87)
For the 59th time, an RFC (RFC816) in the 'Internet Protocol Implementors Guide' discusses exactly how to use Redirects, and how to figure out that your gateway is dead. The issue was addressed years ago in detail, but apparently nobody bothers to read the specs. I won't both to waste my time pointing out that having hosts listening to routing protocols is a terrible idea; nobody ever believes me. Also, RIP is a piece of junk. It doesn't even work with the current EGP, let alone with any followon. It's too bad that a) it was around before any other IGP was, and b) it was in the Berkeley system, because now we'll never get rid of it. Noel -------
hedrick@TOPAZ.RUTGERS.EDU.UUCP (01/21/87)
Noel: I appreciate your advice, but really, the people you should be flaming at are not me, but Berkeley and the various vendors. My problem is very different than Berkeley's: I have to deliver reliable service given products that actually exist. RFC816 says basically - somehow you keep track of what gateways are up - when you want to talk to somebody, try a gateway, and depend upon redirects to get the actual address - when a gateway goes down, just get rid of routes that use it. This will result in trying another random gateway, which will again tell you if there is a better one. For keeping track of what gateways are up, RFC816 mentions - depend upon the network to tell you when a packet isn't delivered. - ping them regularly - depend upon the upper layers [for us, TCP and NFS] to tell you when a route no longer works. When a route stops working, assume that its gateway is down. Now, let's look at how much of that advice I can actually use. I have mostly Suns. This means I have 4.2 networking. I know that's horrible, but there isn't a lot of real 4.3 in the marketplace yet. I'm not in a position to implement my own, or even to port 4.3 to the Sun. In a 4.2 world, none of the suggestions in RFC816 look very attractive. Ethernet obstinately refuses to tell me that it is unable to deliver a packet. The one implementation that tried to ping all of the gateways it knew about [TOPS-20] was roundly condemned by all. And 4.2 has no feedback from the upper layers to the lower ones. [Indeed even in 4.3, it's not clear whether the feedback is good enough that you can really depend upon it in practice. I'd like to hear from anybody who has experience in this area. If you just use the route command to set up several default gateways, will 4.3 really manage to keep up communications, using just redirects and hints from TCP? Note that for many of my users, the main application they are using through the gateway is NFS, which is UDP-based.] Since none of these methods seems very attractive, I proposed the next best thing that I could think of: watching the routing traffic between the gateways. I don't propose to look in the packets. I'm just trying to keep track of what gateways are sending them. If you hate routed, imagine that I am watching Cisco's IGRP traffic (which is by no means impossible). Given this, I thought I was proposing something that was very much in the spirit of RFC816. I proposed a daemon that would do the following: - keep track of what gateways are up, by listening to their routing traffic - periodically scan the routes in the kernel - when it finds a route that uses a dead gateway, remove the route. In Unix, this means that traffic will revert to the default gateway, which will then redirect it to a better one if there is any. This is exactly what RFC816 says. - it would manage the default routing entry, to make sure that it always points to a gateway that is up. What I asked was whether anyone has experience with such a thing, and can advise me on whether it is really worth doing this instead of just using routed. I also mentioned some practical problems that would have to be solved before we could really rely on routed. I really have been trying to follow your advice. I have tried to avoid using routed. I find that I am moving that way, because my connections with NSFnet, and use of various random Unix machines as gateways, provide little choice. Cisco, who supplies our core gateways, had tried to avoid routed as well, but has finally caved in to the inevitable. It would help a lot if there were a standard defining something better. When the current efforts in this direction result in something, I'll be happy to push the vendors we deal with to implement it. I'd be happy to join you in a campaign lobbying vendors for implementations that follow the RFC's. But please don't yell at me. I'm just trying to do something reasonable with the tools I have.
walsh@HARVARD.HARVARD.EDU.UUCP (01/28/87)
to deliver a packet. The one implementation that tried to ping all of the gateways it knew about [TOPS-20] was roundly condemned by all. The BBN VAX networking code pinged gateways. You don't need to ping X if you're getting acks back for any connection actively using gateway X as a first hop. You also don't need to ping if you're not currently using the gateway. Since none of these methods seems very attractive, I proposed the next best thing that I could think of: watching the routing traffic between the gateways. Using an Ethernet board in snoopy mode sounds awfully inefficient.
hedrick@TOPAZ.RUTGERS.EDU.UUCP (01/29/87)
Routed uses broadcasts, so snooping on the gateways doesn't require promiscuous mode.