narten@PURDUE.EDU.UUCP (02/10/87)
This note is prompted by the observation that lots of nameservers still contact usc-isib (10.3.0.52), even though the machine no longer exists. A related problem that is appearing a lot more now has to do with reactions in general to ICMP dest unreachable messages. Now that name server traffic is picking up, it really hurts to see UDP implementations ignoring ICMP errors. It is not uncommon to see half a dozen (or more) packets sent to some remote nameserver even though the first packet causes an ICMP dest unreachable to be returned. In some cases this is not too serious (but still undesirable), since the message comes from a gateway one hop away with a LAN in between. Other times, the message comes from some distant gateway on the other side of the ARPANET. The problem with letting UDP see these errors lies with the stateless nature of datagram delivery at that level. With TCP, there has to be a connection block that the error packet can be matched up against. Hence, TCP can do something intelligent (but often doesn't). With UDP, the packet gets sent with no guarantees about delivery. Furthermore the user process might well be sending data to several destinations via the same "socket", and it is not clear how to return errors to the user. I see three basic approaches. 1) Do nothing, a favorite among current implementations. 2) Pass errors back to the user process. This is hard to do, since the kernel may well have no idea of what process sent the packet. In some cases, the kernel would have to keep a log of all UDP packets it sends in order to pass back errors to the user. 3) Cache errors in the routing tables for short periods of time. This can be done by adding a flag to route table entries that says "Not really reachable". That way, the user would not get an error on the first packet, but the retransmission of that packet could cause the route lookup to note that the destination is unreachable and the user could be informed. This would be a significant improvement because now the user process could elect to use a different address as the jprimary. Furthermore, the user process could elect not to use the "bad route" for a long time (say 30 minutes or several hours), long after the record of the unreachable message has been flushed from routing tables. This has the desired effect of: 1) Processes using UDP can get feedback about unreachable destinations. 2) It doesn't drastically change the semantics of the UDP interface. E.g. the user is not notified asyncronously or forced to ask explicitely whether a route works. On return from the sendpacket() routine, a flag could be returned. In addition, sending packets to an unreachable destination doesn't have to mean that the packet didn't get sent, it just means "I got an dest unreachable a while ago. It is not likely that the packet will get there". The user can choose to ignore this (though he/she really shouldn't). 3) The length of time dest unreachable messages are cached can (and probably needs to be) an adjustable parameter. It may well be that caching such a message for 10-30 seconds would be sufficient to cut down on the number of useless packets sent, yet would not keep users from reaching hosts that were down but just came back up. 4) Programs don't have to rely on timeouts to decide that a host or list of hosts is unreachable. This will often times give users a quicker response. Comments? Thomas
hedrick@TOPAZ.RUTGERS.EDU.UUCP (02/11/87)
You suggest that the kernel should remember destination unreachable messages, and not bother to try again for some time. The problem with this is that there are often transient routing problems. If you try again, things might actually work. Until the core gets more reliable, I would rather retry. Indeed for a while we intentionally broke our TCP code so that it would keep trying when it got destination unreachable, instead of aborting the connection. This helped us keep connnections up to certain hosts.
brescia@CCV.BBN.COM.UUCP (02/11/87)
> (you can ignore an 'unreachable' because it may indicate a transient > routing problem) [paraphrased - mb] You really need to be advocating to look at the subcodes returned by ICMP destiination unreachable, because you can usually trust the 'host dead' type returned from some gateways when trying to talk to hosts on arpanets. Yes, you would do well to ignore ICMP net unreachable if you suspect routing flurries (often the case nowadays). With UDP domain lookups however, could you not use that as an indication to try another address, even if you keep retransmitting to the original one? You need not worry about "bothering" too many servers in this case, because the 'unreachable' is a response which tells you that you did not reach that server. Also, you should be explicit in your reasoning about the 'port unreachable' subcode. Do you mean to try again because the server too busy and did not get another server listen up again, or give up because there is not now nor will there ever be a server at that host (because the service host changed). I think you should use the broadcast approach for connection setup, since you supposedly don't care which of the equivalent servers you reach. If, for example, you try to contact one from the set { A, B, C }, and you get an unreachable from A, try B next, and only forget A if the reply code was 'host dead'. Of course, your implementation on the arpanet (AHIP) interface does recognize arpanet host-dead messages, doesn't it? mike
karels%okeeffe@UCBVAX.BERKELEY.EDU.UUCP (02/11/87)
ICMP unreachable messages are reported to users of UDP sockets in 4.3 if and only if the socket is "connected"; that is, that the remote address is bound as well as the local address. Otherwise, it is unreasonable to report errors even though the local address matches that in an ICMP error message. The error may well refer to a datagram other than the most recently sent, in which case it is likely to be confusing at best. This is used in the UNIX resolver code to detect the abscence of a local server; it depends on receiving the "port unreachable" error. On the other hand, the same binding causes late messages from one server to be discarded after "connecting" to the next of a series of choices. This isn't a problem in the standard installation, with only one server choice (the server on the same host). The UNIX nameserver does not take advantage of ICMP error returns, in part because it runs multi-threaded, processing other requests while awaiting a reply to a recursive query. However, recent additions to the BIND server will enable it to measure response time of multiple servers for a domain. It will then choose the fastest server, which will not include one that was recently unreachable if there are alternatives. Recent questions about the ordering of root servers in BIND configuration files are no longer interesting. Current servers use the configuration file to reach the root servers initially, which they then query about the root domain. That information is then used as long as it is valid. Mike
Mills@LOUIE.UDEL.EDU.UUCP (02/12/87)
Charlie, I know the TOPS-20s have been sorting ICMP messages to the right processes for years, since that's where I got the idea to do the same thing in the fuzzballs. Having said that, it's too bad the users at the top of the TOPS-20 protocol stack don't see the information itself - say in TELNET or FTP. Dave
brady@DCN9.ARPA.UUCP (02/12/87)
I came in on this discussion a little late, so pardon me if I'm a little off topic... > The problem with this is that there are often transient routing > problems. If you try again, things might actually work. Until > the core gets more reliable, I would rather retry. Indeed for a > while we intentionally broke our TCP code so that it would keep > trying when it got destination unreachable, instead of aborting > the connection. This helped us keep connnections up to certain > hosts. If you adopt this practice, you negate the purpose of the message. So why is it sent in the first place? In the long run, ignoring control messages like these could undermine any sort of development on the internet, particularly in relation to gateway to gateway communications. It may seem that some benefit is gained in certain instances from ignoring unreachable messages. But if there is to be a "standard" protocol, such a change would have to be beneficial (or at least non-detrimental) to the majority of the cases. I believe that in most cases, the control messages are a necessary factor in the control of needless congestion across an already strained internet. -Sean