[comp.protocols.tcp-ip] connection between ARP and ICMP...

mws%beta@LANL.GOV (Mitchel W Sukalski) (01/06/89)

I'm in a slight dilemma here, and I was wondering if anybody out in
netland can help me?  I'm looking for documentation (RFCs, IENs, etc.)
that validates the behavior I'm going to describe.

I have a router with two network interfaces.  I'm trying to connect from
machine A, on one of the networks, to machine B, on the other network.
The router does not have an ARP entry for machine B, and when it receives
the first packet from machine A, it sends a "host unreachable" destination
unreachable ICMP message to machine A.  Simultaneously, it generates
an ARP request for machine B.

Some of the folks I have talked to have stated that it is the duty of 
the receiving machine, machine A in this case, to ignore the host
unreachable messages and use a timeout mechanism instead (reporting the
error condition upon timeout, perhaps).  As I see it, that makes sense
for an active connection (to avoid reacting to a transient network problem),
but upon opening a connection, an application, such as telnet,
should take heed of the error immediately and let the user take
the necessary recovery action.

In any case, I have been unable to find any documentation that says the
above behavior is within specification.  I have found ample statements to
the effect that the packet can be dropped, and an ARP request generated;
then, it is up to the transport protocol to resend the packet.  If the
ARP module has not gotten a reply, then the router could send a host
unreachable message.

I'd appreciate any insights or references.

Thanks in advance,

Mitch Sukalski
Communications Group, C-4
Los Alamos National Laboratory

hedrick@geneva.rutgers.edu (Charles Hedrick) (01/06/89)

I think it's very wierd to say "host unreachable" because you don't
have an ARP table entry.  In my view unreachable should mean that you
really can't get to the host, not that your internal cache needs
refreshing.  There's no question that it is a bug for hosts to break a
connection when they get an unreachable while a connection is open.
There may be a transient routing problem that will clear up shortly.
The recommended approach is to save the error message but keep
retrying.  If you end up timing out, then you print the error based on
the most recent unreachable.  Thus the unreachable does accomplish
something: it gives you some information about what is wrong, which is
useful in generating an intelligent error message.  But you sure don't
want to break the connection the first time you get an unreachable.
(Doing so is a common bug in TCP/IP implementations, however.)  I've
heard claims that when you are first setting up a connection, you
should pay attention to unreachables.  A user who is in the middle of
a connection likely has time invested in his context and is willing to
wait in hopes of continuing.  But when you're first trying a
connection, you don't, and it's silly to make people wait for a minute
to time out of a connection attempt when you know that there's no
route available to the destination.  That sounds sort of plausible to
me, but I think I might still wait for a timeout.  I my opinion, both
the gateway and the host are behaving suboptimally in your example.
I'm not sure they are flatly wrong, but I'd try to get them changed.
I think the host should ignore the unreachables, or use them only as a
basis for error messages.  And I think the gateway shouldn't issue
them for a missing ARP table entry.  (Indeed probably it shouldn't
issue them at all, because of the number of buggy host
implementations.)

subbu@hpindda.HP.COM (MCV Subramaniam) (01/07/89)

Mitchel,

I think ICMP unreachable message should not be generated if the gateway 
can find a route to the destination. If there is no routing entry by which
the gateway can access the destination, then the gateway should not ARP for
the destination at all, but just send the Unreachable message to the source.

Hope this helps.

-Subbu

slevy@UC.MSC.UMN.EDU ("Stuart Levy") (01/07/89)

A year or so ago on this list, someone suggested that "Host Unreachable"
should be returned when there was no response after some number of ARP
retries.  On a network where there's no direct indication that a packet
was received, this seems as good a method as any to detect dead hosts.
I don't believe it would be a layering violation, any more than translating
ARPAnet "Host Dead" status into "Host Unreachable" is -- it's just implementing
a well-defined network layer service.

On the other hand, to send "Host Unreachable" immediately just because there's
no ARP response in the cache is clearly broken.  It should make some attempt
to contact the host before saying this.

	Stuart Levy