mo@maximo.UUCP (10/27/87)
Hmm, a recent note explaining how with 802.5, TCP may have to request that the lower levels occassionallly "re-ARP" and that this is somehow inherently evil. I would like to present an alterntive view. Another way to consider the problem is that at some point, for reasons it cannot know about, TCP decides correctly that the path to its destination is failing. There needs to be a way for TCP to register a complaint with the lower levels that it isn't happy with the level of service its getting and would like the lower levels to "try harder." One could argue that the lower levels should always "try their hardest," but their connectionless nature often precludes them from getting enough feedback to really evaluate the effectiveness of their efforts. So, if TCP could say - "The path to host XY.Z.Z.Y seems to be screwed - please do anything you can to remedy the situation," several useful scenarios become possible. Among them are redunantly reliable local cables. The current IP and localnet architectures make is very difficult to get improved local reliability by the simple procedure of laying two cables (whatever that means) and installing two interfaces in each machine. In the simple case, the two cables essentially MUST have separate IP network numbers (or at least, separate subnetwork numbers) and if one cable fails, all the TCP connections will die because the Interfaces, not the hosts, have the Internet addresses and there is no cleverness in the middle to reroute traffic. The next approach is to introduce a "virtual local cable driver" which sits atop multiple interfaces which you want to consider the same Internet Network. The idea is that the indirect driver can then consider which interface to use based on delivery success. In Ring networks with back-channel non-delivery information, this can work well. With Ethernets, this is very difficult. One simple approach is to just send the packet on BOTH wires!! This is a tremendous test of your hosts' reassembly and redundant segment discard code. It also causes the network to use twice as much CPU time as it would otherwise. If, on the other hand, we could get some feedback from above that indicated we are having path problems, then we can re-ARP on alternate cables (assuming the cache keeps wire affinity information) and pick up before TCP starts dropping connections. This scenario generalises to other link media like "dialup" connections through digital PBX's and ISDN networks as well. Maybe the real point is that error recovery and control is link-specific, and the procedures can often keep things going in the face of serious problems. But currently in most implementations, the low-level link drivers do not get enough information on link quality from the modules which are in the best position to know about it on a global scale. Link drivers clearly know something about the link, but the global information may be crucial for some kinds of error recovery, particuarly for purely datagram links. Currently, this kind of feedback is considered a "layering violation" by some. I suggest that either this notion of layering is wrong, or people have a very stilted view of the interaction between layers. -Mike O'Dell
braden@VENERA.ISI.EDU (10/27/87)
Mike O'Dell writes:
Another way to consider the problem is that at some point,
for reasons it cannot know about, TCP decides correctly that
the path to its destination is failing. There needs to be
a way for TCP to register a complaint with the lower levels that
it isn't happy with the level of service its getting and would
like the lower levels to "try harder." One could argue that
the lower levels should always "try their hardest," but their
connectionless nature often precludes them from getting enough
feedback to really evaluate the effectiveness of their efforts.
So, if TCP could say - "The path to host XY.Z.Z.Y seems to be
screwed - please do anything you can to remedy the situation,"
several useful scenarios become possible. Among them are
redunantly reliable local cables.
...
Currently, this kind of feedback is considered a "layering
violation" by some. I suggest that either this notion of layering
is wrong, or people have a very stilted view of the interaction between
layers.
-Mike O'Dell
Actually, this particular "creative" layer "violation" is very much a part
of the long-accepted requirements for a well-designed TCP/IP
implementation. It is explicitly discussed in one of the "Dave Clark
Five" papers, entitled "Fault Isolation and Recovery" (RFC816). It is
unfortunately true that there are some TCP/IP implementations extent
in the Internet which do not have this important feature; however, the
requirement was clearly laid out in Dave's paper. You don't have to
apologize for it.
Bob Braden
minshall@OPAL.BERKELEY.EDU (10/27/87)
> Hmm, a recent note explaining how with 802.5, TCP may have to > request that the lower levels occassionallly "re-ARP" and that > this is somehow inherently evil. I would like to present an > alterntive view. Interestingly enought, this is an area where the source routing scheme of IBM/ 802.5 is superior to the "proxy-ARP" routers (and maybe to the TransLan ether bridge schemes). With the latter approaches, if a bridge/router which is not within "local broadcast" range fails, then there is no way for the local TCP to request that the path be redetermined. In the source route (802.5 variety) scheme, the local TCP merely causes a new route to be determined via a new "all rings broadcast" XID. However, I do agree with Mike O'Dell that some method of improving "local reliability" is a very desirable goal (and is one reason I like token rings; given that they pass back "delivered" and/or "addressee unknown" indications). The flow of information though (as the token ring case points out) needs to go both ways. The upper layers need to be able to say "hey, end-to-end seems to have fallen apart", and the lower layers need to be able to say "hey, your local packet hasn't made it because of reason XXX". The 802.2/x standards address the latter issue; you get it from the MAC layer in 802.5 (and maybe 802.4), and from the LLC layer with 802.3 IFF you are a Type 2 802.2 user (Type 2 ==> link level acks, "flow control", etc.). 802.2/x doesn't mention the first problem at all (allowing the upper layers to be able to say "hey, there is some end-to-end problem"). It might be an interesting addition. Greg Minshall
hedrick@TOPAZ.RUTGERS.EDU (Charles Hedrick) (10/27/87)
Actually, 4.3 already has a procedure whereby TCP can notify the lower levels that a connection is failing. It's not obvious to me that this is a layering violation. It seems to me that the TCP layer is the only one that can know when things are timing out, and that having it notify the layer that knows wha to do is perfectly appropriate. What would be wrong would be for the TCP code to directly manipulate the routing database. Currently this notification has an effect only for routes that are going through gateways. It marks the route as down. In my view it would be perfectly appropriate that if a route is local, the arp entry should be killed. Indeed I have considered adding such code, and may yet do so. Our main problem with this mechanism is that not all applications use protocols that can detect failure in this way. E.g. the domain system does not. Of course named does retry, but the retries are done at the application level. Unless we have every program that does this use an ioctl to notify the system that a route is failing, we can't depend upon this mechanism. This wouldn't really be a layering violation, but it would be ugly. I'm still not sure what the right solution is, but we now have enough redundancy in our network that it is worth coming up with one, and I am committed to doing so soon. I can't simply run routed on each machine, because that will cause synchronized paging on diskless machines.
jr@lf-server-2.BBN.COM (John Robinson) (10/28/87)
In article <8710261735.AA02484@uunet.UU.NET> mo@maximo.UUCP writes: >Currently, this kind of feedback is considered a "layering >violation" by some. I suggest that either this notion of layering >is wrong, or people have a very stilted view of the interaction between >layers. > > -Mike O'Dell For what it's worth, ISO (hence ANSI) have a standard called multi-link which allows two or more point-to-point links to function as a single data link layer entity, so the notion that there should be a way to provide backup or parallel paths at layer 2 is not at all foreign to the layer-conscious. X.75 and now X.25 incorporate multi-link as well. Anyone want to work on the "multi-link" extensions to 802.2? (Only half :-) -- /jr jr@bbn.com or jr@bbn.uucp