Leonard@arizona.edu (Aaron Leonard) (06/08/90)
We have experienced what you might call negative synergy
between DECnet traffic being routed thru a cisco and
then between level II routers.
The case in which the difficulty emerges is somewhat
complicated to explain, but quite reproducible, so please bear
with me.
[ UAZHE0 46.437 ] [ UAZHE4 46.365 ]
[ level II router ] [ endnode ]
| |
[ MAGGIE 50.204 ]--( large bushy bridged )--[ CIRRUS 50.140 ]
[ level II router ] ( ethernet <128.196.128> ) [ endnode ]
|
[ PANCHO 50.222 ]
( repeatered enet )--[ cisco AGS II ]--( large repeatered )
( <128.196.28> ) [ level I router ] ( ethernet <128.196.120> )
| |
[ ECEVAX 50.111 ] [ DOC 50.231 ]
[ endnode ] [ endnode ]
In the above picture, consider any endnode to be representative of a
large number of topologically identical endnodes. We use the IP
subnet terminology simply as means of identifying Ethernets. (All
connections above are ethernet; all DECnet nodes but PANCHO are
VAXen.
Note that this is an unusual configuration, in that we have nodes
in multiple DECnet areas residing on the same ethernet (on
128.196.128.) (This is necessitated by the peculiarities of HEPnet
routing.)
We have found that traffic flows quickly (1) amongst all nodes in
this network EXCEPT for the case where DOC or ECEVAX (or any other
node in subnets 128.196.120 and 128.196.28) tries to communicate
with any node in area 46. In that case, the traffic flow is
consistently an order of magnitude slower (2). These results
have been verified by many tests using a large number of
pairs of nodes.
The traffic flow in the too-slow case is as follows: area 50 endnode
to PANCHO (AGS), PANCHO to MAGGIE (level II router for area 50),
MAGGIE to UAZHE0 (lev II router for area 46), UAZHE0 to area 46
endnode. Note that the lev II routers are NOT the bottleneck;
traffic that flows between e.g. CIRRUS and UAZHE4, which passes
thru MAGGIE and UAZHE0, but not thru PANCHO, is quick. Note also
that PANCHO by itself is not the bottleneck, as traffic between
e.g. CIRRUS and DOC is quick. The only case where traffic is slow
is where inter-area traffic is routed thru PANCHO.
Notes.
All traffic tests were run using DEC's DTSEND utility, via the following
command sequence:
$ MC DTSEND
Test: DATA/PRINT/STATISTICS/TYPE=ECHO/SIZE=500/SECONDS=60/NODE=node
This utility tests task-to-task NSP throughput between DECnet
phase IV nodes.
(1) "Quick" data flow is in the range of 400Kbps to 1.2Kbps.
(2) "Slow" data flow is in the range of 40Kbps to 100Kbps.tinkelman@ccavax.camb.com (06/08/90)
In article <21919@megaron.cs.arizona.edu>, Leonard@arizona.edu (Aaron Leonard) described a problem involving slow throughput between certain pairs of nodes (in a picture, partially reproduced below). I can't offer a coherent explanation of the differences you reported, but I do want to comment on one of your examples, that was the throughput between endnodes CIRRUS and UAZHE4: > [ UAZHE0 46.437 ] [ UAZHE4 46.365 ] > [ level II router ] [ endnode ] > | | > [ MAGGIE 50.204 ]--( large bushy bridged )--[ CIRRUS 50.140 ] > [ level II router ] ( ethernet <128.196.128> ) [ endnode ] ... > Note that the lev II routers are NOT the bottleneck; > traffic that flows between e.g. CIRRUS and UAZHE4, which passes > thru MAGGIE and UAZHE0, but not thru PANCHO, is quick. My comment is that DECnet will use the intermediate level II routers only to help the two end nodes find each other. Once the circuit between them is established, CIRRUS and UAZHE4 will communicate directly with each other. This means there will be no intermediate DECnet routing *and* max size Ethernet packets can be used. -- Bob Tinkelman, Cambridge Computer Associates, Inc., 212-425-5830 bob@camb.com or ...!{uupsi,uunet}!camb.com!bob
aaron@dragoon.telcom.arizona.edu (Aaron Leonard) (06/09/90)
In article <25934.266f66c4@ccavax.camb.com>, tinkelman@ccavax.camb.com (Bob Tinkelman) corrects an assumption I made in my earlier posting concerning poor inter-area routing performance. I had implied that traffic between two endnodes in different areas on the same Ethernet will pass between the area routers. |> > Note that the lev II routers are NOT the bottleneck; |> > traffic that flows between e.g. CIRRUS and UAZHE4, which passes |> > thru MAGGIE and UAZHE0, but not thru PANCHO, is quick. |> Bob: |> My comment is that DECnet will use the intermediate level II routers only |> to help the two end nodes find each other. Once the circuit between them |> is established, CIRRUS and UAZHE4 will communicate directly with each other. |> This means there will be no intermediate DECnet routing *and* max size |> Ethernet packets can be used. |> -- Excellent! We're on to something here. Bob is right - the DECnet traffic between CIRRUS and UAZHE4 will indeed short-circuit the area routers and travel directly over the ethernet. And this result, in fact explains why the performance is so poor when the cisco router is entered into the loop: because the short-circuiting of same-Ethernet circuits between different-area nodes ONLY HAPPENS when the nodes are END NODES! When the set-up path is: Area-A-endnode-on-Eth1 -> Area-A-lvl-I-rtr -> Area-A-lvl-II-rtr-on-Eth2 -> Area-B-lvl-II-rtr-on-Eth2 -> Area-B-lvl-II-endnode-on-Eth2, the short circuit between "Area-A-lvl-I-rtr" and "Area-B-lvl-II-endnode" is never made. Rather, all traffic between the endnodes will continue to flow thru every single router in the loop. (Oh, for an icmp redirect!) I verified that this is not a problem with the cisco; rather, an identical path traced thru a VAX/VMS router produced identical results - so by definition, the cisco is routing correctly! This brings up, then, another question: if all my DECnet connections into my ciscos are to direct-attached ethernets, then is there any reason at all to run DECnet routing on the ciscos? (Assume here that equal-cost path splitting is not a topological possibility.) Why not just bridge the whole ball of wax? In this (admittedly pathological) case, at least, bridging will produce better throughput and reduce host load, right?
tinkelman@ccavax.camb.com (06/09/90)
In my prior article <25934.266f66c4@ccavax.camb.com>, I should have made an
additional observation. I guess I forgot, because it didn't bear on your
_immediate_ performance related question. But it does bear on the future.
*** The pictured configuration will `soon' be illegal. DECnet/OSI ***
*** (Phase V) will not allow multiple areas on the same Ethernet. ***
With Phase V you probably will have all the nodes in your picture in the
same area, and therefore avoid the `extra' hops of area routers. Packets
will flow NodeOnEnet1-Cisco1-Cisco2-NodeOnEnet2 with no `extra' hops at
DECnet area routers.
I said `probably' in the above. You could keep the nodes on each physical
Ethernet in separate DECnet areas if the Cisco boxes will be able to act
as DECnet Phase V Level 2 routers. (Will they be able to do that?) You
could also maintain separate DECnet areas on the two LANs *and* bridge
them, if you have the bridges filter all the appropriate DECnet level 2
routing multicasts. This latter configuration _should_ work, though I'm
not sure if DEC will say it's supported.
Despite the two alternatives in the preceeding paragraph, I still think
that unless there is some very strong (and strange?) *technical* reason
not to do so, you will find it better to go to a single DECnet area.
(DEC's position is certainly that DECnet areas should reflect network
topology, not administrative responsiblities.)
--
Bob Tinkelman, Cambridge Computer Associates, Inc., 212-425-5830
bob@camb.com or ...!{uupsi,uunet}!camb.com!bob kph@dustbin.cisco.com (Kevin Paul Herbert) (06/12/90)
>And this result, in fact explains why the performance is so poor when the >cisco router is entered into the loop: because the short-circuiting of >same-Ethernet circuits between different-area nodes ONLY HAPPENS when the >nodes are END NODES! Yes, this is quite correct. Depending on software version, end-nodes keep either a cache of the nodes on the same cable, or the previous hop to get to a node they have been in contact with. Old (pre VMS V5.0 nodes do the former), new do the later. When a DEC system originates a packet onto the network, it sets a bit in the message header which means "originated on this cable". If a router switches a packet on to a different circuit, it clears this bit - if it is staying on the same cable, it doesn't touch the bit. When a receiving end-system gets a message with the bit set, it creates a cache entry indicating that messages to that node can be sent directly, bypassing the designated router for that LAN. If it gets a message without the bit set, it either (for old software) does nothing, or for new software, it looks at the MAC address of the previous source, and caches that for use as a route to the source. Routers do not contain this cache. The DEC philosophy is that routers should rely only on information learned via routing protocols, and not make any decisions based on data path optimizations. >This brings up, then, another question: if all my DECnet connections >into my ciscos are to direct-attached ethernets, then is there any >reason at all to run DECnet routing on the ciscos? DEC end-systems produce a large amount of background traffic (periodic hello messages). If you run routing, this information is basically collected into a single control message sent to other routers. If you run bridging, all of the hellos get bridged. This is considerable traffic if you have a lot of end-systems. Kevin