achille@cernvax.UUCP (achille) (02/22/89)
We are having strange problems with our Domain Internet; our topology is as follows: a) we have a Domain ring (net 1) and b) an Ethernet (net 3001). We had a single router between ring and eth until a couple of weeks ago (and had no problems for what we could see) when we got an additional eth board for a second dn3000. The idea was to have 2 routers in parallel between the ring and eth to provide redundancy in case of failure of one of the 2 routers. Now almost at the same time we got some nodes running sr10, we installed some new eth boards for tcp/ip on user nodes and got lot more users working on eth based nodes. The problem we experience is that sometimes from the ring you can only see part of the eth nodes (lcnode -from //anethnode), but if you creep onto //anethnode and then do an lcnode from there you see the nodes that were missing previously. It looks like one of the 2 routers is gone crazy and that the nodes that don't show up in 'lcnode -from ...' are trying to use it for routing. Now I checked some time ago and saw that if a router goes down, all other nodes realize the change quite quickly (couple of seconds - 1 min.), so we are really stuck ! We are running the routers under 9.7, there are a few sr10 nodes both on the ring and on eth (some dn10k). The ethernet is NOT just for Domain, there are some other 700 tcp/ip hosts on it plus DecNet hosts. Does anybody know about problems with multiple Domain routers and/or incompatibilities with other eth machines or whatever ? Thanx in advance, Achille Petrilli, Cray and PWS operations