[comp.dcom.lans] Problems with PCRoute dropping packets

cjroehrig@watdragon.uwaterloo.ca (Chris J. Roehrig) (06/28/91)

This message is empty.

cjroehrig@watdragon.uwaterloo.ca (Chris J. Roehrig) (06/28/91)

We are having problems with PCRoute dropping packets between our campus
backbone (thickwire) and our local thinwire subnet.  Pings to the router
from the campus side are losing over 70% of the packets.
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

We started with a stripped down 4.77MHz PC eight months ago and everything
worked fine until a few months ago when the campus ethernet started having
problems of it's own.   It was tracked down to be incompatibilities between
a new 802.3 fiber hub and old DEC repeaters.  The repeaters were replaced
and the campus net is fine, but now we are having problems, even though we
are now using a 25MHz '386!!!!


I can establish reliable connections through the PCRoute PC (audiogate) to
machines on the same physical cable, and to some machines on other subnets
but not to the vast majority of the campus machines (see figure).  This 
would lead one to believe there is some connectivity problem with these 
segments, but this does not appear to be the case (why? I'll explain in a
sec...)

    
     CS dept.                                    ... = other segments
     backbone
     ---------                ...    ...
    /        \                  \   /
 dragon       --------------- fiber hub
                              /                  Physics backbone
             ------------------------------------------
              |       |                 |            |
           repeater  astro            audiogate    repeater
             /                 audionet |             \
           ...                 -----------             ...
                               |     
                            audiolab

For instance, I can reliably ping from audiolab to astro and vice-versa.
I CANNOT ping reliably from audiolab to dragon or vice-vera.  I CANNOT
ping reliably from dragon to audiogate.  Now for the weird part: I CAN
reliably ping from astro to dragon and vice-versa.  This is not just an
anomaly;  every machine I've tried can reliably ping to every other machine
except when it goes through or to audiogate; only a few relatively local 
machines like astro are successful at that. 
   By reliable, I mean 0% packet loss for 50 64 byte packets.  If it's not
reliable, it's DRAMATICALLY not reliable: 60-70% packet loss.

This leads me (as well as the networks gods here) to conclude that:

    a) it's not a problem with the campus network; it must be the router.

Okay, so I try brand-new cards (8-bit WD3008E Plus's as Elite's) in a 
brand-new 386: no improvement.  The network guys come and check out our 
AUI drop and transceiver: they're fine.
I disconnect our subnet (audionet) from the routing PC (audiogate) and run 
NCSA Telnet on the audiogate PC using the same WD3008 on the same thickwire 
drop connected to the Physics backbone and PRESTO: pings to it work just fine!

This leads me to believe that:
    b) it's not the transceiver or thickwire drop.
    c) it's not the PC.
    d) it's not the Ethernet cards.

Could it be the PCRoute configuration?  The gross configuration must be ok;
otherwise it just wouldn't work at all.  It appears to manage routing
tables just fine (looking at the syslog stuff).  I'm using the stock
ether-ether executable v2.1; the PCROUTE.LOG file is given below.

If it's not the configuration, in light of a), b), c), and d), it must
be the PCRoute program itself.   I've heard from someone who did some digging 
in the PCRoute code that there may be problems with its mapping of IP address
to physical address when there are multiple gateways hanging from the net
it is connected to; something about its hashing function.  Does anyone
know about this?

I'm not sure that the picture above is entirely accurate; I don't have my
hands on a campus network map yet.  But I do know that the Physics backbone
is connected through a fibre hub to the main campus and also has a bunch of 
repeaters hanging off of it.

So has anyone heard of any problems like this?  I've heard that PCRoute
was a very reliable program and I'd sure be disappointed if we couldn't
fix this...

Here's the PCROUTE.LOG file:

******* PCroute starting *******
Interface 1 (ethernet)                <-- thickwire; campus side
    Address   129.97.129.26
    NetMask   255.255.254.0
    Flags     0000H
    Metric    0001H
    The Ethenet Address 0000H
    The Ethenet Address C088H
    The Ethenet Address 7519H
Interface 2 (ethernet)                <-- thinwire; audionet side
    Address    129.97.248.1
    NetMask   255.255.254.0
    Flags     0000H
    Metric    0001H
    The Ethenet Address 0000H
    The Ethenet Address C0DEH
    The Ethenet Address 9419H
STATIC ROUTES
Forwarding BOOTP requests to               0.0.0.0
Logging messages to SYSLOGD on host    129.97.248.2
Logging level 0008H
Logging mask 0000H
******* PCroute closing log file *******


------------------------------------------------------------------------
Chris Roehrig
Audio Research Group
University of Waterloo, CANADA