[net.lan] 3Com Multibus Ethernet board

mann@Navajo.ARPA (11/14/84)

I've discovered a documentation bug that may cause some grief for anyone
who uses 3Com Multibus ethernet boards.  These boards rely on software to
implement part of the exponential backoff algorithm.  Whenever a collision
is detected, the board generates a JAM interrupt and waits for a random
number to be written into the backoff register by software.  It then delays
the specified number of slot times and tries again.

Section 4.4 of the manual explains how to choose the random numbers:
"a uniformly distributed random integer greater than or equal to zero
and less than 2^k, where k is either the number of retransmission
attempts for the packet being transmitted or 10, whichever is less."
Section 4.2 says: "software must write the two's complement of the
number of slot times to delay into MEBACK."  These statements would
lead one to believe that it is legitimate to write a 0 into MEBACK.  In
fact, following this procedure, one would choose to delay 0 slots and
write a 0 into MEBACK 50% of the time on the first collision, 25% on
the second, etc.  The problem is that the ethernet board interprets a
zero as "delay 65536 slot times"!!  (This was determined by timing how
long it took to try transmitting again: about 3.36 seconds.)  Thus, it
seems the driver must write -(s+1) into MEBACK, where s is the number
of slot times to delay, determined as above.

This sort of bug is likely to go unnoticed for a very long time since
collisions are quite rare on an ethernet.  In fact, the 3Com driver in
the V kernel was in use for over a year with several bugs in collision
handling.  These were not detected until recently, when they showed up
as occasional unexpected timeouts during a stress test of our file
server.  (The timeout was less than 3 seconds.)

People who maintain ethernet drivers may want to recheck their code.

	--Tim