[net.bugs.4bsd] Problem with DEUNA/Ethernet

neil@ux.cs.man.ac.uk (The Root of all our Problems) (03/07/86)

This message is empty.

neil@ux.cs.man.ac.uk (Neil Todd) (03/07/86)

Sorry about the previous empty message.

Can somebody out there give me a few words of wisdom :-

I have a Vax 8600 running 4.3 Beta and two 750s running 4.2. They are
all on an ethernet, connected using DEUNAs.

Traffic between the 750s is fine. But the 8600 will run ok for about
an hour or so and then the Ethernet from the 8600 effectively dies -
the 750s remain ok. I've tried both the BBN and INET versions of the
code but with little change.  I've switched in the debugging and this
annouces that the 8600 thinks that there are checksum errors. We put
in a new DEUNA just in case the old one was duff, but this didn't
make any difference. We've run the full diagnostics on the whole
machine, but they didn't reveal anything.

Hedrick@topaz recently posted something to "Fix TCP hangs and slow
transfers", but after examining the code it is not relevent to 4.2 or
4.3.

So the question is - "What do I do now ?", 

BTW Is there a way of completely resetting the networking code without
rebooting, this would at least allow me to recover the ethernet when
this problem occurs.


				Neil Todd


JANET:-		neil@uk.ac.man.cs.ux		* Dept of Computer Science
UUCP:-		mcvax!ukc!man.cs.ux!neil	* University of Manchester
ARPA:-		neil%uk.ac.man.cs.ux@ucl.cs	* Oxford Road
PHONE:-		(+44) 61 273 7121 Ext 5018	* Manchester M13 9PL

pjk@ascvax.UUCP (Pat Keziah) (03/14/86)

neil@ux.cs.man.ac.uk writes in <34@ux.cs.man.ac.uk>:
> .
> .
> .
> 
> I have a Vax 8600 running 4.3 Beta and two 750s running 4.2. They are
> all on an ethernet, connected using DEUNAs.
> .
> .
> .
> 
> Traffic between the 750s is fine. But the 8600 will run ok for about
> an hour or so and then the Ethernet from the 8600 effectively dies -
> the 750s remain ok. [. . . a sentence deleted. . . ]
> I've switched in the debugging and this
> annouces that the 8600 thinks that there are checksum errors.
> 
> .
> .
> .

There is a bug in the udp checksum generated by the 4.2 BSD kernel.
The 4.2 kernel defaults to not checking udp checksums, thus the 4.2
systems are able to talk to each other fine.  [Un?]Fortunately,
4.3 includes the fix for the outgoing checksums and checks them on
input.  We were able to get our 4.2 VAX to work with the PC/IP software
from MIT (which seems to handle checksums correctly) by adding
a line to the code in the routine udp_output (source file ==
/sys/netinet/udp_usrreq.c).  The resulting code in udp_output that
builds the udp header and checksum looks like this:

	ui = mtod(m, struct udpiphdr *);
	ui->ui_next = ui->ui_prev = 0;
	ui->ui_x1 = 0;
	ui->ui_pr = IPPROTO_UDP;
	ui->ui_len = len + sizeof (struct udphdr);
	ui->ui_src = inp->inp_laddr;
	ui->ui_dst = inp->inp_faddr;
	ui->ui_sport = inp->inp_lport;
	ui->ui_dport = inp->inp_fport;
	ui->ui_ulen = htons((u_short)ui->ui_len);
|	ui->ui_len = htons((u_short)ui->ui_len);   /* the line we added */

	/*
	 * Stuff checksum and output datagram.
	 */
	ui->ui_sum = 0;
	ui->ui_sum = in_cksum(m, sizeof (struct udpiphdr) + len);

The line that we added was line 194 in our sources.

You should be able to make this change to your 4.2 BSD systems
and clear up the ethernet problem ( and charge on to the next crisis).

Good Luck,
-- 

Pat Keziah				Ampex Switcher Company
{hao,boulder,avsdS}!ascvax!pjk		10604 West 48th Avenue
					Wheatridge,  CO  80033
					USA
					(303)423-1300 x226