[net.unix-wizards] deuna driver problems in 4.23

jason%ucbopal.CC@ucb-vax.ARPA (Jason Venner) (10/01/85)

On several of my deuna machines,  I get the following message
de0:buffer unavailable
This is unrelated to the load on the machine (as far as I can tell,
I get them on idle machines and on buisy machines).

There is another problem that sometimes rears it's ugly head.

the machine becomes unable to transmit,  even though it is able to
receive.
The only toggle I have for this condition is a program that sends a
large umber of udp packets in a lump.

Any suggestiongs?

jason.
jason@ucbjade.berkeley.edu
jason%ucbjade@berkeley.{edu,arpa,csnet}
jason@ucbjade.BITNET
{tektronix,ihnp4,decvax,dual,sun,cbosg}!ucbvax!ucbjade!jason

smb@ulysses.UUCP (Steven Bellovin) (10/07/85)

> 
> On several of my deuna machines,  I get the following message
> de0:buffer unavailable
> This is unrelated to the load on the machine (as far as I can tell,
> I get them on idle machines and on buisy machines).
> 
> There is another problem that sometimes rears it's ugly head.
> 
> the machine becomes unable to transmit,  even though it is able to
> receive.
> The only toggle I have for this condition is a program that sends a
> large umber of udp packets in a lump.

We have similar problems on our 750s with 3Com boards.  Sometimes, for no
apparent reason, the board will stop transmitting packets.  A (modified)
netstat -i shows 50 packets queued for output, with lots of drops indicated.
Ruptime on the afffected machine will show everyone else up, indicating
packets are being received.  This has never happened with our DEUNAs, or
on our 780s.  One possible clue:  on one machine, we have the 3Com board
on the same UBA as a KMC with line unit (for RJE); when the 3Com board hangs,
the KMC hangs as well.  But a DEUNA and a DMR-11 on the same UNIBUS remain
alive and well.

Suggestions, anyone?  I'm contemplating a sanity timer on the transmit interrupt
to reinitialize the board when a transmission has taken too long.


		--Steve Bellovin

karl@osu-eddie.UUCP (Karl Kleinpaste) (10/08/85)

> > On several of my deuna machines,  I get the following message
> > de0:buffer unavailable
> > This is unrelated to the load on the machine (as far as I can tell,
> > I get them on idle machines and on buisy machines).
> > ...
> > the machine becomes unable to transmit,  even though it is able to
> > receive.
> 
> We have similar problems on our 750s with 3Com boards.  Sometimes, for no
> apparent reason, the board will stop transmitting packets...
> 
> Suggestions, anyone?  I'm contemplating a sanity timer...

This is exactly what I had to do to a DEQNA (Q-bus) Ethernet board
which we use in a PDP-11/73.  When I talked with an engineer at DEC
who worked on the beast, he said it was due to a fault in a
transmitter chip which DEC uses but doesn't manufacture; it seems it
gets stuck sometimes, believing itself to be transmitting when it isn't.
A very simple sanity timer goes off every 2 seconds and asks
	if (xmit_active && current_time > xmit_start_time + <somevalue>)
			reset the board;
Or something to that effect.  DEUNAs are supposed to be similar, but
not identical, of course.

Fortunately, when resetting it like this, I don't have to re-send any
setup packets; it manages to remember that much on its own.  But
I also print a note that says something like "Enet reset" on the
console when it happens, it helps us keep track of how often the beast
gets stuck like this.  There is apparently no `real' solution, unless
you can get DEC to deal with their supplier.

The problem may be similar with the 3Coms (do DEC and 3Com have the
same supplier?  I wish I knew who they were), so the same solution should
work pretty well there, too.
-- 
Karl Kleinpaste

gail@calmasd.UUCP (Gail B. Hanrahan) (10/14/85)

In re: "de0: buffer unavailable" messages --

Our poor little 730 (which runs an old 4.2 binary from mt xinu)
developed this problem late last week.  It does not seem to have
affected the running of the system (used only for driving
printers).  The DEC CE came out and swapped DEUNAs; the new
DEUNA produced the same error message, which leads me to think
that something else must be going on.

Since it is a binary-only machine (no source at all at this
site, though Calma does have a source license in Milpitas),
I can't go hacking on the DEUNA driver.  Any other suggestions
would be appreciated.

-- 

Gail Bayley Hanrahan
Calma Company, San Diego
{ihnp4,decvax,ucbvax}!sdcsvax!calmasd!gail

pc@ukc.UUCP (R.P.A.Collinson) (10/18/85)

The deuna seems to be a UNIBUS hungry beast. Recently, Quantime in
London had this problem and I suggested that the interface was moved
further up the UNIBUS nearer the processor. There was a tape and disc
drive between it and the end of the bus.

This was a guess at the solution and based on no real knowledge. 

However, it did seem to solve the problem.

The tape and discs still work, by the way.