[comp.sys.pyramid] Pyramid and Sun net problems?

berger@datacube.UUCP (11/16/87)

We are running osx 4.0 on a 90x (with no data cache). We seem to be 
experience real performance problems primarily with the ethernet.

We mostly have a network of Sun 3's with 2 Sun's as servers. There is
cross NFS mounting between the Sun's and the Pyramid. The sun's are running
Sun OS 3.4.

The biggest problem is when a user has a home directory on the Pyramid
and is logged into a sun, even if they are working on a Sun filesystem,
they get error messages such as:

NFS getattr failed for server datacube: RPC: (unknown error code)
NFS server datacube not responding still trying
NFS write failed for server datacube: RPC: Timed out
NFS write error 60 on host datacube
NFS getattr failed for server datacube: RPC: Timed out
NFS lookup failed for server datacube: RPC: Timed out
NFS readdir failed for server datacube: RPC: Timed out
NFS lookup failed for server datacube: RPC: Unable to receive

These messages happen pretty regularly if the sun is dealing with
a file on the pyramid file system via NFS or if the sun user's home
directory is on a pyramid file system via NFS.

The biggest problem is that some programs (in particular FRAME a WYSIWYG
editor on the Sun) will hang during a write, jamming the whole Sun
workstation at times.

Any suggestions?
				Bob Berger 

Datacube Inc. Systems / Software Group	4 Dearborn Rd. Peabody, Ma 01960
VOICE:	617-535-6644;	FAX: (617) 535-5643;  TWX: (710) 347-0125
UUCP:	berger@datacube.COM,  rutgers!datacube!berger, ihnp4!datacube!berger
	{cbosgd,cuae2,mit-eddie}!mirror!datacube!berger

steve@BRILLIG.UMD.EDU (Steve D. Miller) (11/17/87)

   The Suns send out huge (ballpark 4K) UDP packets when doing NFS reads and
writes.  These packets get fragmented at the IP level into three
back-to-back large IP packets; if the Pyramid Ethernet controller or
software is slow, it could end up dropping one or more of those IP packets.
This can be controlled by using the rsize and wsize options when you do a
cross-mount.  (Setting these to 2K should help.)  There should be something
in the Sun mount(8) man entry about this, and I would hope that the option
appears on the Pyramid side, too.

   This same sort of thing happens between Sun-2s with 3Com Ethernet cards
and Sun-3s.  Fiddling this stuff might not help, but if you haven't tried,
you should.

   Good luck.

	-Steve

Spoken: Steve Miller    Domain: steve@mimsy.umd.edu    UUCP: uunet!mimsy!steve
Phone: +1-301-454-1516  USPS: UMIACS, Univ. of Maryland, College Park, MD 20742

bob@aargh.cis.ohio-state.edu (Bob Sutterfield) (11/18/87)

In article <122700002@datacube> berger@datacube.UUCP writes:
>
>We are running osx 4.0 on a 90x (with no data cache). We seem to be 
>experience real performance problems primarily with the ethernet.
>...
>The biggest problem is when a user has a home directory on the Pyramid
>and is logged into a sun, even if they are working on a Sun filesystem,
>they get error messages such as:
>
>NFS server datacube not responding still trying
>NFS write failed for server datacube: RPC: Timed out
>...

Sounds pretty familiar.  We had such problems all spring and half the
summer.  Traced our Ethernet backbone (supecting length/late collision
problems) and cleaned it up a lot in the process.  Spent a lot of time
watching packets fly with a LANalyzer.  Really thought it was our
problem.

All the users were getting pretty irate at us, thinking the system was
crashing and bouncing 20 times a day.

Then one day, while talking to RTOC about another problem, I happened
to notice that once again, all my Pyramid-based X clients' connections
to my Sun had timed out and winked away, and once again swore pretty
vehemently at our network.

The RTOC person said "Oh yeah, that's a known problem.  You see it?
OK, we'll ship you a PTF and fix you right up."  Sure enough, next
morning I installed their PTF, rebooted, and we haven't had such
problems since.

So, I'd suggest you describe your problems to RTOC, because they just
might be able to do something about it for you.

While you have them on the phone, suggest (as I have) that they change
their fix distribution policy so as to send out PTFs that solve
problems of some designated severity level, to all users of that
software, on a regular basis.  This would seem a better idea than
customers wasting a couple of months fooling around trying to solve
major problems, when all along it was a known bug with a known fix.

As I said, I have mentioned this to RTOC, our local sales rep, our
local tech support person, and anybody at any levels inside Pyramid
who would listen.  Maybe if we get more votes, it mike help.

"And if twenty people walk in singing the first few bars... they'll
think it's a movement!"
-=-
 Bob Sutterfield, Department of Computer and Information Science
 The Ohio State University; 2036 Neil Ave. Columbus OH USA 43210-1277
 bob@ohio-state.{arpa,csnet} or ...!cbosgd!osu-cis!bob
 soon: bob@cis.ohio-state.edu