[comp.sys.sequent] Problems with NFS

urmel@cosmo.uucp (Markus Hess) (09/26/90)

I'm working as a systemadministrator and systemprogramer for a publishing
company and i'm responsible for a lot of UN*X computers, including Sun
and Sequent Balance.

Last week, i've updated the Suns to the new OS Version 4.1 and encountered
a problem with NFS:

	The Sequent, running DYNIX 3.0.12.1  and operating as server
	for the Suns, now cause NFS errors (NFS server ... not responding,
	still trying :-( ) to filesystems mounted on the Suns.
	The transfer rate dramatically slowed down to about 10KB a second.
	Using SUNOS 4.0.x, we have no problems. I think, that the Sequent NFS
	Version used in DYNIX 3.0.12.1 is out of date, but i don't know
	exactly.

	While looking to netstat(1N) output on the Sequent, the problem
	seems to be that the Sequent drops a large amount of 
	IP fragments because of timeouts.

Has anyone else got the same problem before? Can anyone help
or give a hint, how to solve this very annoying problem?
Thanks for any help in advance.

Markus

se@comp.lancs.ac.uk (Steve Elliott) (09/27/90)

In article <5996@balu.UUCP> urmel@cosmo.uucp (Markus Hess) writes:
>
>Last week, i've updated the Suns to the new OS Version 4.1 and encountered
>a problem with NFS:
>
>	The Sequent, running DYNIX 3.0.12.1  and operating as server
>	for the Suns, now cause NFS errors (NFS server ... not responding,
>	still trying :-( ) to filesystems mounted on the Suns.
>	The transfer rate dramatically slowed down to about 10KB a second.
>	Using SUNOS 4.0.x, we have no problems. I think, that the Sequent NFS
>	Version used in DYNIX 3.0.12.1 is out of date, but i don't know
>	exactly.
>
>
>Has anyone else got the same problem before? Can anyone help
>or give a hint, how to solve this very annoying problem?
>Thanks for any help in advance.
>
>Markus

I was just about to ask a similar question myself. Several of my
users have reported problems when working on a Sun SparcStation
with user files NFS mounted from a Symmetry S81, DYNIX 3.0.15.
It got so bad I eventually rang up Sequent for advice. 
The conversation went like this:
Me: "We're getting NFS error messages on our Sequent"
Sequent: "What other machines?"
Me: "Sun SparcStations"
Sequent: "I knew you were going to say that...."

Apparently SparcStations throw out Ethernet packets at such a speed
that the Sequent can't keep up with it.
I was told that old SCED boards might be adding to the problem, but that
even with the most uptodate SCEDs there will still be a problem.
The guy I spoke to suggested that we tweaked our Suns to transmit
Ethernet packets slower! He added that the only real solution is
for Sequent to develop a faster driver.

Steve
-- 
se@uk.ac.lancs.comp
Department of Computing, Engineering Building, 
University of Lancaster, Bailrigg, Lancaster, LA1 4YR, UK
PHONE: +44 524 65201 ext 3783.

ables@lot.ACA.MCC.COM (King Ables) (09/28/90)

From article <1032@dcl-vitus.comp.lancs.ac.uk>, by se@comp.lancs.ac.uk (Steve Elliott):
> 
> Apparently SparcStations throw out Ethernet packets at such a speed
> that the Sequent can't keep up with it.

If true, this is very interesting.  I saw the same thing happen a few years
ago.

We had a Balance 8000 and a bunch of Sun-2s (I told you it was a few years
ago!) on our net.  The day we got Sun-3s and put them on the net, the
Balance ethernet interface started hanging (this was before the NFS port,
so this was just plain old rlogin/rsh access).  You could reboot and it
would fix it for a while, but soon it would hang again.  Eventually the
SCED board had a new rev. released and all was ok again.  The story then
was the same.  The Sequent hardware just wasn't able to keep up with
the Sun hardware (I think this instance was back-to-back packets which
hadn't been done before).

Let me preface the following criticism with some praise, though.  Sequent
was VERY cooperative and basically busted their butts to solve the problem
for us.  They sent people down a couple of times to look at it (the fact
that a couple of other customers in the area were also having the same problem
probably didn't hurt, though).  And in all the time I ever dealt with them
from a tech. support standpoint, I found them to be far and away better than
any other computer company I've ever dealt with before or since.

The thing that worried me was they obviously weren't designing their interfaces
according to published specs.  They seemed to be looking at what was out there
and designing to coexist with it.  When somebody made a breakthrough of speed
or capacity which was still within the spec., it fouled them up.  This doesn't
seem real bright.  And from this recent problem, it sounds like they still 
may be doing this.  Seems like after being burned once they wouldn't do that
again (fool me once, shame on you, fool me twice, shame on me, that sort of
thing).

Obviously having to SLOW DOWN your Suns is not a good solution.  I expect
Sequent will be working on a solution for you, but if it's BAD (Broken As
Designed) it may not come quickly.

Do other companies have this kind of trouble?  I flashed on the posting since
I've seen it from Sequent before and I've never seen it myself or heard of
it happening to other vendors.

On the other hand, it's the ONLY significant problem I've ever seen out
of Sequent, so from that standpoint it's not so bad.
-----------------------------------------------------------------------------
King Ables                    Micro Electronics and Computer Technology Corp.
ables@mcc.com                 3500 W. Balcones Center Drive
+1 512 338 3749               Austin, TX  78759
-----------------------------------------------------------------------------
We don't inherit the Earth from our parents, we borrow it from our children.

pen@lysator.liu.se (Peter Eriksson) (09/29/90)

ables@lot.ACA.MCC.COM (King Ables) writes:

>We had a Balance 8000 and a bunch of Sun-2s (I told you it was a few years
>ago!) on our net.  The day we got Sun-3s and put them on the net, the
>Balance ethernet interface started hanging (this was before the NFS port,
>so this was just plain old rlogin/rsh access).  You could reboot and it
>would fix it for a while, but soon it would hang again.  Eventually the
>SCED board had a new rev. released and all was ok again.  The story then
>was the same.  The Sequent hardware just wasn't able to keep up with
>the Sun hardware (I think this instance was back-to-back packets which
>hadn't been done before).

Hmm... This sounds a little like a problem we're experiencing right now with
our Sequent Balance 8000. Occasionally the ethernet seems to get stuck in one
direction (ie, the Sequent can transmit packets, but everything to it ends up
in /dev/null). At first we thought it was the transeiver that was faulty, but
now we're not so sure anymore. (It got better when we replaced it with a newer
one (INMAC Clear Signal), but there are signs that it might reoccur again. Most
of the time things work just fine though (with Sun 3:s and SPARCs - no NFS
though, Dynix 2.1.1 doesn't support it...)

/Peter

--
Peter Eriksson                                              pen@lysator.liu.se
Lysator Computer Club                             ...!uunet!lysator.liu.se!pen
University of Linkoping, Sweden                               "Seize the day!"