[comp.protocols.nfs] NFS Problems

jarvis@psych.toronto.edu (Brian Jarvis) (02/19/91)

I'm in need of some advice...

I've got a number of machines, some lowly 4.77 MHz 8086's, others are
20 MHz 386's, all equipped with WD8003e controllers and Beame & Whiteside
NFS software (Ver. 2.10).

I've found that, for optimal through-put, the PC's should be configured
to read in blocks of 4096 bytes, but writing should only be done in
512 byte blocks, the smallest possible.  With these settings, reading
files across the network from the server can be done at about 10K/s, but
writing still chugs at a mere 0.3K/second.  If I raise the write block
parameter to 4096 bytes, this number shrinks to 0.1K/second.  These
tests were taken when traffic on the network was relatively light.

The way the ethernet is installed in our building is somewhat brain-damaged,
I do confess.  It's easy for me to criticize:  I wasn't here when it was
installed!  B{)  I do, however, have to live with it...

The above numbers came from my PC on the 4th floor of our building.  The
packets from the server have to travel along the ground floor (where the
server is located), through a repeater, up to the 4th floor to a router
which then pushes packets out onto 5 different loops.  I'm on one of
these loops.  Other machines connected to the network between the repeater
and the server do not have any difficulty getting excellent through-put,
writing and reading.  Machines after the repeater and router are all
equally sluggish.  All of this would seem to point to problems with
the repeater and router; I haven't yet been able to plant a machine between
the two to see if I can nail it down further, but we'll see...

The part that confuses me most is that telnet and rlogin operations are
not affected.  I can read news and mail from the server with bwktel and
NCSA telnet from my same PC without the slightest hint of sluggishness.
"ftp" through-put exceeds 50K/second.  Go figure.

Now the questions:

1)  Does anyone have a clue where I should be concentrating my efforts?
    I've been spending quite a bit of time checking all items I could,
    but it's clear now that I'm only thrashing.  Pointers are welcome.

2)  Has this ever happened to anyone else?  How did they fix it?

3)  Does anyone have pointers to any sort of preferred network monitoring
    software or hardware?  I've been playing with "NFSWATCH 3.0".  Is
    there any sort of recommended ethernet sniffer (the cheaper the better)
    that I could acquire to help debug this and other ethernet problems?

4)  Is anyone out there actually using Beame & Whiteside NFS software?
    Sometimes I get the impression we're the only ones... B{/

Thanks!

Brian

-- 

Brian A. Jarvis,          Rm. 4026, Sidney Smith Hall, Dept. of Psychology,
jarvis@psych.toronto.edu  University of Toronto, Toronto, Ontario, Canada
System Administrator      M5S 1A1  (416) 978-3948

kdenning@pcserver2.naitc.com (Karl Denninger) (02/20/91)

In article <1991Feb18.165842.12709@psych.toronto.edu> jarvis@psych.toronto.edu (Brian Jarvis) writes:
>I'm in need of some advice...
>
>I've got a number of machines, some lowly 4.77 MHz 8086's, others are
>20 MHz 386's, all equipped with WD8003e controllers and Beame & Whiteside
>NFS software (Ver. 2.10).
>
>I've found that, for optimal through-put, the PC's should be configured
>to read in blocks of 4096 bytes, but writing should only be done in
>512 byte blocks, the smallest possible.  With these settings, reading
>files across the network from the server can be done at about 10K/s, but
>writing still chugs at a mere 0.3K/second.  If I raise the write block
>parameter to 4096 bytes, this number shrinks to 0.1K/second.  These
>tests were taken when traffic on the network was relatively light.

No!  You have problems with your routers or bridges!  You should write and
read in 4096-byte blocks.... if you can't, someone's dropping ethernet 
packets...

Also, your numbers are way too low.  I can EASILY beat or equal LANMAN 
performance over either Ethernet or Token ring...  We wouldn't be running it
here if the performance wasn't up to par....

8086's are a lot slower, but the 386 systems here all can perform this well.

>The above numbers came from my PC on the 4th floor of our building.  The
>packets from the server have to travel along the ground floor (where the
>server is located), through a repeater, up to the 4th floor to a router
>which then pushes packets out onto 5 different loops.  I'm on one of
>these loops.  Other machines connected to the network between the repeater
>and the server do not have any difficulty getting excellent through-put,
>writing and reading.  Machines after the repeater and router are all
>equally sluggish.  All of this would seem to point to problems with
>the repeater and router; I haven't yet been able to plant a machine between
>the two to see if I can nail it down further, but we'll see...

Yow.  Find out where you're dropping packets on the floor and fix it!  It
sounds like your router is not up to snuff, but it could be a badly terminated
segment in there somewhere as well....

>The part that confuses me most is that telnet and rlogin operations are
>not affected.  I can read news and mail from the server with bwktel and
>NCSA telnet from my same PC without the slightest hint of sluggishness.
>"ftp" through-put exceeds 50K/second.  Go figure.

UDP connections stress the network, since they are "blast and forget"
types of things.  You find out later that the packet didn't get there, and
have to retransmit.  In the extreme case, you'll never get all the packets
and you will get error messages.  TCP connections are statefull and 
connection-oriented, and do packet-by-packet acknowledgement.  Thus they 
survive where UDP will fail to work properly.

I've seen problems with 4KB write and read sizes when using a Sun to route
between subnets with SunOS 4.0.3.  4.1 and later seems better able to handle
the load.  Repeaters shouldn't be a problem unless there are termination
difficulties.

>Now the questions:
>
>1)  Does anyone have a clue where I should be concentrating my efforts?
>    I've been spending quite a bit of time checking all items I could,
>    but it's clear now that I'm only thrashing.  Pointers are welcome.

Check the router.  I bet it can't handle the speed of transmission of the
packets.  Also have a FULL TDR scan run on all the cable itself.  You may
find a bad terminator or tap.

>2)  Has this ever happened to anyone else?  How did they fix it?

See above :-)  We gave up here (over 150 seats on thinnet) and are installing
10BaseT equipment.  Regardless of how careful we were, there were always
problems with the wiring while we were on thin cable.....

>4)  Is anyone out there actually using Beame & Whiteside NFS software?
>    Sometimes I get the impression we're the only ones... B{/

We use it, and LOVE it.  It's the best implementation out there.  I've tried
FTP's, Suns, and B&W.  I wouldn't consider using anything less.  Carl Beame
should be canonized (well, perhaps that's a little strong, but you get the
idea) :-)

--
Karl Denninger - AC Nielsen, Bannockburn IL (708) 317-3285
kdenning@nis.naitc.com

"The most dangerous command on any computer is the carriage return."
Disclaimer:  The opinions here are solely mine and may or may not reflect
  	     those of the company.