[comp.protocols.nfs] SUN NFS Load ? Problem

beame@maccs.dcss.mcmaster.ca (Carl Beame) (04/18/91)

	I have a customer who has an interesting NFS problem. The person
is copying serveral thousand files from a Novell server to a SUN NFS server
a day. About 8 out of every 1000 files are corrupted by zeros in the first
512 bytes.

	We tracked the problem down and found that all corrupted files occured
when the following NFS calls were made.

                     PC sends                      Sun replies
                   =================             ===================
                   Create file                     --------------
 (timeout retrans) Create file (same XID)          OK (created)
                   Write 512 bytes data (<>0)      OK (written)
                   --------------------            OK (created)
                   Write 4096 bytes data           OK (written)
                         . . .

	What seems to be happening, is the second created, even though
it has the same XID as the first, is executed after the first write of
data and thus the SUN treats the file as having a 512 byte hole at the
beginning, since the file was created again after the first write.

	This is a very very slow SUN running Sun OS 3.5. Does anyone
known if there is anything this person can do to stop this occuring?
(Buying more horse power is NOT a possible solution).


- Carl Beame
Beame & Whiteside Software Ltd.
Beame@McMaster.CA
(416) 648-6556

thurlow@convex.com (Robert Thurlow) (04/20/91)

In <280D9593.23762@maccs.dcss.mcmaster.ca> beame@maccs.dcss.mcmaster.ca (Carl Beame) writes:
[about getting zeros in the first blocks of files.]
>	This is a very very slow SUN running Sun OS 3.5. Does anyone
>known if there is anything this person can do to stop this occuring?
>(Buying more horse power is NOT a possible solution).

Upgrade the OS!  SunOS 3.5 dates back to before people started using
the duplicate request cache.  The duplicate create is getting dealt
with after the first write because separate NFS daemons are not aware
of that possibility.  The only problem is what OS to upgrade to.  I'd
think 4.0.3 would be pretty good if they feel nervous about making the
leap to 4.1.1.  They could also double or treble the mount timeouts
on the client side if they have control over that, at the cost of
performance hits when packets really did get lost.

Rob T
--
Rob Thurlow, thurlow@convex.com
An employee and not a spokesman for Convex Computer Corp., Dallas, TX