[comp.unix.wizards] NFS reliability

jeff@tc.fluke.UUCP (11/13/86)

In article <1823@rlvd.UUCP> mike@louis.UUCP () writes:

>Recently we have been doing a study of NFS fileservers and we have
>come across unreliability in NFS (i.e writing something to a remote
>file and finding something different when reading it back) when the
>server was under extreme load. Now we are starting to notice the same
>behaviour on our existing Sun fileservers. 

>The question is, have other noticed this and does anyone know why
>it happens? And, of course, does anyone know how to stop it?

>Mike Woods

We have also experienced a (rare) phenonenon which does seem limited to
NFS files: a file may have a (logical) block's worth of data written as nulls.
It appears to happen under some ill-defined conditions of simultaneous access.
We've only seen it with /usr/spool/mail files and RCS *,v files (ouch!).

We've suffered another you-don't-get-back-what-you-wrote bug, but it is
NOT limited just to the NFS.  (If you serve diskless clients, they will
also see panic: ifree's.)  Turns out to be a problem with the Xylogics 450
disk controller under heavy load; we fixed it by putting each disk on a
separate controller.
-- 
        Jeff Stearns       (206) 356-5064
        John Fluke Mfg. Co.
        P.O. Box C9090  Everett WA  98043  
        {uw-beaver,decvax!microsof,ucbvax!lbl-csam,allegra,ssc-vax}!fluke!jeff
-- 
	Jeff Stearns       (206) 356-5064
	John Fluke Mfg. Co.
	P.O. Box C9090  Everett WA  98043  
	{uw-beaver,decvax!microsof,ucbvax!lbl-csam,allegra,ssc-vax}!fluke!jeff

lear@rutgers.RUTGERS.EDU (eliot lear) (11/17/86)

Hi..

Supposedly, some of the reliability problems of NFS could be due to
the fact that NFS turns off some forms of error checking.  If this
is the case, maybe an enhancement could be to do better error
checking.

eliot lear

-- 

[lear@rutgers.rutgers.edu]
[{pyramid|seismo|ihnp4}!rutgers!lear]

henry@utzoo.UUCP (Henry Spencer) (11/25/86)

> We've suffered another you-don't-get-back-what-you-wrote bug...
> ...  Turns out to be a problem with the Xylogics 450
> disk controller under heavy load; we fixed it by putting each disk on a
> separate controller.

It would be nice to see some *details* on this, either here or in Sun-Spots.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry