bernhold@qtp.ufl.edu (David E. Bernholdt) (12/30/88)
Here is a brief description of our local system: 4 3/280's acting as file servers 60 3/50's distributed ~15 per server ~2 MB user files distributed among the servers, and NFS mounted by *all* servers. The problem: If one of the servers goes down, it effectively stops all of the others from working reasonably. The problem seems to be that NFS requests queue up for the dead disks, and are never served - and then legitimate requests get stuck behind the "bad" requests. The main cuplrit (we think) is the call to quota made in the login sequence. Thus we can't startup a new shell on a 3/50 or anywhere else. The questions: Is anyone else running a system this large and interconnected? If so, do you have these problems? Is there something we can do to make these things reasonably independent if one goes down, but still maintain the homogeneity (*all* usr partitions mounted by *all* servers, etc.) on the rest of them? Please e-mail and I will summarize if there is any response. Thanks in advance... -- David Bernholdt bernhold@qtp.ufl.edu Quantum Theory Project bernhold@ufpine.bitnet University of Florida Gainesville, FL 32611 904/392 6365