morrison@cs.cornell.edu (Eliot Morrison) (12/21/88)
We are running OS 3.5 on a large but fairly homogenous site, which consists mainly of 5 3/2?0 servers, each serving 7-10 3/50 or 3/60 diskless clients. For many month we have been plagued by a recurring problem that seems to be related to yellow pages. Observation has convinced us that the crash of a yellow page server while a client is in an active state sometimes causes the client to suffer a strange sort of brain damage: some processes requiring file descriptors subsequently fail. Our most salient examples are rsh, which simply returns after the initail handshake because the client side connect failed, and lpr, which is unable a get a lock. It seems that in each case the client kernel believes that no more descriptors are available, a supposition that is in fact belied by pstat, which reveals available resources. More strangely, invoking ypset fixes the problem, even though yellow pages is behaving normally then. My question is simply whether anyone has observed similar behavior, and if so, has the problem gone away under 4.x ? thanks Eliot Morrison