dietrich@cernvax.UUCP (dietrich wiegandt) (09/14/90)
Hardware: VAX8530 Software: ULTRIX 3.1 Hello, recently our system got into an almost unusable state complaining that the gnode table was full. Running pstat -i revealed that in our table of 1392 entries 875 entries belonged to the same user, and the rest did not suffice to run a decent service for the remaining logged on 60 or so users. The user with all these busy entries in the gnode table was NOT logged on, and there was no process on the system belonging to him. Of course he claimed to be totally innocent. Any idea by what manipulation we might have got in such a state? We don't know for how long we have been running with a gnode table threatening to overflow, so the problem may have arisen days before we noticed it. We have a rather special environment with many machines from different manufacturers and lots of NFS mounts (including possibly this user's files) to remote machines going on. Any hints would be very much appreciated. Dietrich Wiegandt CERN CN Division
grr@cbmvax.commodore.com (George Robbins) (09/17/90)
In article <2731@cernvax.UUCP> dietrich@cernvax.UUCP (dietrich wiegandt) writes: > Hardware: VAX8530 > Software: ULTRIX 3.1 > > recently our system got into an almost unusable state complaining I've seen gnode table full messages when a disk drive goes offline or otherwise wacky. Once this happened when a mangment type was pushing buttons, several times when the HSC decided to run ILDISK (in-line error analysis) due to drive errors. The HSC support works nicely, but doesn't seem all that rugged. > We have a rather special environment with many machines from different > manufacturers and lots of NFS mounts (including possibly this user's files) > to remote machines going on. It seems possible that you might see the same kind of problem if you have lots of NFS mounts and run into network delays or problems, but this is only guessing.... -- George Robbins - now working for, uucp: {uunet|pyramid|rutgers}!cbmvax!grr but no way officially representing: domain: grr@cbmvax.commodore.com Commodore, Engineering Department phone: 215-431-9349 (only by moonlite)
saus@bijou.media.mit.edu (Mark Sausville) (09/19/90)
In article <2731@cernvax.UUCP> dietrich@cernvax.UUCP (dietrich wiegandt) writes:
From: dietrich@cernvax.UUCP (dietrich wiegandt)
Newsgroups: comp.unix.ultrix,comp.unix.internals
Keywords: gnode table overflow
Date: 14 Sep 90 10:03:15 GMT
Followup-To: comp.unix.ultrix
Hardware: VAX8530
Software: ULTRIX 3.1
Any idea by what manipulation we might have got in such a state? We don't
know for how long we have been running with a gnode table threatening to
overflow, so the problem may have arisen days before we noticed it.
We have a rather special environment with many machines from different
manufacturers and lots of NFS mounts (including possibly this user's files)
to remote machines going on.
Any hints would be very much appreciated.
Dietrich Wiegandt
CERN CN Division
A patch to version 3.1 exists (available from the support center)
which purports to fix a panic related to gnodes, NFS mounts and quotas.
We had a problem similar to yours caused by an application which
seemed to corrupt gnodes.
After some investigation (the gnode code is way hairy), we decided
to host the application elsewhere. It is clear to me that it's possible
to wedge gnodes over NFS, but since I couldn't find a simple way
to recreate the problem, I didn't pursue it with DEC.
I would suggest that you try to correlate with some application. In
our case, it was an ethertalk file service serving files which were
NFS mounted on the ethertalk server.
Mark.
Mark Sausville MIT Media Laboratory
617-253-0325 Room E15-354
Fax: 617-258-6264 20 Ames Street
saus@media-lab.media.mit.edu Cambridge, MA 02139