osyjm@caesar.cs.montana.edu (Jaye Mathisen) (04/23/91)
HP9000/350SRX, new install of HPUX 7.0.5. I'm seeing frequent (every few days) hangs of my machine with the following on the console: Failed kernel selftest. Failures: Cannot allocate file system buffers And it repeats it self every 2 minutes. I don't have a wildly different kernel config from when I was running 6.0. I do use the -async option in my exports file for NFS, but I doubt that has anything to do with it. Any suggestions on kernel params to change? Or a clue as to what might be going on? -- Jaye Mathisen,sysmgr 410 Roberts Hall,Dept. of Computer Science Montana State University,Bozeman MT 59717 PHONE: (406) 994-{4780,3931}
bob@hpuerca.HP.COM (Bob Poulsen) (04/26/91)
>>I'm seeing frequent (every few days) hangs of my machine with the following >>on the console: >>Failed kernel selftest. Failures: >>Cannot allocate file system buffers >>And it repeats it self every 2 minutes. I don't have a wildly different >>kernel config from when I was running 6.0. I do use the -async option >>in my exports file for NFS, but I doubt that has anything to do with it. >>Any suggestions on kernel params to change? Or a clue as to what might be >>going on? >>-- A couple of things to try: 1. Increase the value of the kernel parameter 'nbuf' (number of file system buffers). 2. If this is a cluster server, increase the number of cluster server processes. The can be accomplished at boot time with /etc/clusterconf or at run time with '/etc/csp -a'. I have no specific data on HP-UX 7.05 and NFS. >> Jaye Mathisen,sysmgr 410 Roberts Hall,Dept. of Computer Science >> Montana State University,Bozeman MT 59717 PHONE: (406) 994-{4780,3931} Bob Poulsen Hewlett-Packard North American Response Center Normal disclaimer.
mike@hpcupt1.cup.hp.com (Michael Saboff) (04/30/91)
> I'm seeing frequent (every few days) hangs of my machine with the following > on the console: > > Failed kernel selftest. Failures: > Cannot allocate file system buffers > > And it repeats it self every 2 minutes. I don't have a wildly different > kernel config from when I was running 6.0. I do use the -async option > in my exports file for NFS, but I doubt that has anything to do with it. > > Any suggestions on kernel params to change? Or a clue as to what might be > going on? Sorry for the length, but I want to tell what is happening as well as how to fix it. The problem is with the diskless subsytem and its tuneables. This message is output as part of the regular (every 2 minutes) self test of the diskless subsytem. There is a small pool of file system buffers that is used for diskless requests. This pool is different from the normal "nbuf" pool in that allocation from this pool can be done from an interrupt handler. What's happening is that the selftest code cannot allocate one of these buffers. On your system, this pool is too small and diskless requests are not being handled because of this (hang). The kernel tuneable in question is called "dskless_fsbufs". This tuneable has usually set to be the same as "serving_array_size". If you system is a server, this is probably 200. I suspect that your system is a cluster client and that serving_array_size is real small (0-10). Change serving_array_size to be 20 or so (either directly or via other kernel tuneables it depends on like maxusers). Check /etc/master for default settings. Michael Saboff Member of Filesystem Project - HP-UX Kernel Lab mike@hpda.hp.com