[comp.sys.sun] SunOS bug list/4.1 problems with locking

kdenning@ksun.naitc.com (07/24/90)

A couple of things that I'd like some answers to:

1) We've run into a really strange problem with file locks.  From time to
   time we'll see messages of the form "fcntl: no record locks available"
   followed (usually) by "rpc.lockd: out of lock".  At the time this occurs
   there is nothing unusual running on the system.  This is a shared services
   machine, and serves email and the automount master file databases for our
   entire network.  Needless to say, these messages are not nice to see at
   all!  The shared directories, however, are not normally written into by
   the client systems, thus I don't believe that the clients have anything to
   do with this directly.

2) Some problems inexplicably drop core, including sendmail (!) and smail3
   (!!).  There are references to this being environment-length sensitive in
   some other newsgroups; does anyone have hard evidence on this, or even
   better yet, a fix?

Sun's support has been less than stellar on these issues; we would like to
be able to feel good about the systems, but right now my 386/ix machine is
more reliable!

Secondly, we (and I'm sure lots of other people) would REALLY like to have
a list of fixed bugs in a nice compendium somewhere.  I don't care how
it's distributed, electronically or otherwise, but the patches and the
bugs they repair would be most appreciated.  It appears that these are not
normally distributed to users.

Thanks for the assistance!

Karl Denninger	AC Nielsen
kdenning@ksun.naitc.com
(708) 317-3285

casper@fwi.uva.nl (Casper H.S. Dik) (07/25/90)

In article <10194@brazos.Rice.edu>, kdenning@ksun.naitc.com writes:

|> 1) We've run into a really strange problem with file locks.  From time to
|>    time we'll see messages of the form "fcntl: no record locks available"
|>    followed (usually) by "rpc.lockd: out of lock".

I've seen this problem to and am about to file a bug report. If you have
source to the program that uses the locks, there's an easy work around.

The bug is caused (presumably) by the lock client code.  If you test for
the existence of a non existent lock with fcntl/lockf, rpc.lockdistance!
relinquishes the file descriptor refering to the 'locked' file.

The workaround is to do a lockf(fd,F_ULOCK,size) after a
lockf(fd,F_TEST,size) returns 0.  Internally, i.e. in libc, lockf(F_TEST)
is translated to an equivalent call to fcntl (F_GETLK).

If you don't have the source there are some things you can do:

a) complain complain, COMPLAIN
b) Write a wrapper around rpc.lockd to increase the soft limit on fd's to 256:
#!/bin/csh -f
limit desc 256
exec /usr/etc/rpc.lockd.org
c) Try:
   Use ofiles/fstat to find the inodes of the files kept open by rpc.lockd.
   Use ncheck to find the filenames.
   Use the program below (on an NFS client) compiled with -DZAPLOCK and the
     files found in step 2 as arguments. This will cause rpc.lockd to forget
     about those files.

This is a program to demonstrate the bug:

#include <errno.h>
#include <sys/file.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>

main (argc,argv)
int argc;
char **argv;

{
	int fd;

	if (argc < 2) {
		(void) fprintf(stderr,"Usage: %s filename ...\n", *argv);
		exit(1);
	}
	while(*++argv) {
		if ((fd = open(*argv,O_RDWR)) < 0) {
			perror(*argv);
			continue;
		}
		/*
		 * this is what causes the bug
                 */
		if (lockf(fd,F_TEST,0) < 0) {
			perror("TESTLK");
			close(fd);
			continue;
		}
#if ZAPLOCK
		if (lockf(fd,F_ULOCK,0) < 0) {
			perror("UNLOCK");
		}
#endif
		close(fd);
	}
	exit(0);
}


|> Secondly, we (and I'm sure lots of other people) would REALLY like to have
|> a list of fixed bugs in a nice compendium somewhere.

There's the customer distributed buglist.  As a non American user I have
no access to the Online Bugs Database.  (Wish it were accessable over the
Internet, like rlogin odb.sun.com -l guest)

My main problem is tmpfs. This really is a (last minute?) hack.  (Can't
use lockf/flock on a tmpfs, can't make pipes on tmpfs, sticky bit on
directories won't work (it was invented for /tmp!))

Casper H.S. Dik				VCP/HIP: +31205922022
University of Amsterdam     |		casper@fwi.uva.nl
The Netherlands             |		casper%fwi.uva.nl@hp4nl.nluug.nl