[comp.bugs.sys5] Lockd is broken.

khh@root.co.uk (Keith Holder) (01/14/91)

I am trying to put diskless workstation support into V.4 and have come across
a problem with the NFS locking daemon, lockd. It seems that when a client
crashes and then reboots, the server does not release the locks previously 
held by the client. The problem is highlighted by the fact  that the server 
still replies to the client on the old connection, the client does not see the 
reply that the file is still locked (it is listening on the newly established
connection) , so it re-issues a new request after a timeout period. This 
continues until the server fills its own internal lock table.
	Looking into the problem I have noticed some anomalies. The code
for dealing with  a client crash/recovery  is missing ( compared to SunOS 4.0).
Actually, I a being polite here, the code has been removed, with some vague
comment about the kernel handling all the locks.
The locking code (both kernel and lockd) uses the `sysid' field of the lock 
data structure, does anyone know what this should be set to? Currently it 
contains the address of a data buffer, which randomly changes for each repeated 
lock request.
	The reason why I need this to work, is that /sbin/mount and
the file system specific mount programs all create and use lock files.
-------------------------------------------
Keith Holder, UniSoft Ltd (khh@root44.co.uk)