[comp.protocols.appletalk] CAP6.0 -- incompatible lock daemons?

lance@luna.dpl.scg.hac.com (Lance Telepnev) (05/15/91)

Is anyone out there having problems with CAP6.0 running on 4.1.1?

I have it running on a sun3 running 4.1.1 and it seems to work just fine
till someone tries to access a file on a mounted partition that is not local
to the machine AND THAT machine is NOT running 4.1.1, rather 4.0.3.
The mac hangs forever forcing me to reboot the mac.  The macintosh typically
hangs after about the same amount of time while trying to copy/access either
a large file or a small one.

Does that mean that I have to upgrade all the machines to 4.1.1 before
everything will work correctly?  Or is there some sort of work around for this?

I suspect a problem with incompatible lock daemons between operating systems.

Any comments or suggestions would be greatly appreciated!

		-Lance Telepnev
*********************************************************************
| Lance Telepnev     | Hughes Aircraft Company      |    // Amiga   |
| lance@luna.hac.com | Space & Communications Group |  \X/ there is |
| ph: 213-414-6225   | P.O. Box 92919               | no substitute |
| pg: 213-352-1611   | Los Angeles, CA  90009-2919  |               |
*********************************************************************

cwilson@clapton.austek.oz (Chris Wilson) (05/15/91)

In article <14881@hacgate.UUCP> lance@luna.dpl.scg.hac.com (Lance Telepnev) writes:
>
>Is anyone out there having problems with CAP6.0 running on 4.1.1?
>
>I have it running on a sun3 running 4.1.1 and it seems to work just fine
>till someone tries to access a file on a mounted partition that is not local
>to the machine AND THAT machine is NOT running 4.1.1, rather 4.0.3.
>The mac hangs forever forcing me to reboot the mac.  The macintosh typically
>hangs after about the same amount of time while trying to copy/access either
>a large file or a small one.
>
Probably, the server machine doesn't do the correct sort of locking
using lockd. The answer is to comment out the following line from m4.setup

# lockf - "afp: byte range locking using unistd.h"
define([X_LOCKF],1)

I had a similar problem where the NFS server was a machine running an
oldish version of Ultrix and doing that fixed the problem. You can
then run gen.makes, recompile and see if it works.

Chris
ACSnet:	cwilson@austek.oz	Internet:  cwilson@austek.oz.au

vturner@nmsu.EDU (Turner) (05/15/91)

We had the same troubles, but noticed them from the Sun side.  Whenever anyone
would mount their filespace (again, non-local, < SunOS 4.1.1), we would get
blasted with rpc errors on the console of the remote Sun.  Our solution was to
apply Sun patch 100075-06 to the affected machines.  This fixed the problem for
us...  Patching CAP seems like a better idea, and since some admins on our net
are still having problems with it (they're too busy to patch their OSi), I'll
try the cap patch suggested.  Anyway, in the event that doesn't work, I have
included the README file for the Sun patch.

Hope this helps,

Vaughan
                     VaughAn Turner     Internet: vturner@nmsu.edu
     Networking/Workstation Support     Box 30001, Dept. 3AT
    Computer Center, Networking/WSC     Las Cruces, New Mexico
        New Mexico State University     88003-0001
               Bitnet: vturner@nmsu     UUCP: ucbvax!nmsu.edu!vturner
  Work: (505) 646-4244     FAX: (505) 646-5278      Home: (505) 522-3653

     Home Address: 1115 Larry Drive     Las Cruces, New Mexico 88001-5457

"...the first rule of engineering [is] to work with Earth's natural forces,
never against them."
                                                     "Earth"  by David Brin

----included file follows----
Patch-ID# 100075-06 
Keywords: lockd, rpc.lockd, rpc.statd file locking
Synopsis: lockd problems in 4.1.1 4.1 and 4.0.3
Date: 7/Feb/91

SunOS release: 4.1.1, 4.1 , 4.0.3, 4.0.3c
 
Unbundled Product:
 
Topic: rpc.lockd jumbo patch
 
BugId's fixed with this patch: 1044565 1045700 1046001 1045996 1045995

Architectures for which this patch is available: sun4 sun4c Sun3 and Sun3x

Problem Description:

	PROBLEMS FIXED BY 100075-06 PATCH

	i) Fixed problems where locks were getting lost on a heavily
	loaded system, particularly when using shared locks or test
	lock calls.

	ii) Fixed problem where pc-nfs applications were failing with
	"rpc.lockd: unable to unlock a lock" and "rpc.lockd: unable to 
	set a lock."

	iii) Fixed problem with automatic upgrade and downgrade of 
	locks.

	iv) Fixed problem with client reboot (L1 A) and locks not being
	recovered after that.

	v) Fixed problem with signal interrupting lock calls and wrong
	error code returned.

	vi) Fixed problem with infinite retry of lock on unlinked files.

	vii) Fixed problem with local blocking shared locks not being
	granted a lock when one is available.

	viii) Fixed problem with u-area overwrite when doing test lock.

	ix) Fixed problem where messages like "klm_lockmgr: unlock denied?!" 
	and "lock-manager: RPC error: .." no longer appear under normal 
	operations.

	x) Additionally this patch allows the kernel lockf debugging code
	to be turned on and off dynamically by setting/unsetting the
        variable lock_debug_on using the following command:-

                #adb -w -k /vmunix /dev/mem
                physmem XXX
                lock_debug_on/W 1
                ^D

        PROBLEMS FIXED BY 100075-05 PATCH

	i) Fixed problem with running out of file descriptors and
	getting RPC TIMEOUT errors, seen when running large number
	of diskless clients.

	ii) Fixed problem where fd and fd structure was not released
	when doing test lock, eventually running out of file 
	descriptors. This problem would occur when running WP and
	quitting out of the window.

	iii) Fixed problem where rpc.lockd core dumps after a large
	amount of time when running "fame" application.

	iv) Fixed problem where restarting lockd on a client results
	in the server not being able to communicate with the new lockd
	as it has an old client handle that is associated with a now
	invalid port number.

	v) Fixed problem where upgrade from a read lock to a write lock 
	is allowed when remote read locks are outstanding.


INSTALL:


For 4.1.1 and 4.1:

Rename the original files before installing the patches
 
 mv /sys/sunX/OBJ/kern_descrip.o /sys/sunX/OBJ/kern_descrip.o.FCS
 mv /sys/sunX/OBJ/klm_lockmgr.o /sys/sunX/OBJ/klm_lockmgr.o.FCS
 mv /sys/sunX/OBJ/ufs_lockf.o /sys/sunX/OBJ/ufs_lockf.o.FCS
 mv /usr/etc/rpc.lockd /usr/etc/rpc.lockd.FCS
 

      Place the new " *.o" files in OBJ directory in /sys/sunX/OBJ
      for your correct " arch -k " type and SunOS release. 

         example: 
		from the command line type:  arch -k 
		if the return was sun3x
		AND your SunOS release is 4.1 (can be checked in /etc/motd)
		cp 4.1/sun3x/*.o /sys/sun3x/OBJ/
	end example.

      Place the new rpc.lockd in /usr/etc
      chown root /usr/etc/rpc.lockd ; chgrp staff /usr/etc/rpc.lockd
      chmod 755 /usr/etc/rpc.lockd

      Rebuild and install a new kernel and reboot.



For 4.0.3:


Rename the original files before installing the patches
 
 mv /sys/sunX/OBJ/kern_descrip.o /sys/sunX/OBJ/kern_descrip.o.FCS
 mv /sys/sunX/OBJ/klm_lockmgr.o /sys/sunX/OBJ/klm_lockmgr.o.FCS
 mv /sys/sunX/OBJ/ufs_lockf.o /sys/sunX/OBJ/ufs_lockf.o.FCS
 mv /sys/sunX/OBJ/vfs_io.o /sys/sunX/OBJ/vfs_io.o.FCS
 mv /sys/sunX/OBJ/klm_kprot.o /sys/sunX/OBJ/klm_kprot.o.FCS
 mv /sys/sunX/OBJ/ufs_vnodeops.o /sys/sunX/OBJ/ufs_vnodeops.o.FCS
 mv /usr/etc/rpc.lockd /usr/etc/rpc.lockd.FCS
 

      Place the new "*.o" files in OBJ directory in /sys/sunX/OBJ
      Place the new rpc.lockd in /usr/etc
      chown root /usr/etc/rpc.lockd ; chgrp staff /usr/etc/rpc.lockd
      chmod 755 /usr/etc/rpc.lockd
      Add the following line to /sys/sunX/conf/files

ufs/ufs_lockf.c   standard

      Rebuild and install a new kernel and reboot.

paul@cs.ed.ac.uk (Paul Anderson) (05/17/91)

In article <14881@hacgate.UUCP>, lance@luna.dpl.scg.hac.com (Lance Telepnev) writes:
> 
> Is anyone out there having problems with CAP6.0 running on 4.1.1?
> I have it running on a sun3 running 4.1.1 and it seems to work just fine
> till someone tries to access a file on a mounted partition that is not local
> ........
> Does that mean that I have to upgrade all the machines to 4.1.1 before
> everything will work correctly?  Or is there some sort of work around for this?
> 
> I suspect a problem with incompatible lock daemons between operating systems.

Try "broken" rather than "incompatible". Flock had lots of problems
under SunoS until recently. I had to disable all locking in CAP to run it
sucessfully under 4.0.3. I believe that 4.1.1 contains lots of locking fixes
and things should work now, but I haven't been brave enough to put the
locking back in my version of CAP yet.

You might be able to get a patch tape from Sun, but the answer is probably an upgrade to 4.1.1.

-- 
Paul Anderson                      JANET: paul@uk.ac.ed.lfcs
LFCS, Dept. of Computer Science    UUCP:  ..!mcvax!ukc!lfcs!paul	
University of Edinburgh            ARPA:  paul%lfcs.ed.ac.uk@nsfnet-relay.ac.uk
Edinburgh EH9 3JZ, UK.             Tel:   031-650-5193