[comp.sys.sun] close

ji@cs.columbia.edu (John Ioannidis) (03/16/89)

What's wrong with the following program fragment?

                fd = open("/dev/xy0g", 2);      /* fd gets set to 3 */
                ...
                if (close(fd) < 0)
                  perror("closing");

Nothing, one would say. But on a 3/280, with a Xylogics 451 and a NEC
drive hanging off of it, that close() returns -1 and errno is set to 1
(EPERM, "Not owner"). This doesn't happen if I open the corresponding
character-special file, and it doesn't happen for a SCSI drive (shoeboxes
connected to 3/60s) and it doesn't happen under 3.4.  Furthermore,
examining the kernel structures reveals that the file has been correctly
closed, fd's, file and vnode tables, the works, have been correctly
updated, and just somehow somewhere the carry bit is being set. A quick
(<1 day) inspection of the sources revealed nothing. Has anybody else
experienced this? Is it a known bug?

adTHANKSvance,

/ji

#include <appropriate disclaimers>

In-Real-Life: John Ioannidis
E-Mail-To: <ji@cs.columbia.edu> (preferred), or <ji@walkuere.altair.fr>
P-Mail-To: GIP-Altair, Dom de Voluceau BP105, Rocquencourt 78153 Le Chesnay, FR
V-Mail-To: +33 1 39635227, +33 1 39635417

[[ Someone reported a similar problem in a device driver that he wrote.
See v7n153.  --wnl ]]

berliner@uunet.uu.net (Brian Berliner) (03/30/89)

In Sun-Spots Digest v7n201, John Ioannidis <ji@cs.columbia.edu> writes:
>What's wrong with the following program fragment?
>
>                fd = open("/dev/xy0g", 2);      /* fd gets set to 3 */
>                ...
>                if (close(fd) < 0)
>                  perror("closing");
>
>Nothing, one would say. But on a 3/280, with a Xylogics 451 and a NEC
>drive hanging off of it, that close() returns -1 and errno is set to 1
>(EPERM, "Not owner").

The problem can be found in the snode layer.  This is a layer in the VFS
structure that encapsulates device files (FIFO, BLK, and CHR) so that they
"appear" as if they are always local, regardless of their original
locality (ufs or nfs).

This is a very subtle bug in how the snode layer interacts with the ufs
layer (local file system) or nfs layer (which interacts with the ufs layer
on the server, potentially).  The bug is that when you do the final
close(), vno_close() does a VN_RELE() which (in this case) calls
spec_inactive().

spec_inactive() calls spec_fsync().  spec_fsync() gets the current times
of the underlying device and updates them to be the times that are stored
in the snode.  VOP_SETATTR() is called to update the times, but little
does anyone know, if VOP_SETATTR() finally gets down to ufs_setattr() (as
it does for most nfs implementations), ufs_setattr() will not bother
updating the times for the file if the current user is not the owner of
the file or the super-user.  And the real catch is that to determine this,
a call to suser() may be made which can set u.u_error if the caller is not
the super-user.

The crux of the bug is that u.u_error has already been updated in
vno_close() before the call to VN_RELE(), which can change it out from
under vno_close()!

It doesn't say so in the note, but I am assuming that you are running this
as yourself (not the super-user) and the device inode is owned by someone
other than yourself (probably root).  There are really a couple of bugs
here:

	1. vno_close() should not assume that the VN_RELE() call
	   will not muck with u.u_error
	2. block device times are not being properly updated on the disk
	   when opened by someone other than the owner or the super-user
	3. spec_fsync() must "know" that the current user must own
	   the file or be the super-user in order to update the times
	   correctly for the file.

I did not bother fixing (1), but the fix is obvious (save the error around
the VN_RELE() call).

I fixed numbers (2) and (3) by having spec_fsync() optionally muck the
current user ID to match that of the file to sync out the new times.

For a source fix, try this.  Add these lines just before the spec_fsync()
function in specfs/spec_vnodeops.c:

> /*
>  * Changes the uid in the credentials structure to match that
>  * of the file when updating the file times.
>  */
> int spec_ownertimes = 1;

And, change the VOP_SETATTR() call in spec_fsync() to the following code:

> 		/*
> 		 * XXX - To update the times, one must be the
> 		 * current owner of the device.  However, to
> 		 * get to this point, the user must have had
> 		 * permission to open the device, so if necessary,
> 		 * we munge the uid in a copy of the credentials
> 		 * temporarily to allow the times to be properly	
> 		 * updated.
> 		 */
> 		if (spec_ownertimes && vatmp->va_uid != cred->cr_uid) {
> 			register struct ucred *ncred;
> 
> 			ncred = crdup(cred);
> 			ncred->cr_uid = vatmp->va_uid;
> 			VOP_SETATTR(realvp, vap, ncred);
> 			crfree(ncred);
> 		} else {
> 			VOP_SETATTR(realvp, vap, cred);
> 		}

This allows the change to be "patched" out by setting spec_ownertimes to 0.

Good luck!

Brian Berliner
Prisma, Inc.
UUCP:		uunet!prisma!berliner
Internet:	berliner@prisma.com