ji@cs.columbia.edu (John Ioannidis) (03/16/89)
What's wrong with the following program fragment? fd = open("/dev/xy0g", 2); /* fd gets set to 3 */ ... if (close(fd) < 0) perror("closing"); Nothing, one would say. But on a 3/280, with a Xylogics 451 and a NEC drive hanging off of it, that close() returns -1 and errno is set to 1 (EPERM, "Not owner"). This doesn't happen if I open the corresponding character-special file, and it doesn't happen for a SCSI drive (shoeboxes connected to 3/60s) and it doesn't happen under 3.4. Furthermore, examining the kernel structures reveals that the file has been correctly closed, fd's, file and vnode tables, the works, have been correctly updated, and just somehow somewhere the carry bit is being set. A quick (<1 day) inspection of the sources revealed nothing. Has anybody else experienced this? Is it a known bug? adTHANKSvance, /ji #include <appropriate disclaimers> In-Real-Life: John Ioannidis E-Mail-To: <ji@cs.columbia.edu> (preferred), or <ji@walkuere.altair.fr> P-Mail-To: GIP-Altair, Dom de Voluceau BP105, Rocquencourt 78153 Le Chesnay, FR V-Mail-To: +33 1 39635227, +33 1 39635417 [[ Someone reported a similar problem in a device driver that he wrote. See v7n153. --wnl ]]
berliner@uunet.uu.net (Brian Berliner) (03/30/89)
In Sun-Spots Digest v7n201, John Ioannidis <ji@cs.columbia.edu> writes: >What's wrong with the following program fragment? > > fd = open("/dev/xy0g", 2); /* fd gets set to 3 */ > ... > if (close(fd) < 0) > perror("closing"); > >Nothing, one would say. But on a 3/280, with a Xylogics 451 and a NEC >drive hanging off of it, that close() returns -1 and errno is set to 1 >(EPERM, "Not owner"). The problem can be found in the snode layer. This is a layer in the VFS structure that encapsulates device files (FIFO, BLK, and CHR) so that they "appear" as if they are always local, regardless of their original locality (ufs or nfs). This is a very subtle bug in how the snode layer interacts with the ufs layer (local file system) or nfs layer (which interacts with the ufs layer on the server, potentially). The bug is that when you do the final close(), vno_close() does a VN_RELE() which (in this case) calls spec_inactive(). spec_inactive() calls spec_fsync(). spec_fsync() gets the current times of the underlying device and updates them to be the times that are stored in the snode. VOP_SETATTR() is called to update the times, but little does anyone know, if VOP_SETATTR() finally gets down to ufs_setattr() (as it does for most nfs implementations), ufs_setattr() will not bother updating the times for the file if the current user is not the owner of the file or the super-user. And the real catch is that to determine this, a call to suser() may be made which can set u.u_error if the caller is not the super-user. The crux of the bug is that u.u_error has already been updated in vno_close() before the call to VN_RELE(), which can change it out from under vno_close()! It doesn't say so in the note, but I am assuming that you are running this as yourself (not the super-user) and the device inode is owned by someone other than yourself (probably root). There are really a couple of bugs here: 1. vno_close() should not assume that the VN_RELE() call will not muck with u.u_error 2. block device times are not being properly updated on the disk when opened by someone other than the owner or the super-user 3. spec_fsync() must "know" that the current user must own the file or be the super-user in order to update the times correctly for the file. I did not bother fixing (1), but the fix is obvious (save the error around the VN_RELE() call). I fixed numbers (2) and (3) by having spec_fsync() optionally muck the current user ID to match that of the file to sync out the new times. For a source fix, try this. Add these lines just before the spec_fsync() function in specfs/spec_vnodeops.c: > /* > * Changes the uid in the credentials structure to match that > * of the file when updating the file times. > */ > int spec_ownertimes = 1; And, change the VOP_SETATTR() call in spec_fsync() to the following code: > /* > * XXX - To update the times, one must be the > * current owner of the device. However, to > * get to this point, the user must have had > * permission to open the device, so if necessary, > * we munge the uid in a copy of the credentials > * temporarily to allow the times to be properly > * updated. > */ > if (spec_ownertimes && vatmp->va_uid != cred->cr_uid) { > register struct ucred *ncred; > > ncred = crdup(cred); > ncred->cr_uid = vatmp->va_uid; > VOP_SETATTR(realvp, vap, ncred); > crfree(ncred); > } else { > VOP_SETATTR(realvp, vap, cred); > } This allows the change to be "patched" out by setting spec_ownertimes to 0. Good luck! Brian Berliner Prisma, Inc. UUCP: uunet!prisma!berliner Internet: berliner@prisma.com