[comp.bugs.4bsd] Shared file descriptors

gil@taux01.UUCP ( Gil Shwed) (05/30/89)

Description:

	Child/Father processes sharing of file descriptors does not
work correctly. When two or more processes are reading simultaneously from
using the same file descriptor (as resulted from fork())
the file offset does not increment properly.
Resulting in reading the same blocks by number of processes,
and skipping others.

This bug exist on *ALL* the Berkeley derived system I've checked,
which includes: VAXen, ALL Suns, Sequent DYNIX, Gould, CCI TAHOE, and others.

Repeat-By:
	Compile the following program:
	main() {
		char buf[8192];
		int n;

		fork();
		while((n = read(0, buf, 8192)) > 0)
			write(1, buf, n);
		exit(0);
	}
	Then, run:

		a.out < /vmunix > out

	Now, run:
		cmp /vmunix out

	They should be the same, but...

Explanation:

	The routine rwuio() in the file sys/sys_generic.c has the following
	structure:

	count = uio->uio_resid;		/* Take count to read/write */
	uio->uio_offset = fp->f_offset;	/* (a) Offset in file */
	.
	.
					/* (b) perform actual i/o */
	u.u_error = (*fp->f_ops->fo_rw)(fp, rw, uio);
	u.u_r.r_val1 = count - uio->uio_resid;
	fp->f_offset += u.u_r.r_val1;	/* (c) Update file offset */


	When the first process enters this routine, it tries to read
	from position 0 (a), it enters file-type specific function (ino_rw),
	and blocks until the block is brought from disk.
	Then, the second processes gets the cpu and perform the same
	operation: It enters read()->rwuio(), takes the file offset (a),
	which is *NOT* modified to reflect the first read.
	This results in reading the same block twice!
	Furthermore, when the two processes gets their (same) block,
	they update the file offset (c) *TWICE*.
	So... the next read will skip a block.

	Or in the example, the output file will contain the following
	data blocks (numbered as the block number from the original file):
		0, 0, 2, 2, ...., n, n
	Instead of:
		0, 1, 2, 3, ...., n-1, n


Fix:
	The problem should be fixed by locking the inode before
	taking the offset (This is the way SystemV does it).
	This is sort of architectural problem, since rwuio()
	does not suppose to know the file type and/or recognize
	ILOCK().
	So, I leave the right to fix this bug the people who
	wrote it that way the first place...


-- Have Fun!
-- Gil Shwed

Or one of the following:
gil%taux01@nsc.com
gil@hujifh.bitnet
gil@humus.huji.ac.il
gil@batata.bitnet

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/30/89)

In article <1770@taux01.UUCP> gil%taux01@nsc.COM (Gil Shwed) writes:
>		fork();
>		while((n = read(0, buf, 8192)) > 0)
>			write(1, buf, n);

This program has a race condition between the two processes,
even in the absence of kernel bugs.

>	The problem should be fixed by locking the inode before
>	taking the offset (This is the way SystemV does it).

Yeah, it's a problem, isn't it.  We begin to see why genuine
concurrent programming constructs are useful..

jfh@rpp386.Dallas.TX.US (John F. Haugh II) (05/31/89)

In article <1770@taux01.UUCP> gil%taux01@nsc.COM (Gil Shwed) writes:
>	main() {
>		char buf[8192];
>		int n;
>
>		fork();
>		while((n = read(0, buf, 8192)) > 0)
>			write(1, buf, n);
>		exit(0);
>	}
>	Then, run:
>
>		a.out < /vmunix > out
>
>	Now, run:
>		cmp /vmunix out
>
>	They should be the same, but...

No, they shouldn't.

You are assuming the read-write pair is atomic, which it isn't.  It
would be quite possible for either process to step in between a
read and write and copy another block.  A possible scenario is

proc 1:READ()  ... wait ...  WRITE()
proc 2:       READ() WRITE()

In this situation, input block 0 would be output block 1, and so
on.

I'll leave the more pathological cases to your imagination.  This
does not appear to contradict the claim that there is a bug, but
it does question your methods ...
-- 
John F. Haugh II                        +-Button of the Week Club:-------------
VoiceNet: (512) 832-8832   Data: -8835  | "AIX is a three letter word,
InterNet: jfh@rpp386.Cactus.Org         |  and it's BLUE."
UucpNet : <backbone>!bigtex!rpp386!jfh  +--------------------------------------