[net.unix-wizards] r+ on fopen.

gwyn%brl-vld@sri-unix.UUCP (09/15/83)

From:      Doug Gwyn (VLD/VMB) <gwyn@brl-vld>

Your anonymous hacker should read the manual; there is nothing wrong
with fopen( , "r+" ).

"When a file is opened for update, both input and output may be done
on the resulting stream.  However, output may not be directly followed
by input without an intervening fseek or rewind, and input may not be
directly followed by output without an intervening fseek, rewind, or an
input operation which encounters end-of-file."

Lepreau%utah-20@sri-unix.UUCP (09/19/83)

From:  Jay Lepreau <Lepreau@utah-20>

This is indeed a long-standing bug, to my knowledge first fixed
by John Demco (ubc-vision!demco), and later this year by Guy Harris.
Both messages are below, as each has info the other doesn't.

>From harpo!decvax!microsof!uw-beave!ubc-visi!demco Thu Feb 17 17:54:32 1983
Subject: Re: read/write same file with one open
Newsgroups: net.lang.c

I have found a bug in the fseek() function of the stdio package as
distributed by Berkeley (3/9/81). It can result in improper flushing with a
stream file which has been opened for reading and writing (by specifying
"r+", "w+", or "a+"). I don't know if the problem exists on System III or V.

First, here is some background information about the stdio package. If a
stream is opened for read/write, a flag bit called _IORW is set in the _iob
structure for the file. Two other flag bits, _IOWRT and _IOREAD, indicate
whether the file is currently being written or read. Changing from reading
to writing or vice versa can only be done after reading an end of file, or
calling fseek() or rewind(). In these cases the package tries to make sure
that neither of the _IOWRT or _IOREAD bits are set, so you can subsequently
do either a getc() or putc() and the package will enter reading or writing
mode correctly.

The problem is that fseek() called while in read mode can return with the
file still in read mode. Subsequent putc()'s will not be written. A quick
fix is to change

	resync = offset&01;

to

	if (!(iop -> _flag & _IORW))
		resync = offset & 01;
	else
		resync = 0;

Can someone tell me what this "resync" business is all about? It looks like
an attempt to keep the file's buffer aligned on an even byte offset into the
file, but it won't do that in all cases.



>From harpo!seismo!rlgvax!guy Sun Aug 14 22:50:11 1983
Subject: Standard I/O streams open for reading and writing
Newsgroups: net.unix-wizards,net.bugs.4bsd

The Standard I/O library for releases of UNIX from V7 on includes a feature
to permit you to open a stream for reading and writing.  This feature is
documented in USG UNIX (System III, System V).  There is one minor bug in
the V7 implementation, and an additional major bug in the 4.?BSD implementation.

The first bug is that if your "umask" takes away your own read or write
permission, you may not use the "w+" or "a+" modes.  The problem is that
these modes require "stdio" to do a "creat" to create the file, and then to
close and reopen the file in order to have it open for reading and writing
(USG doesn't have to do this, but interestingly enough the S3 "stdio" does
it anyway), so if your umask takes away read or write permissions the reopen
fails.  The fix is to change the lines

	f = creat(file, 0666);
	if (rw && f>=0) {
		close(f);
		f = open(file, 2);
	}

to

	if (rw) {
		int m = umask(0);
		f = creat(file, 0666);
		if (f>=0) {
			close(f);
			f = open(file, 2);
			chmod(file, 0666 & ~m);
		}
		umask(m);
	} else
		f = creat(file, 0666);

in "endopen.c" in V7, and in "fopen.c" and "freopen.c" in 4.?BSD.

The more serious bug is due to the way the 4.?BSD "stdio" handles "fseek".
If you have a stream open for reading and writing, you must do an "fseek"
if you want to switch from reading to writing or vice versa (this is documented
in the USG manuals, so it's not a buried feature).  However, if you do an
"fseek" to an odd-byte boundary on 4.?BSD on a stream where the last operation
was a read, it seeks to the previous byte and does a "getc" to move to the next
byte.  This is not a problem if the stream is open only for reading or only for
writing (I presume it was put in there for efficiency; PDP-11 UNIX moves data
from kernel to user or vice versa much more efficiently if it is on a word
boundary), but it makes it impossible to put an "fseek" after a read and
immediately before a write (because the "fseek" may do a read before you get
a chance to do your write).  The code in "fseek.c" to handle seeks on a
stream on which the last operation was a read has a section which reads:

		if (iop->_flag & _IORW) {
			iop->_ptr = iop->_base;
			iop->_flag &= ~_IOREAD;
		}

A line "resync = 0;" should be added after the "iop->_flag &= ~_IOREAD;".  This
way, "fseek" will behave as it did if the stream is only open for reading, but
will not do the "resync" if it is open for reading and writing.

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

-------

SHAWN%mit-ml@sri-unix.UUCP (09/19/83)

From:  Shawn F. McKay <SHAWN@mit-ml>


I would have lots of intrest in a fix as well. I have been 'told' about
the bug myself, and used 2 sets of 'fopen' calls when I needed to
read/write. (or just use 'open/close'). But I would love to be able to
use the 'r+' stuff.

			Thanks,
			  Yours In Hacking(c),
			       -Shawn

(c) CopyRight 1983, All Rights Reserved,

jeff%aids-unix@sri-unix.UUCP (09/19/83)

From:  Jeff Dean <jeff@aids-unix>

No, I am not the anonymous hacker, but there really is a problem with
read/write files under 4.1 bsd.

To see the problem, fopen a file in "r+", fgets a line, fseek to an odd
location, and then fwrite something there (and then fclose the file).
Nothing has been written!  Repeat this, fseeking to an even location, and
now the fwrite works correctly.

According to the manual, an fseek is necessary to switch between read and
write.  However, there is a problem with fseek which causes it to
incorrectly switch modes.  Under certain conditions, an fseek to an odd
location causes a seek to ( location - 1 ), followed by a getc (!).
Unfortunately, getc puts the buffer back into read mode, causing fseek to
exit with the buffer still in read mode.

Placing an fflush in the program will not solve the problem.  fseek is
leaving the buffer in read mode; fflush has no effect on a buffer in read
mode.

An obvious solution to this problem is to simply remove the offending code
(all code dealing with the variable "resync").  Is there some reason that
this "resync" code is there? (a PDP11-ism ?)

	jd

alt@BRL.ARPA (09/21/83)

From:  Howard Alt <alt@BRL.ARPA>


It seems that one of our people was trying to read and write
to the same file, seeking back and forth, etc.  It wasn't writing
the file.  I solved the problem by opening 2 file descriptors to the
same file, one for read, the other for append, and rewinding the
append one.  Then, I used ftell on the read descriptor to seek into the
write desctiptor and write.  It is a kluge.  
A person that wishes to remain unknown, did some testing, and here is
what he had to say about it...

------------BEGIN FORWARDED MESSAGE
To: alt hakim
Subject: Seek and you shall die

	It appears that opening files for reading and writing 
simultaneously is not a working feature. I beat on the problem
for a while with a little test program and noted the following
behavior (see /usr/XXX/test.c):

	If one seeks forward an even number of bytes and then
	seeks backwards an even number of bytes and then writes
	on the file, things appear to work correctly. Things
	also seem to work correctly if both seeks are odd. If
	one seek is odd and the other even the write to the file 
	has no effect.

All of this is fairly useless; the program did fairly (random)
bizarre things after only minor modification. I think the bottom
line is: "read/write files are broken (QED)".

				We must consult the wizards,
				   the anonymous hacker...
------------END FORWARDED MESSAGE


Well, has anyone had the same experience?  I really have no desire
to dig through the fopen, fseek, etc code to find out what is wrong...
someone must have fixed it.
Please send me the fix (or even speculate on what it could be)
		Thanks,
				Howard.
	

guy@rlgvax.UUCP (Guy Harris) (09/23/83)

If this was under V7, I'd be somewhat surprised as the differences between
System III "stdio" (where the documentation claims it works) and V7 "stdio"
are miniscule.  Under 4.?BSD, however, the description (odd vs. even seek
offsets) seems to fit a bug in the 4.?BSD "fseek".  The fix follows:

*** /tmp/,RCSt1011421	Fri Sep 23 00:45:21 1983
--- /tmp/,RCSt2011421	Fri Sep 23 00:45:22 1983
***************
*** 36,41
  		if (iop->_flag & _IORW) {
  			iop->_ptr = iop->_base;
  			iop->_flag &= ~_IOREAD;
  		}
  		p = lseek(fileno(iop), offset-resync, ptrname);
  		iop->_cnt = 0;

--- 36,42 -----
  		if (iop->_flag & _IORW) {
  			iop->_ptr = iop->_base;
  			iop->_flag &= ~_IOREAD;
+ 			resync = 0;
  		}
  		p = lseek(fileno(iop), offset-resync, ptrname);
  		iop->_cnt = 0;

The problem is that if you are switching from reading a stream to writing it,
or vice versa, you must do an "fseek" between the last read and the first
write (this is documented in the System III documentation).  However, on a
stream where the last operation was a read, if told to seek to an odd offset
the 4.?BSD "fseek" will seek to that offset - 1 and then read the next
character.  I presume this was done because copies between kernel and user
space on a PDP-11 are more efficient if both buffers are aligned on word
boundaries (so check 2.?BSD for this as well!).  Unfortunately, this means
that the sequence:

	read
	seek to odd boundary
	write

becomes:

	read
	seek to even boundary
	read one character
	write

which violates the rule that an "fseek" must separate reads and writes.  The
fix merely turns off the seek to even boundary and read on streams open for
reading and writing.  Given this fix, I believe that "[rwa]+" on 4.?BSD will
work as reliably as it does on V7 and System III, so everybody stick this
fix in and then we can rewrite our code to use "[rwa]+".

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy