siva@bally.Bally.COM (Siva Chelliah) (11/07/90)
I have an interesting (?) question. Program 1 : #include "stdio.h" char buf[7]; main () { int i=1; FILE *fp; fp = fopen("temp.dat","ab"); for (i = 0;i<5;i++){ sprintf(buf,"line %d",i); fwrite(buf,sizeof(buf),1,fp); } fclose(fp); } temp.dat : line 0line 1line 2line 3line 4 fread should update the pointer, so that I should be able to do a read or a write after that. Right ? Program 2 : #include "stdio.h" char buf[7] = "line 4"; char tbuf[7]; main () { int i=1; FILE *fp; fp = fopen("temp.dat","r+b"); fread(tbuf,sizeof(tbuf),1,fp); printf("tbuf = %s\n",tbuf); /* this worked . I got line 0 */ fwrite(buf,sizeof(buf),1,fp); fclose(fp); } temp.dat : line 0line 1line 2line 3line 4line 0line 4 Can you believe this? This happened when I used IBM RT, AIX 2.0 When I used Microsoft C 5.1(DOS 3.3 ) , nothing changed in temp.dat . When I used fseek before fwrite , it worked. I do not remember reading anywhere that I should do a fseek before fread/ fwrite. Is that a bug in the compiler or in my head ? Please help. Siva
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/07/90)
In article <402@bally.Bally.COM>, siva@bally.Bally.COM (Siva Chelliah) writes: > I do not remember reading > anywhere that I should do a fseek before fread/ fwrite. Is that a bug in the > compiler or in my head ? Please help. It's in your head. Basically the model is that a stream can be in one of three states: undetermined reading writing When you read from a stream, it should not be in 'writing' state. When you write to a stream, it should not be in 'reading' state. fseek() and rewind() put a stream back into 'undetermined' state. This has been the case at least since V7 stdio, maybe longer. Check your documentation again; it may be under "fopen" where the meaning of r+ is explained. -- The problem about real life is that moving one's knight to QB3 may always be replied to with a lob across the net. --Alasdair Macintyre.
chris@mimsy.umd.edu (Chris Torek) (11/07/90)
In article <402@bally.Bally.COM> siva@bally.Bally.COM (Siva Chelliah) writes: >When I used fseek before fwrite , it worked. I do not remember reading >anywhere that I should do a fseek before fread/ fwrite. Is that a bug in the >compiler or in my head ? Put that way, the answer has to be `in your head'. :-) ANSI standard X3.159-1989 says that you (the programmer) must call fseek or rewind or fsetpos before switching from reading to writing or vice versa. (In fact, the wording refers to `a successful seek operation', suggesting that not only must you call fseek or fsetpos or rewind, but also that if the seek fails, you may not change I/O direction.) Incidentally: >char tbuf[7]; > fread(tbuf,sizeof(tbuf),1,fp); > printf("tbuf = %s\n",tbuf); This is a bug waiting to happen. (In this test program, of course, the fread returns 1, having read 7 bytes, of which the last is a '\0' character, so it is okay here, sort of.) It is dangerous to print a `string' that has been read in via fread or read, since neither is guaranteed to store a '\0' at the end. (It is also dangerous to ignore return values, but. . . .) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
gwyn@smoke.brl.mil (Doug Gwyn) (11/07/90)
In article <402@bally.Bally.COM> siva@bally.Bally.COM (Siva Chelliah) writes:
- fp = fopen("temp.dat","r+b");
- fread(tbuf,sizeof(tbuf),1,fp);
- fwrite(buf,sizeof(buf),1,fp);
-line 0line 1line 2line 3line 4line 0line 4
-Can you believe this?
Sure.
-When I used fseek before fwrite, it worked. I do not remember reading
-anywhere that I should do a fseek before fread/fwrite.
The exact requirement is spelled out quite explicitly in the C standard,
section 4.9.5.3. I won't bore you with the technical reasons, but this
was not an oversight nor necessarily sloppiness on the part of your C vendor.
hagins@dg-rtp.dg.com (Jody Hagins) (11/08/90)
In article <402@bally.Bally.COM>, siva@bally.Bally.COM (Siva Chelliah) writes: |> |> I have an interesting (?) question. [ program 1 deleted ] |> fread should update the pointer, so that I should be able to do a read or |> a write after that. Right ? Yes. |> Program 2 : |> |> #include "stdio.h" |> char buf[7] = "line 4"; |> char tbuf[7]; |> main () |> { |> int i=1; |> FILE *fp; |> fp = fopen("temp.dat","r+b"); |> fread(tbuf,sizeof(tbuf),1,fp); |> printf("tbuf = %s\n",tbuf); /* this worked . I got line 0 */ |> fwrite(buf,sizeof(buf),1,fp); |> fclose(fp); |> } |> |> temp.dat : |> |> line 0line 1line 2line 3line 4line 0line 4 |> |> Can you believe this? This happened when I used IBM RT, AIX 2.0 |> When I used Microsoft C 5.1(DOS 3.3 ) , nothing changed in temp.dat . |> When I used fseek before fwrite , it worked. I do not remember reading |> anywhere that I should do a fseek before fread/ fwrite. Is that a bug in the |> compiler or in my head ? Please help. Would you believe, in your head? The following is a quote from "C A Reference Manual" by Harbison and Steele pertaining to fopen(). "When a file is opened for update ('+' is present in the type string), the resulting stream may be used for both input and output. However, an output operation may not be followed by an input operation without an intervening call to fseek() or rewind(), and an input operation may not be followed by an output operation without an intervening call to fseek() or rewind() or an input operation that encounters end-of-file" Hope this helps! |> |> Siva -Jody hagins@gamecock.rtp.dg.com
kpv@ulysses.att.com (Phong Vo[drew]) (11/08/90)
In article <14384@smoke.brl.mil>, gwyn@smoke.brl.mil (Doug Gwyn) writes: - In article <402@bally.Bally.COM> siva@bally.Bally.COM (Siva Chelliah) writes: - - fp = fopen("temp.dat","r+b"); - - fread(tbuf,sizeof(tbuf),1,fp); - - fwrite(buf,sizeof(buf),1,fp); - -line 0line 1line 2line 3line 4line 0line 4 - -Can you believe this? - - Sure. - - -When I used fseek before fwrite, it worked. I do not remember reading - -anywhere that I should do a fseek before fread/fwrite. - - The exact requirement is spelled out quite explicitly in the C standard, - section 4.9.5.3. I won't bore you with the technical reasons, but this - was not an oversight nor necessarily sloppiness on the part of your C vendor. However, one may argue that the sloppiness is in the C standard. The standard, in this case, basically just documents the behavior of stdio without considering that this is a bad design that arose from a bad implementation. It is ugly to have to call fseek before switching modes. There are other uglinesses (e.g., inconsistent interfaces) in stdio that could have been avoided too. One may say that the standard failed in that respect. This is sad considering that the standard did go a long way to invent a new C language. Phong Vo
dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (11/12/90)
In <13992@ulysses.att.com> kpv@ulysses.att.com (Phong Vo[drew]) writes: The standard, in this case, basically just documents the behavior of stdio without considering that this is a bad design that arose from a bad implementation. It is ugly to have to call fseek before switching modes. I believe the requirement to call fseek (etc.) when switching arises out of the need to make stdio fast. Due to buffering, alternating reads and writes can confuse each other. The only way the stdio library could automatically protect you against this would be for it to explicitly test for internal state before every read and write. E.g., within fread, we sould have: if (my_state == DOING_WRITE) { .. resync buffer .. my_state = DOING_READ; .. rest of fread .. } I suppose we should consider ourselves lucky we are even allowed to do both reads and writes on the same data stream: I had the blues because I had no shoes Until upon the street I met a man whose feet were stuck in Pascal. -- Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com> UUCP: oliveb!cirrusl!dhesi
cechew@bruce.cs.monash.OZ.AU (Earl Chew) (11/13/90)
In <2677@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes: >In <13992@ulysses.att.com> kpv@ulysses.att.com (Phong Vo[drew]) writes: > The standard, in this case, basically just documents the behavior of > stdio without considering that this is a bad design that arose from > a bad implementation. It is ugly to have to call fseek before > switching modes. I think that this is true. >I believe the requirement to call fseek (etc.) when switching arises >out of the need to make stdio fast. Due to buffering, alternating This is not the case. The main obstacle to switching between reads and writes is: 1. the behaviour of early implementations of stdio 2. subsequent casting of (1) in concrete by ANSI-C There is a need to make stdio fast --- but this does not prohibit arbitrary switching between read and write modes. Most implementations of stdio buffer data between calls to read(2) and write(2). Thus the cost of making a system call is only incurred every BUFSIZ bytes. Intermediate data is transferred directly to the buffer. The main impediment to switching modes in many implementations of stdio is the use of a single buffer pointer (usually _ptr). This single pointer functions as a read pointer when reading and a write pointer when writing, allowing quick access to the buffer. Calls to a buffer fill or flush function are only made when the pointer reaches some high water mark. Thus (getc(fp); putc(0, fp)) or (putc(0, fp); getc(fp)), especially when the pointer is in the middle of the buffer. It is possible to perform an automatic mode switch if *two* pointers are used: a reading pointer and a writing pointer. >reads and writes can confuse each other. The only way the stdio >library could automatically protect you against this would be for it to >explicitly test for internal state before every read and write. E.g., >within fread, we sould have: > if (my_state == DOING_WRITE) { > .. resync buffer .. > my_state = DOING_READ; > .. rest of fread .. > } Some implementations of stdio do this anyway to prevent users from hanging themselves: if (my_state == DOING_WRITE) { ... error ... } ... rest of fread ... In these cases, there already is a guard on the fread() code, so replacing `... error ...' with `... resync buffer ...' is possible without loss in performance for the normal case. I am unsure whether ANSI-C prohibits stdio implementations from automatic switching, but it clear that if such a feature were to be implemented, its use would make the application non-conforming. In any event, use of the separate read and write pointers allows runtime checking to ensure that an explicit switch is made between read and write modes, even if automatic switching is not implemented (ie it is possible to trap {getc(fp); putc(0, fp);} or {putc(0, fp); getc(fp);}). Earl -- Earl Chew, Dept of Computer Science, Monash University, Australia 3168 EMAIL: cechew@bruce.cs.monash.edu.au PHONE: 03 5655447 FAX: 03 5655146 ----------------------------------------------------------------------
chris@mimsy.umd.edu (Chris Torek) (11/13/90)
In article <2677@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes: >I believe the requirement to call fseek (etc.) when switching arises >out of the need to make stdio fast. Due to buffering, alternating >reads and writes can confuse each other. The only way the stdio >library could automatically protect you against this would be for it to >explicitly test for internal state before every read and write. Although this is (effectively) the reason the V7 Unix stdio and all its descendents (and, presumably, whatever predecessor eventually became the USG stdio and thence the System V stdio, though I have not looked closer than determining that the SVR3 stdio was absolutely horrid inside) ... where was I? Oh yes, the reason most Unix stdios do not check. Right. Your average out-of-the-box Unix stdio has, for efficiency, two particular state variables in each FILE. One is a pointer into a current buffer, and the other is a count. For `getc' operations, if the count is positive, one decrements it and fetches through the pointer, which is then increemented. For `putc' operations, if the count is positive, one decrements it and stores through the pointer, which is then incremented. This means that buffered I/O, which typically stores somewhere between 512 and 65536 characters in each buffer, can handle somewhere between 511 and 65535 `calls' to `getc' or `putc' within an inline macro expansion. Unfortunately, it also means that fp = fopen("foo", "w+"); ... putc(' ', fp); c = getc(fp); tends to `get' a random value (whatever happened to be in the current buffer). This particular `feature' is easy to fix without sacrificing efficiency. Instead of carrying one count and one pointer, stdio can carry *two* counts (and, as it turns out, one pointer). The current read or write state is then stored implicitly in the two counts (as well as explicitly elsewhere, of course). The following extracts from my <stdio.h> should give you the idea. /* * Stdio buffers. */ struct __sbuf { unsigned char *_base; int _size; }; /* * Stdio state variables. * * The following always hold: * * if (_flags&(__SLBF|__SWR)) == (__SLBF|__SWR), * _lbfsize is -_bf._size, else _lbfsize is 0 * if _flags&__SRD, _w is 0 * if _flags&__SWR, _r is 0 * * This ensures that the getc and putc macros (or inline functions) never * try to write or read from a file that is in `read' or `write' mode. * (Moreover, they can, and do, automatically switch from read mode to * write mode, and back, on "r+" and "w+" files.) * * _lbfsize is used only to make the inline line-buffered output stream * code as compact as possible. * * _ub, _up, and _ur are used when ungetc() pushes back more characters * than fit in the current _bf, or when ungetc() pushes back a character * that does not match the previous one in _bf. When this happens, * _ub._base becomes non-nil (i.e., a stream has ungetc() data iff * _ub._base!=NULL) and _up and _ur save the current values of _p and _r. */ typedef struct __sFILE { unsigned char *_p; /* current position in (some) buffer */ int _r; /* read space left for getc() */ int _w; /* write space left for putc() */ short _flags; /* flags, below; this FILE is free if 0 */ short _file; /* fileno, if Unix descriptor, else -1 */ struct __sbuf _bf; /* the buffer (at least 1 byte, if !NULL) */ int _lbfsize; /* 0 or -_bf._size, for inline putc */ /* operations */ void *_cookie; /* cookie passed to io functions */ #if __STDC__ || c_plusplus int (*_read)(void *_cookie, char *_buf, int _n); int (*_write)(void *_cookie, const char *_buf, int _n); fpos_t (*_seek)(void *_cookie, fpos_t _offset, int _whence); int (*_close)(void *_cookie); #else int (*_read)(); int (*_write)(); fpos_t (*_seek)(); int (*_close)(); #endif /* separate buffer for long sequences of ungetc() */ struct __sbuf _ub; /* ungetc buffer */ unsigned char *_up; /* saved _p when _p is doing ungetc data */ int _ur; /* saved _r when _r is counting ungetc data */ /* tricks to meet minimum requirements even when malloc() fails */ unsigned char _ubuf[3]; /* guarantee an ungetc() buffer */ unsigned char _nbuf[1]; /* guarantee a getc() buffer */ /* separate buffer for fgetline() when line crosses buffer boundary */ struct __sbuf _lb; /* buffer for fgetline() */ /* Unix stdio files get aligned to block boundaries on fseek() */ int _blksize; /* stat.st_blksize (may be != _bf._size) */ int _offset; /* current lseek offset */ } FILE; extern FILE __sF[]; #define __SLBF 0x0001 /* line buffered */ #define __SNBF 0x0002 /* unbuffered */ #define __SRD 0x0004 /* OK to read */ #define __SWR 0x0008 /* OK to write */ /* RD and WR are never simultaneously asserted */ #define __SRW 0x0010 /* open for reading & writing */ #define __SEOF 0x0020 /* found EOF */ #define __SERR 0x0040 /* found error */ #define __SMBF 0x0080 /* _buf is from malloc */ #define __SAPP 0x0100 /* fdopen()ed in append mode */ #define __SSTR 0x0200 /* this is an sprintf/snprintf string */ #define __SOPT 0x0400 /* do fseek() optimisation */ #define __SNPT 0x0800 /* do not do fseek() optimisation */ #define __SOFF 0x1000 /* set iff _offset is in fact correct */ #define __SMOD 0x2000 /* true => fgetline modified _p text */ [much deleted] /* * The __sfoo macros are here so that we can * define function versions in the C library. */ #define __sgetc(p) (--(p)->_r < 0 ? __srget(p) : (int)(*(p)->_p++)) #ifdef __GNUC__ static __inline int __sputc(int _c, FILE *_p) { if (--_p->_w >= 0 || (_p->_w >= _p->_lbfsize && (char)_c != '\n')) return (*_p->_p++ = _c); else return (__swbuf(_c, _p)); } #else /* * This has been tuned to generate reasonable code on the vax using pcc */ #define __sputc(c, p) \ (--(p)->_w < 0 ? \ (p)->_w >= (p)->_lbfsize ? \ (*(p)->_p = (c)), *(p)->_p != '\n' ? \ (int)*(p)->_p++ : \ __swbuf('\n', p) : \ __swbuf((int)(c), p) : \ (*(p)->_p = (c), (int)*(p)->_p++)) #endif -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
karl@ima.isc.com (Karl Heuer) (11/14/90)
In article <3337@bruce.cs.monash.OZ.AU> cechew@bruce.cs.monash.OZ.AU (Earl Chew) writes: >The main obstacle to switching between reads and writes is: >1. the behaviour of early implementations of stdio >2. subsequent casting of (1) in concrete by ANSI-C X3J11 did not freeze this behavior. They declined to correct it (and quite properly so, if there was no existing practice), but the fix is a valid conforming extension. It would even be possible for some other standard, like POSIX, to require it. >I am unsure whether ANSI-C prohibits stdio implementations from automatic >switching, but it clear that if such a feature were to be implemented, its >use would make the application non-conforming. True (as does, say, the use of "isatty()"). But if the vendors add it now, it might be required behavior by the time C-2001 is done. Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
jon@jonlab.UUCP (Jon H. LaBadie) (11/16/90)
Despite all the discussion on this topic, I do not see the need for the programmer to indicate a switch from reading to writing and visa versa. I mean I know it is needed, but I do not understand why. If in the stdio buffer I have the following; Mary had a big sheep. Supercalifragalisticexpalidocious ... ^ With my pointer (read in this case) on the 'M', after I fread 23 bytes, so my buffer and pointer are such: Mary had a big sheep. Supercalifragalisticexpalidocious ... ^ what is wrong with fwrite'ing "Jack and Jill" on top of Super...? I.e. what is critical returning to some ground zero state before making a transition? Jon -- Jon LaBadie {att, princeton, bcr, attmail!auxnj}!jonlab!jon
les@chinet.chi.il.us (Leslie Mikesell) (11/16/90)
In article <27633@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: >[...] though I have not looked >closer than determining that the SVR3 stdio was absolutely horrid >inside) It seems real strange to me that when you use setvbuf, the first putc() will trigger a write() (on AT&T 3b2 & 386 SysVr3.2 anyway). This kind of defeats the purpose of requesting the buffer, doesn't it? Anyway, the nicest thing about stdio is that you are not obligated to use it. The only thing difficult at all to do using your own buffering is an equivalent to fprintf(). Has anyone built something like sprintf that can be limited to a fixed buffer size and maintains state so you can pick up where you quit on the last pass? It might return either the number of characters placed in the buffer (if they all fit) or a negative number indicating the buffer was filled and you need to call again to get the rest. Les Mikesell les@chinet.chi.il.us
cechew@bruce.cs.monash.OZ.AU (Earl Chew) (11/17/90)
In <880@jonlab.UUCP> jon@jonlab.UUCP (Jon H. LaBadie) writes: >If in the stdio buffer I have the following; > Mary had a big sheep. Supercalifragalisticexpalidocious ... > ^ >With my pointer (read in this case) on the 'M', after I fread 23 bytes, >so my buffer and pointer are such: > Mary had a big sheep. Supercalifragalisticexpalidocious ... > ^ >what is wrong with fwrite'ing "Jack and Jill" on top of Super...? >I.e. what is critical returning to some ground zero state before >making a transition? With `traditional' stdio implementations, the FILE will still be in `READING' mode --- despite the fact that the fwrite() (or putc, etc) may `apparently' succeed. However, when the buffer is exhausted, you will find that no write(2) is performed (ie the fact that the buffer is dirty is not recorded) because of the `READING' mode, and you will lose the data you wrote. Earl -- Earl Chew, Dept of Computer Science, Monash University, Australia 3168 EMAIL: cechew@bruce.cs.monash.edu.au PHONE: 03 5655447 FAX: 03 5655146 ----------------------------------------------------------------------