nowlin@ihuxy.UUCP (06/04/87)
I've run into a problem with the Fread() gemdos macro. Using Megamax I wrote a piece of code to buffer line-by-line input. The code reads files by 8K blocks at a time. There's no problem as long as the call to Fread() requests a block smaller than the rest of the file being read. For example, if I read 1000 bytes from a 3000 byte file Fread() returns 1000 and there are exactly 1000 characters stored in the buffer specified. If there are only 300 bytes in the file and I request 1000 Fread() will return 301 and the 301st byte is a ^Z. Since ^Z is the EOF character in gemdos It appears that Fread() is actually reading one character past/into the end of file and sticking that character at the end of the input buffer. Please don't reply with work around solutions to this. I've already worked around it. I just wanted to see if anybody else has run into this problem and if they've found a real solution for it. In most cases this isn't much of a problem but it's something to watch out for when comparing files. It looks like the files are actually starting to diverge from each other when in fact one has ended. These are two entirely different things and this bug makes the later trickier to detect. I'm also not sure if this is only related to Megamax or is a bug in the gemdos call itself. Megamax seems to think it's gemdos. I'm not sure since Megamax has to provide the gemdos() library call that's used to invoke the lower level routines and that could be where the problem is occurring. Has anyone seen this problem with any of the other C development systems? Jerry Nowlin (...!ihnp4!ihuxy!nowlin)
apratt@atari.UUCP (Allan Pratt) (06/05/87)
in article <1987@ihuxy.ATT.COM>, nowlin@ihuxy.ATT.COM (Jerry Nowlin) says: > If > there are only 300 bytes in the file and I request 1000 Fread() will return > 301 and the 301st byte is a ^Z. Since ^Z is the EOF character in gemdos It > appears that Fread() is actually reading one character past/into the end of > file and sticking that character at the end of the input buffer. ^Z doesn't mean EOF to GEMDOS: it means EOF to some braindamaged programs. Programs like Mince, for instance. If there is a ^Z at the end of your file, it is considered a character just like any other character to GEMDOS. It's up to the library to interpret ^Z as EOF. There is a remark in the GEMDOS documentation to the effect that "Some applications use ^Z to mark EOF in text files." The intent of this was that you should write your programs to be liberal: when dealing with a text file, don't be surprised to see ^Z, but don't be surprised not to. Personally, I prefer no ^Z, because you already know exactly how many bytes there are in a file. But as far as GEMDOS is concerned, files are untyped: there is no concept of a "text" file versus a "binary file" -- all files are just a collection of bytes. Some history: ^Z came to mean EOF because in CP/M, files were allocated in clusters of multiples of 128 bytes. In the directory entry for a file, there was no indication of *exactly* how many bytes were in the file. This didn't matter for programs (binary files), but it did matter for text files. So ^Z was used as the end-of-text marker in text files. MuShDOS preserved this braindamage, even though it didn't need it because it DOES keep track of EXACTLY how many bytes were written to a file. GEMDOS doesn't even document any special treatment of ^Z, but some programs use it to mark EOF in text files (e.g. Mince). To address Mr. Nowlin's remarks specifically: how do you know there are 300 bytes left in the file? Maybe there were 300 text bytes left and one EOF byte (^Z). Here's one way to find out how many bytes are REALLY left in the file: long bytes_left_in_file(fd) int fd; { long pos = Fseek(0L,fd,1); /* get current offset */ long end = Fseek(0L,fd,2); /* seek to end, get offset */ Fseek(pos,fd,0); /* seek back to pos */ return end-pos; } /----------------------------------------------\ | Opinions expressed above do not necessarily | -- Allan Pratt, Atari Corp. | reflect those of Atari Corp. or anyone else. | ...lll-lcc!atari!apratt \----------------------------------------------/
apratt@atari.UUCP (Allan Pratt) (06/05/87)
in article <749@atari.UUCP>, apratt@atari.UUCP (Allan Pratt) says: > > ^Z doesn't mean EOF to GEMDOS: it means EOF to some braindamaged programs. > Programs like Mince, for instance. > Okay, sorry, I didn't mean braindamaged programs; just those written for CP/M and preserving that silly (my opinion) convention through MS-DOS and GEMDOS. /----------------------------------------------\ | Opinions expressed above do not necessarily | -- Allan Pratt, Atari Corp. | reflect those of Atari Corp. or anyone else. | ...lll-lcc!atari!apratt \----------------------------------------------/
braner@batcomputer.UUCP (06/07/87)
[] When I buffer I/O myself (see the source code for my uu*code, MORE, etc) (oops: those are in AL, not C!) (In C: my version of microEMACS) I first read the file attributes (using Setdta() and Fsfirst()) and isolate the file size. Later, when reading the file in chunks (of 4.5 or 9 or 18K) I keep track of the amount remaining to be read. In the last call to Fread() I only ask for the amount that I know is there! (Why all the trouble? - 'cause this is _much_ faster than using the byte-by-byte UNIX-like fopen(),fread()...) - Moshe Braner