[comp.sys.ibm.pc] Major bug in all

b-davis@utah-cs.UUCP (02/07/87)

; Here is a probable bug (or feature) in MS-DOS.  Does anyone know a
; work around (without closing the first file)?  Would Gordon Letwin at
; Microsoft care to comment?  This bug appears on PC-DOS 2.0, PC-DOS 3.0,
; and MS-DOS 3.1.  (Unix has no problems with this algorigthm.)

; The test goes like this:
; 	Create a file with name 'xxx'.  Call this file FD1.
;	Write 80 bytes to FD1.
;	Open the file named 'xxx' a second time.  Call this file FD2.
;		Note that NO errors have happened yet.
;	Try to read 80 bytes from FD2.  No bytes are read.
;		Note that NO error is reported.
;	If in symdeb push to a new shell.  See that the file 'xxx' has
;		been created but has a size of 0.
;	Exit the program.  See that the file 'xxx' is now 80 bytes long.
; If FD1 is closed at POINT A (see source) then FD2 will perform the read.
; If FD1 is closed at POINT B (see source) then the read of FD2 still fails
;	even though the disk has been updated before the read happens.

xtest	segment	para public 'prog'
	assume	cs:xtest,ds:xtest

	org	100h
start:	jmp	main

; int fd1, fd2;
; char buffer[80];
; int a, b;
fd1	dw	0
fd2	dw	0
buffer	db	80 dup (0)
a	dw	0
b	dw	0
xxx	db	"xxx", 0

main:
	; modify memory so symdeb can push a shell
	mov	bx, 4096
	mov	ah, 4ah
	int	21h
	; create a file
	; fd1 = open("xxx", O_TRUNC | O_CREAT | O_RDWR | O_BINARY, 0600);
	mov	dx, offset xxx
	mov	cx, 0
	mov	ah, 3ch
	int	21h
	mov	fd1, ax
	; write out 80 bytes
	; a = write(fd1, buffer, sizeof(buffer));
	mov	bx, fd1
	mov	dx, offset buffer
	mov	cx, size buffer
	mov	ah, 40h
	int	21h
	mov	a, ax
; POINT A
; If these three lines are included then all works well.
;	mov	bx, fd1
;	mov	ah, 3eh
;	int	21h
	; open the file a second time
	; fd2 = open("xxx", O_RDWR | O_BINARY, 0);
	mov	dx, offset xxx
	mov	al, 2
	mov	ah, 3dh
	int	21h
	mov	fd2, ax
; POINT B
; If these three lines are included then the read still fails.
;	mov	bx, fd1
;	mov	ah, 3eh
;	int	21h
	; read the data from the second file
	; b = read(fd2, buffer, sizeof(buffer));
	mov	bx, fd2
	mov	dx, offset buffer
	mov	cx, size buffer
	mov	ah, 3fh
	int	21h
	mov	b, ax
	; Here a = 80 and b = 0.  Shouldn't b = 80 also?
	; return to system
	mov	al, 0
	mov	ah, 4ch
	int	21h
	ret	
xtest	ends
	end	start
;--------------------------- End of Test -----------------------


-- 
Brad Davis	{ihnp4, decvax, seismo}!utah-cs!b-davis	
		b-davis@utah-cs.ARPA
One drunk driver can ruin your whole day.

rap@oliveb.UUCP (02/07/87)

In article <4274@utah-cs.UUCP> b-davis@utah-cs.UUCP (Brad Davis) writes:
>; POINT A
|; If these three lines are included then all works well.
|;	mov	bx, fd1
|;	mov	ah, 3eh
|;	int	21h
|	; open the file a second time
|	; fd2 = open("xxx", O_RDWR | O_BINARY, 0);
|	mov	dx, offset xxx
|	mov	al, 2
|	mov	ah, 3dh
|	int	21h
|	mov	fd2, ax
|; POINT B
|; If these three lines are included then the read still fails.
|;	mov	bx, fd1
|;	mov	ah, 3eh
|;	int	21h
|	; read the data from the second file

Not a bug.  You just didn't account  for  DOS  buffers.  The
file  may be writen to at POINT A, but the disk isn't.  When
you issue the function 3eh call at POINT A the  buffers  are
flushed  and  the file closed.  When the buffers are flushed
is when the disk gets writen.

I think that if you look at your example now you can see why
POINT A behaves differently from POINT B.
-- 

					Robert A. Pease
{hplabs|fortune|idi|ihnp4|tolerant|allegra|glacier|olhqma}!oliveb!olivej!rap

rassilon@mit-eddie.UUCP (02/08/87)

In article <4274@utah-cs.UUCP> b-davis@utah-cs.UUCP (Brad Davis) writes:
>Here is a probable bug (or feature) in MS-DOS.  Does anyone know a
>work around (without closing the first file)?  Would Gordon Letwin at
>Microsoft care to comment?  This bug appears on PC-DOS 2.0, PC-DOS 3.0,
>and MS-DOS 3.1.  (Unix has no problems with this algorigthm.)
>
>The test goes like this:
>	Create a file with name 'xxx'.  Call this file FD1.
>	Write 80 bytes to FD1.
>	Open the file named 'xxx' a second time.  Call this file FD2.
>		Note that NO errors have happened yet.
>	Try to read 80 bytes from FD2.  No bytes are read.
>		Note that NO error is reported.
>	If in symdeb push to a new shell.  See that the file 'xxx' has
>		been created but has a size of 0.
>	Exit the program.  See that the file 'xxx' is now 80 bytes long.

I don't see any problem with this.  Remember, in most computers files are
not written too unless the buffer is full (which I don't think 80 bytes
fills) or you close the file.  From your code it appears that you are trying
to read from FD1 before closing it.  Yes, the file was created when you
originally opened it.  This is necessary to ensure that there is room on
the disk for at least one block and that you the disk/directory you are
attempting to write to actually exists.

One way of getting around this without closing the file is to do a 'flush'.
This, in essence, means checking to see if the buffer is empty and, if not,
writing it's contents to the disk.  How to force output in assembly I don't
know.  In TURBO Pascal I simply use the command FLUSH(filename).

					-- Rassilon
					   (rassilon@eddie.mit.edu)

kneller@ucsfcgl.UUCP (02/08/87)

In article <4274@utah-cs.UUCP> b-davis@utah-cs.UUCP (Brad Davis) writes:
>; Here is a probable bug (or feature) in MS-DOS.  Does anyone know a
>; work around (without closing the first file)?  Would Gordon Letwin at
>; Microsoft care to comment?  This bug appears on PC-DOS 2.0, PC-DOS 3.0,
>; and MS-DOS 3.1.  (Unix has no problems with this algorigthm.)
>
>; The test goes like this:
>; 	Create a file with name 'xxx'.  Call this file FD1.
>;	Write 80 bytes to FD1.
>;	Open the file named 'xxx' a second time.  Call this file FD2.
>;		Note that NO errors have happened yet.
Right -- the file exists in the directory entry.  No problem here.

>;	Try to read 80 bytes from FD2.  No bytes are read.
>;		Note that NO error is reported.
What do you mean NO error?  You ask for 80 bytes and read 0.  That says
the read failed.  Read returns the number of bytes read.

>;	If in symdeb push to a new shell.  See that the file 'xxx' has
>;		been created but has a size of 0.
>;	Exit the program.  See that the file 'xxx' is now 80 bytes long.
File hadn't been closed.  Until it is closed, the size stored in the
directory is *not* the number of bytes written to the file.  Imagine
the disk overhead if the directory was updated each time you wrote to
a file.

>; If FD1 is closed at POINT A (see source) then FD2 will perform the read.
Yes, close the file, the directory entry is updated and the *next* open
will get the file size information from the directory and it will be
correct.

>; If FD1 is closed at POINT B (see source) then the read of FD2 still fails
>;	even though the disk has been updated before the read happens.
Yes, but the file size is read when the file is opened!  And it was zero then!

[ source deleted ]

The crux of the matter is that the open causes the file size to be read
from the directory entry.  If you have two separate file handles, one
for reading and one for writing, they have separate file pointers.  When
you write to the file with one handle, neither the other file pointer
nor other file size get changed.  (When I say "file pointer", I mean
the number that tells the position in the file in bytes).

If you are simply trying to read and write from the same file, open the
file with read/write access and position the file pointer back to the
beginning of the file with the "lseek" function call (int 21h, ah=42h)
before doing the read.  Then you don't have to close the file before
reading.

One more thing.  If you want to force DOS to update the directory entry
so subsequent opens will work, use "dup" (int 21h, ah=45h) on the file
handle and close the duplicate handle.  Then you won't have to reopen
the file.
-----
	Don Kneller
UUCP:	...ucbvax!ucsfcgl!kneller
ARPA:	kneller@cgl.ucsf.edu
BITNET:	kneller@ucsfcgl.BITNET

johnl@ima.UUCP (02/08/87)

In article <4274@utah-cs.UUCP> b-davis@utah-cs.UUCP (Brad Davis) writes:
>; Here is a probable bug (or feature) in MS-DOS.  ...
>; The test goes like this:
>; 	Create a file with name 'xxx'.  Call this file FD1.
>;	Write 80 bytes to FD1.
>;	Open the file named 'xxx' a second time.  Call this file FD2.
>;		Note that NO errors have happened yet.
>;	Try to read 80 bytes from FD2.  No bytes are read.
>;		Note that NO error is reported.
It's a feture.  That is, it's supposed to work that way.  As far as I can
tell, when you create a file, DOS writes a directory entry that is zero
length.  When you close the file, DOS rewrites the directory entry.  If
you try to read the file in the meantime, you lose.  This points out the
fact that in its heart of hearts DOS is more like CP/M than like Unix, even
though on the surface it gets more Unix-like in each revision.

Perhaps now that MS-net provides an environment in which multiple users are
reading and writing files all the time, MS will make the semantics a little
more reasonable, but I wouldn't count on it.  Unix is fairly unusual in the
way that it lets any old program read a file while it is being written.  It's
more common not to let anybody read the file until it is written and closed.

(By the way, this seems to me more like a nit than a major bug.)
-- 
John R. Levine, Javelin Software Corp., Cambridge MA +1 617 494 1400
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something
Where is Richard Nixon now that we need him?

gemini@homxb.UUCP (02/08/87)

All this brew-ha-ha about what is either a DOS bug, or a DOS
shortcoming.

In article <10056@cgl.ucsf.edu.ucsfcgl.UUCP>, kneller@socrates.ucsf.edu (Don Kneller%Langridge) writes:
> File hadn't been closed.  Until it is closed, the size stored in the
> directory is *not* the number of bytes written to the file.  Imagine
> the disk overhead if the directory was updated each time you wrote to
> a file.

No overhead, the directory is (should) be in a buffer, too.  Note that
UNIX does exactly this (actually updates the inode, not the directory).
Otherwise, commands like "tail -f" wouldn't work.  I'm tempted to
call this DOS behavior a bug although it could be a (mis)feature if it is
documented.  It will certainly be called a bug when multiprocessing
DOS arrives (if ever).

Rick Richardson, PC Research, Inc. (201) 922-1134, (201) 834-1378 @ AT&T-CP
..!ihnp4!castor!{rer,pcrat!rer} <--Replies to here, not to homxb!gemini, please.

kneller@ucsfcgl.UUCP (02/09/87)

In article <479@ima.UUCP> johnl@ima.UUCP (John R. Levine) writes:
>In article <4274@utah-cs.UUCP> b-davis@utah-cs.UUCP (Brad Davis) writes:
>>; Here is a probable bug (or feature) in MS-DOS.  ...
>>; 	[ ...]
>It's a feture.  That is, it's supposed to work that way.  As far as I can
>tell, when you create a file, DOS writes a directory entry that is zero
>length.  When you close the file, DOS rewrites the directory entry.  If
>you try to read the file in the meantime, you lose.

You *can* read the file if you opened the file with read/write access and
position the file pointer back to the beginning of the file with "lseek".
-----
	Don Kneller
UUCP:	...ucbvax!ucsfcgl!kneller
ARPA:	kneller@cgl.ucsf.edu
BITNET:	kneller@ucsfcgl.BITNET

pinkas@mipos3.UUCP (02/09/87)

In article <4274@utah-cs.UUCP> b-davis@utah-cs.UUCP (Brad Davis) writes:
>; Here is a probable bug (or feature) in MS-DOS.  Does anyone know a
>; work around (without closing the first file)?  Would Gordon Letwin at
>; Microsoft care to comment?  This bug appears on PC-DOS 2.0, PC-DOS 3.0,
>; and MS-DOS 3.1.  (Unix has no problems with this algorigthm.)
>
>; The test goes like this:
>; 	Create a file with name 'xxx'.  Call this file FD1.
>;	Write 80 bytes to FD1.
>;	Open the file named 'xxx' a second time.  Call this file FD2.
>;		Note that NO errors have happened yet.
>;	Try to read 80 bytes from FD2.  No bytes are read.
>;		Note that NO error is reported.
>;	If in symdeb push to a new shell.  See that the file 'xxx' has
>;		been created but has a size of 0.
>;	Exit the program.  See that the file 'xxx' is now 80 bytes long.
>; If FD1 is closed at POINT A (see source) then FD2 will perform the read.
>; If FD1 is closed at POINT B (see source) then the read of FD2 still fails
>;	even though the disk has been updated before the read happens.

I don't have my DOS manuals in front of me so I'm not sure what the trhee
lines of assembly at points A and B are, but here is my explanation of what
is happening.

When you open file FD1, DOS (and your C compiler) create an internal
buffer, usually 128 or 256 bytes long.  This buffer is in memory only.  The
file is created on disk, but the 80 bytes are still in the buffer.  Thus,
when you open FD2, it can open the file but will return EOF with the first
attempt to read.  Thus, there is no error reported.  Note that with Lattice
C (and Turbo Pascal), the input buffer is filled at the time that open is
called.  Thus, you get an EOF immediately.

The call to read() simply copies characters out of the buffer and into the
user-supplied memory, refilling the internal buffer as needed.  Since EOF
was detected when the file was opened, nothing will ever be read.

This is the defined and correct behavior for C.  If Unix is writing the
bytes out to disk before giving the second shell control, it is flushing.

To force a file to be written to disk (i.e. make the disk as up to date as
possible), use the fflush() routine.  My Unix manual entry says (copied
without permission, but this is almost identical to every C compiler's
endtry on this topic):

SYNTAX
     #include <stdio.h>

     fflush(stream)
     FILE *stream;

DESCRIPTION
     Fflush causes any buffered data for the named output stream
     to be written to that file.  The stream remains open.

I assume whatever C compiler you are using under MS-DOS has a similar call.

BTW close() automatically flushes the buffer.  Since exit() (called
implicitly by falling off the end of main()) calls close() for every file
still open, a file might not be written to disk until after the program
terminates.  (This appears to be happenning in your case.)

-Israel

P.S.  When stating that there is a bug in DOS, please let people know what
software you are using when this bug occurs.  For example, it would help if
I knew what C ompiler you were using.
-- 
----------------------------------------------------------------------
UUCP:	{amdcad,decwrl,hplabs,oliveb,pur-ee,qantel}!intelca!mipos3!pinkas
ARPA:	pinkas%mipos3.intel.com@relay.cs.net
CSNET:	pinkas%mipos3.intel.com

campbell@maynard.UUCP (02/12/87)

In article <2368@homxb.UUCP> gemini@homxb.UUCP (Rick Richardson) writes:
>
>No overhead, the directory is (should) be in a buffer, too.  Note that
>UNIX does exactly this (actually updates the inode, not the directory).
>Otherwise, commands like "tail -f" wouldn't work.  I'm tempted to
>call this DOS behavior a bug although it could be a (mis)feature if it is
>documented.  It will certainly be called a bug when multiprocessing
>DOS arrives (if ever).

One of the many reasons I hate MS-DOS is Microsoft's

    Rule Number One:
	"Once a bug, always a feature."

There are many good business reasons for this, and I suspect Microsoft
hates it worse than we do, but it's reality.  Note that in a sense,
multiprocessing DOS is here, since file servers serve multiple clients
concurrently.  Microsoft's MS-NET file server has the bug too (not surprising,
since it just calls DOS to do file system work).  I designed an MS-NET
file server for a client, and we discovered this the annoying way:
we believed the specs (ha!), implemented the "correct" behavior, and
discovered that it conflicted with what Microsoft's server and with
raw MS-DOS.  Solution:  break our server, too.  Grumble.

    Rule Number Two:
	"The IBM PC and PC-DOS have no bugs.  They're all features."

-- 
Larry Campbell                                The Boston Software Works, Inc.
Internet: campbell@maynard.uucp             120 Fulton Street, Boston MA 02109
uucp: {alliant,wjh12}!maynard!campbell              +1 617 367 6846
ARPA: campbell%maynard.uucp@harvisr.harvard.edu      MCI: LCAMPBELL