[comp.lang.c] questions about a backup program for the MS-DOS environment

jhallen@wpi.wpi.edu (Joseph H Allen) (05/02/90)

In article <1990Apr25.125806.20450@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
>In article <255@uecok.UUCP> dcrow@uecok (David Crow -- ECU Student) writes:
>> [...]
>>      - possibly a faster copying scheme.  the following is the
>>         code I am using to copy from one file to another:
>>
>>            do
>>            {
>>               n = fread(buf, sizeof(char), MAXBUF, infile);
>>               fwrite(buf, sizeof(char), n, outfile);
>>            } while (n == MAXBUF);        /* where MAXBUF = 7500 */
>>
>Try:
>     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
>		fwrite(buf, sizeof(char), n, outfile);
>
>By using BUFSIZ instead of your own buffer length you get a buffer size
>equal to what the fread and fwrite routines use.  

No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
(no wonder it's so slow :-)

Try (in small or tiny model):

#include <dos.h>

char far *buffer=farmalloc(65024);
unsigned n;
int readfile;	/* Open handle (use _open() ) */
int writefile;	/* Open handle (use _open() ) */

do
 {
 _BX=readfile;		/* Handle */
 _CX=65024;		/* Count */
 _DX=FP_OFF(buffer);	/* Offset of buffer */
 _DS=FP_SEG(buffer);	/* Segment of buffer */
 _AH=0x3f;
 geninterrupt(0x21);	/* Read */
 __emit__(0x73,2,0x2b,0xc0);	/* Clear AX if error.  This codes to:
					jnc over
					sub ax,ax
				    over:
				*/
 _DS=_SS;		/* Restore data segment */

 n=_AX;			/* Get amount actually read */
 if(!n) break;		/* If we're done */

 _CX=n;
 _BX=writefile;
 _DX=FP_OFF(buffer);
 _DS=FP_SEG(buffer);
 _AH=0x40;
 geninterrupt(0x21);	/* Write */
 _DS=_SS;
 } while(n==65024);
-- 
jhallen@wpi.wpi.edu (130.215.24.1)

bright@Data-IO.COM (Walter Bright) (05/03/90)

In article <12459@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes:
<In article <1990Apr25.125806.20450@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
<<In article <255@uecok.UUCP> dcrow@uecok (David Crow -- ECU Student) writes:
<<<      - possibly a faster copying scheme.  the following is the
<<<         code I am using to copy from one file to another:
<<<            do
<<<            {  n = fread(buf, sizeof(char), MAXBUF, infile);
<<<               fwrite(buf, sizeof(char), n, outfile);
<<<            } while (n == MAXBUF);        /* where MAXBUF = 7500 */
<<Try:
<<     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
<<		fwrite(buf, sizeof(char), n, outfile);
<<
<<By using BUFSIZ instead of your own buffer length you get a buffer size
<<equal to what the fread and fwrite routines use.  
<No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
<(no wonder it's so slow :-) Try (in small or tiny model):
<	[asm example deleted]

There is no point in going to asm to get high speed file copies. Since it
is inherently disk-bound, there is no sense (unless tiny code size is
the goal). Here's a C version that you'll find is as fast as any asm code
for files larger than a few bytes (the trick is to use large disk buffers):


#if Afilecopy
int file_copy(from,to)
#else
int file_append(from,to)
#endif
char *from,*to;
{	int fdfrom,fdto;
	int bufsiz;

	fdfrom = open(from,O_RDONLY,0);
	if (fdfrom < 0)
		return 1;
#if Afileappe
	/* Open R/W by owner, R by everyone else	*/
	fdto = open(to,O_WRONLY,0644);
	if (fdto < 0)
	{   fdto = creat(to,0);
	    if (fdto < 0)
		goto err;
	}
	else
	    if (lseek(fdto,0L,SEEK_END) == -1)	/* to end of file	*/
		goto err2;
#else
	fdto = creat(to,0);
	if (fdto < 0)
	    goto err;
#endif

	/* Use the largest buffer we can get	*/
	for (bufsiz = 0x4000; bufsiz >= 128; bufsiz >>= 1)
	{   register char *buffer;

	    buffer = (char *) malloc(bufsiz);
	    if (buffer)
	    {   while (1)
		{   register int n;

		    n = read(fdfrom,buffer,bufsiz);
		    if (n == -1)		/* if error		*/
			break;
		    if (n == 0)			/* if end of file	*/
		    {   free(buffer);
			close(fdto);
			close(fdfrom);
			return 0;		/* success		*/
		    }
		    n = write(fdto,buffer,(unsigned) n);
		    if (n == -1)
			break;
		}
		free(buffer);
		break;
	    }
	}
err2:	close(fdto);
	remove(to);				/* delete any partial file */
err:	close(fdfrom);
	return 1;
}

jhallen@wpi.wpi.edu (Joseph H Allen) (05/03/90)

In article <2484@dataio.Data-IO.COM> bright@Data-IO.COM (Walter Bright) writes:
>In article <12459@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes:
><In article <1990Apr25.125806.20450@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
><<In article <255@uecok.UUCP> dcrow@uecok (David Crow -- ECU Student) writes:
><<By using BUFSIZ instead of your own buffer length you get a buffer size
><<equal to what the fread and fwrite routines use.  
><No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
><(no wonder it's so slow :-) Try (in small or tiny model):
><	[asm example deleted]
>There is no point in going to asm to get high speed file copies. Since it
>is inherently disk-bound, there is no sense (unless tiny code size is
>the goal). Here's a C version that you'll find is as fast as any asm code
>for files larger than a few bytes (the trick is to use large disk buffers):
> [better C example deleted]

I didn't use asm to get the code itself fast.  The only reason I did it was so
that you can use 64K buffers in small/tiny model.  Now if only there was a
farread and farwrite call... 

I guess you can just compile the program in large model to have this same
effect (by habit I don't tend to use the large models).

Interestingly, this aspect of the copy program is one place where I think DOS
is sometimes faster than UNIX.  I suspect that many UNIX versions of 'cp' use
block-sized buffers. Doing so makes overly pessimistic assumptions about the
amount of physical memory you're likely to get.  

Of course, since DOS doesn't buffer writes it often ends up being slower
anyway (since it has to seek to the FAT so often).  'copy *.*' would be much,
much faster if only DOS was just a wee bit smarter...
-- 
jhallen@wpi.wpi.edu (130.215.24.1)

darcy@druid.uucp (D'Arcy J.M. Cain) (05/03/90)

In article <12459@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes:
>In article <1990Apr25.125806.20450@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
>>     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
>>		fwrite(buf, sizeof(char), n, outfile);
>
>No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
>(no wonder it's so slow :-)
>
>Try (in small or tiny model):
>
>#include <dos.h>
> [ ... A bunch of assembler code shoe-horned into a C program]

Double yuck.  Not only does this depend on a specific hardware platform,
a specific OS, a specific vendor's compiler but even 2 specific models
of the compiler out of 6 possible.  I bet someone would be hard pressed
to modify this code to make it less portable than it is.

-- 
D'Arcy J.M. Cain (darcy@druid)     |   Government:
D'Arcy Cain Consulting             |   Organized crime with an attitude
West Hill, Ontario, Canada         |
(416) 281-6094                     |

cline@cheetah.ece.clarkson.edu (Marshall Cline) (05/04/90)

In article <2484@dataio.Data-IO.COM> bright@Data-IO.COM (Walter Bright) writes:
>In article <12459@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes:
>>In article <1990Apr25.125806.20450@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
>>>In article <255@uecok.UUCP> dcrow@uecok (David Crow -- ECU Student) writes:
>>>>      - possibly a faster copying scheme.  the following is the
>>>>         code I am using to copy from one file to another:
>>>>            do
>>>>            {  n = fread(buf, sizeof(char), MAXBUF, infile);
>>>>               fwrite(buf, sizeof(char), n, outfile);
>>>>            } while (n == MAXBUF);        /* where MAXBUF = 7500 */
>>>Try:
>>>     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
>>>		fwrite(buf, sizeof(char), n, outfile);
>>>
>>>By using BUFSIZ instead of your own buffer length you get a buffer size
>>>equal to what the fread and fwrite routines use.  
>>No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
>>(no wonder it's so slow :-) Try (in small or tiny model):
>>	[asm example deleted]
>There is no point in going to asm to get high speed file copies. Since it
>is inherently disk-bound, there is no sense (unless tiny code size is
>the goal). Here's a C version that you'll find is as fast as any asm code
>for files larger than a few bytes (the trick is to use large disk buffers):
[example deleted]

Note that Walter used read()/write() as opposed to fread()/fwrite().
In Turbo-C, it's even faster (probably 2 to 3 times faster!) to use
_read() and _write(), since read()/write() can end up
deleting/inserting ^J if the files are in text mode.

I found fread()/fwrite() in Turbo-C to be embarassingly & surprisingly
slow.  After tracing them, I found they resolved to loops of
fgetc()/fputc() rather than the seemingly more obvious read()/write(),
apparently since fgetc()/fputc() know about buffering.  However it
wouldn't be hard to fix fread()/fwrite() to do it `right'!

Marshall
--
===============================================================================
Marshall Cline/ECE Department/Clarkson University/Potsdam NY 13676/315-268-3868
 cline@sun.soe.clarkson.edu, bitnet: BH0W@CLUTX, uunet!clutx.clarkson.edu!bh0w
  Career search in progress: ECE faculty. Research oriented. Will send vita.
===============================================================================

sra@ecs.soton.ac.uk (Stephen Adams) (05/04/90)

In article <12459@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes:

   In article <1990Apr25.125806.20450@druid.uucp> darcy@druid.UUCP (D'Arcy J.M. Cain) writes:
   >In article <255@uecok.UUCP> dcrow@uecok (David Crow -- ECU Student) writes:
   >> [...]
   >>      - possibly a faster copying scheme.  the following is the
   >>         code I am using to copy from one file to another:
   >>
   >>            do
   >>            {
   >>               n = fread(buf, sizeof(char), MAXBUF, infile);
   >>               fwrite(buf, sizeof(char), n, outfile);
   >>            } while (n == MAXBUF);        /* where MAXBUF = 7500 */
   >>
   >Try:
   >     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
   >		fwrite(buf, sizeof(char), n, outfile);
   >
   >By using BUFSIZ instead of your own buffer length you get a buffer size
   >equal to what the fread and fwrite routines use.  

   No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
   (no wonder it's so slow :-)

   Try (in small or tiny model):

	... 30 lines of *heavily* machine dependent C ...

To suggest replacing 5 (or 2) lines of working C that will run on
anything that runs C with 30 lines of `assembly code' that runs only
on a PC, with a specific memory model and C compiler is lunacy.

Especially as it is completely unnecessary.

The most important things are:

	+ the buffer size
	+ avoiding needless copying

The bigger the buffer, the less time you go round the loop.  I would
suggest using the open/read/write/close functions instead of stdio.h
for copying files.  This is because stdio does its own buffering:

	input -> infile's buffer -> your buf -> outfile's buffer -> output

with read/write *you* do the buffering:

	input -> your buffer -> output

Use a large buffer, preferably one that is a multiple of the block
size of the disk long.  It will go as fast as the 30 line wonder.  And
if it doesnt work you stand a chance of debugging it.

bright@Data-IO.COM (Walter Bright) (05/05/90)

In article <CLINE.90May3130316@cheetah.ece.clarkson.edu> cline@sun.soe.clarkson.edu (Marshall Cline) writes:
<Note that Walter used read()/write() as opposed to fread()/fwrite().
<In Turbo-C, it's even faster (probably 2 to 3 times faster!) to use
<_read() and _write(), since read()/write() can end up
<deleting/inserting ^J if the files are in text mode.
<I found fread()/fwrite() in Turbo-C to be embarassingly & surprisingly
<slow.  After tracing them, I found they resolved to loops of
<fgetc()/fputc() rather than the seemingly more obvious read()/write(),
<apparently since fgetc()/fputc() know about buffering.  However it
<wouldn't be hard to fix fread()/fwrite() to do it `right'!

But I wasn't using Turbo C, I was using Zortech C! In Zortech C, read()/
write() resolve directly to calls to MSDOS. read()/write() do not do
CR-LF translation. I always felt it was silly to have them do such
translation, thus creating 3 levels of I/O instead of 2. If you want
CR-LF translation, use fgetc/fputc.

BTW, fread/fwrite directly memcpy the bytes from/to the I/O buffer, they
are not loops around fgetc/fputc for the Zortech libraries.

e89hse@rigel.efd.lth.se (05/08/90)

In article <2682@ecs.soton.ac.uk>, sra@ecs.soton.ac.uk (Stephen Adams) writes:
>>In article <255@uecok.UUCP> dcrow@uecok (David Crow -- ECU Student) writes:
>>>      - possibly a faster copying scheme.  the following is the
>>>         code I am using to copy from one file to another:
>>>
>>>            do
>>>            {
>>>               n = fread(buf, sizeof(char), MAXBUF, infile);
>>>               fwrite(buf, sizeof(char), n, outfile);
>>>            } while (n == MAXBUF);        /* where MAXBUF = 7500 */
>>>
>>Try:
>>     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
>>		fwrite(buf, sizeof(char), n, outfile);
>>
>>By using BUFSIZ instead of your own buffer length you get a buffer size
>>equal to what the fread and fwrite routines use.  
>
>No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
>(no wonder it's so slow :-)
>
>   Try (in small or tiny model):
>
>
>	... 30 lines of *heavily* machine dependent C ...
>
 Why mashine dependent and why machine code, just replace 7500 (MAXBUF) with
something more normal like 0x4000 and fread and fwrite with read and write
respectivly. (And open with O_BINARY). If you wanna make it faster just
increase the buffer size.


 Henrik Sandell