[comp.benchmarks] Sparcstation 2 Write I/O

eddjp@edi386.UUCP ( Dewey Paciaffi ) (02/14/91)

I've found an anomaly I can't explain on my borrowed Sparcstation 2.
I've been running the Byte Magazine Benchmarks and the file I/O times
were surprisingly low, around 270 kb / sec. on read and writes.

In attempting to isolate any problem with the bench program, I shut off
the write portion of the test, and read an old test file. The results
were in the 1.2 - 1.4 MB range. Turning this around I changed the program
to write only, and still averaged about 270 kb.

To be completely sure that the benchmark program didn't have some latent
bug that I hadn't found, I ran the following test and again can only 
write about 270 - 290 kb/sec. The system is quiet except for this test.

Could someone verify these results or (hopefully) refute them? Remember
that this is a loaner machine, and I'm not sure how well tuned it may
be. The disk comes up Synchronous, and remains so as far as I can tell.

======== Cut Here ========== Cut Here =========== Cut Here =============

#include <fcntl.h>

#define BUFFSIZE 1024

main()
{
	int file;
	int c = 0;
	char buffer[BUFFSIZE];

	file = open("./dummy0", O_WRONLY | O_CREAT | O_TRUNC );
	
	while ( c++ != 2000 )
		if ( (write(file,buffer,BUFFSIZE)) == -1 )
		{
			perror("Write");
			break;
		}

	close(file);
}
	
-- 
Dewey Paciaffi           ...!uunet!edi386!eddjp

dan@rna.UUCP (Dan Ts'o) (02/16/91)

In article <152@edi386.UUCP> eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
>I've found an anomaly I can't explain on my borrowed Sparcstation 2.
>I've been running the Byte Magazine Benchmarks and the file I/O times
>were surprisingly low, around 270 kb / sec. on read and writes.

>#include <fcntl.h>
>
>#define BUFFSIZE 1024

	I think your write size is too small. Reads probably benefit greatly
from the read-ahead of the UNIX buffer cache. But your writes in buffers of
1024 will not and you'll incur lots of rotational latency (waiting for the
disk to come around again). Try a buffer of 32768 or more.

					Dan

ajudge@maths.tcd.ie (Alan Judge) (02/16/91)

In <152@edi386.UUCP> eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
>I've found an anomaly I can't explain on my borrowed Sparcstation 2.
>I've been running the Byte Magazine Benchmarks and the file I/O times
>were surprisingly low, around 270 kb / sec. on read and writes.

Just for reference, I tried using dd with a bs=1024k (big) to create and read
100Mb files.  Machine was a plain SS2, with a Hitachi 1.2Gb disk.

I got about 1.5 Mb/s writing, and 2.1Mb/s reading.
-- 
Alan Judge   ajudge@maths.tcd.ie  a.k.a. amjudge@cs.tcd.ie +353-1-772941 x1782

"COBOL is the revenge of some witch burned in Salem, [...]"  (Bill Davidsen)

root@lingua.cltr.uq.OZ.AU (Hulk Hogan) (02/18/91)

eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
>In attempting to isolate any problem with the bench program, I shut off
>the write portion of the test, and read an old test file. The results
>were in the 1.2 - 1.4 MB range. Turning this around I changed the program
>to write only, and still averaged about 270 kb.

We have an SS2 in the server config (internal 207MB disk and an external
669MB (which doesn't have the same config as my Wren 6's, but well be for
all I know, as Sun's disks are called SUN0207 & SUN0669 rather than Wren-X),
a 150MB tape and CD-ROM.  The Sun chappie who installed it said that the
internal 207MB disks are rather slow.  This may explain your results.

>Could someone verify these results or (hopefully) refute them? Remember
>that this is a loaner machine, and I'm not sure how well tuned it may
>be. The disk comes up Synchronous, and remains so as far as I can tell.

The /var/adm/message contains the lines...
|Feb 18 11:26:30 beefcake vmunix: esp0:        Target 3 now Synchronous at 4.0 mb/s max transmit rate
|Feb 18 11:26:30 beefcake vmunix: esp0:        Target 0 now Synchronous at 3.334 mb/s max transmit rate

So, they both appear (at least) capable of doing Sync SCSI.

/\ndy
-- 
Andrew M. Jones,  Systems Programmer, 	Internet: andy@lingua.cltr.uq.oz.au
Centre for Lang. Teaching & Research, 	Phone (Australia):  (07) 365 6915
University of Queensland,  St. Lucia, 	Phone (World):    +61  7 365 6915
Brisbane,  Qld. AUSTRALIA  4072 	Fax: 		  +61  7 365 7077

"No matter what hits the fan, it's never distributed evenly....."

eddjp@edi386.UUCP ( Dewey Paciaffi ) (02/18/91)

In article <1991Feb16.145828.12539@maths.tcd.ie> ajudge@maths.tcd.ie (Alan Judge) writes:
-In <152@edi386.UUCP> eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
->I've found an anomaly I can't explain on my borrowed Sparcstation 2.
->I've been running the Byte Magazine Benchmarks and the file I/O times
->were surprisingly low, around 270 kb / sec. on read and writes.
-
-Just for reference, I tried using dd with a bs=1024k (big) to create and read
-100Mb files.  Machine was a plain SS2, with a Hitachi 1.2Gb disk.
-
-I got about 1.5 Mb/s writing, and 2.1Mb/s reading.

I've also gotten similar results  dd and mkfile. Unfortunately all of my
programs use the write system call, and it only does about 270 kb on
a quiet system, using SunOs 4.1.1. My 386 does about half as good, and
my RS/6000 gets about 3 MB on the writes on a quiet system...

-- 
Dewey Paciaffi           ...!uunet!edi386!eddjp

eddjp@edi386.UUCP ( Dewey Paciaffi ) (02/18/91)

In article <1085@rna.UUCP> dan@rna.UUCP (Root Beer) writes:
-In article <152@edi386.UUCP> eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
->I've found an anomaly I can't explain on my borrowed Sparcstation 2.
->I've been running the Byte Magazine Benchmarks and the file I/O times
->were surprisingly low, around 270 kb / sec. on read and writes.
-
->#include <fcntl.h>
->
->#define BUFFSIZE 1024
-
-	I think your write size is too small. Reads probably benefit greatly
-from the read-ahead of the UNIX buffer cache. But your writes in buffers of
-1024 will not and you'll incur lots of rotational latency (waiting for the
-disk to come around again). Try a buffer of 32768 or more.
-
-					Dan

It was my understanding that the write system call returns as soon as the 
kernel has the data to be written in its buffers. I chose 1024 because
its the size of a file system block and also the size of a buffer cache
block, the buffer cache being made up of a number of these blocks. I
had tried sizes up to 8192 without any observable difference.

I have learned today however that file I/O does not go through the
buffer cache in Sparcs, but rather through the VM system page buffers.
I don't know what effect this has on the way that the write system
call operates.


-- 
Dewey Paciaffi           ...!uunet!edi386!eddjp

eddjp@edi386.UUCP ( Dewey Paciaffi ) (02/18/91)

In article <1991Feb18.054746.6723@lingua.cltr.uq.OZ.AU> root@lingua.cltr.uq.OZ.AU (Hulk Hogan) writes:
>eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
->In attempting to isolate any problem with the bench program, I shut off
->the write portion of the test, and read an old test file. The results
->were in the 1.2 - 1.4 MB range. Turning this around I changed the program
->to write only, and still averaged about 270 kb.
-
-We have an SS2 in the server config (internal 207MB disk and an external
-669MB (which doesn't have the same config as my Wren 6's, but well be for
-all I know, as Sun's disks are called SUN0207 & SUN0669 rather than Wren-X),
-a 150MB tape and CD-ROM.  The Sun chappie who installed it said that the
-internal 207MB disks are rather slow.  This may explain your results.
-

I've obtained the same results on the Internal drive and the External 
669, which comes up with the "Sync" message at boot time...


-- 
Dewey Paciaffi           ...!uunet!edi386!eddjp

ajudge@maths.tcd.ie (Alan Judge) (02/19/91)

In <155@edi386.UUCP> eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
>I've also gotten similar results  dd and mkfile. Unfortunately all of my
>programs use the write system call, and it only does about 270 kb on
>a quiet system, using SunOs 4.1.1. My 386 does about half as good, and
>my RS/6000 gets about 3 MB on the writes on a quiet system...

The main difference, I think, is not write vs dd, but the size of the
buffer used.  I wrote a program yesterday (for an entirely different
reason) that generates a whole series of 500k files using a single
write for each.  I didn't time it too well, but it was getting at least
1Mb/s.  (I think that the slow down vs dd is because I was generating lots
of separate files, with the attendant updates of inodes and directories,
rather than one large file.)
-- 
Alan Judge   ajudge@maths.tcd.ie  a.k.a. amjudge@cs.tcd.ie +353-1-772941 x1782

"If we don't succeed, then we run the risk of failure."  -- Dan Quayle

eddjp@edi386.UUCP ( Dewey Paciaffi ) (02/21/91)

In article <1991Feb19.101559.8441@maths.tcd.ie> ajudge@maths.tcd.ie (Alan Judge) writes:
>In <155@edi386.UUCP> eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
>>I've also gotten similar results  dd and mkfile. Unfortunately all of my
>
>The main difference, I think, is not write vs dd, but the size of the
>buffer used.

After having received several suggestions to increase the buffer size
on the write test I was running on the Sparc2, I did just that. Here's
the results along with those for a RS/6000 Model 320.

==============================================================================

          SunOs 4.1.1 Write System call, 10 MB file

     Buffer Size      Avg time in Seconds      KB / Second
     -----------      -------------------      -----------

        1024                37.55                 272.7
        2048                37.43                 273.6
        4096                37.05                 276.4
        8192                 8.36                1224.9  * Note change
       16384                 8.36                1224.9
       32768                 8.43                1214.7
       65536                 8.71                1175.7



            AIX 3.1 Write System call, 10 MB file

     Buffer Size      Avg time in Seconds      KB / Second
     -----------      -------------------      -----------

        1024                 5.57                1838.5
        2048                 6.62                1546.8
        4096                 6.20                1648.9
        8192                 6.21                1648.9 
       16384                 5.35                1914.4
       32768                 5.35                1914.0
       32768                 9.18                1115.5

==============================================================================

The tests were run 6 times at each buffer size and averaged. The Sparc2
tests were run on an external 669 MB drive, and similar results
were obtained on a second Sparc2, using its internal drive. The RS/6000
tests were run on an internal 320 IBM drive.

The remarkable change in throughput on the Sparc at 8k buffers surprised me a
little. Apparently the Standard I/O package takes this into account though,
because a 10 MB file written out in 1024 blocks with fputs returns in
about 8.5 seconds.

Other than that, the RS/6000 just looks a little faster...
-- 
Dewey Paciaffi           ...!uunet!edi386!eddjp

law@super.ORG (Jeffrey A Law) (02/21/91)

In article <PCG.91Feb19200520@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>On 18 Feb 91 12:42:25 GMT, eddjp@edi386.UUCP ( Dewey Paciaffi ) said:
>eddjp> In article <1085@rna.UUCP> dan@rna.UUCP (Root Beer) writes:
>dan> In article <152@edi386.UUCP> eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
>
>eddjp> I've found an anomaly I can't explain on my borrowed Sparcstation 2.
>eddjp> I've been running the Byte Magazine Benchmarks and the file I/O times
>eddjp> were surprisingly low, around 270 kb / sec. on read and writes.
[ lots of interesting, but in this case useless dribble deleted ]

I've been watching this thread for a little while and nobody has 
thought to really question the benchmark's accuracy (byte/README):

"BSD4v2 only: The time functions in the memory and and [sic] file
access tests appear to be incorrect"

Which just happens to be right!  fstime.c has loops that look like:

while(!sigalarm) {
	if ((read(fd,buf,count) <= 0) {
		if (errno == EINVAL) {
			lseek(fd, 0l, 0);	/* rewind at EOF */
		}
		else {
			if (errno != EINTR) {
				perror("you lose");
				return (-1);
			}
			else stop_count();
		}
	}
	++nblocks;
}

Well, under SunOS and 4.3 BSD a read just returns 0 at EOF
(and sometimes I've seen in errno set to EINTR).  So loops written like
the one above will essentially scan over the file only once.  The simple
fix is to save the value returnd from the read and properly handle the 
case where the read/write returned zero but did not set errno.

The lessons to be learned here are:

1) Read the documentation.  That would have at least made you wonder
   a little more about a possible real problem, rather than trying to 
   explain it away with caches, read sizes and other factors.

2) If results seem way out of line with other data something is probably
   very wrong in your data gathering or generation.  Question the validity
   of the data before trying to explain it away..

>I think that a better test, even if it is a system test and not just an
>IO subsystem test, is to use bonnie. Remember: bonnie is a benchmark
>that also tests cache effectiveness if any, in its random IO section.
So true.
Jeff


-- 
1987: We set standards, not Them. Your standard windowing system is NeUWS.
1989: We set standards, not Them. You can have X, but the UI is OpenLock.
1990: Why are you buying all those workstations from Them running Motif?

eddjp@edi386.UUCP ( Dewey Paciaffi ) (02/22/91)

In article <43414@super.ORG> law@super.ORG (Jeffrey A Law) writes:
- In article <152@edi386.UUCP> eddjp@edi386.UUCP ( Dewey Paciaffi ) writes:
-
-> I've found an anomaly I can't explain on my borrowed Sparcstation 2.
-> I've been running the Byte Magazine Benchmarks and the file I/O times
-> were surprisingly low, around 270 kb / sec. on read and writes.
-[ lots of interesting, but in this case useless dribble deleted ]
-
-I've been watching this thread for a little while and nobody has 
-thought to really question the benchmark's accuracy (byte/README):
-
-"BSD4v2 only: The time functions in the memory and and [sic] file
-access tests appear to be incorrect"
-
-Which just happens to be right!  fstime.c has loops that look like:
-
-while(!sigalarm) {
-	if ((read(fd,buf,count) <= 0) {
-		if (errno == EINVAL) {
-			lseek(fd, 0l, 0);	/* rewind at EOF */

-[Further code deleted]-

The above actually caused a failure under AIX 3.1, causing me to repair that
area of the code some months ago. This is not the question.  You can write 
a very simple program that contains a simple loop writing a 1K buffer
to disk until you write 10MB, and it will take about 37 seconds. Any buffer
size less than 8K (or somewhere between 4K and 8K ) will take about 37 
seconds to write 10MB. I have run this on two Sparc2 systems in different
areas of the country and they both respond the same.  Use the csh builtin
'time'. Use the Sys5 'time'. Use Ext. and Int. drives. I've tried them
all after breaking the problem down to it's most basic component, and
I've found Sparc2 systems running about 270 kb consistently when writing
buffers < 8K in size. Is this significant ??  I don't know :-).

-The lessons to be learned here are:
-
-1) Read the documentation.  That would have at least made you wonder
-   a little more about a possible real problem, rather than trying to 
-   explain it away with caches, read sizes and other factors.

Which I did. And, with the results that I posted yesterday, it appears
the the write system call transfer speed varys precisely and dramatically
with the size of the buffer being written. I'm trying to explain a result,
not explain it away. I'm also offering it for refutation. I'm interested
in accuracy. Actually, I thought I'd uncovered a kernel problem with
the Write System Call.

-2) If results seem way out of line with other data something is probably
-   very wrong in your data gathering or generation.  Question the validity
-   of the data before trying to explain it away..

Which is why I wrote a program that does nothing more that transfer data
to the disk and check for errors. I posted that program. Has anyone been 
able to find my error ( which I haven't discounted by any means ),
corroborate my results, or obtain better results when transferring buffers
which are less than 8K in size ?  
-- 
Dewey Paciaffi           ...!uunet!edi386!eddjp