[comp.unix.sysv386] Disk benchmark

shawn@jdyx.UUCP (Shawn Hayes) (10/09/90)

   The company I work for is looking for a 386 based Unix for a new product.
One of the requirements for any system we use is that it have very good disk
access capabilites.  To judge the capabilites of AIX 1.2 and OS/2 (our first
candidates) a benchmark was written that does random and sequential reads
and writes to a file.  

I'd like for anyone who has a 386 based Unix system to run the included
benchmark and post(or email) the results so we have some basis for comparison
with any other Unixes.  The  benchmark is very short.  The only requirement
is that you have a C compiler and a little over two megabytes of disk space
as the file that's used will be 2 megabytes in size.

To compile the benchmark:    cc -o disktest disktest.c sm0run.c

and to run the benchmark:

touch DISKTST                     ;create the test file
time disktest SYNC                ;run the benchmark and force all writes to
                                  ; disk

then record the results

If you wish you can remove the DISKTST file, recreate it and rerun the
benchmark in NOSYNC mode which will result in all writes being postponed
until the OS is ready to post them.  (time disktest NOSYNC)

The results that we recieved from AIX 1.2 were 1.11 minutes for the SYNC test
and about 20 seconds for OS/2 using the same hardware.





------------------------cut here----------------------------------
#!/bin/sh
# This is c, a shell archive (shar 3.32)
# made 10/08/1990 15:19 UTC by root@amber
# Source directory /u/shar/disktest
#
# existing files WILL be overwritten
# This format requires very little intelligence at unshar time.
# "echo" and "sed" will be needed.
#
# This shar contains:
# length  mode       name
# ------ ---------- ------------------------------------------
#   2623 -rw-r--r-- disktest.c
#   3506 -rw-r--r-- sm0run.c
#   1665 -rw-r--r-- sm0run.h
#
# ============= disktest.c ==============
echo "x - extracting disktest.c (Text)"
sed 's/^X//' << 'SHAR_EOF' > disktest.c &&
X
X#include "sm0run.h"
X
Xchar *FileName  = "DISKTST";
Xchar Record[70] = { "123456789012345678901234567890123456789012345678901234567890123456789" };
Xchar BlankRecord[4096];
Xunsigned long next = 1;
X
X
Xmain(argc,argv)
Xint argc;
Xchar *argv[];
X{
X    int      i;
X    int      j;
X    int      Read;
X    int      Wrote;
X    int      Display;
X    int      SyncMode;
X    int      NumLoops = 50;
X    char     Buf[100];
X    FileType Handle;
X
X    Display  = TRUE;
X    SyncMode = SYNC;
X    if( argc >=2 )
X    {
X        if( strncmp("NO",argv[1],2) == 0 )
X            SyncMode = NOSYNC;
X        if( strncmp("no",argv[1],2) == 0 )
X            SyncMode = NOSYNC;
X    }
X    if( argc >=3 )
X        NumLoops = atoi(argv[2]);
X    if( argc >=4 )
X        Display = FALSE;
X
X    printf("Using %s mode.\n", SyncMode==SYNC ? "SYNC" : "NOSYNC" );
X
X    Handle = GenericOpen( FileName, SyncMode );
X
X    /*******************/
X    /*   Create File   */
X    /*******************/
X    printf("Creating file\n");
X    GenericSeek( Handle, 0L );
X    for( i=0; i<560; i++ )
X    {
X        Wrote = GenericWrite(Handle, (void *) BlankRecord,
X                             4096, SyncMode );
X        if (Wrote != 4096)
X        {
X            printf("\nError writing to file\n");
X            exit(0);
X        }
X        if( (i%100)==0 )
X            if( Display )
X                printf("%d\n",i);
X    }
X
X    /****************************/
X    /*   Perform Random Reads   */
X    /****************************/
X    printf("Performing Random Reads\n");
X    for( i=0; i<NumLoops; i++ )
X    {
X        GenericSeek( Handle, (long)RndRec() * sizeof(Record) );
X        Read = GenericRead(Handle, (void *) Buf,
X                           sizeof(Record));
X        if (Read != sizeof(Record))
X        {
X            printf("\nError reading file\n");
X            exit(0);
X        }
X        if( (i%100)==0 )
X            if( Display )
X                printf("%d\n",i);
X    }
X
X    /*****************************/
X    /*   Perform Random Writes   */
X    /*****************************/
X    printf("Performing Random Writes\n");
X    for( i=0; i<NumLoops; i++ )
X    {
X        GenericSeek( Handle, (long)RndRec() * sizeof(Record) );
X        Wrote = GenericWrite(Handle, (void *) Record,
X                             sizeof(Record), SyncMode );
X        if (Wrote != sizeof(Record))
X        {
X            printf("\nError writing to file\n");
X            exit(0);
X        }
X        if( (i%100)==0 )
X            if( Display )
X                printf("%d\n",i);
X    }
X    printf("Finished.\n");
X}
X
Xint RndRec()
X{
X    next = next * 1103515245 + 12345;
X    return( (unsigned int)(next/65536)%32768 );
X}
SHAR_EOF
# ============= sm0run.c ==============
echo "x - extracting sm0run.c (Text)"
sed 's/^X//' << 'SHAR_EOF' > sm0run.c &&
X/*
X*************************************************************************
X*                              sm0run.c                                 *
X*************************************************************************
X*                                                                       *
X*    Description: This module deals with queue calls, hopefully         *
X*                 generically between unix and os/2.                    *
X*                                                                       *
X*                COPYRIGHT 1990 SCIENTIFIC-ATLANTA, INC.                *
X*                            ALL RIGHTS RESERVED                        *
X*************************************************************************
X*    Rev      Date     Author   Change Description                      *
X*   -----   --------   ------   --------------------------------------- *
X*    1.0    03/21/90            Original Release                        *
X*************************************************************************
X*/
X
X#include <sys/types.h>
X#include <errno.h>
X#include "sm0run.h"
X
X#define  Q_PERM_ALL 00666
X#define  BAD_VALUE     -1
X#define  GOOD_VALUE     0
X
X
X
X
X
X
X
X/*
X***********************************
X*                                 *
X*       Generic routines          *
X*                                 *
X***********************************
X*/
X#if 0
X
XFileType GenericOpen(FileName)
X  char       *FileName;
X{
X    return(fopen(FileName, "r+b"));
X}
X
X
Xint GenericFailure(Handle)
X  FileType    Handle;
X{
X    return(Handle == NULL);
X}
X
X
Xvoid GenericClose(Handle)
X  FileType    Handle;
X{
X    fclose(Handle);
X}
X
X
Xint GenericRead(Handle, Buffer, BufLen)
X  FileType    Handle;
X  void       *Buffer;
X  int         BufLen;
X{
X    return(fread(Buffer, 1, BufLen, Handle));
X}
X
X
Xint GenericWrite(Handle, Buffer, BufLen)
X  FileType    Handle;
X  void       *Buffer;
X  int         BufLen;
X{
X    return(fwrite(Buffer, 1, BufLen, Handle));
X}
X
X
Xvoid GenericSeek(Handle, Position)
X  FileType    Handle;
X  long        Position;
X{
X    int         err   ;
X
X    err = fseek(Handle, Position, 0);
X    if (err) printf("fseek err=%d; Position=%ld\n", err, Position ) ;
X}
X
Xvoid GenericFlush( File1, File2, File3 )
X  FileType    File1, File2, File3;
X{
X    fflush( File1 );
X    fflush( File2 );
X    fflush( File3 );
X}
X
X#else
X
X#include <fcntl.h>
X#define O_BINARY 0
X
XFileType GenericOpen( FileName, SyncMode )
X  char       *FileName;
X  int         SyncMode;
X{
X    if( SyncMode == NOSYNC )
X        return(open(FileName, O_RDWR | O_BINARY ));
X    else
X        return(open(FileName, O_RDWR | O_BINARY | O_SYNC ));
X}
X
X
Xint GenericFailure(Handle)
X  FileType    Handle;
X{
X    return(Handle <= 0);
X}
X
X
Xvoid GenericClose(Handle)
X  FileType    Handle;
X{
X    close(Handle);
X}
X
X
Xint GenericRead(Handle, Buffer, BufLen)
X  FileType    Handle;
X  void       *Buffer;
X  int         BufLen;
X{
X    return(read(Handle, Buffer, BufLen));
X}
X
X
Xint GenericWrite(Handle, Buffer, BufLen, SyncMode)
X  FileType    Handle;
X  void       *Buffer;
X  int         BufLen;
X  int         SyncMode;
X{
X    return(write(Handle, Buffer, BufLen));
X}
X
X
Xvoid GenericSeek(Handle, Position)
X  FileType    Handle;
X  long        Position;
X{
X   long        err   ;
X
X    err = lseek(Handle, Position, 0);
X    if (err==-1L) printf("lseek err=%ld; Position=%ld\n", err, Position ) ;
X}
X
Xvoid GenericFlush( File1, File2, File3 )
X  FileType    File1, File2, File3;
X{
X/*    fsync( File1 );
X    fsync( File2 );
X    fsync( File3 );
X*/}
X
X#endif
X
X
X/* end of file */
SHAR_EOF
# ============= sm0run.h ==============
echo "x - extracting sm0run.h (Text)"
sed 's/^X//' << 'SHAR_EOF' > sm0run.h &&
X/*  File: sm0run.h for the IBM-6000 Model 320 */
X
X#include    <stdio.h>
X#include    <stdlib.h>
X#include    <fcntl.h>       /*  needed for OS/2 and AIX  */
X
X/*    #define     TRUE        1    */
X/*    #define     FALSE       0    */
X#define     forever     1
X
Xtypedef     unsigned char       BYTE    ;
Xtypedef     unsigned short      BOOL    ;
Xtypedef     unsigned long       ULONG   ;
Xtypedef     int                 logical ;
Xtypedef     int                 FileType;
X
X#define     DONTWAIT            0L
X#define     WAITFOREVER        -1L
X#define     MAXQUEUES           10
X#define     EMPTY               0
X#define     FULL                100
X#define     MAXBUFFERS          600
X#define     MAXBUFSIZ           70
X
X#define     SYNC                -1
X#define     NOSYNC               0
X
Xint         Comm_Read_String(); /* (char *StrBuffer, char Terminator,
X                                 int MaxLength, long TimeOut, int PortID); */
Xint         Comm_Write_String   (char *StrBuffer, int MaxLength, int PortID);
X
Xint         Read_Queue          (int WhoAmI, int *WhoIsMsgFrom, char *Buffer,
X                                 int *Length, long TimeOut);
Xint         Write_Queue         (int WhoAmI, int WhoIsMsgTo, char *Buffer,
X                                 int Length);
X
Xvoid        Timer_Reset         ();
Xlong        Timer_Now           ();
Xvoid        sleep               (long Time);
X
XFileType    GenericOpen         ();
XBOOL        GerericFailed       ();
Xvoid        GenericClose        ();
Xint         GenericRead         ();
Xint         GenericWrite        ();
Xvoid        GenericSeek         ();
Xvoid        GenericFlush        ();
X
X/* end of file */
SHAR_EOF
exit 0





Please post your results.   Thanks.

Shawn Hayes

shawn@jdyx.UUCP (Shawn Hayes) (10/13/90)

   Dick Dunn was entirely correct in his posting.  I am looking to test both
the SYNC and NOSYNC modes.  The NOSYNC times could also be affected by buffer
size if the buffer is under 2 megabytes.  What I would like to find is either
a version of UNIX that has better disk performance capabilities( perhaps by
putting the inode and the file data at the same point on the disk) or another
way of accessing/updating the data that avoids the inode update penalty. 

   I suspect that the two updates required in Unix explain why OS/2 can
give a performance of up 3 times what AIX 1.2 shows.  If anyone knows of a
method of improving file performance or of a Unix that gives increased file
performance over AIX please speak up.  I'd really rather work on a Unix 
system than OS/2 but disk performance is critical for our application.

Thanks.

Shawn

bruce@segue.segue.com (Bruce Adler) (10/14/90)

In article <1990Oct13.032355.3176@jdyx.UUCP> shawn@jdyx.UUCP (Shawn Hayes) writes:
>   I suspect that the two updates required in Unix explain why OS/2 can
>give a performance of up 3 times what AIX 1.2 shows.  If anyone knows of a
>method of improving file performance or of a Unix that gives increased file
>performance over AIX please speak up.  I'd really rather work on a Unix 
>system than OS/2 but disk performance is critical for our application.

I know zip about os/2 so forgive me if these questions sound stupid.  

It sounds like os/2 merely implements buffer cache write through but 
unix implements true synchronous file updates.  If you're appending to a 
file does os/2 really update the directory entry after every block 
write?  

On os/2 if you crash your system before closing the file do you still 
have all of your data (except perhaps the last block) intact after you 
reboot (including the proper timestamps on the directory entries)?  

If not, then the os/2 scheme is only usable when doing updates in place 
(i.e.  you have to pre-allocate the whole file). (This doesn't seem
like a "fair" comparison to unix's robust file system with synchronous
writes.)

If you're dealing with pre-allocated and initialized database files, 
then on unix you should put your database in a separate partition and 
use the raw-disk interface.  On my machine reading the raw disk (I would 
have tested writing but I don't have an empty partition) via:

	dd if=/dev/rdsk/0s4 of=/dev/null bs=1k count=560
        
takes less than 3 seconds (note: I'm running ISC Unix not AIX).  But to 
use the raw disk interface efficiently you have to do your own buffer 
cache management (be sure to plock() your process).  Now put this all in 
a separate process add crash recovery, shared memory, record locking, 
mutual exclusion, threads and/or an RPC mechanism and I believe this is 
what most commercial, multi-user, high performance database systems do 
(or should be doing) on unix.  

If you're doing something that's time critical or so large and complex 
that you're worried about performance on a 33mhz 386 then it's a waste 
to try to implement your own database system.  Buy a real database 
system.  

If all you're really doing is some flat file record manipulation, or 
what we used to call "data-entry" then you may as well use dos rather 
than os/2 or unix.  So what is it you're really trying to do other than 
avoid having to work on os/2?  :-) 

-- 
bruce@segue.com, ism.isc.com!segue!bruce, aero.org!segue!bruce

shawn@jdyx.UUCP (Shawn Hayes) (10/15/90)

     Well, what we're trying to do is essentially what those commercial 
multi-user high performance database systems do(i.e. we will have multiple  
prgrams accessing the database and it is critical for a low-end(386) system
to write directly to the disk.  As far as pre-allocation goes, that's probably
the way we would do it.  We would have pre-allocated all of the files that
are needed.
    I've tried modifying the benchmark to write/read from the raw-disk      
interface, but I had some problems with it.  I kept getting phys-io errors
in my tests.  If you've got a sample of code that handles raw-disk io I'd be
intersted in seeing it.  Maybe I can figure out what went wrong in my test 
of the raw-disk interface.                 

I'm also going to include my response to Dick Dunn which will help answer a
few questions about what is needed.



 

 
 
   I'm sorry about leaving everyone in the dark about this benchmark. 
I've been working on this off and on for the last six months so I start to
assume everyone understands what's going on. :>   
 
    What my company is looking at doing is making a system that is
portable to at least variations in the Unix OS, if not other operating
systems.  One  portion of this system is a database that will probably
consist of multiple files with multiple keys that must be read/updated by
multiple programs or by  a single program that acts as the database
manager.  
 
   In either case there is another computer that sends the data to us.  
Before the system can acknowledge the other computer the tranasction MUST
be posted to disk, otherwise our system and the other one would get out of
sync during a power failure.  For that reason the SYNC parameter is
needed.
 
   Now, some of you are thinking why not use a UPS system and the NOSYNC 
parameter.  Well, that will work for a larger system, but our marketing
people want us to be able to sell this system to anyone so for small
systems a UPS is out of the question.  That means that whatever operating
system we use must support a write-through cache. 
 
 
Shawn Hayes

cpcahil@virtech.uucp (Conor P. Cahill) (10/16/90)

In article <1990Oct15.091904.5439@jdyx.UUCP> shawn@jdyx.UUCP (Shawn Hayes) writes:
>    I've tried modifying the benchmark to write/read from the raw-disk      
>interface, but I had some problems with it.  I kept getting phys-io errors
>in my tests.  If you've got a sample of code that handles raw-disk io I'd be

I didn't look at the benchmark that you posted, but to read/write from a
raw device all you need to do is write the data in whole blocks to whole
block boundaries.  For most systems the block size is 512.  For some (like
a gould powernode) it is 1K.  If you always write it in multiples of 1K
you satisfy both systems.

There is also an upper limit for the size of a single i/o on the raw deivce
on some systems (Perkin-Elmer has a limit of around 400K).

The part about "write to whole block boundries" means that a given write
must begin at the start of a disk block.  You cant do the following:

	lseek(fd,5,0);
	write(fd,block,blocksize);

Unless, of course, the block size for that device is 5.


-- 
Conor P. Cahill            (703)430-9247        Virtual Technologies, Inc.,
uunet!virtech!cpcahil                           46030 Manekin Plaza, Suite 160
                                                Sterling, VA 22170 

jeff@uf.msc.umn.edu (Jeff Turner) (10/16/90)

In article <> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
>For most systems the block size is 512.  For some (like
>a gould powernode) it is 1K.  If you always write it in multiples of 1K
>you satisfy both systems.

Then there are the CRAYs, large IBMs, and others that use 4KB blocks.
The best bet is to use a multiple of BUFSIZ.

-Jeff
---
Jeff Turner                  Minnesota Supercomputer Center, Inc.
(612) 626-0544               1200 Washington Avenue South
jeff@uh.msc.umn.edu          Minneapolis, Minnesota  55415

johnk@opel.COM (John Kennedy) (10/17/90)

Here's some numbers from SYNC mode:

real 1m 40.36s
user 0m  0.03s
sys  0m  4.48s

This was running on:

AMI 386-33, 64K cache, 4M ram
ISC 2.2
Adaptec 1542B, Wren V (~650M)

filesystem size was...

/usr      (/dev/dsk/0s3    ):   268880 blocks    52040 i-nodes
/usr           :	 Disk space: 131.28 MB of 273.00 MB available (48.09%).

Hope this helps.  I would like to see some other numbers.

John
-- 
John Kennedy                     johnk@opel.COM
Second Source, Inc.
Annapolis, MD