[comp.sys.ibm.pc] C compiler query

erinadv@gpu.utcs.toronto.edu (Greg Nenych) (05/03/88)

Hello everyone.  I've been using Microsoft C 4.0 since December 1986.
Since then, I've been very happy with the product until recently when
I discovered some rather nasty compiler bugs that I have not been able
to work around.  I also feel that performance could also be improved 
quite a bit.  Anyway, it's time to upgrade the compiler and I would
appreciate any help and advise I can get from the net.

I won't bother going through all the details of the bugs;  It's an old
version and won't be fixed anyway.  Bugs do not bother me too much; as
long as I can work around them in a portable manner.  However, I was
recently shocked when a program I wrote in awk actually ran FASTER than
my C code!  All the compiler people tout how many dhrystones/wetstones
they can cruch out but they rarely boast how great their I/O libraries
really are.  If you'll bear with me, perhaps you can compile the code
below and we can compare results using gawk as a common reference.

I wrote a program in awk to pick out column 7 to 17 in a large database
file.  I was using the MKS Toolkit version 2.2c which, according to
Gerry Wheeler (wheels@mks.UUCP), was compiled with Turbo C (plus some of
their own proprietary libraries). I was happy with the speed but thought
I'd make it even faster by writing in in C.  I was sadly disappointed.
The code generated by Microsoft C (small model, optimized for speed) was
slower!!!

Here are a few benchmarks I devised.  The first is to create a 10,000
line file with 70 characters/line (not including CR LF 's)

---- MAKETEST.AWK ----  invoke with     awk -f maketest.awk > testfile

BEGIN {
    for (i=0; i < 10000; i++) {
        for (j=0; j < 7; j++) {
            printf("1234567890")
        }

        printf("\n");
    }
    exit(0)
}

---- MAKETEST.C ---- compile & invode with    maketest > testfile

#include <stdio.h>

#define NOLINES 10000

main()
{
    int i,j;

    for (i=0; i < NOLINES; i++) {
        for (j=0; j < 7; j++)
            printf("1234567890");

        printf("\n");
    }
}

----------
All programs were run three times to make sure the results would be
consistent.  I also provided times for GNU awk since many of you
do not have the MKS stuff.  (Also, I believe gawk was compiled with
MSC 5.0)

               Microsoft C         MKS awk          GNU awk
------------------------------------------------------------
ELAPSED:         7:27.4            11:25.4          10:36.0
ROM time:        0:58.6             0:29.8           0:18.6
DOS time:        3:55.5             0:06.8           0:18.6
User time:       2:33.3            10:48.8           9:25.3

MSC was the leader in this test.  Now for the one that surprised me:

---- GETCOL.AWK ---- invoke with     awk -f getcol.awk testfile > tstfile2

{
print substr($0, 7, 11)
}

---- GETCOL.C ---- Compile & invoke with    getcol testfile > tstfile2

#include <stdio.h>
#define EOSPOS 17
#define START 6

char *progname;

main(argc, argv)
int argc;
char *argv[];
{
    FILE *fp;
    char inbuf[BUFSIZ];

    progname = argv[0];

    if (argc == 1)
        fp = stdin;
    else
        if ((fp = fopen(argv[1], "r")) == NULL) {
            fprintf(stderr, "%s: cannot open %s for read", progname, argv[1]);

            exit(1);
        }

    while(fgets(inbuf, BUFSIZ, fp) != NULL) {

        if (strlen(inbuf) < EOSPOS)
            break;

        inbuf[EOSPOS] = '\0';

        printf("%s\n", &inbuf[START]);
    }
    (void) fclose(fp);
    exit(0);
}

               Microsoft C         MKS awk          GNU awk
------------------------------------------------------------
ELAPSED:         3:39.4             3:08.2           4:59.5
ROM time:        0:53.1             0:32.5           0:50.6
DOS time:        1:11.2             0:05.2           0:13.7
User time:       1:35.1             2:30.6           3:55.2


I have narrowed my choices down to 2:  Upgrading to MSC 5.1 (for $225
Canadian) which will give me a super debugging environment to help me find
more compiler bugs :-), a graphics library, OS/2 support, plus a lot of
other nifty things.  However, the upgrade costs almost as much as a 5.1
compiler off the shelf.  (A good price here is $375.)

My other choice is TURBO C v 1.5 (?) which will set me back about $85
Canadian.  I don't know much about this one but have noticed that more &
more developers of well known products have make the switch to Turbo C.

At this point, I have not made up my mind as to what I shall do.
Perhaps those of you who are serious C programmers can help.  What I
want is a good compiler that produces tight, clean code, conforms to
ANSI C, has a GOOD UNIX library implementation, supports multiple memory
models, does not tie you to an integrated envoronment, and has a good
debugging facility (either PD or vendor supported).  OS/2 support is not
really needed but is a nice plus.

I do not want to start a compiler war on the net. (Put away your flame
throwers! :-)  If you have the time and just over 1 Meg of disk space,
please compile and run the (unaltered) code I have provided and also do a
run of the awk code with the gawk that was recently posted.  Although I am
specifically interested in Microsoft C 5.1 and Turbo C,  It would be
interesting to see how others do.  PLEASE SEND E-MAIL TO THE ADDRESS BELOW.
I will summarize to the net if I get enough replies.

Thanks in advance.

By the way... you can use this code to fine-tune your disk performance
by altering the BUFFERS=?? in your CONFIG.SYS file.  I managed to cut the
execution time of the second benchmark by 50 - 60 seconds after only a
little bit of trial and error and have already noticed a speedup in the
other applications I run.
-- 
Greg Nenych
University of Toronto Computing Services - Erindale College
{allegra,cbosgd,decvax,ihnp4,utcsri,utzoo}!utgpu!utcseri!nenych