erinadv@gpu.utcs.toronto.edu (Greg Nenych) (05/03/88)
Hello everyone. I've been using Microsoft C 4.0 since December 1986.
Since then, I've been very happy with the product until recently when
I discovered some rather nasty compiler bugs that I have not been able
to work around. I also feel that performance could also be improved
quite a bit. Anyway, it's time to upgrade the compiler and I would
appreciate any help and advise I can get from the net.
I won't bother going through all the details of the bugs; It's an old
version and won't be fixed anyway. Bugs do not bother me too much; as
long as I can work around them in a portable manner. However, I was
recently shocked when a program I wrote in awk actually ran FASTER than
my C code! All the compiler people tout how many dhrystones/wetstones
they can cruch out but they rarely boast how great their I/O libraries
really are. If you'll bear with me, perhaps you can compile the code
below and we can compare results using gawk as a common reference.
I wrote a program in awk to pick out column 7 to 17 in a large database
file. I was using the MKS Toolkit version 2.2c which, according to
Gerry Wheeler (wheels@mks.UUCP), was compiled with Turbo C (plus some of
their own proprietary libraries). I was happy with the speed but thought
I'd make it even faster by writing in in C. I was sadly disappointed.
The code generated by Microsoft C (small model, optimized for speed) was
slower!!!
Here are a few benchmarks I devised. The first is to create a 10,000
line file with 70 characters/line (not including CR LF 's)
---- MAKETEST.AWK ---- invoke with awk -f maketest.awk > testfile
BEGIN {
for (i=0; i < 10000; i++) {
for (j=0; j < 7; j++) {
printf("1234567890")
}
printf("\n");
}
exit(0)
}
---- MAKETEST.C ---- compile & invode with maketest > testfile
#include <stdio.h>
#define NOLINES 10000
main()
{
int i,j;
for (i=0; i < NOLINES; i++) {
for (j=0; j < 7; j++)
printf("1234567890");
printf("\n");
}
}
----------
All programs were run three times to make sure the results would be
consistent. I also provided times for GNU awk since many of you
do not have the MKS stuff. (Also, I believe gawk was compiled with
MSC 5.0)
Microsoft C MKS awk GNU awk
------------------------------------------------------------
ELAPSED: 7:27.4 11:25.4 10:36.0
ROM time: 0:58.6 0:29.8 0:18.6
DOS time: 3:55.5 0:06.8 0:18.6
User time: 2:33.3 10:48.8 9:25.3
MSC was the leader in this test. Now for the one that surprised me:
---- GETCOL.AWK ---- invoke with awk -f getcol.awk testfile > tstfile2
{
print substr($0, 7, 11)
}
---- GETCOL.C ---- Compile & invoke with getcol testfile > tstfile2
#include <stdio.h>
#define EOSPOS 17
#define START 6
char *progname;
main(argc, argv)
int argc;
char *argv[];
{
FILE *fp;
char inbuf[BUFSIZ];
progname = argv[0];
if (argc == 1)
fp = stdin;
else
if ((fp = fopen(argv[1], "r")) == NULL) {
fprintf(stderr, "%s: cannot open %s for read", progname, argv[1]);
exit(1);
}
while(fgets(inbuf, BUFSIZ, fp) != NULL) {
if (strlen(inbuf) < EOSPOS)
break;
inbuf[EOSPOS] = '\0';
printf("%s\n", &inbuf[START]);
}
(void) fclose(fp);
exit(0);
}
Microsoft C MKS awk GNU awk
------------------------------------------------------------
ELAPSED: 3:39.4 3:08.2 4:59.5
ROM time: 0:53.1 0:32.5 0:50.6
DOS time: 1:11.2 0:05.2 0:13.7
User time: 1:35.1 2:30.6 3:55.2
I have narrowed my choices down to 2: Upgrading to MSC 5.1 (for $225
Canadian) which will give me a super debugging environment to help me find
more compiler bugs :-), a graphics library, OS/2 support, plus a lot of
other nifty things. However, the upgrade costs almost as much as a 5.1
compiler off the shelf. (A good price here is $375.)
My other choice is TURBO C v 1.5 (?) which will set me back about $85
Canadian. I don't know much about this one but have noticed that more &
more developers of well known products have make the switch to Turbo C.
At this point, I have not made up my mind as to what I shall do.
Perhaps those of you who are serious C programmers can help. What I
want is a good compiler that produces tight, clean code, conforms to
ANSI C, has a GOOD UNIX library implementation, supports multiple memory
models, does not tie you to an integrated envoronment, and has a good
debugging facility (either PD or vendor supported). OS/2 support is not
really needed but is a nice plus.
I do not want to start a compiler war on the net. (Put away your flame
throwers! :-) If you have the time and just over 1 Meg of disk space,
please compile and run the (unaltered) code I have provided and also do a
run of the awk code with the gawk that was recently posted. Although I am
specifically interested in Microsoft C 5.1 and Turbo C, It would be
interesting to see how others do. PLEASE SEND E-MAIL TO THE ADDRESS BELOW.
I will summarize to the net if I get enough replies.
Thanks in advance.
By the way... you can use this code to fine-tune your disk performance
by altering the BUFFERS=?? in your CONFIG.SYS file. I managed to cut the
execution time of the second benchmark by 50 - 60 seconds after only a
little bit of trial and error and have already noticed a speedup in the
other applications I run.
--
Greg Nenych
University of Toronto Computing Services - Erindale College
{allegra,cbosgd,decvax,ihnp4,utcsri,utzoo}!utgpu!utcseri!nenych