erinadv@gpu.utcs.toronto.edu (Greg Nenych) (05/03/88)
Hello everyone. I've been using Microsoft C 4.0 since December 1986. Since then, I've been very happy with the product until recently when I discovered some rather nasty compiler bugs that I have not been able to work around. I also feel that performance could also be improved quite a bit. Anyway, it's time to upgrade the compiler and I would appreciate any help and advise I can get from the net. I won't bother going through all the details of the bugs; It's an old version and won't be fixed anyway. Bugs do not bother me too much; as long as I can work around them in a portable manner. However, I was recently shocked when a program I wrote in awk actually ran FASTER than my C code! All the compiler people tout how many dhrystones/wetstones they can cruch out but they rarely boast how great their I/O libraries really are. If you'll bear with me, perhaps you can compile the code below and we can compare results using gawk as a common reference. I wrote a program in awk to pick out column 7 to 17 in a large database file. I was using the MKS Toolkit version 2.2c which, according to Gerry Wheeler (wheels@mks.UUCP), was compiled with Turbo C (plus some of their own proprietary libraries). I was happy with the speed but thought I'd make it even faster by writing in in C. I was sadly disappointed. The code generated by Microsoft C (small model, optimized for speed) was slower!!! Here are a few benchmarks I devised. The first is to create a 10,000 line file with 70 characters/line (not including CR LF 's) ---- MAKETEST.AWK ---- invoke with awk -f maketest.awk > testfile BEGIN { for (i=0; i < 10000; i++) { for (j=0; j < 7; j++) { printf("1234567890") } printf("\n"); } exit(0) } ---- MAKETEST.C ---- compile & invode with maketest > testfile #include <stdio.h> #define NOLINES 10000 main() { int i,j; for (i=0; i < NOLINES; i++) { for (j=0; j < 7; j++) printf("1234567890"); printf("\n"); } } ---------- All programs were run three times to make sure the results would be consistent. I also provided times for GNU awk since many of you do not have the MKS stuff. (Also, I believe gawk was compiled with MSC 5.0) Microsoft C MKS awk GNU awk ------------------------------------------------------------ ELAPSED: 7:27.4 11:25.4 10:36.0 ROM time: 0:58.6 0:29.8 0:18.6 DOS time: 3:55.5 0:06.8 0:18.6 User time: 2:33.3 10:48.8 9:25.3 MSC was the leader in this test. Now for the one that surprised me: ---- GETCOL.AWK ---- invoke with awk -f getcol.awk testfile > tstfile2 { print substr($0, 7, 11) } ---- GETCOL.C ---- Compile & invoke with getcol testfile > tstfile2 #include <stdio.h> #define EOSPOS 17 #define START 6 char *progname; main(argc, argv) int argc; char *argv[]; { FILE *fp; char inbuf[BUFSIZ]; progname = argv[0]; if (argc == 1) fp = stdin; else if ((fp = fopen(argv[1], "r")) == NULL) { fprintf(stderr, "%s: cannot open %s for read", progname, argv[1]); exit(1); } while(fgets(inbuf, BUFSIZ, fp) != NULL) { if (strlen(inbuf) < EOSPOS) break; inbuf[EOSPOS] = '\0'; printf("%s\n", &inbuf[START]); } (void) fclose(fp); exit(0); } Microsoft C MKS awk GNU awk ------------------------------------------------------------ ELAPSED: 3:39.4 3:08.2 4:59.5 ROM time: 0:53.1 0:32.5 0:50.6 DOS time: 1:11.2 0:05.2 0:13.7 User time: 1:35.1 2:30.6 3:55.2 I have narrowed my choices down to 2: Upgrading to MSC 5.1 (for $225 Canadian) which will give me a super debugging environment to help me find more compiler bugs :-), a graphics library, OS/2 support, plus a lot of other nifty things. However, the upgrade costs almost as much as a 5.1 compiler off the shelf. (A good price here is $375.) My other choice is TURBO C v 1.5 (?) which will set me back about $85 Canadian. I don't know much about this one but have noticed that more & more developers of well known products have make the switch to Turbo C. At this point, I have not made up my mind as to what I shall do. Perhaps those of you who are serious C programmers can help. What I want is a good compiler that produces tight, clean code, conforms to ANSI C, has a GOOD UNIX library implementation, supports multiple memory models, does not tie you to an integrated envoronment, and has a good debugging facility (either PD or vendor supported). OS/2 support is not really needed but is a nice plus. I do not want to start a compiler war on the net. (Put away your flame throwers! :-) If you have the time and just over 1 Meg of disk space, please compile and run the (unaltered) code I have provided and also do a run of the awk code with the gawk that was recently posted. Although I am specifically interested in Microsoft C 5.1 and Turbo C, It would be interesting to see how others do. PLEASE SEND E-MAIL TO THE ADDRESS BELOW. I will summarize to the net if I get enough replies. Thanks in advance. By the way... you can use this code to fine-tune your disk performance by altering the BUFFERS=?? in your CONFIG.SYS file. I managed to cut the execution time of the second benchmark by 50 - 60 seconds after only a little bit of trial and error and have already noticed a speedup in the other applications I run. -- Greg Nenych University of Toronto Computing Services - Erindale College {allegra,cbosgd,decvax,ihnp4,utcsri,utzoo}!utgpu!utcseri!nenych