sundar@ai.mit.edu (Sundar Narasimhan) (12/29/90)
Hi: For some time now I've been chasing a bug in a rather large program of mine. Turns out on every other architecture but the Pyramid the process size of this program turns out to be half as it is on the Pyramid. We've tried replacing the default version of malloc() but it doesn't help. I used a test program (appended below) to produce the following test results. The program continuously allocates (calls calloc) with a specified size and then keeps track of the difference between the actual addresses returned. 1. pyramid: Results: chunksize 200 times 15000 minerror 60 maxerror 60 avg 60.000000 Results: chunksize 220 times 15000 minerror 40 maxerror 40 avg 39.999996 Results: chunksize 240 times 15000 minerror 20 maxerror 20 avg 19.999998 Results: chunksize 250 times 15000 minerror 10 maxerror 10 avg 9.999999 Results: chunksize 255 times 15000 minerror 5 maxerror 5 avg 5.000000 Results: chunksize 256 times 15000 minerror 4 maxerror 4 avg 4.000000 Results: chunksize 257 times 15000 minerror 287 maxerror 287 avg 287.000000 Results: chunksize 260 times 15000 minerror 284 maxerror 284 avg 283.999969 Results: chunksize 280 times 15000 minerror 264 maxerror 264 avg 264.000000 Results: chunksize 300 times 15000 minerror 244 maxerror 244 avg 244.000000 Results: chunksize 1024 times 15000 minerror 32 maxerror 32 avg 32.000000 Results: chunksize 1025 times 15000 minerror 1055 maxerror 1055 avg 1055.000000 Results: chunksize 512 times 15000 minerror 32 maxerror 32 avg 32.000000 Results: chunksize 513 times 15000 minerror 543 maxerror 543 avg 543.000000 Results: chunksize 514 times 15000 minerror 542 maxerror 542 avg 542.000000 2. Sun: Results: chunksize 220 times 15000 minerror 12 maxerror 8252 avg 12.587772 Results: chunksize 280 times 15000 minerror 8 maxerror 8488 avg 8.565371 Results: chunksize 250 times 15000 minerror 14 maxerror 8326 avg 14.811788 Results: chunksize 200 times 15000 minerror 8 maxerror 8392 avg 8.829922 Results: chunksize 300 times 15000 minerror 12 maxerror 8396 avg 12.634709 Results: chunksize 511 times 15000 minerror 9 maxerror 8601 avg 9.699780 Results: chunksize 512 times 15000 minerror 8 maxerror 8600 avg 8.699780 Results: chunksize 513 times 15000 minerror 15 maxerror 8591 avg 16.087006 Note that after a chunksize of 256 we start incurring almost 100% overhead. (This exactly explains our observations with our program -- it is now almost twice as large as it should be when it runs). For obvious reasons, this is a BAD thing. I'd appreciate it if someone can suggest an explanation/obvious fixes. I would include the exact hw/OS version on the Pyramid if I thought that would help. And here is the program used to produce the test results. Thanks in advance for all your help. -Sundar -------------------------------- #include <stdio.h> main(argc, argv) int argc; char *argv[]; { char **test; int chunksize; int i, j, notimes, count, error; int minerror, maxerror, toterror; float avg; if (argc != 3) { printf ("memtest chunksize notimes\n"); exit(0); } chunksize = atoi(argv[1]); notimes = atoi(argv[2]); if (chunksize <= 0) { fprintf(stderr, "chunksize must be > 0\n"); exit(1); } if (notimes <= 1) { fprintf(stderr, "notimes must be > 1\n"); exit(0); } if ((test = (char **)(calloc(sizeof(char *), notimes))) == NULL) { fprintf(stderr, "first calloc failed\n"); exit(1); } for (i=0; i<notimes; i++) { if ((test[i] = (char *)malloc(sizeof (char)* chunksize)) == NULL) { fprintf(stderr, "%d'th calloc failed (chsize=%d)\n", i, chunksize); break; } } for (j=1, count=0, toterror=0; j<(i);j++) { error = test[j] - test[j-1] - chunksize; count++; if (error < minerror || j == 1) minerror = error; if (error > maxerror || j == 1) maxerror = error; toterror += error; } avg = (float) toterror / (float) count; printf("Results: chunksize %d times %d minerror %d maxerror %d avg %f\n", chunksize, notimes, minerror, maxerror, avg); printf("waiting...\n"); getchar(); }
romain@salt.pyramid.com (Romain Kang) (01/04/91)
| Hi: For some time now I've been chasing a bug in a rather large | program of mine. Turns out on every other architecture but the | Pyramid the process size of this program turns out to be half as | it is on the Pyramid. The default version of malloc() attempts to align "large" chunks of memory for optimal hardware performance. Recognizing situations like yours, there is a special libmalloc.a in the att universe that attempts different space/time tradeoffs. (Both ucb and the att default malloc() are variants of the Caltech powers-of-two fast allocator. What was the alternate malloc() you were using?) Unfortunately for you, libmalloc.a is not currently supported in the ucb universe. If you need it, call Customer Support. In the mean time, the following makefile will build a ucb-usable version of att libmalloc.a (and don't tell anyone you got it from me). Re-running your Sun numbers, I get: Results: chunksize 220 times 15000 minerror 4 maxerror 4 avg 4.000000 Results: chunksize 280 times 15000 minerror 4 maxerror 8 avg 4.029335 Results: chunksize 250 times 15000 minerror 6 maxerror 6 avg 6.000000 Results: chunksize 200 times 15000 minerror 4 maxerror 12 avg 4.799253 Results: chunksize 300 times 15000 minerror 4 maxerror 4 avg 4.000000 Results: chunksize 511 times 15000 minerror 5 maxerror 13 avg 5.031735 Results: chunksize 512 times 15000 minerror 4 maxerror 12 avg 4.031735 Results: chunksize 513 times 15000 minerror 7 maxerror 15 avg 7.126409 ======================================================================== # # Use att libmalloc.a in the ucb universe # M_OBJS = malloc.r mem.r assert.o libmalloc.a: $(M_OBJS) ar r $@ $(M_OBJS) ranlib $@ malloc.r: att ld -r -o $@ -u _malloc -lmalloc mem.r: att ld -r -o $@ -u _memcpy -u _memset -lc # # Can't ld -u __assert, because it would pull in abort(), # which pulls in att stdio, which is incompatible with ucb <stdio.h> # assert.o: /.attlib/libc.a att ar x /.attlib/libc.a $@ mtest: mtest.o libmalloc.a cc -o mtest mtest.o libmalloc.a clean: rm -f $(M_OBJS) mtest mtest.o clobber: rm -f $(M_OBJS) libmalloc.a -- "Eggheads unite! You have nothing to lose but your yolks!" -Adlai Stevenson