corwin@bsu-cs.UUCP (Paul Frommeyer) (05/09/87)
Sorry for mutliple newsgroups, and major apologies if someone has already noticed this bug (we but recently joined Usenet), but this seems serious: Several other students and us were working on a software engineering project involving the display of 39 or more colors on a VT240. We were writing in VAX C under VMS 4.5. Our program used several lists to keep track of all the data it needed for window size, current display, etc. Our lists were interconnected, and in setting up list elements for new windows we used extensive pointer indirection in calls to malloc(). What we found was, to say the least, interesting. Third-level pointer indirection, such as list1->link2->link3->data worked fine in a call such as list1->link2->link3 = (struct lista *) malloc (sizeof(struct lista)); Analysis in VAX DEBUG would reveal that link3 had a value, say, of 55080. Now, if we attempted another call to malloc like pointer1->link2->link3 = (struct lista *) malloc... etc. there was no problem. This would give link3 a value of, say, 55200 if struct lista was 120 bytes long. But at one point we needed to pointer1->ptr2->ptr3->ptr4->ptr5->ptr6 = (struct lista *)... etc. When we tried this, all hell broke loose! Malloc calls the system service for allocating more virtual memory, in addition to performing some VAX C housekeeping chores. Now, we would expect ptr6 to have a value of 55200, assuming malloc was giving contiguous memory, which it seems to do. What we never expected was to get a value BETWEEN 55080 and 55200. But that's exactly what happened! When malloc was called with more than 3 levels of pointer indirection, why, it just crashed! It would give a value usually about 4 bytes more than the last byte for the structure pointed to by link3, say about 55084. Malloc was overlapping virtual memory!! It took one of our team members a day to figure out what was going wrong! Nowhere in any of the VAX C manuals that we read does it caution against this! Noncontiguous allocation might be expected, but overlapping addresses was definitely dirty pool on the part of the computer. After much discussion we concluded that the fault lay in the compiler; it was not generating correct VAX-11 native code to do multiple indirection. We don't know what it IS doing, really, but we can see that it doesn't work properly. It doesn't seem that the fault could be at the system service level, but it is impossible to tell because we cannot see the generated object code (our installation does not posess a disassembler). Digital might say this is a feature to keep programmers from using excessive indirection, but what pray tell is wrong with excessive indirection, especially when it allows manipulation of many interconnected lists? Is there anyone out there who has encountered a similiar problem? Does Unix C do this, too? We thought it might be the VAX hardware, but further thought seemed to point to the compiler. There are already more bugs than a bait store in the VAX C I/O library routines for fseek() and lseek(), but those are at least documented in the VAX C manual. Would anyone at Digital care to comment on this? Are there any fixes? We're both graduating and job hunting, so getting hold of us may be tricky. Anyway, beware deep pointer indirection in VAX C. It's a killer. Paul Frommeyer ({ihnp4,seismo}!{pur-ee,iuvax}!corwin) Russ Walker ({ihnp4,seismo}!{pur-ee,iuvax}!mnp) Computer Science Dept. Ball State University Muncie, IN 47306
corwin@bsu-cs.UUCP (Paul Frommeyer) (05/11/87)
(Sorry, had to move this from comp.lang.c++ to comp.lang.c and repost...) Several other students and us were working on a software engineering project involving the display of 39 or more colors on a VT240. We were writing in VAX C under VMS 4.5. Our program used several lists to keep track of all the data it needed for window size, current display, etc. Our lists were interconnected, and in setting up list elements for new windows we used extensive pointer indirection in calls to malloc(). What we found was, to say the least, interesting. Third-level pointer indirection, such as list1->link2->link3->data worked fine in a call such as list1->link2->link3 = (struct lista *) malloc (sizeof(struct lista)); Analysis in VAX DEBUG would reveal that link3 had a value, say, of 55080. Now, if we attempted another call to malloc like pointer1->link2->link3 = (struct lista *) malloc... etc. there was no problem. This would give link3 a value of, say, 55200 if struct lista was 120 bytes long. But at one point we needed to pointer1->ptr2->ptr3->ptr4->ptr5->ptr6 = (struct lista *)... etc. When we tried this, all hell broke loose! Malloc calls the system service for allocating more virtual memory, in addition to performing some VAX C housekeeping chores. Now, we would expect ptr6 to have a value of 55200, assuming malloc was giving contiguous memory, which it seems to do. What we never expected was to get a value BETWEEN 55080 and 55200. But that's exactly what happened! When malloc was called with more than 3 levels of pointer indirection, why, it just crashed! It would give a value usually about 4 bytes more than the last byte for the structure pointed to by link3, say about 55084. Malloc was overlapping virtual memory!! It took one of our team members a day to figure out what was going wrong! Nowhere in any of the VAX C manuals that we read does it caution against this! Noncontiguous allocation might be expected, but overlapping addresses was definitely dirty pool on the part of the computer. After much discussion we concluded that the fault lay in the compiler; it was not generating correct VAX-11 native code to do multiple indirection. We don't know what it IS doing, really, but we can see that it doesn't work properly. It doesn't seem that the fault could be at the system service level, but it is impossible to tell because we cannot see the generated object code (our installation does not posess a disassembler). Digital might say this is a feature to keep programmers from using excessive indirection, but what pray tell is wrong with excessive indirection, especially when it allows manipulation of many interconnected lists? Is there anyone out there who has encountered a similiar problem? Does Unix C do this, too? We thought it might be the VAX hardware, but further thought seemed to point to the compiler. There are already more bugs than a bait store in the VAX C I/O library routines for fseek() and lseek(), but those are at least documented in the VAX C manual. A limitation on indirection should be, too. Would anyone at Digital care to comment on this? Are there any fixes? We're both graduating and job hunting, so getting hold of us may be tricky. Anyway, beware deep pointer indirection in VAX C. It's a killer. Paul Frommeyer ({ihnp4,seismo}!{pur-ee,iuvax}!corwin) Russ Walker ({ihnp4,seismo}!{pur-ee,iuvax}!mnp) Computer Science Dept. Ball State University Muncie, IN 47306