taylorr@glycine.cs.unc.edu (Russell Taylor) (06/07/90)
We are running OS 3.2.2 on an IRIS 4D/240GTX. I ran a program and got the proverbial 'Bus error (core dumped)' message. The catch is that when I run dbx and look for the error, it tells me that the error occured IN malloc(): ...Source (of malloc.c) not available... There are several calls to malloc() in the code. There have been successful calls before this call is made. All calls are passed constant references, and this code compiles and runs correctly on a variety of other machines (VAX, sun 4, DecStation). Is there a known bug (and hopefully fix) for this? Thanks, Russell Taylor taylorr@cs.unc.edu
cycy@isl1.ri.cmu.edu (Scum) (06/08/90)
In article <14525@thorin.cs.unc.edu>, taylorr@glycine.cs.unc.edu (Russell Taylor) writes: | We are running OS 3.2.2 on an IRIS 4D/240GTX. I ran a program and | got the proverbial 'Bus error (core dumped)' message. The catch is that | when I run dbx and look for the error, it tells me that the error occured | IN malloc(): | There are several calls to malloc() in the code. There have been | successful calls before this call is made. All calls are passed constant | references, and this code compiles and runs correctly on a variety of other | machines (VAX, sun 4, DecStation). | | Is there a known bug (and hopefully fix) for this? Try linking with the malloc library. Just use -lmalloc as an argument when you are linking; this will provide an alternative version of malloc which seems to work better. This has solved the problem for us in similar cases. Good luck. -- Chris. -- -- Chris. (cycy@isl1.ri.cmu.edu) "People make me pro-nuclear." -- Margarette Smith
paquette@cpsc.ucalgary.ca (Trevor Paquette) (06/11/90)
In article <14525@thorin.cs.unc.edu>, taylorr@glycine.cs.unc.edu (Russell Taylor) writes: > > We are running OS 3.2.2 on an IRIS 4D/240GTX. I ran a program and > got the proverbial 'Bus error (core dumped)' message. The catch is that > when I run dbx and look for the error, it tells me that the error occured > IN malloc(): > > ....Source (of malloc.c) not available... > > There are several calls to malloc() in the code. There have been > successful calls before this call is made. All calls are passed constant > references, and this code compiles and runs correctly on a variety of other > machines (VAX, sun 4, DecStation). > > Is there a known bug (and hopefully fix) for this? > > Thanks, > Russell Taylor > taylorr@cs.unc.edu In the files that use malloc add the following.. #include <malloc.h> then when compiling add '-lmalloc' to your list of libraries. I have had this problem before and this cleared it up. Note: this does not 'fix' the problem.. you are now using a different malloc. Trev ___________________________________________/No man is a failure who has friends Trevor Paquette ICBM:51'03"N/114'05"W|I accept the challange, body and soul, {ubc-cs,utai,alberta}!calgary!paquette|to seek the knowledge of the ones of old paquette@cpsc.ucalgary.ca | - engraved on the Kersa Blade of Esalon
yohn@tumult.asd.sgi.com (Mike Thompson) (06/12/90)
In article <14525@thorin.cs.unc.edu>, taylorr@glycine.cs.unc.edu (Russell Taylor) writes: > > We are running OS 3.2.2 on an IRIS 4D/240GTX. I ran a program and > got the proverbial 'Bus error (core dumped)' message. The catch is that > when I run dbx and look for the error, it tells me that the error occured > IN malloc(): > > ...Source (of malloc.c) not available... > > There are several calls to malloc() in the code. There have been > successful calls before this call is made. All calls are passed constant > references, and this code compiles and runs correctly on a variety of other > machines (VAX, sun 4, DecStation). > > Is there a known bug (and hopefully fix) for this? > > Thanks, > Russell Taylor > taylorr@cs.unc.edu I cannot guarantee that there are no bugs in malloc (I assume you are getting malloc from libc), but I don't know of any (besides performance problems when allocating many memory areas). But I have seen many, many user programs that bomb in malloc because the user code overran the memory allocated by a call to malloc. malloc(strlen(s)) and copying s is a classic way to get into trouble (user forgets that strlen does not account for the trailing null character) -- there are many other possibilities. Since malloc(3X) -- the malloc in /usr/lib/libmalloc.a -- aligns requests to eight-byte boundaries and malloc(3C) aligns only to four-bytes, switching to libmalloc may help if only that it masks gives the caller a little more unrequested rounding space. This may be what's happening with the malloc calls on your Vaxen, etc. Now if your program does make many calls to malloc, it is usually best to link with libmalloc. The two mallocs do have slightly different behavior -- libmalloc will return a null pointer when asked for zero bytes and will ignore a null pointer on free; libc malloc will not touch the just-freed space until (at least) the next call to malloc/free. Usually these behaviors are not a concern. Mike Thompson
dwatts@ki.UUCP (Dan Watts) (06/12/90)
In article <1990Jun10.211156.16153@calgary.uucp> paquette@cpsc.ucalgary.ca (Trevor Paquette) writes: >In article <14525@thorin.cs.unc.edu>, taylorr@glycine.cs.unc.edu (Russell Taylor) writes: >> >> We are running OS 3.2.2 on an IRIS 4D/240GTX. I ran a program and >> got the proverbial 'Bus error (core dumped)' message. The catch is that >> when I run dbx and look for the error, it tells me that the error occured >> IN malloc(): > < stuff deleted > > Note: this does not 'fix' the problem.. you are now using a different > malloc. My experience has been that this error is caused by a program writing outside the bounds of a malloc'd area. This causes the hidden memory management headers to get corrupted. A quick hack would be to add some constant pad to all malloc's. Try adding 128 bytes and see if that does it. Since my mistakes are usually in writing one byte too much, a pad of 16 works ok. Note that this hasn't solved the code problem, it's just defensive programming. You might also not failures in printf() due to the same reason. I usually track this down by putting in calls to malloc() in other places in the code and try to find the _bad_ code by seeing which ones work, and which ones die (note to free() the malloc()'d mem just after getting it). -- ##################################################################### # CompuServe: >INTERNET:uunet.UU.NET!ki!dwatts Dan Watts # # UUCP : ...!uunet!ki!dwatts Ki Research, Inc. # ############### New Dimensions In Network Connectivity ##############
swed@aerospace.aero.org (Gregory D. Swedberg) (06/12/90)
Are you ever calling realloc? The IRIS does not seem to implement it correctly, when reallocing to the a larger size it seems to just return the original pointer rather than a pointer to a new larger block. I have had to give up on realloc on the IRIS.
mds@mds.sgi.com (Mark D. Stadler) (06/13/90)
In article <62083@sgi.sgi.com> yohn@tumult.asd.sgi.com (Mike Thompson) writes: >In article <14525@thorin.cs.unc.edu>, taylorr@glycine.cs.unc.edu (Russell Taylor) writes: >> >> We are running OS 3.2.2 on an IRIS 4D/240GTX. I ran a program and >> got the proverbial 'Bus error (core dumped)' message. The catch is that >> when I run dbx and look for the error, it tells me that the error occured >> IN malloc(): >> ... >> There are several calls to malloc() in the code. There have been >> successful calls before this call is made. All calls are passed constant >> references, and this code compiles and runs correctly on a variety of other >> machines (VAX, sun 4, DecStation). >> ... >> Is there a known bug (and hopefully fix) for this? > >I cannot guarantee that there are no bugs in malloc (I assume you are >getting malloc from libc), but I don't know of any (besides performance >problems when allocating many memory areas). But I have seen many, >many user programs that bomb in malloc because the user code overran >the memory allocated by a call to malloc. malloc(strlen(s)) and >copying s is a classic way to get into trouble (user forgets that >strlen does not account for the trailing null character) -- there are >many other possibilities. > >Since malloc(3X) -- the malloc in /usr/lib/libmalloc.a -- aligns >requests to eight-byte boundaries and malloc(3C) aligns only to >four-bytes, switching to libmalloc may help if only that it masks gives >the caller a little more unrequested rounding space. > i've examined a number of malloc() problems throughout the last 7 years or so, and have always traced the problem back to the application... there are a couple of good reasons that malloc() usage problems are masked on a machine and libmalloc basis. first of all, i know that a number of VMS programs have malloc problems once they are ported to unix. the VMS malloc rounds the request up to the nearest multiple of 512 (page size). then it skips the next virtual page. this turns out to be a great debug tool since you get core dumps when you hit the next page instead of quietly corrupting some other data structure. unfortunately, the granularity is only at the page level, so small problems are masked and only surface in other environments. VAX unix may act similar, but i don't know for sure. the traditional libc malloc approach uses a linked list scheme where the next pointers are embedded in the memory arena. if you overwrite a chunk of malloced memory, you corrupt the linked list and the next call to malloc() will traverse into the boonies. the libmalloc approach keeps the pointers into the memory arena in a separate area and therefore, if you overwrite a chunk of malloced memory, you may corrupt some other data structure that doesn't really matter anyway... (at least not at the time). since the next pointers are saved from corruption, malloc() won't dump core. but you still have a problem lurking out there somewhere. i think i'd stick to the old malloc() and narrow the problem down more. if you mask this symptom, you will make it even more difficult to isolate a problem further down the road. -- mds [aka Mark D Stadler mds@sgi.com ...!uunet!sgi!mds (415)335-1327]
krk@cs.purdue.EDU (Kevin Kuehl) (06/13/90)
In article <789@ki.UUCP> dwatts@ki.UUCP (Dan Watts) writes: >in other places in the code and try to find the _bad_ code by seeing which >ones work, and which ones die (note to free() the malloc()'d mem just Another thing you can do (if you are fortunate enough to have access to a Sun4) is to use the `malloc_debug(2)' on a Sun4. This is one of the greatest tools I have ever used. On every call to malloc, it checks the heap and verifies that it is not corrupted. If it is corrupted, the program dumps core so you can find it. This would be great to have under Irix, don't you think? Whatd'ya say at SGI? I would really appreciate this feature. Kevin krk@cs.purdue.edu ..!{decwrl,ucbvax,gatech}!purdue!krk
vjs@rhyolite.wpd.sgi.com (Vernon Schryver) (06/13/90)
In article <10822@medusa.cs.purdue.edu>, krk@cs.purdue.EDU (Kevin Kuehl) writes: > In article <789@ki.UUCP> dwatts@ki.UUCP (Dan Watts) writes: > > Another thing you can do (if you are fortunate enough to have access > to a Sun4) is to use the `malloc_debug(2)' on a Sun4. This is one of > the greatest tools I have ever used. On every call to malloc, it > checks the heap and verifies that it is not corrupted [...] > > This would be great to have under Irix, don't you think? Whatd'ya say > at SGI? I would really appreciate this feature. We've shipped versions of malloc that did this in the distant past. I don't think the malloc(3) shipped on 4D's in the last 2-3 years had this. Howver, consider the mallopt() function and the following paragraph, cut from a window displaying the IRIX 3.3 malloc(3X) man page: | M_DEBUG Turns debug checking on if value is not equal to 0, otherwise | turns debug checking off. When debugging is on, each call to | malloc and free causes the entire malloc arena to be scanned and | checked for consistency. This option may be invoked at any | time. Note that when debug checking is on, the performance of | malloc is reduced considerably. There have been internal discussions about possibly enhancing this feature in a future release to make it slower and more paranoid. (This would be good.) If you complain enough, you might convince the powers that be to ship some neat "memory-leak" tools developed in the window system wars. It is worth noting that the semantics of libmalloc and ancient malloc differ slightly. Particularly sloppy code has trouble with libmalloc. Vernon Schryver vjs@sgi.com
taylorr@pooh.cs.unc.edu (Russell Taylor) (06/13/90)
I traced the problem down by using the Saber-C product we recently got for our Suns. The problem (as many people responded) was that I was doing strange things to memory that had been gotten via calls to malloc(). The Saber-C environment checked for the strangeness and showed me right where it was happening. Once I fixed it, the problem went away. Thank you all for your suggestions! Russell Taylor
yohn@tumult.asd.sgi.com (Mike Thompson) (06/14/90)
In article <75367@aerospace.AERO.ORG>, swed@aerospace.aero.org (Gregory D. Swedberg) writes: > > > Are you ever calling realloc? The IRIS does not seem to > implement it correctly, when reallocing to the a larger size it seems > to just return the original pointer rather than a pointer to a new > larger block. I have had to give up on realloc on the IRIS. Are you implying that realloc isn't returning a large enough buffer for your (new) request? The whole purpose of realloc is to avoid copying data around whenever possible. To that end, realloc will check to see if it can grow the current buffer to satisfy the request and just pass back the same (grown) buffer. If there isn't enough room to grow the current buffer, a new buffer will be allocated, the data copied, and the old buffer released. If you think that realloc is returning the same buffer and it hasn't grown the buffer adequately, please call the customer support hot line immediately with details. Mike Thompson