eric@snark.UUCP (Eric S. Raymond) (10/08/88)
(Cross-posted to comp.lang.c because the point made has more general relevance) In <4025@pbhyf.pacbell.com>, rob@pbhyf.PacBell.COM (Rob Bernardo) writes: > I'm one of the elm 2.2 developers and I've run across a nasty > bug that has existed in elm since the 1.5 version (at least). > [describes stepping through looking for a buffer corruption point] > What makes this even more bizarre is that different pieces of evidence > point in different directions for the location of the overrun array: Overrun static buffers don't tend to produce quite this level of confusion; the bug search strategy you described would have caught the problem, I think, if that's what was going on. An overrun *automatic* buffer, on the other hand, can trash stack frames, leaving just this kind of apparently contradictory evidence -- because the pc can land in the middle of a buffer-altering routine on return from the enclosing subroutine. It's amazing how often this results in silent corruption and later bugs, instead of a nice clean core dump. If you have sdb, get to a point where the bug is presenting, then do a check on each automatic buffer in a routine above your break point on the execution stack. Odds are you'll find one full... This is the most insidious kind of C bug there is, bar none. Even malloc() braindamage isn't as nasty, because you can replace malloc() with a version that does more checking. I've learned the hard way to check on this whenever I see a bug involving magic changes in buffers within routines that shouldn't access them. -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP Post: 22 S. Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718