[comp.lang.c] nasty evil pointers

brian@bucc2.UUCP (03/07/88)

  We lose a lot of a time to pointers running off all over the place.
We are using Microsoft C 5.0 under MESS-DOS, and that means no control
of memory. We've tried using things like _nullcheck() and _heapwalk(), but
the don't help much. It would be nice if we could check every pointer as
it was used... something like

void pointer_validate(p, n, f)
  void *p;
  int n;
  char *f;

  That would be called with a pointer, __LINE__, and __FILE__. If pointer
pointed, say, into the operating system or the text space, the function
would print a message and exit(). Otherwise it would return. Writing
such a function is not too difficult. However, getting it called is. It
would be nice if we had a utility program like ctrace we could run our
code through that would put in calls to this function. Unfortunatley, this
would have to have considerable knowledge of the C language to work...
In simple cases like

void foo(ip)
  int *ip;
  {
  printf("%d", *ip);
  }

would become:

void foo(ip)
  int *ip;
  {
  pointer_validate((void *) ip, __LINE__, __FILE__)
  printf("%d", *ip);
  }

  however, something like:

char *bar(bp, ip, sp)
  int *bp, *ip;
  char *sp;
  {
  register int i;
  char *s;

  if (sp)
    s = sp;

  for (i = *bp; i < 50; i += *ip)
    s[i] = 'X';

  return(s);
  }


  is another story entirley. The program would have to put in braces
in the if statement, and do something to the for like:

char *bar(bp, ip, sp)
  int *bp, *ip;
  char *sp;
  {
  register int i;
  char *s;

  pointer_validate((void *) sp, __LINE__, __FILE__)
  if (sp)
    {
    pointer_validate((void *) sp, __LINE__, __FILE__)
    s = sp;
    }
  else if (!(s = malloc(50)))
    return(NULL);

  pointer_validate((void *) bp, __LINE__, __FILE__)
  for (i=*bp; i<50; pointer_validate((void *) ip, __LINE__, __FILE__), i += *ip)
    s[i] = 'X';

  return(s);
  }

  I know these aren't good examples, it 5:00 AM... anyway, you get the idea
of the difficulties involved here. Without source to the compiler it could
be a major project. Does anybody know a way this could be done, perhaps with
something like lex? We don't have lex on the PC, but if we could come up
with source from lex on the unix system and move it to the PC, wouldn't it
work? (I don't know anything about lex)

...............................................................................

  When the going gets weird, the weird turn pro.

  Brian Michael Wendt       UUCP: {cepu,ihnp4,uiucdcs,noao}!bradley!brian
  Bradley University        ARPA: cepu!bradley!brian@seas.ucla.edu
  (309) 691-5175            ICBM: 40 40' N  89 34' W

brian@bucc2.UUCP (03/08/88)

> /* Written  5:10 am  Mar  7, 1988 by bucc2.UUCP!brian in bucc2:comp.lang.c */
> /* ---------- "nasty evil pointers" ---------- */
>   We lose a lot of a time to pointers running off all over the place.
> We are using Microsoft C 5.0 under MESS-DOS, and that means no control
> of memory. We've tried using things like _nullcheck() and _heapwalk(), but
> the don't help much. It would be nice if we could check every pointer as
> it was used... something like
> 
> void pointer_validate(p, n, f)
>   void *p;
>   int n;
>   char *f;

  After I posted my earlier note, I thought of a way to make this problem
somewhat simpler... The follwing macro can be defined:

#define VAL(p)	(pointer_validate((p), __LINE__, __FILE__), (p))

  Then all that needs to be done is to replace a pointer dereferences with
*VAL(ptr), for example:

*expr			=>	*VAL(expr)
expr->expr2		=>	VAL(expr)->expr2

  Of course, the program still needs to know something about C to tell pointer
dereferences from multiplication, and then there are more complex cases like:

**expr			=>	*VAL(*VAL(expr))

  But this problem is much simpler. So does anyone know how to solve this
simpler problem of putting a VAL macro around every pointer dereference?

...............................................................................

  When the going gets weird, the weird turn pro.

  Brian Michael Wendt       UUCP: {cepu,ihnp4,uiucdcs,noao}!bradley!brian
  Bradley University        ARPA: cepu!bradley!brian@seas.ucla.edu
  (309) 691-5175            ICBM: 40 40' N  89 34' W

backstro@silver.bacs.indiana.edu (Dave White) (03/10/88)

In article <13100003@bucc2> brian@bucc2.UUCP writes:
>
>It would be nice if we could check every pointer as
>it was used...
>
>
>If pointer
>pointed, say, into the operating system or the text space, the function
>would print a message and exit(). Otherwise it would return.
>
>Without source to the compiler it could
>be a major project. Does anybody know a way this could be done, perhaps with
>something like lex?
>
Lattice C 3.0 (and, I suppose, more recent versions) allow the size of
the stack and the size of the heap to be specified at runtime -- one
uses an argument of the form
   =ssize/heapsize
which the runtime startup routine digests before calling main().
Given such a beast, Lattice can provide isxxxx() functions that
determine whether a passed pointer points within an expected space.
However, the programmer must call them explicitly.

This makes it possible to determine when a wild pointer endangers the
machine.  It has the disadvantage that the proper heapsize might need to
be fudged, as well as the stacksize....  developing code for 8086's anf
friends is painful enough...

The sanest way to do what you want automatically is to build it into the
compiler itself.  Because compilers frequently call internal functions not made
available to the user, and because of the difficulty of cases like the
one you mention, it just isn't practical to write a pre-processor for
this -- you'd be halfway to rewriting the compiler!
>---
>  When the going gets weird, the weird turn pro.
No, they just bring out an Optimizing Turbo 5.0 version :-)
--
backstro@silver.bacs.indiana.edu

null@bsu-cs.UUCP (Patrick Bennett) (03/10/88)

In article <1159@silver.bacs.indiana.edu>, backstro@silver.bacs.indiana.edu (Dave White) writes:
> In article <13100003@bucc2> brian@bucc2.UUCP writes:
> >
> >It would be nice if we could check every pointer as
> >it was used...
> >
> >
> >If pointer
> >pointed, say, into the operating system or the text space, the function
> >would print a message and exit(). Otherwise it would return.
> >

OS/2 provides this...  Different memory areas can have varying levels of
protection.  When this protection is broken, whether by attempting to modify,
read, whatever (depends on the protection) the program is promptly aborted but
with appropriate error info...

Although I personally don't have or use OS/2 I obtained this information from
the OS/2 Programmer's Guide by Ed Iacobucci (The leader of the IBM OS/2
Design Team)

Sounds great to me...

-- 
----
Patrick Bennett     UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!null

jackm@devvax.JPL.NASA.GOV (Jack Morrison) (03/11/88)

In article <13100003@bucc2> brian@bucc2.UUCP writes:
>
>It would be nice if we could check every pointer as
>it was used...

Funny how often I see things in this newsgroup that can be answered 
with two words:

	try Ada.

Even funnier because I still prefer using C. 

-- 
Jack C. Morrison	Jet Propulsion Laboratory
(818)354-1431		jackm@jpl-devvax.jpl.nasa.gov
"The paycheck is part government property, but the opinions are all mine."

nevin1@ihlpf.ATT.COM (00704a-Liber) (03/12/88)

In article <13100003@bucc2> brian@bucc2.UUCP writes:
>It would be nice if we could check every pointer as it was used...

This seems to be more of a job for the OS than for the language.  If you try to
check every single use of pointers, you end up with a language like Pascal
:-).  I don't mind the checking of pointers as long as it doesn't interfere
with the efficiency that those of us who program in C have come to love :-).
Also, if the pointer checks find an error, I would like it to transfer control
to an error-handling section of my program so that I could either try to
recover or go down *gracefully* (saving calculated data, etc.).
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

steve@endor.harvard.edu (Kaufer - Lopez - Pratap) (03/13/88)

In article <1529@devvax.JPL.NASA.GOV> jackm@devvax.JPL.NASA.GOV (Jack Morrison) writes:
>In article <13100003@bucc2> brian@bucc2.UUCP writes:
>>
>>It would be nice if we could check every pointer as
>>it was used...
>
>Funny how often I see things in this newsgroup that can be answered 
>with two words:
>	try Ada.
>
>Even funnier because I still prefer using C. 
>
>-- Jack C. Morrison	Jet Propulsion Laboratory
---
Well, if you are still using C, you should really try an interpreter.
The Saber-C interpreter, from Saber Software, Inc., provides 
full run-time checking of pointers (and many other nasty C things....) 
More info can be obtained by e-mail or phone...

	Stephen Kaufer
	Saber Software, Inc.
	saber@harvard.harvard.edu
	30 JFK Street, Cambridge, MA.  02138
	(617) 876-7636

--- 
Note: As the address indicates, I work for Saber.
Having said that, comments about biased reporting to /dev/null!

brian@bucc2.UUCP (03/14/88)

> /* Written 5:54 pm Mar 11,1988 by ihlpf.ATT.COM!nevin1 in bucc2:comp.lang.c */
> In article <13100003@bucc2> brian@bucc2.UUCP writes:
> >It would be nice if we could check every pointer as it was used...
> 
> This seems to be more of a job for the OS than for the language.  If you try
> to check every single use of pointers, you end up with a language like Pascal
> :-).  I don't mind the checking of pointers as long as it doesn't interfere
> with the efficiency that those of us who program in C have come to love :-).
> Also, if the pointer checks find an error, I would like it to transfer control
> to an error-handling section of my program so that I could either try to
> recover or go down *gracefully* (saving calculated data, etc.).
> -- 
>  _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194

  Of course we don't plan to ship products with pointer checking code! The
whole idea of checking the pointers is as a tool to locate the problems that
are causing the bad pointers in the first place. The pointer checking code
would be used only in the module debugging phase. It would be removed before
the project reached alpha test.

  You're right, this is a job of the OS. Sadly, we don't even have an
operating system on the machines we use, all we have is MS-DOS. (no smiley
faces)

...............................................................................

  When the going gets weird, the weird turn pro.

  Brian Michael Wendt       UUCP: {cepu,ihnp4,uiucdcs,noao}!bradley!brian
  Bradley University        ARPA: cepu!bradley!brian@seas.ucla.edu
  (309) 691-5175            ICBM: 40 40' N  89 34' W

pablo@polygen.uucp (Pablo Halpern) (03/15/88)

In article <13100004@bucc2> brian@bucc2.UUCP writes:
>> /* Written  5:10 am  Mar  7, 1988 by bucc2.UUCP!brian in bucc2:comp.lang.c */
...
>> void pointer_validate(p, n, f)
>>   void *p;
>>   int n;
>>   char *f;
>
>  After I posted my earlier note, I thought of a way to make this problem
>somewhat simpler... The follwing macro can be defined:
>
>#define VAL(p)	(pointer_validate((p), __LINE__, __FILE__), (p))
>
>  Then all that needs to be done is to replace a pointer dereferences with
>*VAL(ptr), for example:

Close, but no cigar.  Side effects will kill you.  For example:

	a = *p++;

would be replaced by:

	a = *VAL(p)++;

which would, in turn, be expanded to:

	a = *(pointer_validate((p), __LINE__, __FILE__), (p))++;

This is sematically incorrect, since the comma operator returns an rvalue
which is not a legal operand for the ++ operator.  Also, if the pointer
expression causes potential side effects, as in:

	a = *func();

the side effect will be carried out twice by the macro.  How about the
following declaration for pointer_validate():

void *pointer_validate(p, n, f)
	void *p;
	int n
	char *f;

Where the return value of pointer_validate is p.  This has the problem
that the return type is no longer the same as the type of p and must be
casted.  This requires any filter that inserts the pointer_validate calls
to understand not only C syntax, but symbol table information as well.
Oh, well, keep trying :-(

Pablo Halpern		|	mit-eddie \
Polygen Corp.		|	princeton  \ !polygen!pablo  (UUCP)
200 Fifth Ave.		|	bu-cs      /
Waltham, MA 02254	|	stellar   /

terry@wsccs.UUCP (terry) (03/20/88)

In article <2315@bsu-cs.UUCP>, null@bsu-cs.UUCP (Patrick Bennett) writes:
} In article <13100003@bucc2> brian@bucc2.UUCP writes:
} >
} >It would be nice if we could check every pointer as
} >it was used...
} >
} >
} >If pointer
} >pointed, say, into the operating system or the text space, the function
} >would print a message and exit(). Otherwise it would return.

Before going further, I would like to point out that the quote above is
referring to a language feature, NOT an operating system feature.

} OS/2 provides this...  Different memory areas can have varying levels of
} protection.  When this protection is broken, whether by attempting to modify,
} read, whatever (depends on the protection) the program is promptly aborted but
} with appropriate error info...
} 
} Although I personally don't have or use OS/2 I obtained this information from
} the OS/2 Programmer's Guide by Ed Iacobucci (The leader of the IBM OS/2
} Design Team)

What I think Brian was saying was a reference to the _kind_ of message he
received, not that he didn't receive it.  Anyone who's gotten the message
'segmentation fault - core dumped' realizes that he has reached outside his
address space.  I think the simplist method of determining where the problem
occurred would be in adb, since you have the arguments and file header info
readily available.

If I'm wrong, and Brian was suggesting that the compiler error if you have
a pointer pointing into the noplace (and this whould be better resolved by
the linker, in that it could be an external pointer), I would have to take
exception.  This kind of checking would imply that 1) only constant pointers
could accurately be checked, or 2) that checking should occur at assignment
time.  The first of these is inadequate, as constant pointers seldom break,
and the second is ridiculous, in that the assignment value would have to
be checked at assignment time, leading to a number of problems in type-
checking, speed of execution, etc.  If one truly felt compelled to _DO_
something about it, you could write grungy code like:

extern void *_begincs;		/* linker would define these*/
extern void *_endcs;
extern void *_beginds;
extern void *_endds;

void *
savemefrommyself( arg)
void *arg;
{
	if( arg >= _begincs && arg <= _endcs)
		return( arg);
	if( arg >= _beginds && arg <= _endds)
		return( arg);

	printf( "Your pointer points to the noplace\n");
	exit( 1);
}

main()
{
	char *p, *q;
	.
	.
	.
	p = (char *)savemefrommyself( (void *)q);
	.
	.
	.
}

Personally, I'd prefer it if the compiler didn't automatically write this for
me.  It also leaves open the problem of "what if it's supposed to point into
the data segment and it points into the code segment instead".  I admit, that
with enough dorking around you could probably kludge something that would make
the wrong assumption only 30% of the time, but why?  Eventually, you'd end up
with it breaking something like

	func()
	{
		char *q;


		q=(char *)NULL;	/* breaks here*/
		.
		.
		.
		if( errcond1) q = "Error condition 1";
		if( errcond2) q = "Error condition 2";
		.
		.
		.
		if( q == (char *)NULL)
			printf( "Operation complete");
		else printf( "Error: %s", q);
	}

Whereas if it were implimented in adb, you could do something like:

	$ a.out
	Segmentation fault - core dumped
	$ adb a.out core
	_ $C
	  main() xxxxxxxxxxx
	  func() xxxxxxxxxxx
		 zzzzzz  yyyyyy
		address yyyyyy prior to data segment
	_

Which is much more informative, and prettier, yoo.


| Terry Lambert           UUCP: ...{ decvax, ihnp4 }                          |
| @ Century Software          : ...utah-cs!uplherc!sp7040!obie!wsccs!terry    |
| SLC, Utah                                                                   |
|                   These opinions are not my companies, but if you find them |
|                   useful, send a $20.00 donation to Brisbane Australia...   |
| 'There are monkey boys in the facility.  Do not be alarmed; you are secure' |

karl@haddock.ISC.COM (Karl Heuer) (03/23/88)

In article <338@wsccs.UUCP> terry@wsccs.UUCP (terry) writes:
>> >It would be nice if we could check every pointer as it was used...
>[If necessary you could define a pointer-verifying function, and use]
>	p = (char *)savemefrommyself( (void *)q);
>
>Personally, I'd prefer it if the compiler didn't automatically write this for
>me.

I'd prefer the best of both worlds: make it an option.

>Eventually, you'd end up with it breaking something like [assigning NULL]

Well, that's because of an imperfection in the verifying function.  The value
NULL is legal to copy, but illegal to dereference or perform arithmetic on.
You need to have two separate verification routines.  (It still wouldn't be
perfect; the posted function only checked for completely wild pointers.  A
better implementation would catch array overflow, too.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint