[comp.lang.c] legal but questionable pointer mixing

karl@ima.isc.com (Karl Heuer) (01/22/91)

Summary: some pointer-mixing is semantically incorrect but isn't detected by
the compiler; others are correct but the compiler (with all warnings enabled)
find them questionable.  In both cases tweaking the compiler with a |#pragma|
could be a win.


I recently mentioned in comp.std.c that I have been% routinely compiling C
programs with "gcc -Wall -Wcast-qual -Wwrite-strings".  The last two options
are only useful to people who use the |const| qualifier religiously, and even
then one can still get spurious warnings on correct code.  In particular, the
implementation of |strchr()| requires a conversion from its |const|-qualified
parameter to its non-|const|-qualified result.

The immediate question is: how can the code be written so as to suppress the
warning message?  My answer: I don't.  In a sense the warning is appropriate;
gcc is simply making the correct observation that |strchr()| is a bad function
(in the sense that |q=strchr(p, p[0])| is equivalent to a dequalifying cast,
but the compiler doesn't know it).

A similar problem exists with the |void *| functions like |memcpy()|:
	int *ip0, *ip1, *ip2;  double *dp1;  size_t sz;
	ip0 = memcpy(ip1, ip2, sz);  /* RIGHT */
	dp0 = memcpy(ip1, ip2, sz);  /* WRONG */
Regardless of whether or not you use casts on |void *| conversions, the two
statements will look equally correct to the compiler; if one generates a
warning&, so will the other.  Nevertheless, the first is correct and the second
is not.  Again, it's the moral equivalent of an implied cast.

Logically, |strchr()| is an overloading of the two functions
	char const *strchr_r(char const *, int);
	char *strchr_w(char *, int);
and |memcpy()| is an overloading of an infinite family that includes
	int *memcpy_i(int *, int const *, size_t);
	double *memcpy_d(double *, double const *, size_t);

So, I claim the first question we should be asking is: how can we tell the
compiler about the correct *usage* of these functions?

No doubt the problem has been solved in other languages.  C++ has explicit
overloading, which solves the |strchr()| problem but allows only a finite
subset of the |memcpy()| family to be handled.  It would seem that the best
way to handle it in C is with either$ a |#pragma| or a magic keyword attached
to the declaration in the header file.  Something like:

	/*
	 * The |#pragma| means: |strchr()| arg1 (formally a |char const *|) can
	 * be either |char *| or |char const *|, and the result should be
	 * treated as the same type.
	 */
	#pragma sametype strchr arg1, result
	extern char *strchr(char const *, int);

	/*
	 * The |#pragma|s mean: |memcpy()| arg1 (formally a |void *|) and arg2
	 * (formally |void const *|) can be of any pointer type, but they must
	 * be identical except for possible |const|-qualification of arg2, and
	 * the result should be treated as the same type as arg2 (including
	 * |const|-qualification, if any).
	 */
	#pragma sametype memcpy arg1, unconst(arg2)
	#pragma sametype memcpy arg2, result
	extern void *memcpy(void *, void const *, size_t);

Judging from my (limited) experience with gcc hacking, it should not be
particularly difficult to implement something along these lines.

Now, back to the original question.  Given the source code to |strchr()|
and the assurances guaranteed by the |#pragma|, a human can prove that the
conversion from |char const *| to |char *| within |strchr()| is safe.  If the
compiler could come up with this proof on its own, it would be justified in
automatically suppressing the warning.

Since theorem-proving in unsolvable in general, at the very least it should be
assisted by another |#pragma| from the user, say
	#pragma provably-okay cast-qual
	return ((char *)s);
meaning "although the following statement is apparently dangerous, it can
be proved that the problematic conditions (in this case, that the source
pointer was originally from a |const| object and the destination pointer will
eventually be used to modify it) will never occur."  Ideally, the compiler
should then attempt to prove the assertion, and possibly emit the warning "you
may be right, but I can't seem to prove it."  (This also applies to certain
cases of "variable may be used without being set"; and a similar |#pragma|
could be used to assert loop invariants.)

In practice, though, embedding this kind of theorem-proving into the current
generation of compilers is probably not worth the effort (unless an important
and philanthropic client, such as the U.S. government, thinks they need it).
As a much simpler approximation, I would propose
	#pragma disable-warning cast-qual
	return ((char *)s);

Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
________
% I use the past tense because, now that I've seen more recent documentation,
  I've just added -Wpointer-arith -Wshadow -Wstrict-prototypes to the list.
& It would probably be silent, though the X3J11 appendix suggests that
  implicit |void *| conversion could be a Common Warning.
$ In this article I'm concentrating on the semantics.  The exact syntax
  (including whether it's spelled |#pragma| or |__frobnicate|) is open.