[comp.std.c] Why no arithmetic on void *

bengsig@dk.oracle.com (Bjorn Engsig) (01/22/91)

Could someone please explain if arithmetic on void * (with the same semantics
as on char *) was discussed when ANSI C was made, and why it was not included.
-- 
Bjorn Engsig, ORACLE Corporation, E-mail: bengsig@oracle.com, bengsig@oracle.nl

            "Stepping in others footsteps, doesn't bring you ahead"

steve@taumet.com (Stephen Clamage) (01/23/91)

bengsig@dk.oracle.com (Bjorn Engsig) writes:

>Could someone please explain if arithmetic on void * (with the same semantics
>as on char *) was discussed when ANSI C was made, and why it was not included.

If you want the semantics of char*, you should declare a char*.

Void* was an invention of the Committee to cover the case of "pointer to
something of unknown type".  This allows a guaranteed portable way to
pass pointers to an arbitrary object to a routine which can make use of
such a thing, and a way to store pointers to an arbitrary object.

When you add/subtract 1 to/from a pointer, the pointer is incremented/
decremented by the size of the object it points to.  Since the size of the
object pointed to by a void* is by definition unknown, pointer arithmetic
is illegal.

Example: If you have
	void *p = &mydata;
why would you want the expression (p+1) to point one char past the start
of mydata?  If for some reason that is what you want, you can write
((char*)p+1).  Or if you really want to do pointer arithmetic with p
without a lot of casts, you can write
	char  *p = (char*)&mydata;

-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

barmar@think.com (Barry Margolin) (01/23/91)

In article <561@taumet.com> steve@taumet.com (Stephen Clamage) writes:
>Void* was an invention of the Committee to cover the case of "pointer to
>something of unknown type".  This allows a guaranteed portable way to
>pass pointers to an arbitrary object to a routine which can make use of
>such a thing, and a way to store pointers to an arbitrary object.

However, it would be nice if void* could also be used as a way to pass
pointers to an array of arbitrary objects.

>When you add/subtract 1 to/from a pointer, the pointer is incremented/
>decremented by the size of the object it points to.  Since the size of the
>object pointed to by a void* is by definition unknown, pointer arithmetic
>is illegal.

This issue came up recently in another newsgroup, I think in the context of
qsort().  The arguments to qsort() include a void* representing the array,
and a size_t containing sizeof(each array element).  Internally, however,
qsort() must cast the void* to a char* in order to increment it.  It seems
to me that ANSI C only goes halfway in providing a generic pointer, because
generic pointer arithmetic must be done by casting to a non-generic type,
namely char*.  I opined that it would make sense for void* arithmetic to
operate in the units that sizeof returns, i.e. define that sizeof void ==
1.  For compatibility sizeof char would also be defined to be 1, but such
use could be deprecated.  Future C standards might then be able to get rid
of the notion that char is the fundamental unit (e.g. systems with bit
addressing could implement void* as bit pointers and sizeof would return
the size in bits, or systems with 16-bit characters but 8-bit bytes could
be supported without needing to resort to wchar_t).

My point is that char* is doing double duty, and it seems like void* was
invented to take over the portion of char*'s role that has nothing to do
with character data.
--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

chip@tct.uucp (Chip Salzenberg) (01/23/91)

According to barmar@think.com (Barry Margolin):
>However, it would be nice if void* could also be used as a way to pass
>pointers to an array of arbitrary objects.

Of course it can.

Aside: In normal usage, what we call "pointer to an array" is usually
"pointer to the first element of the array."  C _allows_ a declaration
of a genuine "pointer to an array," but such a pointer is typically no
more useful than a pointer to the first element.

In any case, a |void*| can represent all of the above pointer types.

>It seems to me that ANSI C only goes halfway in providing a generic
>pointer, because generic pointer arithmetic must be done by casting
>to a non-generic type, namely char*.

That's one interpretation.  I see it as leveraging off existing
practice.  Remember that |void*| and |char*| must have identical
representations and that sizeof(char) must always equal one.  So
|char*| is just as generic a pointer as |void*|.

Given these facts, the utility of |void*| lay not in its similarities
to |char*|, but in its differences: conversions without casts and no
allowance for arithmetic.

>Future C standards might then be able to get rid of the notion that char
>is the fundamental unit ...

I see what you're driving at -- but such a language wouldn't be C any
more.  Many existing programs would break.  The assumption
"sizeof(char)==1" is hard-coded into innumerable calls to fread() and
other library functions.
-- 

Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
          perl -e 'sub do { print "extinct!\n"; }   do do()'

gwyn@smoke.brl.mil (Doug Gwyn) (01/24/91)

In article <1238@dkunix9.dk.oracle.com> bengsig@dk.oracle.com (Bjorn Engsig) writes:
>Could someone please explain if arithmetic on void * (with the same semantics
>as on char *) was discussed when ANSI C was made, and why it was not included.

The main purpose of void and void* was to provide additional compile-time
safety checking.  This would be largely defeated if pointer arithmetic
had been allowed on void*.  Note that such arithmetic would have had to
have been specially defined for that one type, since it does not fit the
pointer-arithmetic model given for additive operators.  (If it HAD been
made consistent, we would have had to declare that sizeof(void)==0 and
then void* arithmetic would not have had the same behavior as char*
arithmetic anyway.  So it would have had to violate the model, which is
obviously extremely undesirable.)

Note also that there is absolutely no need for such a wart, since you
can perform the desired pointer arithmetic after casting the void* to
char*.

gwyn@smoke.brl.mil (Doug Gwyn) (01/24/91)

In article <1991Jan22.211212.14692@Think.COM> barmar@think.com (Barry Margolin) writes:
>However, it would be nice if void* could also be used as a way to pass
>pointers to an array of arbitrary objects.

There is essentially no difference between "object" and "array of objects"
in this context.  To access an object, first the void* must be converted
to a pointer to the desired type.  That could be a pointer to an array
type, if desired.

>[proposes] sizeof void == 1.

No, that would never be accepted, since (I believe) it is clear to most
X3J11 members that if we were to give an object meaning to void, then we
would have to require that sizeof(void)==0.  I was Point Of Contact for
the zero-sized object special interest group, and determined as a side
exercise just what would have to be specified in the standard were void
to acquire object meaning (similar to my proposal on changes needed to
support a "short char" type to designate sizeof units).

>For compatibility sizeof char would also be defined to be 1, but such
>use could be deprecated.  Future C standards might then be able to get rid
>of the notion that char is the fundamental unit ...

I argued for such a specification, prompted by the Japanese proposal for
"long char", but ultimately the explicit multibyte-character specifications
were adopted rather than the obviously cleaner solution of a basic type
whose size may be other than that of a char.  I won't rehash the pros and
cons here, but will note that it is unlikely this will ever be cleaned up,
because of the international momentum behind kludgier (multibyte character)
approaches.

However, trying to give "storage_unit*" semantics to void* is the wrong
approach.  "short char" or some such explicit object type would be the
technically appropriate method to provide for potentially sub-char
addressing.  As the standard is currently organized, void definitely
must not be given object meaning; therefore it would be might hard to
give void* the semantics you seek.

>My point is that char* is doing double duty, and it seems like void* was
>invented to take over the portion of char*'s role that has nothing to do
>with character data.

It is true that char (not just char *) has been overloaded with both
"character" and "byte" meanings.  This was, as noted above, a deliberate
decision of X3J11, although one which I disliked.  As the library section
of the standard evolved, it became clear that all "generic object"
pointer parameters to library routines such as memcpy() should be changed
to void* rather than char*.  This doesn't mean, however, that X3J11
believed that void* is the proper way to write "byte pointer"; char* is
also a "byte pointer", and the latter is the form one should use for
byte-oriented address arithmetic.  (Again, I'm not happy with that, but
that IS the position taken for the standard.)

gwyn@smoke.brl.mil (Doug Gwyn) (01/24/91)

In article <561@taumet.com> steve@taumet.com (Stephen Clamage) writes:
>Void* was an invention of the Committee to cover the case of "pointer to
>something of unknown type".

I would rather say, "pointer to object of irrelevant type".

gwyn@smoke.brl.mil (Doug Gwyn) (01/26/91)

In article <279D93D3.5EE7@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>"sizeof(char)==1" is hard-coded into innumerable calls to fread() and
>other library functions.

I had to figure out the ramifications when I wrote my "short char" proposal.
Actually there are enough problems with use of fread() etc. that I don't
think much more work would be needed to accommodate sizeof(char)>1 than
is already necessary to properly include <stdio.h>, use size_t, etc.  Also
note that in any implementation where it was decided to make sizeof(char)
remain 1, the existing usage would not be a problem.  For quite some time
I had been writing code that did not assume sizeof(char)==1, just in case.
Now that the standard requires sizeof(char)==1, I don't think there is any
real chance that it will ever change, so one can safely assume that
constraint.