[comp.lang.c++] classes with no data members

rlb@polari.UUCP (rlb) (05/31/88)

In the July, 1987 "The C++ Programming Language":

Chapter 7, section 7, for such a small section, contains quite a lot of
facts about the language.  The concept of a class that has no data
(as opposed to function) members is a stumbling-block for me.  Actually,
I thought I understood it until this section pointed out that you can have
a non-NULL pointer to such a class.  I have some questions:

a)  What the heck should a pointer to such a class point to?

b)  Should the "sizeof" such a class be zero?

c)  If the "sizeof" such a class is zero and I use "new" to create an
    object of that class, should the amount of free store decrease?

d)  The proposed ANSI standard does not guarantee whether its storage 
    allocators return NULL or not for objects of size zero.  Should C++
    guarantee the behavior of the default implementation of "new" when
    asked to allocate zero bytes?

e)  If I declare an "auto" array of objects of such a class, should it
    consume any stack space?

Notice that the above questions say "should" and not "does"; I am not really
interested in what particular implementations do, but rather what the language
definition is (or should be).
-Ron Burk

ark@alice.UUCP (06/03/88)

In article <464@polari.UUCP>, rlb@polari.UUCP writes:
 
> Chapter 7, section 7, for such a small section, contains quite a lot of
> facts about the language.  The concept of a class that has no data
> (as opposed to function) members is a stumbling-block for me.  Actually,
> I thought I understood it until this section pointed out that you can have
> a non-NULL pointer to such a class.  I have some questions:
 
> a)  What the heck should a pointer to such a class point to?

It should point somewhere.

Less flippantly, consider a class T with no data members
and a class S derived from T.  Then a T* may indeed point
into the middle of the S that contains it.  Such a pointer
is completely meaningful.  Or consider a class with no data
members but with virtual functions.  Such a class contains
some information even though there's no way to get to it
directly.

> b)  Should the "sizeof" such a class be zero?

Maybe, maybe not.  It certainly won't be zero if the class
has any virtual functions.  Even if it doesn't, cfront may
pad it with an extra byte to avoid breaking the C compiler.

> c)  If the "sizeof" such a class is zero and I use "new" to create an
>     object of that class, should the amount of free store decrease?

Maybe, maybe not.  It may well be that the allocator imposes an
overhead for each object even if the size of the object
is zero.

> d)  The proposed ANSI standard does not guarantee whether its storage 
>     allocators return NULL or not for objects of size zero.  Should C++
>     guarantee the behavior of the default implementation of "new" when
>     asked to allocate zero bytes?

Probably not, because in practice it may be forced to rely on
the underlying C implementation.

> e)  If I declare an "auto" array of objects of such a class, should it
>     consume any stack space?

If and only if sizeof such an object is non-zero.

rlb@polari.UUCP (rlb) (06/10/88)

In article <10399@sol.ARPA>, crowl@cs.rochester.edu (Lawrence Crowl) writes:
> between them without having pointers that point somewhere?  In essence, by
> using a NULL pointer, you have taken away my ability to do:
> 
>     p = new empty ;
>     q = new empty ;
>     if ( p != q ) ....
> 
Good point, which I had not thought of.  This seems to prove that "new"
must allocate space for an object of size zero, even if the host system
"malloc" does not.

To sum up, an implementation which wishes "empty" classes to occupy no
space, must generate unique addresses for each such object (whether automatic
or static) and must require the "new" operator to return unique addresses
for objects of sizeof zero.  Such an implementation is a bit iffy; since the
language manual doesn't specify the behaviour of "new" with a zero arg, user
versions may break the implementation.
The alternative appears vastly easier and I assume all existing implementations
take it: simply waste a byte of storage for each empty class.

dupuy@douglass.columbia.edu (Alexander Dupuy) (06/11/88)

In article <7943@alice.UUCP>, ark@alice.UUCP writes:
> 
> The unsatisfying part is the statement that a pointer to an empty
> class must point *somewhere*.  I don't see why it has to point *anywhere*.


In article <593@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>
>It doesn't have to point anywhere, since nothing will be there to
>be pointed to.  But pointers to two such objects should be distinguishable,
>so you can't just use (foo*)0 for all of them.  In fact, all
>pointers of any types should be pairwize distinguishable.
>
>Did I miss the part where people said WHY they would want to have
>pointers to empty structures?  And why they could not begrudge a byte
>or two of wasted space to accomplish this feat?
>
>
>		Dave J.

On a system with demand-paged virtual memory, it is quite easy to have pointers
which are distinguishable, but which don't point to anything which exists (in
memory).  Just have an empty page (in BSS, of course, so it doesn't take up
disk space, either) which is used only as the target of pointers to empty
classes, structs, and anything else which has zero size.

If you have more than PAGESIZE zero sized structures, just reserve more empty
pages.  If you're feeling adventurous, you could even try using invalid pages
(whatever they may be on your particular machine) so that you don't even waste
usable virtual memory space.

"I gotta whole lotta nuttin..."

@alex
-- 
inet: dupuy@columbia.edu
uucp: ...!rutgers!columbia!dupuy

deb@svax.cs.cornell.edu (David Baraff) (06/12/88)

>In article <10399@sol.ARPA>, crowl@cs.rochester.edu (Lawrence Crowl) writes:
> between them without having pointers that point somewhere?  In essence, by
> using a NULL pointer, you have taken away my ability to do:
> 
>     p = new empty ;
>     q = new empty ;
>     if ( p != q ) ....
> 

If 'p' and 'q' have no data in their classes (only functions),
is there any difference between p and q? That is, other than looking
at the addresses (i.e. p != q), is there anyway to tell p and q apart,
in a functional or semantic sense? If not, then the above isn't really
a problem.

Though perhaps if we start discussing derived types, some differences
could arise...

	David Baraff
	deb@svax.cs.cornell.edu

karl@haddock.ISC.COM (Karl Heuer) (06/18/88)

Assume the existence of a C++ compiler that allows zero-sized objects (ZSOs),
i.e. one that does not insert a shim in an empty class.

Several people have stated the opinion that any two object pointers should be
distinguishable, and that calling `new empty' twice should yield unequal
(hence non-NULL) values.  I'm not convinced.

Given `int a[N]', &a[N] is a valid pointer that doesn't point anywhere.  That
is, it's guaranteed that &a[N] can be stored in a pointer variable, but
dereferencing it is undefined.  And it's not guaranteed to be distinct from
other pointers: &a[N] may well be equal to &b[0].  I think ZSO pointers are in
the same category.  (After all, it's just the limiting case N=0.)

Also, consider `empty x[2]'.  Surely &x[0] and &x[1] must compare equal?  If
so, why should the situation be different if I declare `empty x0, x1'?  Or if
I call `new empty' twice?

Btw, in ANSI C, malloc(0) either gives you a ZSO or a NULL pointer.  In C++,
new(0) apparently hasn't been well-defined.  (This is a serious flaw, in view
of the legality of user-defined new.  The language definition should either
specify the behavior, or explicitly state that it's undefined.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

rlb@polari.UUCP (rlb) (06/21/88)

}Prologue:
I believe I'll have one last whack at the minor, but interesting case of
"empty classes".  A note of interest: dpANS rationale refers to "the general
principle of not providing for 0-length objects".  Hmmmm....

}Discussion:
In article <4614@haddock.ISC.COM>, karl@haddock.ISC.COM (Karl Heuer) writes:
> Assume the existence of a C++ compiler that allows zero-sized objects (ZSOs),
> i.e. one that does not insert a shim in an empty class.
> 
> Several people have stated the opinion that any two object pointers should be
> distinguishable, and that calling `new empty' twice should yield unequal
> (hence non-NULL) values.  I'm not convinced.
I am.
> 
> Given `int a[N]', &a[N] is a valid pointer that doesn't point anywhere.  That
> is, it's guaranteed that &a[N] can be stored in a pointer variable, but
> dereferencing it is undefined.  And it's not guaranteed to be distinct from
> other pointers: &a[N] may well be equal to &b[0].  I think ZSO pointers are in
> the same category.  (After all, it's just the limiting case N=0.)
The Walking Lint stumbles a bit here :-).  It certainly is guaranteed to be
distinct from a possibly quite large number of other pointers -- those that
point to a[0] through a[N-1].  It is true that a pointer to &a[N] cannot be
dereferenced legally and you may personally choose to define that as "doesn't
point anywhere" but:  A) that definition does not remove the restrictions on
the pointer (must play well with other pointers into the same object :-), and
B) this is all just an analogy (not quite a correct one, I think) for the
subject under discussion, therefore we proceed...
> 
> Also, consider `empty x[2]'.  Surely &x[0] and &x[1] must compare equal?  If
> so, why should the situation be different if I declare `empty x0, x1'?  Or if
> I call `new empty' twice?
I assume your posting is based on the feeling (which I also have) that it is
pretty damn hard to come up with a realistic example in which it *matters*
whether or not two pointers to different objects of the same empty class are
distinguishable.  It is much easier to note that it could be "surprising"
that two separately declared objects have addresses which compare equal.  To
use your own example, &x[0] and &x[1] definitely must NOT compare equal,
because the expression "&x[1] - &x[0]" must evaluate to the integer "1".
Else you must clutter the language definition up with a great many "except
for zero-sized objects".
> 
> Btw, in ANSI C, malloc(0) either gives you a ZSO or a NULL pointer.  In C++,
> new(0) apparently hasn't been well-defined.  (This is a serious flaw, in view
> of the legality of user-defined new.  The language definition should either
> specify the behavior, or explicitly state that it's undefined.)
Can't argue that it's a flaw, but I also can't really call it serious because
then what stronger word would I be able to use for the fact that several
areas of the syntax of the grammar (e.g. "new" with initializers, class
member initializers, etc.) itself are entirely up in the air?  Catastrophic?
This is, of course, deliberate rabble-rousing; the non-alarmist view is that
    Language Definition == (AT&T implementation + Reverse Engineering)
I've watched enough episodes of Green Acres to know that when everyone around
you is using logic like this, ya better just go along with it:-).
> 
> Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
A couple of postings have appeared to the effect "Machine architecture X plus
clever trick Y allows me to create a separate address space for pointers to
zero-sized objects."  This is at least theoretically interesting: such an
arrangement seems to allow a C++ implementation where empty objects consume
no space (but only if "new" were well-defined and defined in favor of
this).  In practice, it don't seem likely that any implementor would ever
go to the trouble.

}Epilogue:
I conclude that "empty classes" are nasty little beasts that are not
worth the minor convenience of having the compiler create the virtually-
always-in-practice-needed dummy member on the programmer's behalf.  C seems
quite sensible to eschew zero-sized objects.  While "The C++ Programming
Language" does not seem to be a bad language definition, it's given me a lot
of respect for just how good a job K&R did on their first edition.

bobd@bloom.UUCP (Bob Donaldson) (06/22/88)

In article <18223@cornell.UUCP>, deb@svax.cs.cornell.edu (David Baraff) writes:
> 
> If 'p' and 'q' have no data in their classes (only functions),
> is there any difference between p and q? That is, other than looking
> at the addresses (i.e. p != q), is there anyway to tell p and q apart,
> in a functional or semantic sense? If not, then the above isn't really
> a problem.

If we are getting into semantics, here, it is certainly possible (and
possibly helpful :-)) to distinguish between two DIFFERENT yet otherwise
indistinguishable instances of a type.  The fact that they are different
instances itself may be of some value, even if no difference would arise
from choosing one over the other for a given purpose.
----------------------------------------------------------------
Bob Donaldson              ...!ut-emx!juniper!radian!bobd
Radian Corporation             ...!sun!texsun!radian!bobd
PO Box 201088       
Austin, TX  78703       (512) 454-4797

Views expressed are my own, not necessarily those of my employer.

nevin1@ihlpf.ATT.COM (00704a-Liber) (06/24/88)

In article <423@bloom.UUCP> bobd@bloom.UUCP (Bob Donaldson) writes:

>If we are getting into semantics, here, it is certainly possible (and
>possibly helpful :-)) to distinguish between two DIFFERENT yet otherwise
>indistinguishable instances of a type.  The fact that they are different
>instances itself may be of some value, even if no difference would arise
>from choosing one over the other for a given purpose.

We have three choices:

(For purposes of this example p and q are empty objects)

(1)	&p NEVER equals &q (distinguishable)
(2)	&p ALWAYS equals &q (indistinguishable)
(3)	&p should never be compared to &q (comparison is undefined)

I am leaning towards case 3 for a number of reasons.

I have not seen any realistic examples where either case 1 is preferred
over case 2 or vice-versa.

p and q cannot legally be dereferenced; I find very little meaning in
taking the address of nothing.

If p and q are of different classes then it may be very hard (without
allocating space) to guarantee that case 1 still holds.

If case two is true, then there is *never* any point in comparing their
addresses; therefore, there is no point in defining their addresses to be
equal in the first place (it is an unnecessary restriction).


Unless someone can produce some evidence to the contrary, it seems to me
that there is no purpose to taking the address of an empty object.


A solo posting by:
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				You are in a little maze of twisty
 /  / _ , __o  ____		 email paths, all different.
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

sher@sunybcs.uucp (David Sher) (06/24/88)

The key to why you might want to take the address of empty classes:
What if the empty class has a subclass that is not empty.  For example:
class Type {
public:
    char * question() { return "Why?" }
    };

class Wierd : Type {
    int i; ...
    };

Now lets say Type was a container type for a large variety of subtypes
that only share a few properties (inherited from type) such as the
question function above.  This is about the only reason you'd want an
empty class in the first place, as a source of inheritted functions.
Then say you are manipulating an array of Type.  You definitely want
them to compare unequal since some of them will have content.  Also
otherwise you break the semantics of arrays in C as another poster
pointed out.  I can not see where you gain by having their addresses
compare equal, except to save marginal amounts of space or to simplify
the compiler (how does it simplify the compiler?).  I see only problems
by making pointers to empty objects null.
-David Sher
ARPA: sher@cs.buffalo.edu	BITNET: sher@sunybcs
UUCP: {rutgers,ames,boulder,decvax}!sunybcs!sher

ok@quintus.uucp (Richard A. O'Keefe) (06/25/88)

In article <5103@ihlpf.ATT.COM> nevin1@ihlpf.UUCP (00704a-Liber,N.J.) writes:
>Unless someone can produce some evidence to the contrary, it seems to me
>that there is no purpose to taking the address of an empty object.

There are programming languages (SETL springs to mind) which have an
"atom" data type, the objects of which have *no* properties other than
their identities.  (Don't confuse them with Lisp atoms.)  I have also
seen this sort of thing used in VDM work.  I think that it would be
occasionally useful to have (typed) objects consuming no (real) store
-- type checking should ensure that pointers to null objects are not
compared with pointers to non-null objects, unless the programmer goes
out of his way to shoot himself in the foot -- which have no properties
other than identity, so that one doesn't absent-mindedly slide into
using arithmetic properties.  Using 'new' to generate a null object in
order to obtain a new token seems a lot cleaner to me than incrementing
a global integer variable, even if that's the implementation.

I would almost go so far as to say that the only reason for having empty
objects at all would be to to take their "addresses".

strouckn@nvpna1.UUCP (Louis Stroucken 42720) (06/27/88)

In article <475@polari.UUCP> rlb@polari.UUCP (rlb) writes:
[...]>}Prologue:
>I assume your posting is based on the feeling (which I also have) that it is
>pretty damn hard to come up with a realistic example in which it *matters*
>whether or not two pointers to different objects of the same empty class are
>distinguishable.  

What about:

p = new empty;
q = new empty;

// ...

if ( p == q ) {
	delete p;
} else {
	delete p;
	delete q;
}

(or something like it. I want a syntax-free language :-)

May I delete an empty object twice? If not, I prefer them to be distinct.

Louis Stroucken

karl@haddock.ISC.COM (Karl Heuer) (06/28/88)

In article <475@polari.UUCP> rlb@polari.UUCP (rlb) writes:
>In article <4614@haddock.ISC.COM>, karl@haddock.ISC.COM (Karl Heuer) writes:
>>Given `int a[N]', &a[N] is a valid pointer that doesn't point anywhere.
>>That is, it's guaranteed that &a[N] can be stored in a pointer variable, but
>>dereferencing it is undefined.  And it's not guaranteed to be distinct from
>>other pointers: &a[N] may well be equal to &b[0].  I think ZSO pointers are
>>in the same category.  (After all, it's just the limiting case N=0.)
>
>The Walking Lint stumbles a bit here :-).  It certainly is guaranteed to be
>distinct from a possibly quite large number of other pointers -- those that
>point to a[0] through a[N-1].

You're misparsing me.  I'm not saying that it compares unequal to everything;
I'm saying that &a[N] has a `flaw' which valid pointers do not: &a[N] might
compare equal to &b[0].

>It is true that a pointer to &a[N] cannot be dereferenced legally and you may
>personally choose to define that as "doesn't point anywhere" but: A) that
>definition does not remove the restrictions on the pointer (must play well
>with other pointers into the same object :-),

I'm not sure I understand why we're concerned with other pointers into the
same object.  I thought the question was whether &x == &y should be forbidden
for two distinct ZSOs.

>and B) this is all just an analogy (not quite a correct one, I think) for the
>subject under discussion, therefore we proceed...

As I said, take the limiting case N=0.  An array of size zero is one type of
ZSO; I would expect its properties to be similar to both an empty class and a
nonempty array.

>>Also, consider `empty x[2]'.  Surely &x[0] and &x[1] must compare equal?  If
>>so, why should the situation be different if I declare `empty x0, x1'?  Or
>>if I call `new empty' twice?
>
>...  To use your own example, &x[0] and &x[1] definitely must NOT compare
>equal, because the expression "&x[1] - &x[0]" must evaluate to the integer 1.

&x[1] is x+1, which is (empty *)((char *)x + sizeof(empty)), which is &x[0].
&x[1]-&x[0] is the difference in bytes divided by the size of the type, which
is 0/0 (indeterminate).  Looks like ZSO-pointer subtraction is undefined.

>Else you must clutter the language definition up with a great many "except
>for zero-sized objects".

I don't think this can be avoided -- except by not implementing ZSOs, which is
of course what we see in practice.  For good reasons, it seems.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint