[comp.lang.c++] >this< as a reference

rpk@rice-chex.ai.mit.edu (Robert Krajewski) (10/16/90)

Now that there is no need to assign to >this<, wouldn't it have
been cleaner if >this< were a reference, and not a pointer ?
----
Robert P. Krajewski
Internet: rpk@ai.mit.edu ; Lotus: robert_krajewski.lotus@crd.dnet.lotus.com

bs@alice.att.com (Bjarne Stroustrup) (10/16/90)

rpk@rice-chex.ai.mit.edu (Robert Krajewski @ MIT Artificial Intelligence Laboratory) writes:

 > Now that there is no need to assign to >this<, wouldn't it have
 > been cleaner if >this< were a reference, and not a pointer ?

Probably, but `this' was invented/introduced about 4 years before references.

jimad@microsoft.UUCP (Jim ADCOCK) (10/18/90)

>Now that there is no need to assign to >this<, wouldn't it have
>been cleaner if >this< were a reference, and not a pointer ?

I think so, but if one had made "this" a reference [in which case call
it "self"] then one should have changed other parts of the language to
return references too.  For example, in that case "new" should return
a reference rather than a pointer too.  Then, if one is making use
of references central in the language, one probably needs to make 
references assignable, like other OOP languages.

Not using references for everything has given C++ some indirect advantages,
though.  Most other OOPLs always access objects via reference, using 
reference semantics.  This hinders their ability to offer extended 
"primitive" classes such as class complex or String classes.  The other
languages thus also don't embed objects, but only embed a reference to the
"embedded" object, leading to more fragmentation, longer construction
times, additional indirections, etc.  Most other OOPLs also don't allow
objects on stack or static, but only on the heap.  So, overall, I think
the C++ approach has faired pretty well for the language, even if the
duality of pointer/reference semantics everywhere leads to some confusion.
One programming convention that one can consider to cut down on the 
confusion is to use pointers to refer to non-primitive objects -- things
that follow "reference semantics", and use references for "primitive" objects --
things like complex and String that follow value semantics.

To my mind, perhaps the saddest legacy left over from "C" days is the
automatic conversion of an array to a pointer to its first member.  Thus,
when presented with a Foo*, one doesn't really know if one has access to
an isolated Foo object, the start of an array of Foos, or a pointer into 
an array of Foos.  To my mind this is a great weakness in the C++ type
system, and leads to much aliasing problems, and compilers having to
pessimize their code.  Too late to fix it now, though....

scp@acl.lanl.gov (Stephen C. Pope) (10/19/90)

on 17 Oct 90 17:54:19 GMT,
jimad@microsoft.UUCP (Jim ADCOCK) said:

Jim> To my mind, perhaps the saddest legacy left over from "C" days is the
Jim> automatic conversion of an array to a pointer to its first member.  Thus,
Jim> when presented with a Foo*, one doesn't really know if one has access to
Jim> an isolated Foo object, the start of an array of Foos, or a pointer into 
Jim> an array of Foos.  To my mind this is a great weakness in the C++ type
Jim> system, and leads to much aliasing problems, and compilers having to
Jim> pessimize their code.  Too late to fix it now, though....

Do check out Doug Lea's submission to the ANSI committee concerning
array references.  He deals with just this issue, in a manner which
enhances the safety of using arrays tremendously.

stephen pope
advanced computing lab, lanl
scp@acl.lanl.gov

dl@g.g.oswego.edu (Doug Lea) (10/20/90)

> Do check out Doug Lea's submission to the ANSI committee concerning
> array references.  He deals with just this issue, in a manner which
> enhances the safety of using arrays tremendously.

Thanks, Stephen, but I haven't submitted it yet, because I'd like to
get more informal feedback about it before doing so. Toward that end,
I'll post a draft/request-for-comments on comp.std.c++.

 
--
Doug Lea  dl@g.oswego.edu || dl@cat.syr.edu || (315)341-2688 || (315)443-1060
|| Computer Science Department, SUNY Oswego, Oswego, NY 13126 
|| Software Engineering Lab, NY CASE Center, Syracuse Univ., Syracuse NY 13244

pcg@aber-cs.UUCP (Piercarlo Grandi) (10/21/90)

In article <58304@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:

  >Now that there is no need to assign to >this<, wouldn't it have
  >been cleaner if >this< were a reference, and not a pointer ?
  
  I think so, but if one had made "this" a reference [in which case call
  it "self"] then one should have changed other parts of the language to
  return references too.  For example, in that case "new" should return
  a reference rather than a pointer too.

This looks very bizarre to me. Are you under the impression that references
are kind of like pointers? I do not think so. I think they are very
different. I guess that the new operator should continue to return pointers
even if there are rerefences around.

  Then, if one is making use of references central in the language, one
  probably needs to make references assignable, like other OOP languages.

Here I think that you are falling prey to nominalism. There are things
called references in C++ and in say Simula67 or Eiffel. They are completely
different beasts, even if the name is the same. The Simula 67 and Eiffel
references are the direct equivalents of pointers in C++.

C++ references are like Algol 68 constant references, i.e. aliases. Pointers
are another way of doing aliases, but they have one level of indirection more
than C++ references (they are references to references).

  Not using references for everything has given C++ some indirect advantages,
  though.  Most other OOPLs always access objects via reference, using 
  reference semantics.

Read "pointers" throughout... Of course C++ has the advantage that to define
composite it can use contiguity and not just pointers. This is a *large*
advantage, and makes it clear that things like derivation are only justified
in the context of a pointer based OO language, which C++ is not.

  This hinders their ability to offer extended "primitive" classes such as
  class complex or String classes.

In no way I am aware of, frankly. Unless you are speaking of efficiency. But
Smalltalk is a pointer based OO language and it does provide fine
granularity (not "primitive" -- they are composite) classes, slowly.

  So, overall, I think the C++ approach has faired pretty well for the
  language, even if the duality of pointer/reference semantics everywhere
  leads to some confusion.

No confusion at all for those who have studied Algol 68 or BLISS:

	C++				Algol 68

	const int c = 10;		INT c = 10;

	int i := 100;			INT i := 100;
	int &j = i;			REF INT j = i;

	int *p = &i;			REF INT p := i;
	int *p = new int;		REF INT p := HEAP INT;

Note that three Algol 68 lines above are abbreviations:

	INT i := 100;		REF INT i = LOC INT; i := 100;
	REF INT p := i;		REF REF INT p = LOC REF INT; p := i;
	REF INT p := HEAP INT;	REF REF INT p = LOC REF INT; p := HEAP INT;

Explanation: '=' in Algol 68 defines a constant. 'i' above is a constant of
type 'reference to integer', and its value is a reference (returned by the
local reference allocator LOC) to some memory place. ':=' in Algol 68 is an
abbreviation for '=' (when in a declaration), and adds an implicit extra REF
and a call to the local reference generator.

In a sense in C all declarations imply one REF and LOC, except that 'const'
takes away the ref, '&' takes way the LOC, and '*' adds an extra REF.

	Algol 68			C++

	TYPE x = value;			const TYPE x = value;

	REF TYPE x = LOC TYPE;		TYPE x;
	REF TYPE x = y;			TYPE &x = y;

	REF REF TYPE x = LOC REF TYPE;	TYPE *x;
-- 
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

jimad@microsoft.UUCP (Jim ADCOCK) (10/23/90)

In article <2058@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
|In article <58304@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
|
|  >Now that there is no need to assign to >this<, wouldn't it have
|  >been cleaner if >this< were a reference, and not a pointer ?
|  
|  I think so, but if one had made "this" a reference [in which case call
|  it "self"] then one should have changed other parts of the language to
|  return references too.  For example, in that case "new" should return
|  a reference rather than a pointer too.
|
|This looks very bizarre to me. Are you under the impression that references
|are kind of like pointers? I do not think so. I think they are very
|different. I guess that the new operator should continue to return pointers
|even if there are rerefences around.

References are kind of like pointers, in that in common use both allow a 
way to access and send messages to objects:

object.doSomething();  // verses:

object->doSomething();

The point I was trying make is that in a language that supports two ways of
sending messages, either one of the two techniques are central, or you duplicate
every feature of the language everywhere.  If you want to make the 
"reference" style of programming central, then "this" [self] is a "reference", 
and so are other common features of the language, such as return values of new. 
If you want to make the "pointer" style of programming central, then you leave
the language the way it is.

|C++ references are like Algol 68 constant references, i.e. aliases. Pointers
|are another way of doing aliases, but they have one level of indirection more
|than C++ references (they are references to references).

Agreed.  If C++ references are constant references, this leaves the 
possibility open for references that are not constant -- ie they are 
re-assignable.  If one had re-assignable references in C++, then one could
program using a syntax almost identical to most other OOPLs.

|  Not using references for everything has given C++ some indirect advantages,
|  though.  Most other OOPLs always access objects via reference, using 
|  reference semantics.
|
|Read "pointers" throughout... Of course C++ has the advantage that to define
|composite it can use contiguity and not just pointers. This is a *large*
|advantage, and makes it clear that things like derivation are only justified
|in the context of a pointer based OO language, which C++ is not.

Hm, I don't get this, you say that contiguity is a great advantage, but then
state [without justification] that derivation is only justified in the context
of a pointer based OO langauge, which C++ is not.  So you seem to be stating
that you think contiguity is good, but prohibits derivation?  Can you 
explain this position?

|  This hinders their ability to offer extended "primitive" classes such as
|  class complex or String classes.
|
|In no way I am aware of, frankly. Unless you are speaking of efficiency. But
|Smalltalk is a pointer based OO language and it does provide fine
|granularity (not "primitive" -- they are composite) classes, slowly.

Primitives need to be created on the stack or in globals.  Most OOPLs provide
for a very limited set of built-in primitives that follow value semantics,
and require all programmer created types be non-primitive classes that 
follow reference semantics, are created on the heap....  Conversely, C++
allows one to add to the set of built-in primitive types with addition 
types of "simple" objects that follow value semantics: complex, iostream,
iter classes, etc are some example of these extended primitive classes.
I do not define primitive as non-composite, but rather classes that are
relativily simple and follow value semantics, rather than reference semantics.

|  So, overall, I think the C++ approach has faired pretty well for the
|  language, even if the duality of pointer/reference semantics everywhere
|  leads to some confusion.
|
|No confusion at all for those who have studied Algol 68 or BLISS:
|
|	C++				Algol 68
|
|	const int c = 10;		INT c = 10;
|
|	int i := 100;			INT i := 100;
|	int &j = i;			REF INT j = i;
| ....

Experience with Algol may help one understand C++ references, but it in 
no way helps one remember which of the . or -> syntax one needs to use when
dealing with someone else's classes, nor whether a reference or a pointer is
expected as a parameter, or is returned from a routine, or ....:

    object = something->subpart()->doSomething(someParam);
    object = &(something->subpart()->doSomething(*someParam));
    object = *(something->subpart()->doSomething(&someParam));
    object = *(something.subpart()->doSomething(someParam));
    object = *(something.subpart->doSomething(*someParam));
    object = something.subpart.doSomething(&someParam));

Unfortunately, the number of reasonable conventions [including not using
*any* conventions] exist that a C++ programmer runs into a different set
of usage conventions for each and every different set of libraries used.
Thus some complain that C++ is not an object oriented language, but a set
of tools for programmers to build their own object-oriented language on top of.

pcg@cs.aber.ac.uk (Piercarlo Grandi) (10/26/90)

In article <58470@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:

jimad> In article <2058@aber-cs.UUCP> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:

pcg> In article <58304@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:

pcg> This looks very bizarre to me. Are you under the impression that
pcg> references are kind of like pointers? I do not think so. I think
pcg> they are very different. I guess that the new operator should
pcg> continue to return pointers even if there are rerefences around.

jimad> References are kind of like pointers, in that in common use both
jimad> allow a way to access and send messages to objects:
jimad> object.doSomething();  // verses:
jimad> object->doSomething();

Well, you can use . also to access members off an object value, not just
a reference to it. Weak reason to say that referneces are kind of
pointers. They are indeed completely different, because their algebra is
completley different -- for example, as they are now, you cannot compare
references for equality, but you can compare pointers for equality.
Indeed, as they are now, there are no values of type 'TYPE &', while
there are values of type 'TYPE *'.

jimad> Not using references for everything has given C++ some indirect
jimad> advantages, though.  Most other OOPLs always access objects via
jimad> reference, using reference semantics.

pcg> Read "pointers" throughout... Of course C++ has the advantage that
pcg> to define composite it can use contiguity and not just pointers.
pcg> This is a *large* advantage, and makes it clear that things like
pcg> derivation are only justified in the context of a pointer based OO
pcg> language, which C++ is not.

jimad> Hm, I don't get this, you say that contiguity is a great
jimad> advantage, but then state [without justification] that derivation
jimad> is only justified in the context of a pointer based OO langauge,
jimad> which C++ is not.  So you seem to be stating that you think
jimad> contiguity is good, but prohibits derivation?  Can you explain
jimad> this position?

Ok, here we go: derivation in the form of prefixing was invented in
Simula 67. Simula 67 was designed to be strictly Algol 60 compatible.

To achieve this a decision was made to have an obviously distinct syntax
and implementation for the bolted on OO aspects. Thus the decision to
declare all objects using the ref(TYPE) syntax. This decision was
probably justified but had two regrettable consequences, or maybe design
goals:

1) pointers are in effect only introduced in the OO part of the
language. The Algol 60 part stays pointerless.

2) not only pointers can only point only to objects, all objects can
accessed only thru pointers.

3) all objects must live on the heap.

All these are consequences of the desire to have Simula 67 as a very
strict extension to Algol 60, which has neither pointers, not
composition nor a heap; the base language is left unchanged. They need
not apply to languages not based on Algol 60. For example C++ is based
on C, which being a descendant of Algol 68 in many way, already has
structures, pointers, and the heap as well as the stack.

Now, in an OO language like Simula 67 (or Smalltalk or Eiffel, but these
do not have the excuse of being based on Algol 60 and designed for
strict orthogonality of extension to it) where all objects must be
accessed via pointers creating composite objects can be inefficient,
because it implies a pointer and indirection for each subobject (not a
problem for Lispers of course :->).

Thus Simula 67 prefixing, which allows physical contiguity of the fields
of the subobjects. Unfortunately derivation as prefixing has also been
given as a regrettable bonus an effect in the algebra of the interfaces
of the classes involved, not just in their implementation -- not only is
their representation merged, but also their interface, i.e. accessor
functions and operations.

This is regrettable, because it confuses the algebra of interfaces with
that of implementations, binding the two so strictly.

In C++ we have a looser and better, but not yet optimal, situation. We
can have normal composition, that implies merging of implementation but
not of interface, and derivation, in which both are merged.

I have already argued in other articles that the algebras of
implementations and interfaces should be strictly distinct. In
particular I do not like derivation because it ties the merging of
interfaces with the merging of implementations, while IMNHO we can find
more interesting mechanisms for merging of implementations.  For
example, delegation in one form or another, does achieve merging of
interfaces without requiring merging the implementations.

In Simula 67 prefixing can be grudgingly accepted because there is
no other way to have contiguity, given that the OO part of the language
is strictly pointer based to differentiate it from the Algol 60 part.

This is no longer true in C++, which is based on C, and where not only C
is not as limited as Algol 60, but also strict extensioning is not a
goal. Yet we still have to use prefixing for merging interfaces, but it is
plain to the naked eye that it is a rotten technology (complexities
abound), and there is no need to tie it with contiguity and viceversa.

jimad> I do not define primitive as non-composite, but rather classes that are
jimad> relativily simple and follow value semantics, rather than reference semantics.

This is self-serving... Uhm.

jimad> Experience with Algol may help one understand C++ references, but it in 
jimad> no way helps one remember which of the . or -> syntax one needs to use when
jimad> dealing with someone else's classes, nor whether a reference or a pointer is
jimad> expected as a parameter, or is returned from a routine, or ....:

This si only because the C++ references were added as an aftethought. If
C had been designed along the Algol 68 lines, with a unified concept of
references, and a clear distinction between object values and reference
values, there would be no such problems.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk