[comp.lang.c++] Reference Semantics

benson@odi.com (Benson Margulies) (05/05/89)

I've found a use for reference variables that refer via the null
pointer. In the starkest of terms,

       example& t = *(example *)0;

I'd like to clarify the official language philosophy in this area, and
I hope to persuade the designers that this is a good use.

Consider an interface like:

gadget& find_a_gadget (char * gadget_name, int& error);

What should this do if the named gadget cannot be found, and error
signalling out-of-line is undesirable?

It seems natural to me for find_a_gadget to do the following:

	error = ENOGADGET;
	return *(gadget *)0;

Furthermore, the caller of find_a_gadget is welcome to assign the
result of find_a_gadget to reference variable r, and test

	if(&r == (gadget *)0) ...

to detect a failure to get one.

So I would like to see the definition of the language state that it is
permissible to say:

	type& var = *(type *)0;

with the condition that once this is done, the only legitimate action
with var is to take its address and compare it. Along with this, 
the definition would have to state the reasonableness of:

	return *(type 0);

in a function of type type&.

What I would like the definition of the language NOT to state is that
the above declaration is invalid and that an implementation can do
whatever it wants with it. 

One might object to my usage, claiming that I should be returning a
pointer. But it looks ugly to have a protocol defined with some
reference returns and some pointer returns. IMHO, reference returns
(and references) make for much more legible code, so it would be a
shame to have to avoid them to handle errors.

As an additional suggestion, it occurs to me that the following might
be a good extension of the language:

	type& ref = initial_ref;

	&ref = new_reference;

&foo on the LHS has no current semantics, so defining it to mean
rebind the reference conflicts with nothing. This allow somewhat
neater code for a variety of cases. One in particular is the desire to
iterate over an array of objects. Currently, one can write:

	for(int x = 0; x < limit; x ++) {
	    foo& r = array[x];
	    ...
        }

better, IMHO, would be

	for(int x = 0; x < limit; x ++) {
	    &r = array[x];
	    ...
        }

or even:

	for(foo& r = array; &r < &array[limit]; &foo ++) {
	    ...

thank you for your kind attention,

-- 
Benson I. Margulies

shopiro@alice.UUCP (Jonathan Shopiro) (05/05/89)

In article <315@odi.ODI.COM>, benson@odi.com (Benson Margulies) writes:
> I've found a use for reference variables that refer via the null
> pointer. In the starkest of terms,
> 
>        example& t = *(example *)0;
> 
> I'd like to clarify the official language philosophy in this area, and
> I hope to persuade the designers that this is a good use.
> 
> Consider an interface like:
> 
> gadget& find_a_gadget (char * gadget_name, int& error);
> 
> What should this do if the named gadget cannot be found, and error
> signalling out-of-line is undesirable?
> 
> It seems natural to me for find_a_gadget to do the following:
> 
> 	error = ENOGADGET;
> 	return *(gadget *)0;
> 
> So I would like to see the definition of the language state that it is
> permissible to say:
> 
> 	return *(type 0);
You meant      *(type*)0;
> 
> in a function of type type&.

Of course this more-or-less works in the current AT&T compiler,
and I don't see any reason why it should not work in any other
implementation (but the _only_ meaningful operation is to take
the address of a zero-referenced object).  However, I would
recommend strongly against using this as a programming style.
The problem is it's just too brittle.  Suppose you had

	Type&	f();	// a function that might return a zero reference

	Type	t = f();  // note t is not a reference

When f() decides to return a zero reference, the program will most likely
crash, because the generated code will try to use the reference returned
from f() to initialize t.  In other words, the only way to safely use f()
is

	Type&	tr = f();
	if (&tr != 0) {
		// use the referenced object
	}

and all other uses of f() are unsafe.  If you must return a reference
to an object, you can encode the idea that this object is the result
of a failed operation in its state, or (here's a gross hack for you)

	extern Type	failedOperationInstanceOfType;
	Type&	tr = f();  // f can return a reference to the above
	if (&tr != &failedOperationInstanceOfType) {
		// use the referenced object
	}

At least in this case the program won't crash if you try to copy the
failure object.

> 
> As an additional suggestion, it occurs to me that the following might
> be a good extension of the language:
> 
> 	type& ref = initial_ref;
> 
> 	&ref = new_reference;
> 
> &foo on the LHS has no current semantics, so defining it to mean
> rebind the reference conflicts with nothing.

But then &ref as an lvalue would have a completely different meaning
from &ref as an rvalue.  Too confusing.
-- 
		Jonathan Shopiro
		AT&T Bell Laboratories, Warren, NJ  07060-0908
		research!shopiro   (201) 580-4229

diamond@diamond.csl.sony.junet (Norman Diamond) (05/15/89)

In article <9310@alice.UUCP> shopiro@alice.UUCP (Jonathan Shopiro) writes:

>If you must return a reference
>to an object, you can encode the idea that this object is the result
>of a failed operation in its state, or (here's a gross hack for you)
>
>	extern Type	failedOperationInstanceOfType;
>	Type&	tr = f();  // f can return a reference to the above
>	if (&tr != &failedOperationInstanceOfType) {
>		// use the referenced object
>	}

Doesn't look like a gross hack to me.  Looks like a perfectly correct,
well structured solution, given the premises that a reference must be
returned and that exceptional cases cannot be handled out-of-band.

--
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.co.jp@relay.cs.net)
  The above opinions are my own.   |  Why are programmers criticized for
  If they're also your opinions,   |  re-implementing the wheel, when car
  you're infringing my copyright.  |  manufacturers are praised for it?

benson@odi.com (Benson Margulies) (05/16/89)

Here are some more thoughts on this:

1) I ask myself, why have both pointers and references in the
language? There seem to be three differences as described be js:
  
   a) references obviate the need for the -> syntax.
   b) references can "never" by null, obviating 
	error checks.
   c) references can never change their binding.

Of the three, (a) is clearly an advantage. (b) might be an advantage,
but I have more to say on the point later. (c), IMHO, is an
inconvienience, leading to lots of extra {} blocks. (Now, in Lisp, it
is quite typical to have language constructs that require an extra
block. But in C++/C, its pretty inconsistent with the rest of the
language. (Unless, of course, someone is planning dynamically sized
arrays.))

Now for (b). I argue that (b) is false.  C++, like C, is supposed to
be close to the machine.  Therefore, (IMHO) it is inappropriate for
implementations to emit code at runtime to ensure that a reference is
never assigned to null. Consider this code fragment:

	foo * foo_pointer;

... arbitrarily complex code ....

       foo& FOO = *foo_pointer;

For compilers to toss an extra dereference into here, let alone a
check against the null pointer, would be inappropriate.  Therefore, in
any case where you are the implementor of a user-callable interface
with a reference parameter, you ALREADY have to check for null to make
safe code (unless, of course, you prefer "core dumped") as a
diagnostic message.

Since the existing language does not, in fact, prevent null
references, and seems incapable of protecting against them, I (putting
on my best, if somewhat dusty from idleness, language lawyer wig)
claim that the language ought to explicitly describe their semantics,
and that it should demand that in all implementations one must be able
to test the address of a reference against 0.

As for (c), I still believe that 

	&ref = new_pointer;

would be very useful, and more in the C general style. But I have no 
argument like the above to help me on this one.


-- 
Benson I. Margulies

ark@alice.UUCP (Andrew Koenig) (05/17/89)

In article <323@odi.ODI.COM>, benson@odi.com (Benson Margulies) writes:

> As for (c), I still believe that 
> 
> 	&ref = new_pointer;
> 
> would be very useful, and more in the C general style.

I won't get into the argument of whether or not such a thing
would be useful.  I will merely point out that this particular
syntax can't work because it already has a meaning in some
contexts.

For example:

	struct T {
		int& operator&();
	};

	void f(T& ref)
	{
		&ref = 3;
	}

This is valid in all current versions of C++.
-- 
				--Andrew Koenig
				  ark@europa.att.com

mat@mole-end.UUCP (Mark A Terribile) (05/18/89)

> 1) I ask myself, why have both pointers and references in the
> language? There seem to be three differences as described be js:

There are a couple of good reasons, centering around the value that references
have when passing arguments.  If we want to be able to write functions
(including operator functions) that accept objects rather than pointers as
written, but that actually pass pointers (for reference semantics or for
evviciency) we seem to be pretty well stuck to references.

-- 

 (This man's opinions are his own.)
 From mole-end				Mark Terribile