[comp.lang.c++] Null references

jgro@lia (Jeremy Grodberg) (02/14/91)

I've been looking through the ARM to try to find out how Null pointers and
references work together, and I can find neither any guarantees nor any
statments that the behaviour is undefined.  Here is a very simple example
of what I want (it is of course too simple to be useful).

#define NULL 0

int* square(int& x)
{
   if (&x != NULL) 
     x *= x;

   return &x;
}

main()
{
  int* p = NULL;
  int& death = *square(*p);  // I want a *guarantee* that this will not crash

  death = 0; // Of course, this causes a run-time crash
}


In other words, I want a guarantee that pointers are not dereferenced
when initializing references.   This is in fact the observed behavior of
CFront 2.0.  It is useful in some circumstances when you want to 
preserve object semantics with references, but occasionally have a null
object reference.  For example, suppose I want to implement a look-up
table, and my find() function needs to return an object.  It is much
more efficient to return an object reference, but I would like to return
NULL if I can't find the requested object.  I currently have no guarantee
that this will work, and my only other solution is to return an object
pointer.  This makes references less useful, and burdens the rest of my
code with creating temporary pointers to hold the return results so I can
check it against NULL before converting it to a reference.  There are other
situations in which null references substantially clean up the code.

Since we already have all sorts of special case treatment
of null pointers, and since we are guaranteed that the address (&) of
a reference is the same as the address of the object which it refers to,
it doesn't seem like such an extra burden to allow the creation of 
null references.

If the ANSI comittee thinks this guarantee is too hard to make, and/or
just isn't worth the effort, then I would like a statment to that effect
added to the standard, so that we can all know this is non-portable,
and so that people realize that null reference might be allowed by the
compiler.  For example, if the initialization of "death" in my example
is successful, some people will take that to mean that "death" refers
to a valid object, and be totally confused when the simple assignement
statement cause a crash.  The statement that null references have 
undefined behaviour will help alert people to that situation.


-- 
Jeremy Grodberg      "I don't feel witty today.  Don't bug me."
jgro@lia.com          

dag@control.lth.se (Dag Bruck) (02/15/91)

In article <1991Feb13.224743.1123@lia> jgro@lia.com (Jeremy Grodberg) writes:
>...  For example, suppose I want to implement a look-up
>table, and my find() function needs to return an object.  It is much
>more efficient to return an object reference, but I would like to return
>NULL if I can't find the requested object.

I guess this is the most common problem with references, and that it
will be solved (in most cases) when we get exception handling implemented.

Instead of returning a NULL reference, you would throw NOT_FOUND to
indicate when the lookup routine cannot find the object.

I find the notion that (1) references are always bound to something,
and (2) references are never re-bound, very useful.  I must admit that
I have used NULL references too, when I ran into trouble.

Dag Bruck
--
Department of Automatic Control		E-mail: dag@control.lth.se
Lund Institute of Technology
P. O. Box 118				Phone:	+46 46-104287
S-221 00 Lund, SWEDEN			Fax:    +46 46-138118

jimad@microsoft.UUCP (Jim ADCOCK) (02/20/91)

In article <1991Feb13.224743.1123@lia> jgro@lia.com (Jeremy Grodberg) writes:
|I've been looking through the ARM to try to find out how Null pointers and
|references work together, and I can find neither any guarantees nor any
|statments that the behaviour is undefined.  Here is a very simple example
|of what I want (it is of course too simple to be useful).
|
|#define NULL 0
|
|int* square(int& x)
|{
|   if (&x != NULL) 
|     x *= x;
|
|   return &x;
|}
|
|main()
|{
|  int* p = NULL;
|  int& death = *square(*p);  // I want a *guarantee* that this will not crash
|
|  death = 0; // Of course, this causes a run-time crash
|}
|
|
|In other words, I want a guarantee that pointers are not dereferenced
|when initializing references.   This is in fact the observed behavior of
|CFront 2.0.  It is useful in some circumstances when you want to 
|preserve object semantics with references, but occasionally have a null
|object reference.  For example, suppose I want to implement a look-up
|table, and my find() function needs to return an object.  It is much
|more efficient to return an object reference, but I would like to return
|NULL if I can't find the requested object.  I currently have no guarantee
|that this will work, and my only other solution is to return an object
|pointer.  This makes references less useful, and burdens the rest of my
|code with creating temporary pointers to hold the return results so I can
|check it against NULL before converting it to a reference.  There are other
|situations in which null references substantially clean up the code.
|
|Since we already have all sorts of special case treatment
|of null pointers, and since we are guaranteed that the address (&) of
|a reference is the same as the address of the object which it refers to,
|it doesn't seem like such an extra burden to allow the creation of 
|null references.
|
|If the ANSI comittee thinks this guarantee is too hard to make, and/or
|just isn't worth the effort, then I would like a statment to that effect
|added to the standard, so that we can all know this is non-portable,
|and so that people realize that null reference might be allowed by the
|compiler.  For example, if the initialization of "death" in my example
|is successful, some people will take that to mean that "death" refers
|to a valid object, and be totally confused when the simple assignement
|statement cause a crash.  The statement that null references have 
|undefined behaviour will help alert people to that situation.

I agree with your desire that C++ support null references.  I asked for
clarification on this issue on comp.std.c++ some time ago, and I think
the issue has been discussed by the ansification committee, and I think
the answer is that they decided *not* to explicitly support null references.
Perhaps someone on the committee can confirm or deny this?

The case *for* null references includes:

*that they are the equivalent to null pointers 
*they allow references to be used in many situations where pointers historically	were.
*compilers can't in practice afford to generate code to keep from allowing
	null references in most cases
*in a few obscure situations null reference WONT work correctly unless 
	compiler support is added.
*in those few situations if a null pointer is converted to a null reference
	extremely obscure bugs will result.

The case *against* null references includes:

*in a few obscure situations this will allow faster, smaller code to be
	generated.
*not allowing null references might allow more flexibility in porting C++
	to future "object-oriented" CPU architectures. 

wmm@world.std.com (William M Miller) (02/25/91)

jimad@microsoft.UUCP (Jim ADCOCK) writes:
> I agree with your desire that C++ support null references.  I asked for
> clarification on this issue on comp.std.c++ some time ago, and I think
> the issue has been discussed by the ansification committee, and I think
> the answer is that they decided *not* to explicitly support null references.
> Perhaps someone on the committee can confirm or deny this?

Let me clarify the organization of X3J16 before I answer this question
directly.  The committee has a number of informal subcommittees called
"working groups," dealing with issues like proposed extensions, libraries,
environments, etc.  These working groups have no authority beyond
recommending courses of action to the full committee; only the committee as
a whole can make decisions regarding what is and is not in the standard.

The issue of null references was discussed by the "core language" working
group, both via email and at the October meeting.  (The core language WG
deals with interpretation of the existing specification and with minor
modifications, while the extensions WG considers proposals for more major
changes.)  In the course of these discussions, the core language WG decided
1) that that current definition, embodied in E&S, does *not* allow null
references, and 2) not to recommend a change.

The relevant section of E&S is 8.4.3, page 153, where it says, "A variable
declared to be a T&, that is 'reference to type T' (section 8.2.2), must be
initialized by an object of type T or by an object that can be converted
into a T."  The result of applying unary * to a null pointer is not an
object, so it clearly cannot be used as an initializer for a reference.

The core language WG decided not to recommend a change to this policy for
several reasons, of which the two strongest were:

1) If the rule were relaxed to allow null references, code that currently
legitimately relies upon the language's guarantee that a reference is
connected to an object would no longer be safe.  Although it is true that
most implementations do not currently check for violations of the rule, it's
certainly reasonable to think that a debugging environment would do so and
find possible errors; if null references are permitted, no such checking
could be done.

2) References are conceptually aliases, not pointers.  Although they have
traditionally been implemented as pointers, there is no need for them to be;
a globally-optimizing compiler might well replace a number of references
with direct addressing.  If the nature of a reference is strictly a name for
an object, a "null reference" is an oxymoron.  It confuses the concept and
the implementation.

The full committee has not considered the issue specifically.  Since the
committee voted to accept the language as defined in the base document,
along with the minor changes introduced by E&S, action is only required for
a change to the definition; the core language WG's decision meant that no
such proposal was presented to the committee.

-- William M. Miller, Glockenspiel, Ltd.
   vice chair, X3J16
   wmm@world.std.com

jimad@microsoft.UUCP (Jim ADCOCK) (02/26/91)

In article <1991Feb24.215757.19083@world.std.com> wmm@world.std.com (William M Miller) writes:
>jimad@microsoft.UUCP (Jim ADCOCK) writes:
|> I agree with your desire that C++ support null references.  I asked for
|> clarification on this issue on comp.std.c++ some time ago, and I think
|> the issue has been discussed by the ansification committee, and I think
|> the answer is that they decided *not* to explicitly support null references.
|> Perhaps someone on the committee can confirm or deny this?
|
|The issue of null references was discussed by the "core language" working
|group, both via email and at the October meeting.  (The core language WG
|deals with interpretation of the existing specification and with minor
|modifications, while the extensions WG considers proposals for more major
|changes.)  In the course of these discussions, the core language WG decided
|1) that that current definition, embodied in E&S, does *not* allow null
|references, and 2) not to recommend a change.
|
|The relevant section of E&S is 8.4.3, page 153, where it says, "A variable
|declared to be a T&, that is 'reference to type T' (section 8.2.2), must be
|initialized by an object of type T or by an object that can be converted
|into a T."  The result of applying unary * to a null pointer is not an
|object, so it clearly cannot be used as an initializer for a reference.
|
|The core language WG decided not to recommend a change to this policy for
|several reasons, of which the two strongest were:
|
|1) If the rule were relaxed to allow null references, code that currently
|legitimately relies upon the language's guarantee that a reference is
|connected to an object would no longer be safe.  Although it is true that
|most implementations do not currently check for violations of the rule, it's
|certainly reasonable to think that a debugging environment would do so and
|find possible errors; if null references are permitted, no such checking
|could be done.

What you're saying then, is that use of null references will be an
unconstrained error -- programmers are not allowed to use them,
but compilers are not required to diagnose them.  This "unconstrained
error" situation comes up in a few places in the ANSI-C specs too, but
I think its generally a bad idea.  I believe that we should try to
keep to a situation where program examples fall into one of three
categories 1) strictly conforming 2) implementation dependent or 
3) constrained error.  Thus, I'd suggest a slight change from considering
null references "unconstrained errors" to "implementation dependent."

In practice, few programmers accept the idea that their program is
"illegal", yet that compilers accept and generate "correct" code for that
program without complaint.  Better then, to declare such programs as
"implementation dependent" so that we can warn programmer that such 
programs may not compile on some compiler some day in the future.

|2) References are conceptually aliases, not pointers.  Although they have
|traditionally been implemented as pointers, there is no need for them to be;
|a globally-optimizing compiler might well replace a number of references
|with direct addressing.  If the nature of a reference is strictly a name for
|an object, a "null reference" is an oxymoron.  It confuses the concept and
|the implementation.

The basic problem I see with this approach is it denies people the
use of references to solve problems with pointers.  See the issues
coming from the ANSI-C numerical group regards restricted pointers
for one such example of people trying to solve problems with "C" pointers.

I propose then, that two problems of references need to be "fixed" so 
that they can be generally used to work around the problems with "C"
pointers.

1) A non-const, re-initable reference be introduced.  As Bjarne as pointed
out this would require the introduction of new syntax to represent 
reference re-initting -- introducing a new ":=" operator to represent
reference re-initting would be one such solution.  Then, the present
const references could continue to be const references that would allow
no possibility of re-initting.  You could even continue to make the
present const reference the default if you allowed the ~const syntax
to explicitly declare references that are re-assignable.

2) Allow overloadable operator dot so that reference syntax can be used
where, in fact, people are implementing smart references -- even though 
today they might mistakenly call them "smart pointers."