[net.lang] levelheight: automatic "nil pointer" handling

jans@mako.UUCP (Jan Steinman) (01/24/85)

In article <6292@boring.UUCP> steven@boring.UUCP (Steven Pemberton) writes:
>I believe that programming languages can go much further in the help they
>give programmers than they do at present. For instance, it is possible to
>statically guarantee that nil pointers are never dereferenced (imagine the
>bugs that that would prevent!) but I know of no language that supports it.

Ada supports this:

|Not only can we explicitly create objects designated by access values, but we
|can also destroy them.  Such objects will no longer be accessible if all
|references to them are removed, such as:
|
|	MY_PACKET := new BUFFER;	-- a BUFFER object is created
|	MY_PACKET := null;		-- the original object is unreachable
|
|Furthermore, these dynamic objects will be destroyed once we leave the scope
|of the access objects.  For example:
|
|	declare
|		POINTER : BUFFER_POINTER;
|	begin
|		...
|		POINTER := new BUFFER;	-- create a BUFFER object
|		...
|	end;				-- BUFFER object is destroyed

From "Software Engineering With Ada", Grady Booch, p.105.
-- 
:::::: Jan Steinman		Box 1000, MS 61-161	(w)503/685-2843 ::::::
:::::: tektronix!tekecs!jans	Wilsonville, OR 97070	(h)503/657-7703 ::::::

ndiamond@watdaisy.UUCP (Norman Diamond) (01/25/85)

> > For instance, it is possible to statically guarantee that nil pointers are
> > never dereferenced (imagine the bugs that that would prevent!) but I know of
> > no language that supports it.
> 
> Ada supports this:  [quoted from Booch]
> 
> |Not only can we explicitly create objects designated by access values, but we
> |can also destroy them.  Such objects will no longer be accessible if all
> |references to them are removed, such as:
> |
> |	MY_PACKET := new BUFFER;	-- a BUFFER object is created
> |	MY_PACKET := null;		-- the original object is unreachable
> |
> |Furthermore, these dynamic objects will be destroyed once we leave the scope
> |of the access objects.  For example:
> |
> |	declare
> |		POINTER : BUFFER_POINTER;
> |	begin
> |		...
> |		POINTER := new BUFFER;	-- create a BUFFER object
> |		...
> |	end;				-- BUFFER object is destroyed

The "answer" has nothing to do with the problem.  Of course objects that are
no longer referenced can be de-allocated.  Furthermore, it is done dynamically.

The first netter's suggested problem was to diagnose programs that dereference
nil pointers.  This too can be done dynamically.  He suggested that it can be
done statically.  I find that very hard to believe.

Can a compiler do static checking and generate code to de-allocate objects, as
in the second netter's quotation?  I find that very hard to believe, too.
Aside from this disbelief, the two issues have little in common.

-- Norman Diamond

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond
CSNET: ndiamond%watdaisy@waterloo.csnet
ARPA:  ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa

"Opinions are those of the keyboard, and do not reflect on me or higher-ups."

ech@spuxll.UUCP (Ned Horvath) (01/28/85)

It is most certainly NOT possible to statically (i.e. at compile time) check
for nil-pointer dereferencing.  To use a somewhat classical argument, suppose
I have a subroutine that emulates a turing machine for one step each time it
is called; if the TM halts, it sets a given pointer to a particular value,
otherwise it leaves it nil.

Obviously I could statically analyze a program containing such a subroutine
for nil-pointer dereferencing only if my compiler had the capability to solve
the halting problem.

Similar arguments (left as an exercise) can be advanced to demonstrate that
static array bounds checking is impossible in principle, or that one cannot
identify unreachable code.

To take a more mundane example, lest the TM Halting problem seem contrived,
imagine a program to search for local optima by relaxation methods.  Such
a program may or may not converge depending on its input -- which is 
presumably not available to the compiler.  A variable which is set when
convergence is reached has a value the compiler certainly cannot predict.

=Ned=

markb@sdcrdcf.UUCP (Mark Biggar) (01/29/85)

The problem of statically determining if a program will ever try to
dereference a nil pointer is exactly equivalent to the Turing
halting problem.  In general any problem of the form "Does a program
ever do X" (except for trivial things like "Does the program execute
its first statement") is unsolvable.

In the case of nil pointer deferencing this can be shown as follows:

Assume you have a program T that that will statically check that a given
input program will never dereference a nil pointer.  Now, modify any
program X which passes the test T in the following way:  at each possible
point in program X where X could halt (just before the end of main, before
and exit or abort, etc.) add a dereference to a nil pointer making a new
program X'.  X' now has the property that if it passes test T it never halts
and if it fails test T it does halt.  Thus, T can be used to solve the Turing
halting problem.  As this is impossible, no such program T can exist.

Mark Biggar
{allegra,burdvax,cbosgd,hplabs,ihnp4,akgua,sdcsvax}!sdcrdcf!markb

steven@boring.UUCP (02/01/85)

In article <611@spuxll.UUCP> ech@spuxll.UUCP writes:
> It is most certainly NOT possible to statically (i.e. at compile time) check
> for nil-pointer dereferencing.  To use a somewhat classical argument, suppose
> I have a subroutine that emulates a turing machine for one step each time it
> is called; if the TM halts, it sets a given pointer to a particular value,
> otherwise it leaves it nil.
You can use the same argument to prove that strong typing is impossible.

> Similar arguments (left as an exercise) can be advanced to demonstrate that
> static array bounds checking is impossible in principle, or that one cannot
> identify unreachable code.
There also exist schemes to statically check array bounds.

So many people have claimed that statically checking nil dereferencing is
impossible, that I think I'd better show how it can be done.

The aim is to statically guarantee all pointer dereferences. To this end we
define a new language. 

We introduce a type of pointer that has no nil value, so that (as long as
there are no problems with unassigned variables, which is NOT part of the
scheme) we know that we can always dereference values of this new type.

Then, for variables that may also be nil we introduce a type that says just
that: values of this type may also be nil.

However, values of this second type may NOT be dereferenced, because they
may be nil. Before you dereference such a value, you are forced by the type
system to see if it is nil or a real pointer.

Now, to make this more concrete, here is a whole program to sort a list of
numbers by making a binary tree. I've made the syntax up as I've gone along:

type treenode = record value: integer; left, right: tree end;

type realtree = pointer to treenode; (*there is no nil for this type*);

type tree = optional realtree; (*there is nil for this type*)

var root: tree; (*note: may be nil, since it is of type tree*)
var i: integer;

begin (*program; procedures follow*)
     root:=nil;

     read(i);
     while i>=0 do
          insert(i, root);
          read(i)
     od;
     printtree(root)
end.

procedure insert(i: integer; var t: tree); (*note t may be nil*)
begin
     with t select
     (nil): t:=new(i, nil, nil);
     (rt: realtree):
          if i<rt^.value (*we may dereference rt, since it is a realtree*)
          then insert(i, rt^.left)
          else insert(i, rt^.right)
     end
end;

procedure printtree(t: tree); (*t may be nil*)
begin
     with t select
     (nil): (*do nothing*);
     (rt: realtree):
          printtree(rt^.left); (*again, we may dereference rt*)
          write(rt^.value);
          printtree(rt^.right)
     end
end

The type system thus guarantees that if this compiles, all pointer
dereferences are safe.

Note that it would be fairly easy to make a pre-processor for Pascal or C
that did these checks, and just generated code like "if t=nil" for the "with
t select" statements.

Steven Pemberton, CWI, Amsterdam; steven@mcvax.

ech@spuxll.UUCP (Ned Horvath) (02/04/85)

Your example of "static checks" of nil dereferencing make, rather than
refute, the argument that nil-dereferencing cannot be checked for statically.
What you have accomplished is to enforce the convention that any potentially
nil-valued pointer must be explicitly AND DYNAMICALLY checked for a nil-value
prior to dereferencing it.  This reduces to the same thing; what difference
does it make if the compiler forces the programmer to insert nil-checking
code, or does it automatically?  Either way, the check MUST be done
dynamically, and that is all the reductio ad absurdum (via TM halting) was
meant to show.

Strong typing is not at all the same thing; it is possible to rewrite your
example so that 'insert' and 'print' take only non-empty trees as arguments,
for example.  If we did that without modifying the main program, that would
correctly generate a compiler (or linker) exception due to type mis-match.
Static checking for a type mismatch is eminently feasible.

To summarize: static analysis suffices to determine WHERE TO PUT code to
check for things like nil values and array-index-out-of-bounds, but the
checks themselves MUST, in general, be made at run time.

There really isn't anything to argue about here: can a compiler recognize
(and enforce or generate protection against) potentially inconvenient run-time
conditions?  Sure, no argument.  Is it in the nature of some of those
conditions that they MUST be checked for at run time?  I think we have
established that as well.

=Ned=