[comp.lang.misc] REFs or aliases?

Chris.Holt@newcastle.ac.uk (Chris Holt) (10/29/90)

All this discussion about Pointers considered Harmful seems to come
down to the following:

In article <1990Oct26.155937.29185@maths.nott.ac.uk> anw@maths.nott.ac.uk (Dr A. N. Walker) writes:
>	But in almost every programming language, in addition to values
>of type INT, I have to deal with values of type REF INT (integer variables
>in most languages).  Yes, it complicates life, but it's usually thought
>to be worth it.

That's the point ( :-); is it really worth it?  What constraints
should be placed on REFs so that they don't allow arbitrary firewall
failures (as in another thread)?  What operations should be allowed
on REFs?  Arithmetic seems dangerous; but if you don't have even that,
then pointers have no advantage over aliasing (and when you don't
need aliasing, you don't need pointers).

> Extending to REF REF INT (pointers), and other REF types,
>generalises and simplifies the construct.  It's just a shame that so many
>programmers fail to appreciate this (often because their first language
>managed to muddy the waters).

Like Algol 68?

-----------------------------------------------------------------------------
 Chris.Holt@newcastle.ac.uk      Computing Lab, U of Newcastle upon Tyne, UK
-----------------------------------------------------------------------------
 "Who overcomes by reason sole hath overcome but half his foe..."

mjs@hpfcso.HP.COM (Marc Sabatella) (11/01/90)

>>       int array[100];                 int array[100];
>>       int *p;                         int i;
>>       extern int *bar();              extern int bar();
>>
>>       foo ()                          foo ()
>>       {                               {
>>               p = bar();                      i = bar();
>>               *p = 42;                        array[i] = 42;
>>       }                               }
>
>Yes, but the problem is not whether foo() in isolation is more efficient
>but whether the _program_ as a whole is more efficient.  In this case,
>in order for these two programs to have the same semantics, the base of
>array[] must be added into p.  Whether this add is done in foo() or in
>bar(), it still takes the same time to do.

Nonsense.  The base of array[] can be "added into p" statically:

	extern int array[100];

	int *bar ()
	{
		return array+8;
	}

Your points about aliasing are well taken (you've repeated them often enough);
I only take issue with the claim that there can *never* be an efficiency
advantage to pointers.  In an earlier post, I mentioned that the abstract they
can provide (virtual machine addresses) are useful to writers of loaders.  You
replied in mail that you had written a loader using only arrays, which is fine,
but it does not change the fact the the abstraction of an actual virtual
address can be useful (your array implementation merely substitued integer
indices for pointers to achive the same abstraction).

Certainly some implementations of pointers can cause aliasing problems in
optimizers; no one disputes that.  This does not change the inherent usefulness
of the abstraction, or the fact that there are cases where pointers can give
you a win (especially in an implementation with more restricted aliasing).

jlg@lanl.gov (Jim Giles) (11/02/90)

From article <8960021@hpfcso.HP.COM>, by mjs@hpfcso.HP.COM (Marc Sabatella):
> [...]
> Nonsense.  The base of array[] can be "added into p" statically:
> 
> 	extern int array[100];
> 
>       int *bar ()
>       {
>               return array+8;
>       }

So can the array base.  Array[8] is a static location.  To be sure,
your _very_ contrived example prevents a compiler from detecting this
without interprocedural analysis.  But, how many codes do you have
that really are structured this way?  Do you recommend this structure?
Most importantly: do you really think that you have proved you point
if restructuring the code will still allow arrays to be as fast as
your pointers?

> [...]
> I only take issue with the claim that there can *never* be an efficiency
> advantage to pointers.  [...]

I still stand by that claim.  A contrived example in which pointers
only provide a savings if a procedure is called in order to return a
static offset isn't really a very good counter-example.  If it's a
static offset, why don't you inline it (as a macro even - if you insist
on the modularity of making a call here)?  If it's not a static offset,
then, as I've said before, it doesn't really save anyway.

> [...]                              This does not change the inherent usefulness
> of the abstraction, or the fact that there are cases where pointers can give
> you a win (especially in an implementation with more restricted aliasing).

It is _that_ claim that I've repeatedly challenged people to prove.
I want an example of a case where pointers really constitute a gain
over the alternatives.  Your only answer so far has been to provide
a code which is structured upside-down.  In other words, the
pointers aren't needed for the functionality but to make up for bad
programming style.  Perhaps that's enough of a reason to include
pointers - but I don't think so.  Especially since pointers
encourange bad style as often (or more often) than they make up for
it.

J. Giles

Chris.Holt@newcastle.ac.uk (Chris Holt) (11/02/90)

This whole discussion of pointers (REFs) seems too concerned with the
wrong things to me (but then, I'm more concerned with semantics; people
worried about efficiency can hit "n" now).

Given an arbitrary domain, e.g. sets of integers, it is always
possible to introduce a monadic operation f that extends that
domain.  So, given a value such as {1,3,4}, we can apply f to
it to obtain a new value f{1,3,4}; and we can apply f to that
new value to obtain f(f{1,3,4}).  We also introduce an inverse
operation f' such that f'(f(x)) = x.  [Obviously, one might
spell f as REF, and f' as DEREF or whatever.]

The question is, why bother?  Forget about machine addresses and
locations and pointers right now; the point is that introducing
such an additional operation complicates the domain, and I can't
see any redeeming social value in so doing (at least that has been
presented yet).  But then, I prefer programs to be as simple as
possible...
-----------------------------------------------------------------------------
 Chris.Holt@newcastle.ac.uk      Computing Lab, U of Newcastle upon Tyne, UK
-----------------------------------------------------------------------------
 "He either fears his fate too much, or his programming tools are small..."

barmar@think.com (Barry Margolin) (11/02/90)

Here's an example of code that uses pointers in a way that I don't think
arrays can substitute.  Something like this appeared in code that a friend
wrote several years ago:

foo() {

char auto_buffer[BUFSIZ];
char* bufp;
int in_size;

if ((in_size = input_size()) < BUFSIZ)
	bufp = auto_buffer;
else
	bufp = malloc (in_size);

get_input (bufp);
...

if (bufp != auto_buffer)
	free (bufp);

}

The code was written this way because it is generally more efficient to use
stack-allocated data than heap data, but it can be difficult or inefficient
to make stack-allocated buffers big enough for all uses (some systems have
limits on stack frame size).  BUFSIZ is set big enough for most of the
common cases, but in extreme cases the code goes to the heap.

Most of the alternatives to pointers can't deal will with situations where
the reference can either by an automatic or heap object, decided at
runtime.
--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

gudeman@cs.arizona.edu (David Gudeman) (11/03/90)

In article  <1990Nov1.204843.18771@newcastle.ac.uk> Chris.Holt@newcastle.ac.uk (Chris Holt) writes:
]
]Given an arbitrary domain, e.g. sets of integers, it is always
]possible to introduce a monadic operation f that extends that
]domain...
]
]The question is, why bother?...

That question can't be answered in the general form you presented,
because the answer depends on the specific domain and the specific
function that is being introduced.  E.g.: Why bother introducing
the negation operator into the set of counting numbers?  Answer, it
gives you solutions to equations of the form "a + i = 0" where "a > 0"
is some constant and "i" is a variable.
-- 
					David Gudeman
Department of Computer Science
The University of Arizona        gudeman@cs.arizona.edu
Tucson, AZ 85721                 noao!arizona!gudeman

pcg@cs.aber.ac.uk (Piercarlo Grandi) (11/04/90)

On 1 Nov 90 20:48:43 GMT, Chris.Holt@newcastle.ac.uk (Chris Holt) said:

Holt> Given an arbitrary domain, e.g. sets of integers, it is always
Holt> possible to introduce a monadic operation f that extends that
Holt> domain.  So, given a value such as {1,3,4}, we can apply f to
Holt> it to obtain a new value f{1,3,4}; and [ ... so on ... ]

Holt> The question is, why bother?  Forget about machine addresses and
Holt> locations and pointers right now;

Ah, but they are of the essence!  Machines actually work like that and
moving around elements of the codomain of f may be cheaper than moving
around elements of its domain. You can also, given the fact that you
have for other reasons to deal with f and its codomain, encode
information in the codomain (hash linking, BIBOP style tagging, ...).

Holt> the point is that introducing such an additional operation
Holt> complicates the domain, and I can't see any redeeming social value
Holt> in so doing (at least that has been presented yet).  But then, I
Holt> prefer programs to be as simple as possible...

It does complicate things (my original remark, please :->), but the
redeeming social value is that it may be more cost effective, *at the
appropriate (low) abstraction level*. That's a social, rather than
mathematical value:

Holt> This whole discussion of pointers (REFs) seems too concerned with
Holt> the wrong things to me (but then, I'm more concerned with
Holt> semantics; people worried about efficiency can hit "n" now).

I do agree that pointers are never necessary, and actually complicate
life, but they allow you to conserve space and time. Machine level
efficiency does matter, or else we would be using relational database
languages to write operating systems (not such a far fetched idea,
actually -- I have considered it...).

A lower constant factor of time or space complexity (because pointers do
not buy you more) is not interesting to the mathematician, but it is to
people that sign cheques.  I think that they are concerned with the
right things too.

As to me, I am concerned about both efficiency and semantics, and which
compromises between the two are feasible and cost effective under which
conditions.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

diamond@tkou02.enet.dec.com (diamond@tkovoa) (11/05/90)

In article <1990Nov2.002843.28959@Think.COM> barmar@think.com (Barry Margolin) writes:

>Here's an example of code that uses pointers in a way that I don't think
>arrays can substitute.  Something like this appeared in code that a friend
>wrote several years ago:
>char auto_buffer[BUFSIZ]; char* bufp; int in_size;
>if ((in_size = input_size()) < BUFSIZ)
>	bufp = auto_buffer;
>else
>	bufp = malloc (in_size);
>get_input (bufp);
>...
>if (bufp != auto_buffer)
>	free (bufp);

PL/I had CONTROLLED variables 26 years ago too.  Although this is another
use of an ugly example, it remains a valid existence proof.  The generated
machine code is full of pointers and gotos (as for all programs), but the
source code isn't.

Uh, Mr. Margolin, you justified this code by the relative speed of stack
accesses vs. heap accesses.  Is that true even when both accesses are done
by dereferencing your bufp variable?  (And the equivalent in PL/I,
dereferencing an entity that is transparent to the programmer.)
-- 
Norman Diamond, Nihon DEC    diamond@tkov50.enet.dec.com
                                    (tkou02 is scheduled for demolition)
We steer like a sports car:  I use opinions; the company uses the rack.