[net.lang] Assignment-by-reference - a query

kelvin@arizona.UUCP (Kelvin Nilsen) (01/07/86)

i am designing a language for use in programming data communications
applications such as file transfer, network prototyping, logon scripts,
more sophisticated scripts, ...

although its forefathers would probably disavow any association with this
beast, it has strong similarities to both SNOBOL and ICON.

one feature which is particularly dangerous is the idea of "assignment-by-
reference."  any assignment of the form (a <- b) simply gives 'a' a pointer
to the same data that 'b' is pointing to.  so in the following sequence of
code:
		b <- 3;
		a <- b;
		b <- 5;
		write(a);

the number 5 is written to stdout.  of course, :-) this is often not desirable 
behavior.  more traditional behavior could be obtained by rewriting the code:

		b <- 3;
		a <- `b;	/* the grave accesnt causes a 
					copy of 'b' to be made
				*/
		b <- 5;
		write(a);

the proposed semantics for this wierdness is such that each time a new item
joins a group of aliased variables, the entire group takes on the value 
represented by the right-hand-side of the assignment.  so if we append to 
the first version of the above code, the statements:

		c <- 7;
		a <- c;

now 'a', 'b', and 'c' all represent the number 7, and changing any one of
them will change all of them.  To back variables out of this shared 
relationship, nullify them one by one.

implementation of this relies on several levels of indirection existing
between a variable and the data that it represents.  this is more costly
than what is done for standard compiled languages, but typical(?) of high
level languages such as ICON and SNOBOL.

motivation for this is several fold.  in ICON, there are inconsistencies
in the treatment of structured objects and scalars. for example, in the 
code:
		t := s
		s[5] := 'c'

the variable 't' will feel the effect of the second line's assignment if
's' represents a list, but will not if 's' is a string.  my intention is
to unify such behavior.

also, as CommSpeak was designed with real-time programming on microcomputers
in mind, it is strongly typed.  Assignment-by-reference was proposed in hopes
of buying back some of the flexibility of late binding and deferred evaluation
which are available to varying degrees in both ICON and SNOBOL.

my quandary:
	have any other languages been plagued by this same blunder?  
		any papers describing the experiences, feelings of the
		language designers in retrospect?
	does anyone have a suggestion for revised semantics of assignment 
		that might satisfy the same needs with less "astonishing
		results" to the unsuspecting programmer.

thanks for your thoughts,
kelvin nilsen

kurt@fluke.UUCP (Kurt Guntheroth) (01/14/86)

The CLU language (Barbara Liskov et al) at least describes this all formally
and gives a name to the objects that are assigned by copying vs those that
are assigned by copying reference.  You can or could once obtain the CLU
Reference Manual from Springer Verlag from the lecture notes in computer
science series.

ka@hropus.UUCP (Kenneth Almquist) (01/19/86)

> Any assignment of the form (a <- b) simply gives 'a' a pointer to the
> same data that 'b' is pointing to.  So in the following sequence of
> code:
>		b <- 3;
>		a <- b;
>		b <- 5;
>		write(a);
>
> the number 5 is written to stdout.  Of course, :-) this is often not
> desirable behavior.

The problem is that assignment does not have consistent semantics.
Let us assume that, in the first assignment, the reference to 3
creates a new object which is set to the value of 3.  After the
assignment, b points to this object.  After the second assignment,
a also points to this object.  Now we come to the third assignment.
I gather that the language starts to create a new object which it
plans to set to the value of 5, but then notices that b already points
to an integer object and decides to simply modify this object rather
than going to the trouble of creating a new object.  This is not what
the programmer was expecting the language to do!  The solution is
simple.  You already require that an assignment of the form (a <- b)
simply give 'a' a pointer to the object pointed to by b.  There is no
justification for making the semantics of assignment change when the
right hand side is a more complex expression than a simple variable.
Require that any assignment of the form (a <- expression) simply give
'a' a pointer to the object computed by the expression.

As an aside, the traditional way to handle numbers in an "assignment-
by-reference" language is to have each number refer to a fixed object.
Thus the assignment "a <- 5" would set 'a' to point to the object '5'
rather than creating a new object.  Naturally there is no reason for
the implementation to actually set aside storage to hold the object
'5'.  This approach gives different results that the one given above
if you permit pointer comarison.  It also gives different results if
you allow numbers to be modified, but I would recommend not providing
such a feature.

> Motivation for this is several fold.  In ICON, there are inconsistencies
> in the treatment of structured objects and scalars. for example, in the 
> code:
>		t := s
>		s[5] := 'c'
>
> the variable 't' will feel the effect of the second line's assignment if
> 's' represents a list, but will not if 's' is a string.

The previous example used integers, and it is pretty clear that there
is no reason for users to modify integers.  In the case of structured
objects, the situation is more complex.  You could insist that users
copy structured objects every time they want to modify them, but if
you do this circular structures cannot be constructed, and as
structures get more complex they get more difficult to copy.
Therefore you have to provide means by which structured objects may be
modified.  Character strings are structured objects, but their
structure is simple and unchanging; thus it is *not* clear that you
need to provide functions for modifying character strings.

In the example given, the confusion is caused by the fact that s[5]
refers to s substring of s if s is a string, but it does not refer to
a sublist of s if s is a list.  Thus the semantics of s[5] vary
depending upon whether s is a string or a list, *regardless* of which
side of the ":=" the s[5] appears.  This could be fixed by making a
string be a list of characters; I expect that this was not done in
ICON for reasons of efficiency.

It may still be reasonably claimed that the assignment
	s[5] := "c"
should assign the string "c" to the result of the expression s[5].
Since the expression s[5] generates a new string, which is not bound
to any variable, this assignment should generate a run time error.
But then ICON has never claimed to be an assignment-by-reference
language.  You can convert it to one by disallowing the above
statement and providing a similar capability in a subroutine which
generates a news character string:
	s := replace_character(s, 5, "c")

					Kenneth Almquist
					ihnp4!houxm!hropus!ka