[comp.lang.misc] side effects

gb17+@andrew.cmu.edu (George Thomas Baggott) (07/22/89)

In a truly functional language (i.e. no side effects at all), a function
for altering a large data structure (adding an entry to a dictionary,
for example) would be written as:

	dictionary_add(dictionary, entry) returns dictionary;
	/* return a copy of dictionary with entry added to it */

This is very costly, however.  If side effects were allowed, you could
do it at a much smaller cost:

	dictionary_add(dictionary, entry);
	/* add entry to dictionary */

Is there some way to have a language with no side effects, yet still be
able to solve this class of problems in a practical way?  If not what is
the minimum amount of comprimise that must be made?


George

gudeman@arizona.edu (David Gudeman) (07/24/89)

In article  <MYltLT200VsiM0U3Yx@andrew.cmu.edu> gb17+@andrew.cmu.edu (George Thomas Baggott) writes:
>In a truly functional language (i.e. no side effects at all), a function
>for altering a large data structure (adding an entry to a dictionary,
>for example) would be written as:
>
>	dictionary_add(dictionary, entry) returns dictionary;
>	/* return a copy of dictionary with entry added to it */
>
>This is very costly, however....
>Is there some way to have a language with no side effects, yet still be
>able to solve this class of problems in a practical way?...

There are ways to design data structures such that your function
doesn't really copy the entire dictionary.  The most common method is
probably this:

Assume the dictionary is represented by a reference block:

struct rblock {        /* reference block for a hashtable */
  ...
  struct hashtable *t; /* a hash table containing the entries */
  struct rblock *p     /* the updated version of this hash table */
  struct backup *b;    /* a description of what was changed to make this
                        * table outdated, NULL means this table is current */
  }

struct rblock *dictionary_add_internal(old_dict, entry)
{
  old_dict->p = /* a way to reverse the change that is about to be made */
  update(old_dict->t, entry);
  new_dict = copy(old_dict);
  new_dict->p = new_dict->b = NULL;
  return new_dict;
}

Obviously a reference to an entry now has to make sure that the rblock
is current, and if not, then it has to follow the chain of updated
entries -- keeping track of significant updates -- to the current
table.   Also obviously, this still uses more storage than a
non-applicative dictionary would need, but you don't have to copy the
whole table, and unneeded rblocks can usually be garbage-collected.

Alternatively, you can write a smart compiler that recognizes if the
program never references the old dictionary again -- a very common
case, and usually possible for a compiler to recognize.  Then you can
usually avoid all the busy-work and just update the dictionary in
place.  There is a very large body of literature on this subject, in
addition to papers on implementation of functional languages, you
might look for references to "applicative" languages and data
structures.
-- 
					David Gudeman
Department of Computer Science
The University of Arizona        gudeman@arizona.edu
Tucson, AZ 85721                 {allegra,cmcl2,ihnp4,noao}!arizona!gudeman

andyt@visix.UUCP (Andy Turk) (07/25/89)

In article <MYltLT200VsiM0U3Yx@andrew.cmu.edu>, gb17+@andrew.cmu.edu (George Thomas Baggott) writes:
> Is there some way to have a language with no side effects, yet still be
> able to solve this class of problems in a practical way?  If not what is
> the minimum amount of comprimise that must be made?
> 
> George

It would be  really nice to be able  to write programs that obey  some
reasonable set of semantics and are efficient to boot.  Unfortunately,
this doesn't happen very often.

#define SOAPBOX
This happens because when a language  is designed with  a semantics in
mind, the semantics  usually caters to  the mathematician  rather than
the machine.  Things that are easy  for the mathematician (e.g., sets,
closures, and mappings) are very  difficult  for  the idiot savants we
call    computers.   That's why   it's   easy to   write a declarative
description  of a  dictionary that  runs really  S  L O  W.   If  only
programs ran at the Speed of Mathematics (tm).

One possible solution  is to build new  machines that are  designed to
work  well with our  current  notions of semantics.  Many  strange and
wonderful architectures have  resulted from this kind  of exploration.
However, they seem to  be very specific in  the types of problems that
they solve.

A better idea is to work  on the mathematical end  of things so we can
design efficient Von Neuman type languages that have a strong semantic
foundation.

#undefine SOAPBOX

As far as  the dictionary example is concerned,  you   always want  to
modify (rather than copy) the state of the dictionary when you  can do
so without breaking the semantics of the program.

If there is just one reference  to the dictionary during the execution
of your program, then it's ok to modify it.  If there is more than one
reference, then  you need to  copy it.  (If  there are  no references,
then get rid of it).

If your program were written in the Most Perfect  Language (MPL), then
you should be able to combine the semantics of MPL  with the semantics
of your program to determine  which implementation of dictionary_add()
is best.

-------------------------------------------------------------------------------
Andrew K. Turk
visix!andyt@uunet.uu.net

        "I don't know what happened to my face." -- Dizzy Gillespie
-------------------------------------------------------------------------------

rro@bizet.CS.ColoState.Edu (Rod Oldehoeft) (07/28/89)

In article <MYltLT200VsiM0U3Yx@andrew.cmu.edu> gb17+@andrew.cmu.edu (George Thomas Baggott) writes:
>In a truly functional language (i.e. no side effects at all), a function
>for altering a large data structure (adding an entry to a dictionary,
>for example) would be written as:
>
>	dictionary_add(dictionary, entry) returns dictionary;
>	/* return a copy of dictionary with entry added to it */
>
>This is very costly, however.  If side effects were allowed, you could
>do it at a much smaller cost:
>
>	dictionary_add(dictionary, entry);
>	/* add entry to dictionary */
>
>Is there some way to have a language with no side effects, yet still be
>able to solve this class of problems in a practical way?  If not what is
>the minimum amount of comprimise that must be made?
>
>
>George

Single assignment semantics requires that an aggregate value be copied
if one wishes to modify an element of the value.  However, an
implementatiion may observe the last use of a value and update it in place.

We have done this for SISAL with good results, although there are still
instances where recognition does not take place.  

A Postscript version of David Cann's Ph.D. dissertation with all the
gory detail on this subject is available via anonymous ftp from
handel.CS.ColoState.EDU (129.82.102.16).  See file TR.89-108.ps in the
directory ~ftp/pub/SISAL.

Rod Oldehoeft                    Email: rro@CS.ColoState.EDU
Computer Science Department      Voice: 303/491-5792
Colorado State University        Fax:   303/491-2293
Fort Collins, CO  80523

morrison@accuvax.nwu.edu (Vance Morrison ) (07/29/89)

I have been thinking about the problem of programming languages for quite
a while now and the problem of aliasing is a interesting one.

The basic problem is that when a person writes code, he is often implicitly
assuming he has exclusive access to all objects in the his namespace and
that each object is unique (unaliased).  When a language allows you to
violate this principle, subtle bugs are possible (and are all too common
in real life).  

Pure functional languages take the view of solving the problem by simply
not permitting aliasing.  When an object it created, it is given a value
and that value never changes (referential transparency).  Given this rather
strong restriction it is amazing that a great deal of nontrivial code can
be written in this manner.

Now what happens in these languages is that every operation has to create
a new operation (since it can't store the result in any old ones). If we
implement this in a simple way, the resulting code is EXTREMELY inefficient.

The way around this is to view the code as simply a description of what
is to be done, not how to do it.  Thus the compiler should recognise that
certain variables are no longer uses and thus can be reused to hold new
values (without the overhead of throwing it away, garbage collecting and
reallocating it).  Any viable functional language does this to some extent
and it is a research topic as to how far it can go.

The nice part about doing things this way (that is programming in a functional
language and having it be converted to an efficient imperitive equivalent)
is that all of the nice features of function languages (simpler to understand
(no aliasing) program correctness proofs).  

The problem with this approach is that there is ALOT of code that really
cant be programmed with a function language (as far as I can tell) and a
bigger set that can be, but not efficiently.  For example function languages
really have no concept of a concurrent database, or resource allocation or
other operating system concepts.   And because aliasing is not allowed, you
cannot make structures have aliasing implicit in their structure (any
data structure with a loop, or things like queues, ADG's etc).  (Actually
you can make these structures but only by simulating a pointer and doing
it that way, but pointers are just what we are trying to avoid).

This is a major problem with functional languages in my option, so I believe
prohibiting aliasing completely is too strong a condition.  Instead, I believe
a lesser constraint is better.  In particular my present idea is that a
language should guarentee 'effective exclusive use'.  What that means is
that as far as the programmer is concerned every variable name is unique and
he has exclusive use over it.  Now this is 'effective' exclusive use because
the compiler may not give you excluseive use if you don't need it.  (for 
example, from the programmers point of view the complete tree is locked, 
however the compiler only locks the part of the tree the program is currently
working on).  

The key to making this work is for the programmer to specify to the compilier
(by sutable language constructs) the properties that he whats at what place
in the code.  Using the code and the required properties the compiler can
determine where exclusive use is necessary (actually it determines where it
is not necessary and thus can optimize).  

These ideas are rough, but I believe they are workable.  I am willing to
listen to anyone's view on the subject.


Vance