corvara@ztivax.UUCP (Dr Gerd Venzl) (05/27/91)
Some months ago somewhere in the literature I found a statement like "Functional programming is not well suited for algorithms of graph theory as these usually make frequent use of side effects." When I tried to write down a purely functional formulation of some manipulations on binary decision diagrams I found that indeed to be quite a difficult and unnatural task. Are there really fundamental reasons for these difficulties to arise or is my vision and understanding of graph algorithms just too imperative? Are there any published attempts to give functional formulations of graph algorithms? Any opinions from netland? Please reply by email, I will summarize if there is sufficient feedback. Torsten Roessel Siemens AG, Munich (FRG) InterNet: roessel@ztivax.siemens.com
reddy@reddy.cs.uiuc.edu (Uday S. Reddy) (05/29/91)
In article <5332@ztivax.UUCP> corvara@ztivax.UUCP (Dr Gerd Venzl) writes:
Some months ago somewhere in the literature I found a statement like
"Functional programming is not well suited for algorithms of graph
theory as these usually make frequent use of side effects."
I am curious where exactly you found that quote.
In any case, the reason graph algorithms cannot be efficiently
implemented in functional languages is that they involve "shared
updates". Depth-first search, for instance, involves updating tags at
vertices to keep track of which vertices have been visited.
"Modifying" a tag means making a new graph which is same as the old
one except that one of its tags is changed.
The simplest way to handle these in existing pure functional languages
is to (abstractly) represent a graph as a finite map
Vertex -> List(Edge)
Such maps can be implemented, in turn, by balanced search trees of
some kind to allow for (nondestructive) updating. Of course, if you
are using an "impure" functional language with assignments, you can
use them to get what you want. But, you should be careful not to
create awful side effects.
Researchers are working on notations for expressing sequentiality
constraints so that compilers can update data structures in place.
See the proceedings of the upcoming FPCA, for example.
Uday Reddy
--
Uday Reddy
Department of Computer Science
University of Illinois
1304 West Springfield Avenue
Urbana, IL 61801.
tmb@ai.mit.edu (Thomas M. Breuel) (05/30/91)
In article <REDDY.91May28195125@reddy.cs.uiuc.edu> reddy@reddy.cs.uiuc.edu (Uday S. Reddy) writes:
Researchers are working on notations for expressing sequentiality
constraints so that compilers can update data structures in place.
See the proceedings of the upcoming FPCA, for example.
I think this is the wrong approach. Using data structures based on
reverse diffs (as in RCS), an update involving side effects:
val new_graph = (Graph.update_add_edge(graph,from,to); graph)
is no more efficient (except for possibly a small, constant time
overhead) than a more functional formulation:
val new_graph = Graph.with_edge(graph,from,to)
Therefore, the following claim doesn't quite hold:
Some months ago somewhere in the literature I found a statement like
"Functional programming is not well suited for algorithms of graph
theory as these usually make frequent use of side effects."
I am curious where exactly you found that quote.
In any case, the reason graph algorithms cannot be efficiently
implemented in functional languages is that they involve "shared
updates".
Since both "updates" and lookups can be carried out in constant time
even in a purely functional setting using data structures based on
reverse diffs, this is not true.
Such maps can be implemented, in turn, by balanced search trees of
some kind to allow for (nondestructive) updating. Of course, if you
are using an "impure" functional language with assignments, you can
use them to get what you want. But, you should be careful not to
create awful side effects.
The main problem is that most purely functional languages do not
provide data structures based on reverse diffs as primitives. If you
don't have side-effects in your language and only have data structures
like "list", "tuple", and "vector" built in, you lose, of course. (But,
by analogy, if you don't have "vector" in you language, you don't
get constant time lookups, so the fault is not with the functional
programming paradigm, but with the absence of the right primitive
data type.)
However, in a language like SML, you can isolate the side effects to
a small, abstract module and program your graph algorithms happily,
side-effect free, and efficiently.
Thomas.
jwb@cepmax.ncsu.edu (John W. Baugh Jr.) (05/30/91)
Thomas M. Breuel writes about "reverse diffs": > I think this is the wrong approach. Using data structures based on > reverse diffs (as in RCS), an update involving side effects: > > val new_graph = (Graph.update_add_edge(graph,from,to); graph) > > is no more efficient (except for possibly a small, constant time > overhead) than a more functional formulation: > > val new_graph = Graph.with_edge(graph,from,to) I've never heard of reverse diffs--could you please give some examples and/or references. Also, can reverse diffs deal effectively with other structured data types (like matrices: constant access & constant update)? John Baugh jwb@cepmax.ncsu.edu
reddy@reddy.cs.uiuc.edu (Uday S. Reddy) (05/30/91)
In article <TMB.91May29174602@volterra.ai.mit.edu> tmb@ai.mit.edu (Thomas M. Breuel) writes:
Researchers are working on notations for expressing sequentiality
constraints so that compilers can update data structures in place.
See the proceedings of the upcoming FPCA, for example.
I think this is the wrong approach. Using data structures based on
reverse diffs (as in RCS), an update involving side effects:
val new_graph = (Graph.update_add_edge(graph,from,to); graph)
is no more efficient (except for possibly a small, constant time
overhead) than a more functional formulation:
val new_graph = Graph.with_edge(graph,from,to)
----------
I just forgot how touchy this subject is. I cannot get into a long
debate at this time, but some clarifications.
I don't recall recommending "side effects" of any kind. The work I
referred in the above quote uses pure functional languages.
Additional sequentiality annotations are added so that compilers can
implement them efficiently.
I wasn't referring to the work of Swarup, Ireland and me, which also
happens to appear in the upcoming FPCA. Our work uses a different
approach with explicit assignments. But, it too does not involve
"side effects".
If you believe there is no problem in dealing with updates in
functional languages, take a standard text book on algorithms and
write the interesting algorithms in a functional language using your
favorite techniques. ("interesting" means using updates in an
interesting way). Then answer the following questions:
- Are the data structures represented as abstractly as in the
imperative programs?
- Do these data structures use the language's storage management or do
you need to manage the storage explicitly?
- Are the algorithms as efficient as the imperative versions?
- Are they as "convenient" as the imperative versions? For example,
do they involve passing an inordinate number of arguments while the
corresponding imperative versions don't?
If you are happy with your answers to all these questions, publish
your results!
--
Uday Reddy
Department of Computer Science
University of Illinois
1304 West Springfield Avenue
Urbana, IL 61801.
tmb@ai.mit.edu (Thomas M. Breuel) (05/30/91)
In article <1991May29.222935.13101@ncsu.edu> jwb@cepmax.ncsu.edu (John W. Baugh Jr.) writes: [Reddy, suggesting the addition of "sequence" primitives to allow a compiler to generate in-place updates rather than copying for modifying an array even in a purely functional language] > I think this is the wrong approach. Using data structures based on > reverse diffs (as in RCS), an update involving side effects [...] > is no more efficient (except for possibly a small, constant time > overhead) than a more functional formulation. I've never heard of reverse diffs--could you please give some examples and/or references. Also, can reverse diffs deal effectively with other structured data types (like matrices: constant access & constant update)? At the end, you find some old SML code (sorry about the style, it was just a quick initial hack for playing around with the idea) that should illustrate the point. The primitives you can use to operate on INTDICT arrays are completely functional: sub, assoc, length, and generate. From the programmer's point of view, there is no way to modify an INTDICT, although internally, the implementation uses side-effects for efficiency. The code gives you constant-time update and access to the latest version of the array. If you try to update older versions, you pay more overhead. For converting code containing side-effects to a functional style, it doesn't matter that updates of old versions take long, since code written using side-effects by necessity will only access and update the latest version of an array. For other purposes, there are more complicated, tree-based mechanisms for keeping diffs that give you log-time update and log-time retrieval for old versions. The technique can be applied to any of the data structures used in imperative languages. As an efficiency hack, and to assert that you ever only want to use the latest version of an array (as is the case for converted imperative programs), you could also simply tag old versions as completely invalid and raise an exception when an attempt is made to use them. I suspect this would be equivalent to the "sequence" primitives that Reddy mentions, but it does not require any language constructs beyond what the functional subset of languages like SML already offers. For other tricks you can play with reverse diffs, look at the RCS documentation. Thomas. signature VALUETYPE = sig type T val makestring:T->string end signature INTDICT = sig type T type dict val generate : (int -> T) -> int -> dict val sub : dict * int -> T val length : dict -> int val assoc : dict * int * T -> dict val print : dict -> unit end functor ReverseDiffArray(T:VALUETYPE):INTDICT = struct type T = T.T val makestring = T.makestring datatype rdiff = BASE of (T array) | DIFF of (int * T * rdiff ref) type dict = rdiff ref nonfix sub fun generate f n = ref(BASE(ArrayU.generate f n)) fun sub(ref(BASE(x)),i:int) = Array.sub(x,i) | sub(ref(DIFF(key,value,next)),i:int) = if i = key then value else sub(next,i); fun length(ref(BASE(x))) = Array.length(x) | length(ref(DIFF(_,_,x))) = length(x) fun assoc(last:dict as ref(BASE(x)),key:int,value:T) = let val old = Array.sub(x,key) val new = (Array.update(x,key,value); ref(BASE(x))) val back = ref(DIFF(key,old,new)) in last := !back; new end | assoc(last:dict,key:int,value:T) = let val n = length(last) val x = Array.array(length(last),sub(last,0)) fun loop i = if i>=n then () else (Array.update(x,i,sub(last,i)); loop(i+1)) in loop 0; Array.update(x,key,value); ref(BASE(x)) end fun print(a:dict) = let fun p(x) = output(std_out,x) val n = length(a) fun loop i = if i>=n then () else (p(makestring(sub(a,i))); if i<n-1 then p " " else (); loop(i+1)) in p "<"; loop 0; p ">\n" end end structure IID = ReverseDiffArray(struct type T = int val makestring:T->string = makestring end); - open IID; open IID - val a = generate (fn x => x) 10; val a = ref (BASE prim?) : dict - print a; <0 1 2 3 4 5 6 7 8 9> val it = () : unit - val b = assoc(a,5,55); <-- constant time update val b = ref (BASE prim?) : dict - print b; <0 1 2 3 4 55 6 7 8 9> val it = () : unit - print a; <0 1 2 3 4 5 6 7 8 9> val it = () : unit - val c = assoc(b,6,66); <-- constant time update val c = ref (BASE prim?) : dict - print c; <0 1 2 3 4 55 66 7 8 9> val it = () : unit - print b; <0 1 2 3 4 55 6 7 8 9> val it = () : unit - print a; <0 1 2 3 4 5 6 7 8 9> val it = () : unit - val d = assoc(a,3,33); <-- causes full copy (could be more efficient) val d = ref (BASE prim?) : dict - print d; <0 1 2 33 4 5 6 7 8 9> val it = () : unit - print a; <0 1 2 3 4 5 6 7 8 9> val it = () : unit -
wg@opal.cs.tu-berlin.de (Wolfgang Grieskamp) (06/02/91)
tmb@ai.mit.edu (Thomas M. Breuel) writes: > [INTDICT "reverse-diff" implementation, cf. original posting] Nice demonstration of the "clean" use of references in SML. Although I believe updatable arrays should be build in or implemented in a foreign language anyway, and a compiler can do the job slightly better then the SML programmer, if it implements reference counting and/or performs sharing analysis. In the first case (RC) the generated code might check at run-time if the object to update is "exclusive", i.e. only one reference exists to it. If so, no "diffs" have to be created. In the second case, static analysis detects that a given object is always exclusive in some context. In any case one has to find a sequentialization order which minimizes sharing. Consider a expression which swaps two values of an array: assoc(assoc(A,j,A sub i),i,A sub j) This, then, may be expressed by the "parallel" let (I hope "and" does something like this in SML; have the manual not in hand): let val xi = A sub i and val A1 = assoc(A,j,xi) and val xj = A sub j and val A2 = assoc(A1,i,xj) in A2 Textual sequentialization leads to the result that A is shared during the first call to assoc. But a simple heuristic can find the desired ordering (i.e. first performing the sub applications, then the assoc applications). Let each parameter of a function be marked to be either "destructive" (function might sel-update the parameter) or "non-destructive" (function will definitively not). This information will be pre-given for primitive functions, and will be inherited by user defined functions. Naturally, "assoc" is destructive in it's array argument, and sub is not (either pre-given or by analysis). The above parallel let will be sequentialized performing first the non-destructive sub applications and then the destructive assoc applications. To conclude, selective update is not _such_ a efficience penalty in functional languages provided your compiler really takes care of it. Unfortunately, most functional language compilers I know of won't (except the OPAL compiler). -- Wolfgang Grieskamp wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET