[comp.lang.prolog] Minimum Spanning Tree with '=='

tarau@ouareau.iro.umontreal.ca (Paul Tarau) (05/10/88)

The following program finds minimum spanning trees
in O(E) "Prolog time", (E = number of edges).
The program uses the impure '==' operation to deal with 
connected components.

% MinSpanTree is a minimum spanning tree of an undirected graph
% represented by its list of Edges of the form: "edge(Cost,From,To)"
% with From<To, sorted by cost (in ascending order),
% and having as vertices natural numbers in 1..NbOfVertices.

mst(NbOfVertices,Edges,MinSpanTree):- 
	% makes an array of connected components
	functor(Components,'$',NbOfVertices), 
	mst(NbOfVertices,Components,Edges,MinSpanTree).

mst(1,_,_,[]):-!.		% no more vertices left
mst(N,Cs,[edge(Cost,V1,V2)|Es],T):-
	N>1,
	arg(V1,Cs,C1),		% C1,C2 are the components of V1,V2
	arg(V2,Cs,C2),
	(	C1==C2->	% Both endpoints are already in the same
		T1=T,		% connected component - skip the edge
		N1 is N 	
	;	C1=C2,		% Put endpoints in the same component
		T=[edge(Cost,V1,V2)|T1],	% Take the edge
		N1 is N-1			% Count a new vertex
	),
	mst(N1,Cs,Es,T1).

test(MinSpanTree):-
  mst(
      % Number of vertices
        5,
      % sorted list of edges (by cost)
        [
	  edge(10,1,2),edge(20,4,5),edge(30,1,4),
	  edge(40,2,5),edge(50,3,5),edge(60,2,3),
	  edge(70,1,3),edge(80,3,4),edge(90,1,5)
	],
	MinSpanTree
      ).

If sorting by cost is not provided, it may take O(E*log(E)).
If the number of vertices is not provided it takes O(E) to
find it as the biggest vertex on the list of edges.
Unification may have chains of variables representing connected
components to follow, but that is C or machine-language time...

The point is is that the '==' hack seems very useful, and
this kind of algorithm works fine on many related problems.
In this example logical variables seem to work basically
as equivalence classes, and unification means replacing
two of them by their union. With this interpretation, the
program is sound, but I do not know at this time how to
to force '==' inherit this kind of soundness in the general case.

Does someone have an elegant (meta-)logical semantics for '==' ?

	Paul Tarau

ok@quintus.UUCP (05/13/88)

In article <544@mannix.iros1.UUCP>, tarau@ouareau.iro.umontreal.ca
(Paul Tarau) presented a cunning little procedure for computing a
minimum spanning tree.  There are some problems:
    -	using integers as vertex labels is a bit of a pain
    -	functor(Components,'$',NbOfVertices) is used, which is likely
	to limit NbOfVertices to a small value (maybe ~250, but I've
	seen otherwise nice Prologs with much lower limits)
    -	there may be many minimum spanning trees of a graph; it would
	be nice to have a relation which can confirm that an alleged
	mst is one.

The key hack in his code is the use of variable-variable unification
as an implementation of the UNION-FIND problem.  For every edge in
the input, two FINDs are done, and for every edge in the output, one
UNION is done.

The snag with this is that it is possible to build up variable chains
of arbitrary length in Prolog.  You have to try rather hard, but you
can do it.  For example, to build a chain of length 5,

	_ = ''(A,B,C,D,E,F),	% A @< B @< ... @< F
	E = F,
	...
	A = B,

This gives us a chain F->E->...->B->A having five links.  Letting V be
the number of vertices in the graph, and E be the number of edges, it
is thus possible for the UNIONs and FINDs to take O(V.(E+V)) time.
This is a pretty unlikely worst case, but this behaviour _has_ been
observed in a benchmark.  There are methods which could be used to
reduce this worst case, but they'd push the constant factor up.  Any
Prolog implementors out there using a nontrivial UNION-FIND for
managing variable-variable unification?

tarau@ouareau.iro.umontreal.ca (Paul Tarau) (05/16/88)

In the article <962@cresswell.quintus.UUCP>, ok@quintus.UUCP 
(Richard A. O'Keefe) writes:

> In article <544@mannix.iros1.UUCP>, tarau@ouareau.iro.umontreal.ca
> (Paul Tarau) presented a cunning little procedure for computing a
> minimum spanning tree.  There are some problems:
>    -	using integers as vertex labels is a bit of a pain
>    -	functor(Components,'$',NbOfVertices) is used, which is likely
>	to limit NbOfVertices to a small value (maybe ~250, but I've
>	seen otherwise nice Prologs with much lower limits)

An easy way to get rid of 'functor' and its 250 or other arbitrary limit
is to represent vertices by free variables:

% mst(NbOfVertices,Edges,MinSpanTree)

mst(1,_,[]):-!.		% No more vertices left
mst(N,[edge(Cost,V1-C1,V2-C2)|Es],T):-
	N>1,
	(	C1==C2->	% Both endpoints are already in the same
		T1=T,		% connected component - skip the edge
		N1 is N 	
	;	C1=C2,		% Put endpoints in the same component
		T=[edge(Cost,V1,V2)|T1],	% Take the edge
		N1 is N-1			% Count a new vertex
	),
	mst(N1,Es,T1).

:- dynamic test/1.	% necessary in Quintus Prolog to 
			% accept an unlimited number of variables

test(MinSpanTree):-
  mst(
      % Number of vertices
        5,
      % sorted list of edges (by cost)
        [
	  edge(10,V1,V2),edge(20,V4,V5),edge(30,V1,V4),
	  edge(40,V2,V5),edge(50,V3,V5),edge(60,V2,V3),
	  edge(70,V1,V3),edge(80,V3,V4),edge(90,V1,V5)
	],
	MinSpanTree
      ),
      numbervars(MinSpanTree,0,_). % just for a nice output

>    -	there may be many minimum spanning trees of a graph; it would
>	be nice to have a relation which can confirm that an alleged
>	mst is one.

A simple way to do it is to have a sorting algorithm which
generates alternative orderings by cost. This may be also
useful in related problems, as multiple solutions come from
multiple orderings. The procedure 'mst/3' can be seen as
just filtering this stream of solutions.

> The key hack in his code is the use of variable-variable unification
> as an implementation of the UNION-FIND problem.  For every edge in
> the input, two FINDs are done, and for every edge in the output, one
> UNION is done.
> The snag with this is that it is possible to build up variable chains
> of arbitrary length in Prolog.
> ......
> Any Prolog implementors out there using a nontrivial UNION-FIND for
> managing variable-variable unification?

The UNION-FIND problem can be solved for variable-variable unification
(at implementation language level) by making a tree instead of a
chain of variable references.

There are two operations:
- weight balancing (by maintaining the size of the tree)
- path compression (by making each variable on a path point to the root).

As explained in R.Sidgewick, Algorithms, Adison-Wesley, 1983, p.402-405
this is an almost O(E) operation for a structure with E edges.

At a first look the necessary WAM hacking seems minimal,
as only the 'unify_variable Yn' instruction is involved.

Paul Tarau

ok@quintus.UUCP (Richard A. O'Keefe) (05/20/88)

In article <545@mannix.iros1.UUCP>, tarau@ouareau.iro.umontreal.ca (Paul Tarau) writes:
> In the article <962@cresswell.quintus.UUCP>, ok@quintus.UUCP 
> (Richard A. O'Keefe) writes:
> > Any Prolog implementors out there using a nontrivial UNION-FIND for
> > managing variable-variable unification?
> 
> The UNION-FIND problem can be solved for variable-variable unification
> (at implementation language level) by making a tree instead of a
> chain of variable references.
> 
> There are two operations:
> - weight balancing (by maintaining the size of the tree)
> - path compression (by making each variable on a path point to the root).
> 
> As explained in R.Sidgewick, Algorithms, Adison-Wesley, 1983, p.402-405
> this is an almost O(E) operation for a structure with E edges.
> 
> At a first look the necessary WAM hacking seems minimal,
> as only the 'unify_variable Yn' instruction is involved.
> 

(1) That's Sedgewick with two "e"s and one "i";
    there is now a second edition of the book;
    Tarjan's papers are, I think, clearer.
    There is also Rem's algorithm, which is very pretty.

(2) The data-structures required for the nearly-linear UNION-FIND are
    non-trivial, and would dramatically increase the space needed for
    variables.

(3) It is not the case that only unify_variable_Yn would be involved.
    To start with, because the data structure would change, every
    dereferencing operation would be affected (and would become slower).
    Further, backtracking would have to undo the UNIONs, so trailing
    and failing would be affected.

(4) We return to the question: are there any Prolog implementors out there
    who have thought it worthwhile to use a non-trivial UNION-FIND?