markb@sdcrdcf.UUCP (Mark Biggar) (04/13/88)
I have a problem where I need to do a topological sort on the nodes of a directed graph. This would be simple except that the graphs can contain cycles. It is the nature of the problem then each node has only ONE edge leaving it, but may have several edges leading to it. This means that any connected subgraph can have at most one cycle in it. I wish to treat these cycles as single nodes in the sorted list. So if I have a list of directed edges like: [d(e,a),d(b,c),d(f,b),d(a,c),d(d,e),d(c,e)] I get the result: [[e,a,c],b,f,d] (or some other permutation that is a valid topological sort) The graphs I am working with may have several disjoint pieces (each of which may contain a cycle). Mark Biggar {allegra,burdvax,cbosgd,hplabs,ihnp4,akgua,sdcsvax}!sdcrdcf!markb markb@rdcf.sm.unisys.com
ok@quintus.UUCP (Richard A. O'Keefe) (04/14/88)
In article <5224@sdcrdcf.UUCP>, markb@sdcrdcf.UUCP (Mark Biggar) writes: > I have a problem where I need to do a topological sort on the nodes of a > directed graph. This would be simple except that the graphs can contain > cycles. It is the nature of the problem then each node has only ONE > edge leaving it, but may have several edges leading to it. This means > that any connected subgraph can have at most one cycle in it. I wish to > treat these cycles as single nodes in the sorted list. > > So if I have a list of directed edges like: > > [d(e,a),d(b,c),d(f,b),d(a,c),d(d,e),d(c,e)] > > I get the result: > > [[e,a,c],b,f,d] (or some other permutation that is a valid topological sort) > > The graphs I am working with may have several disjoint pieces (each of which > may contain a cycle). > There are two parts to the question: how do you do this at all, and how do you do it in Prolog. The Prolog part isn't terribly interesting, there is code for topological sort and union/find already drifting around. So I'll concentrate on the first part. Such a graph is a sort of forest, except that some of the "trees" have a ring as their root instead of a single node. The best method I can think of is to adapt the usual Depth-First-Search method for determining strongly connected components. for Node in nodes(Graph) do visited(Node) := false od Order := [] for Node in nodes(Graph) do if not visited(Node) then Stack := [] % all the nodes reachable from Node N := Node while not visited(N) and there is an arc N->P do visited(N) := true Stack := [N|Stack] N := P od if visited(N) then % either we've found a cycle, or a previously processed path if member(N, Stack) then % it's a cycle Cycle := [] repeat Cycle,Stack := [head(Stack)|Cycle], tail(Stack) until head(Cycle) = N Order := [[Cycle]|Order] fi else % we've found a plain root visited(N) := true Stack := [N|Stack] fi Order := append(Stack, Order) fi od Order := reverse(Order) Interestingly enough, we can use ordinary lists for Stack, Cycle, and Order. If the member(N, Stack) test takes k steps, so what? It must have taken at least k steps to make Stack in the first place, so using ordinary member/2 won't spoil the asymptotic efficiency. The trick is representing visited() and the collection of arcs. That's what will make this an O(V.lgV) method in Prolog (V being the number of vertices) rather than an O(V) method. Caveat: I haven't tested this code.
markb@sdcrdcf.UUCP (Mark Biggar) (04/22/88)
In article <5224@sdcrdcf.UUCP> I write: >I have a problem where I need to do a topological sort on the nodes of a >directed graph. This would be simple except that the graphs can contain >cycles. It is the nature of the problem then each node has only ONE >edge leaving it, but may have several edges leading to it. This means >that any connected subgraph can have at most one cycle in it. I wish to >treat these cycles as single nodes in the sorted list. > >So if I have a list of directed edges like: > >[e(e,a),e(b,c),e(f,b),e(a,c),e(d,e),e(c,e)] > >I get the result: > >[[e,a,c],b,f,d] (or some other permutation that is a valid topological sort) > >The graphs I am working with may have several disjoint pieces (each of which >may contain a cycle). First off, thanks to everyone you answered my article. Given the restrictions on the graphs as stated above, the graphs can be thought of as a forest of trees some of which are rooted in a set of cycles. So the basic idea is to prune the trees from the leaves to the roots and then handle the cycles. So my first shot at it looked like: topsort(List,Slist) :- leaves(List,List,Nodes,Rest), % separate out the leaves (Nodes = [] -> % if there weren't any... cycles(Rest,Slist) % do the cycles ; % else topsort(Rest,Tlist), % topsort the rest of append(Tlist,Nodes,Slist) % and append it on the front ). leaves(_,[],[],[]). % empty list has no leaves leaves(List,[e(Tail,Head)|Edges],[Tail|Nodes],Rest) :- not(member(e(_,Tail),List)), % a leaf if no one points at it leaves(List,Edges,Nodes,Rest). % find the rest leaves(List,[Edge|Edges],Nodes,[Edge|Rest]) :- leaves(List,Edges,Nodes,Rest). % skip over ono-leaves This ignores handling the cycles for now. This implementation has two problem with: the first is that it is wrong in that it looses the root nodes of trees which are not rooted in a cycle. Second I don't like the append. The first problem has two reasonable solutions: 1) Notice and handle root nodes as the last edge leading to that root node is removed. But that requires upto two extra passes over the list of nodes for each edge that is removed, one to see if the head of the edge is a root node and one to see if this edge is the last one pointing at that root node. So I rejected this solution. 2) Trees that are not rooted in a cycle can be made rooted. In the original data edges like e(a,a) are not possible, so I can make one pass over the original data and add such an edge for each unrooted tree and then handle them at the same time I handle the cycles. The append can be gotten rid of by noticing that I don't have to collect the leaves in bunches, I can prune tham one at a time and still get a valid topsort. And by borrowing "the build this list up as I pass the arguments down" trick (from the non-naive reverse) I get the following: topsort([],[]). topsort(Edges, Nodes) :- root_trees(Edges,Edges,RootedEdges), prune_leaves(RootedEdges,SortedLeaves,[],Residue), cycles(Residue,SortedLeaves,Nodes). root_trees(_,[],[]). root_trees(Edges,[e(T,H)|Rest],[e(T,H)|RootedRest]) :- member(e(H,_),Edges), % not a root root_trees(Edges,Rest,RootedRest). % so skip it root_trees(Edges,[e(T,H)|Rest],[e(H,H),e(T,H)|RootedRest]) :- root_trees(Edges,Rest,RootedRest). % add simple cycle prune_leaves(Edges,OutList,InLIst,Residue) :- first_leaf(Edges,Edges,Leaf,Others), prune_leaves(Others,OutList,[Leaf|InList],Residue). prune_leaves(Edges,Leafs,Leafs,Edges). first_leaf(Edges,[e(T,H)|Rest],Leaf,[e(T,H)|Others]) :- member(e(_,T),Edges), first_leaf(Edges,Rest,Leaf,Others). first_leaf(Edges,[e(T,H)|Rest],T,Rest). % note first_leaf is meant to fail on first_leaf(_,[],_,_). % which of course means I didn't find any leaves cycles([],X,X). cycles([e(T,T)|Rest],IList,[T|OList]) :- cycles(Rest,IList,OList). % handle the simple cycles cycles([e(T,H)|Rest],IList,[[H|Nodes]|OList]) :- prune_leaves(Rest,Nodes,[],Residue), % I,ve broken the cycle and cycles(Residue,IList,OList). % can use prune_leaves to % get the rest of the cycle I would like any advice I can get on how to make this better. Mark Biggar {allegra,burdvax,cbosgd,hplabs,ihnp4,akgua,sdcsvax}!sdcrdcf!markb markb@rdcf.sm.unisys.com