[comp.lang.prolog] Representing graphs?

Ari.Huttunen@hut.fi (Ari Juhani Huttunen) (06/23/91)

I would like to know if this is a good way to represent graphs or an ugly hack.
The idea is to have fast access to the current location in the graph and
it's near neighbours. And since SB-prolog doesn't encourage asserting and
the graph has to be created while running...

graph(X) :-
	X = vertice(x,[A,B,C]),
	A = vertice(a,[D,X]),
	D = vertice(d,[A]),
	B = vertice(b,[X,G]),
	C = vertice(c,[X,E]),
	E = vertice(e,[C,F]),
	G = vertice(g,[F,B]),
	F = vertice(f,[E,G]).

test :-
	graph(X),
	writevertice(X),
	move(X,Y),
	writevertice(Y),
	move(Y,Z),
	writevertice(Z).

move(vertice(_,[Y|Ys]),Y).	

writevertice(vertice(Name,ConnectedTo)) :-
	names(ConnectedTo,Names),
	write(Name),
	write('-->'),
	writelist(Names),
	nl.

names([vertice(Name,_)|Vertices],[Name|Names]) :-
	names(Vertices,Names).
names([],[]).

writelist([X|Xs]) :-
	write(X),
	writelist(Xs).
writelist([]).
--
Ari Huttunen		  
puhelin: 90-7285944		T{m{ tila vuokrattavana.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/25/91)

In article <ARI.HUTTUNEN.91Jun23021130@wonderwoman.hut.fi>, Ari.Huttunen@hut.fi (Ari Juhani Huttunen) writes:
> I would like to know if this is a good way to represent graphs
> or an ugly hack.  The idea is to have fast access to the current
> location in the graph and its near neighbours.

> graph(X) :-
> 	X = vertice(x,[A,B,C]),
> 	A = vertice(a,[D,X]),
....
It's an ugly hack.  It is going to get you into big trouble is some
Prologs.  The file GRAPHS.PL in the DEC-10 Prolog library provides
several representations for graphs, including
	list of edges					-- sorted!
	list of node-(list of neighbours)		-- sorted!
These representations appear to be adequate for a great many tasks,
allowing the graph algorithms I was interested in to be coded in
Prolog with the same asymptotic cost as in Pascal or C (admittedly
with higher constant factors).  Another representation would be the
adjacency matrix, represented as a quad-tree, which works rather
nicely.  I would be interested to know which graph algorithms you
are implementing.

By the way, it's "vertEX", not "vertICE".
-- 
I agree with Jim Giles about many of the deficiencies of present UNIX.

bimandre@icarus.cs.kuleuven.ac.be (Andre Marien) (06/26/91)

> > I would like to know if this is a good way to represent graphs
> > or an ugly hack.  The idea is to have fast access to the current
> > location in the graph and its near neighbours.
> 
> > graph(X) :-
> >       X = vertice(x,[A,B,C]),
> >       A = vertice(a,[D,X]),
> ....
> It's an ugly hack.  It is going to get you into big trouble is some
> Prologs.

It is not an ugly hack.

Many people come up with similar ideas; it is a very natural way of
describing such a graph. It is an elegant solution if the graph is known
and needn't be updated.

You can write 'real' Prolog code as in some library, but it is not as
simple as this example.

If some Prolog doesn't support this, complain to your vendor: there
are many non-commercial systems which show it is easy and does not
really come in the way of the efficiency of the system (the most important
characteristic of Prolog ;-)

Andre' Marien
bimandre@cs.kuleuven.ac.be

fcs@aifh.ed.ac.uk (Flavio Soares Correa Da Silva) (06/27/91)

In article <6505@goanna.cs.rmit.oz.au>, ok@goanna.cs.rmit.oz.au (Richard A.
O'Keefe) writes:

# Another representation would be the
# adjacency matrix, represented as a quad-tree, which works rather
# nicely.

Could you explain this a bit more? 

Thanks,

Flavio (fcs@aipna.ed.ac.uk)

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/30/91)

Someone wrote
> I would like to know if this is a good way to represent graphs
> or an ugly hack.  The idea is to have fast access to the current
> location in the graph and its near neighbours.

> graph(X) :-
>       X = vertice(x,[A,B,C]),
>       A = vertice(a,[D,X]),
.....
I replied
> It's an ugly hack.  It is going to get you into big trouble is some
> Prologs.
In article <4134@n-kulcs.cs.kuleuven.ac.be>,
bimandre@icarus.cs.kuleuven.ac.be (Andre Marien) writes:
> It is not an ugly hack.
> 
> Many people come up with similar ideas; it is a very natural way of
> describing such a graph.  It is an elegant solution if the graph is known
> and needn't be updated.

I am grateful to Andre Marien for this opportunity to clarify what I wrote.

Whether this approach is or is not an ugly hack has, of course, little to
do with whether any Prolog system supports it.  There are indeed systems
such as SICStus Prolog which support unification of cyclic terms.  Not all
such systems _fully_ support cyclic terms; unification is, if not trivial,
at least not very hard, but output of cyclic terms may not be supported
(if it's going to be done at all, I'd prefer something like Common Lisp's
notataion, which can show sharing _precisely_, and so can produce much
more compact output than the @(<number of levels "up">) approach), or
output may be but not input, or asserting may not be (again, it is
important not just to construct a representation which would unify with
the source term, but something which has at least as much sharing), or
all the `obvious' things may be supported, but some evaluable predicates
may be surprising.

After L = [1|L], what should length(L, N) do?  Should it bind N to an
IEEE `+infinity' value?  Should it report an error?  Should it just
fail quietly?  (In the Scheme systems I use,
	(let ((L (list 1)))
	    (set-cdr! L L)
	    (length L))
returns #f, meaing "this isn't a proper list, it has no length".)
To see what it should do, consider L = [3-a,2-b,1-c|L], keysort(L, R).
There isn't any cyclic term (`rational ``tree''') which R could be
bound to.  All of the 1s have to precede all of the 2s, but the only
way we can do that is R = [1|R], which means that we'd have lost the
2s and 3s, which is wrong.  The simplest answer seems to be to say
that things like length(L,_), keysort(L,_), append(L,_), L =.. _, and
so on require L to be a _proper_ list.

Here's why cyclic terms are an ugly hack:  

	how can *you* write predicates that work like that?

If there were an evaluable predicate proper_list/1, you could use
that in the special case of lists.  But what about data structures
using other functors?  We don't want a test that says ``is this
term acyclic'', because we don't need anything that strong.  If we
just want to know the length of L, the _elements_ of L can be as
cyclic as they please, it's only the `backbone' of L that must be
acyclic.

The problem reduces to this:
    How can you write a predicate which is guaranteed to terminate
    when given a ground argument which may or may not be cyclic?

The point is that `standard' first-order terms possess an induction
principle:  you can define a `size' function on terms, and an
argument of a compound term has a smaller size than the term that
contains it.  So if you have a recursive predicate, and every step
case of that predicate recurs on an argument of (a term which was
already instantiated), that predicate must terminate.

With cyclic terms, you have no such induction principle.

That means that if you want to use a design method in which it is
easy to show that your programs terminate, you avoid cyclic terms,
no matter how well your system may support them.

I may add that when writing Scheme code, I introduce set-cdr! into
my programs with as much care as if I were loading live ammunition
into a gun that was aimed at myself.

So how does one write terminating procedures in Lisp or Scheme that
can handle cyclic structures?

There are two main techniques.  The one I rely on is `mark bits'.
When walking over a cyclic data structure (such as the support code
for a TMS) I `mark' each object that I visit, and avoid already marked
objects.  There are many ways of implementing such marks.  The other
main technique is to exploit the distinction between EQ? and EQUAL?.
In both cases, you have to be *acutely* aware of the distinction between
the address of a storage location and the value stored in it.  In short,
if you want to write terminating code that handles cyclic terms, you
have to think in terms of POINTERS.

Prolog is not a very good language in which to try to think at the
pointer level.  Consider L1 = [1|L1] and L2 = [1,1|L2].  These two
terms both represent the `infinite list' [1,1,1,1,...] and they would
unify in a system supporting unification of cyclic terms.  But they
contain different numbers of `objects'; L1 involves just one pointer,
while L2 involves 2.  So a recursive procedure manipulating these
things would find itself doing noticeably different things to EQUAL
terms!  Unpleasant things flow from that, but I'll spare you a recital.

I think I've said enough to explain why I call the use of cyclic terms
here an UGLY hack.  I might be persuaded to back down on `hack', but
not on `ugly'.

> Many people come up with similar ideas; it is a very natural way of
> describing such a graph.  It is an elegant solution if the graph is known
> and needn't be updated.

Many people may well come up with similar ideas, but that doesn't mean
it is not a hack, nor that it is elegant.  Basically, it's an imposition
of C/Pascal-style "pointer" thinking onto Prolog, not something that
flows naturally out of LOGIC.

It is illuminating to consider the original problem situation.
The person (who sent me E-mail, sorry to have lost the address) who
posed the question in the first place is trying to implement something
like `Nethack' in Prolog.  I know nothing about Nethack except that it
is like Rogue.  In Rogue, the topology of the maze does not change, as
such, but it isn't necessarily constructed all at once.  (I have never
got deeper than level 12; there's no point in constructing 35 levels
or however many there are if the player never gets that far.)  So one
noteworthy thing about the graph in question is that
	NEW VERTICES AND NEW EDGES MAY BE ADDED.
Andre Marien himself points out that update is a difficulty, but as
the graph is _growing_, it isn't a fatal problem in this case.  The
maze only grows at trapdoors.  (Although there's no reason why a
Rogue-like game couldn't have dwarves busily digging out new tunnels,
and rockfalls closing off old ones...)  But there's another fact about
Rogue which _may_ be relevant to Nethack.  The four-connected mesh
that was proposed is ok for crawling around one step at a time in any
of the cardinal directions.  But in Rogue a player is _not_ restricted
to moving one step at a time in any of the cardinal directions.  A
player may step on a teleportation trap, and arrive anywhere in the
maze, or may read a scroll which will magically move him somewhere
else.  Even if the first draft of the program doesn't have these
features, why use a data structure which will make it hard to add
them later?  If Nethack is like Rogue, then, a data structure is
required in which you can rapidly move from any node in the graph to any
other named node.

It turns out that storing the graph as a collection of facts in the
data base is almost ideal for this particular application.  It has
several merits:
-- representing a set of nodes and a set of edges in the data base
   gives you a representation which is very close to a standard
   mathematical representation of graphs.
-- nodes have *names* which are ground Prolog terms, so sets of
   nodes (`mark bits') can be maintained without needing anything
   exotic in the way of data structures.
-- it provides rapid movement to any node in the graph.
-- it is easier to see what's going on when debugging.
-- many more.

PS:  I'll be away for three weeks, so if there is discussion on this
topic, don't mistake my silence for agreement with _anyone_!

-- 
I agree with Jim Giles about many of the deficiencies of present UNIX.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/30/91)

In article <1991Jun27.094828@aifh.ed.ac.uk>, fcs@aifh.ed.ac.uk (Flavio
Soares Correa Da Silva) writes:
> In article <6505@goanna.cs.rmit.oz.au>, ok@goanna.cs.rmit.oz.au (Richard A.
> O'Keefe) writes:
> 
> # Another representation would be the
> # adjacency matrix, represented as a quad-tree, which works rather
> # nicely.
> 
> Could you explain this a bit more? 

I see that Flavio Soares Correa Da Silva is posting from DAI Edinburgh.
My last notes in Dr Bundy's Mathematical Reasoning Group's ``Blue Book''
are about matrix manipulation in Prolog.

Basically, the idea is that you represent an M-by-N matrix as
	matrix(M, N, QuadTree)
what does the QuadTree look like?
M = 1, N = 1	=> scalar(X_11)
M = 1, N > 1	=> row(X_11_to_X1J, X1K_to_X1N)		 1    J K   N
		   where J = N div 2, K = J+1		+-----+-----+ 1
M > 1, N = 1	=> col(X_11_to_XP1, XQ1_to_XM1)		|  A  |  B  | P
		   where P = M div 2, Q = P+1		+-----+-----+ 
M > 1, N > 1	=> matrix(A, B, C, D)			|  C  |  D  | Q
                   where the matrix is partitioned	+-----+-----+ M
		   as shown on the right.
This provides very efficient `bulk' operations on such matrices,
and logarithmic time access to individual elements.  It's a very nice
data structure for linear algebra operations, and has some useful
extensions.

Graph algorithms expressed in terms of the adjacency matrix can be
programmed using this data structure, and if they work by rows and/or
columns rather than by individual elements, they perform well.
-- 
I agree with Jim Giles about many of the deficiencies of present UNIX.