alanf@bruce.OZ (Alan Grant Finlay) (03/29/90)
.ce
.uh "Prolog in Icon version 2 : Program documentation"
.ce
Alan Finlay, Monash University, March 1990.
This version of Prolog is loosely based upon the various denotational
semantics for Prolog with which I have been acquainted\*[*\*]
.(f
* In particular: T. Nicholson and N. Foo. "A Denotational Semantics for Prolog",
to appear in ACM TOPLAS.
.)f
and discussion with colleagues at Monash University.
The motivation was to gain experience with Icon and to see if there was any
truth to the claim "It should only take about half a page of Icon to implement
a Prolog interpreter."
.sh1 "Data structures"
.ftCW
########################## global variables and types ######################
.nf
record ctxt(env,subst) # integer[string] * ((integer | struct | null) list)
record struct(name,args) # string * ((integer | struct) list)
record rule(ids,head,body)# string list * predicate * predicate list \\
record all(ids,body) # string list * predicate list } clauses
record one(ids,body) # string list * predicate list /
record fun(name,args) # string * predicate list \\ types of
record var(name) # string / predicates
global dbase # table of clauses indexed by head name
global consult # stack of files being consulted
global query # top level query
.ft
.fi
Despite the lack of enforced type discipline in Icon I have used some
self discipline. The record and global declarations above indicate types
using "|" for disjoint sum and shared field names to get a degree of type
inheritance.
Type "ctxt" is a context for goal resolution and consists of an environment
and a substitution. The environment is a table which maps variable identifiers
to variables. A variable is represented by an integer which is its position
in the substitution list. A substitution then effectively maps variables
to terms or unbound. The null value is used to represent an unbound variable.
A term is either another variable or a structure.
Type "struct" is used to represent structures which are either functors or
constants. Functors have a name and a list of arguments which are terms.
Constants are represented as functors with zero arguments.
The types for clauses and predicates are used to represent the syntax of
Prolog. I will use the words predicate and functor more or less
interchangeably since the interpreter doesn't distinguish between them
structurally. A consequence of this is that a variable may be used as
a goal.
The use of separate domains for syntactic predicates and their run time
equivalents (called structures above) is required because the syntax is
independent of any context whereas the "terms" referred to above are only
meaningful when considered together with an appropriate context. At various
stages in a computation a given clause or predicate will have to be used
in different contexts. There are a few idiosyncrasies of the syntactic
domains worth mentioning. Clauses have the redundant field "ids" which is a
list of all the variable identifiers used in the clause. This saves a lot
of recalculation during execution. The types "all" and "one" are respectively
queries where all or one solution are requested. Syntactically this is
signaled by using the turnstile ":-" for all solutions and "?-" for one.
As before type "fun" can be used with no arguments to indicate constants.
Facts are represented as rules with empty bodies.
The database is global and consists of a table of clause lists indexed
by the name of the predicate at the head. The index is used for efficiency
reasons only. The global "consult" is a stack of files being consulted,
used for nesting of consult commands. The top level query is global simply
to simplify the parser and could have been passed to the driver as a parameter
via procedure "prog".
.ne7
.sh1 "The driver"
.ftCW
################################## driver ##################################
procedure main()
dbase:=table([]); consult:=[&input] # empty dbase; standard input
while \\query | *consult>0 do { # more queries possible
prog() # parse clauses, possibly setting query as a side effect
if \\query then case type(query) of {
"all" : {every printsoln(query); write("no more solutions")}
"one" : if not printsoln(query) then write("no")}
else pop(consult)}
end
procedure printsoln(qry) # print first or next solution to qry
local ans,v
every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do {
writes("yes")
every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst))
suspend write()}
end
.ft
Most of procedure "main" is concerned with providing a usable interactive
interface and the details are of little interest. The parser is called
until a query is encountered and "printsoln" is used to resolve it and display
the answer substitution. The separate
procedure "printsoln" is required because it can generate all solutions and
may be activated once only or resumed until it fails depending on the type
of query.
The procedure "printsoln" simply uses procedure "resolve" to generate solution
contexts from an initial context determined by the query. This initial
context has an environment with all and only the free variables in the query
and a substitution in which they are unbound. This context is created by
calling procedure "newctxt" which will be described later.
.ne7
.sh1 "The interpreter proper"
.ftCW
########################### Prolog interpreter #############################
procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl]
local sub,q # in given context, returns updated context
if hd>tl then return ctext
if (q:=qry[hd]).name=="~" then # negation by failure
{if not resolve(q.args,1,1,ctext) then suspend resolve(qry,hd+1,tl,ctext)}
else every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],ctext.subst) do
suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))
end
procedure tryclause(term,cls,sub) # resolves term using given clause or fails
local ctext # a copy of sub is used so no side effects
ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause
if unify(term,scanpred(cls.head,ctext),ctext.subst) then
suspend resolve(cls.body,1,*cls.body,ctext).subst
end
procedure scanpred(prd,ctext) # converts predicate to structure
local args; args:=[]
if type(prd)=="var" then return ctext.env[prd.name]
every put(args,scanpred(!prd.args,ctext))
return struct(prd.name,args)
end
.ft
The procedure "resolve" is a generator which produces all solutions to a sublist
of the supplied goal list "qry". This sublist is from element "hd" to element
.(f
* Lisp's variety of lists can be implemented in Icon but lack the useful
operators the inbuilt lists have. On the other hand the inbuilt list type
in Icon does not have a non copying "tail" or "cdr" operation.
.)f
"tl" and is passed in this way to save copying sublists\*[*\*]. The
resolution takes place with respect to the supplied context and
an updated context is returned as the answer.
Ignoring negation for the moment, "resolve" proceeds by resolving
the first goal in the list and for each solution uses a recursive call
to satisfy the rest of the goal list if possible and generates a result
for every case. The first goal is resolved by resuming "tryclause" for
each clause in the database matching the goal and "tryclause" itself being
a generator can be resumed several times for each clause. For those not
accustomed to Icon's procedure resumption conventions this is more clearly
expressed
.ftCW
q:=qry[hd]
every cls:=!dbase[q.name] do
every sub:=tryclause(scanpred(q,ctext),cls,ctext.subst) do
suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub))
.ft
Notice that the updated substitution return by "tryclause" is supplied
to "resolve" for the rest of the list hence the effects of goal resolution
within a clause body are cumulative. Another important point is to note that
the recursion bottoms out when the sublist of goals is empty and succeeds in
this case. It is tempting to try to use a goal generator instead of a goal
list as in
.ftCW
sub:=ctext.subst
every q:=!qry do
every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],sub)
return ctxt(ctext.env,sub)
.ft
This appears to save an explicit test for the end of the list but suffers
from a fatal flaw. Apart from only being able to generate one solution this
scheme finds the first solution provided there is one but otherwise
succeeds anyway with a partial solution.
Negation is simply handled as "negation as failure" by using Icon's "not"
operator which succeeds if and only if its argument fails. Since when
a negated goal
succeeds the substitution cannot be updated the original substitution is
passed to the remaining goals in this case.
The procedure "tryclause" first creates a context for resolving the supplied
term against the supplied clause. This context is based upon the supplied
substitution and all the free identifiers in the clause. The free identifiers
and the a copy of the substitution are used by "newctxt" to create a context
which has an environment for the identifiers as new variables and the
substitution extended with these new variables unbound. Denotational
semantics for Prolog may perform this task on the fly as a clause is
interpreted and this corresponds operationally to a great deal of recomputation.
Here the syntax parser generates a list of free variable identifiers (without
repetitions) only once and combined with the "newctxt" this avoids extending
the context in a piecemeal fashion.
The substitution passed to "newctxt" is a copy since "newctxt" and "unify"
cause side effects upon their substitution parameter. If these side effects
were eliminated it would require two copying operations to be performed on
very similar substitutions where only one is required. The unifier returns
only success or failure of the attempted unification but in the case of
success the supplied substitution is updated as a side effect.
The procedure "scanpred" simply converts a predicate from its syntactic form
into a term according to some relevant context. There are no side effects.
.ne7
.sh1 "Primitive domain operations and the parser"
.ftCW
######################## primitive domain operations ########################
procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub
local v,i,num # side effect: sub is updated
if type(t1)=="integer" then {
while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1
return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2}
if type(t2)=="integer" then return unify(t2,t1,sub)
if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then {
every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail
return}
end
procedure newctxt(ids,sub) # forms a new context by extending sub
local env; env:=table(&null) # to accommodate the unbound identifiers
every env[!ids]:=*put(sub,&null)
return ctxt(env,sub)
end
procedure trmstr(trm,sub) # converts a term to a string suitable for output
local s; s:=""
case type(trm) of {
"integer" : return trmstr(sub[trm],sub)
"struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list
else {every s:=s||trmstr(!trm.args,sub)||","
return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")}
"null" : return "undefined"}
end
procedure lstr(l,sub) # succeeds if l is a proper non-empty list and
local hd,tl # converts l to string suitable for output
if l.name=="." & *l.args=2 then {
hd:=trmstr(l.args[1],sub); tl:=l.args[2]
while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl
case type(tl) of {
"struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil
return hd||","||lstr(tl,sub)} # cons
"null" : return "undefined"}}
end
############################## Prolog parser ###############################
procedure prog() # parses consult[1] until query found or end of file
query:=&null
while write(read(consult[1])) ? clause()
if /query & consult[1]~===&input then close(consult[1])
end
procedure clause() # adds a clause to the dbase or fails when query set
local p,b,ids,t; b:=[]; ids:=[]
if =":-" then query:=all(ids,b:=body())
else if ="?-" then query:=one(ids,b:=body())
else {p:=pred(); if =":-" then b:=body()}
if (t:=trim(tab(0)))~=="." then # syntax error
return write("syntax error: ",t,if *t=0 then "." else " not"," expected")
every extractids(ids,\\p|!b) # list of variable identifiers
if (\\p).name=="consult" then every push(consult,open((!p.args).name))
return dbase[(\\p).name]:=dbase[p.name]|||[rule(ids,p,b)]
end
procedure body() # list of predicates (may be empty)
local b; b:=[]
if put(b,pred()) then while ="," & put(b,pred())
return b
end
procedure dots() # converts non-empty body of list to cons cells
local p
if p:=pred() then if ="," then return fun(".",[p,dots()])
else return fun (".",[p,fun("nil",[])])
end
procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred
local name,args,d,p,pp; args:=[]
if ="~" then p:=fun("~",[pred()])
else if name:=tab(many(&ucase++&lcase++'0123456789._')) then {
if any(&ucase,name) then p:=var(name)
else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}}
else if ="[]" then p:=fun("nil",[]) # empty list abbreviation
else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation
if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons
else write("syntax error: missing second argument to \\"|\\"")
return \\p # n.b. fails if predicate invalid
end
procedure check(s) # report error if s not present or skip over it
if not =s then write("syntax error: ",s," expected before ",tab(0))
end
procedure extractids(ids,pred) # build the set of variable identifiers
if type(pred)=="fun" then every extractids(ids,!pred.args)
else if not (pred.name==!ids) then put(ids,pred.name)
return # the identifiers have been appended to reference parameter ids
end
.ft
The rest of the interpreter is supplied for completeness but is of little
intrinsic interest.
The unifier updates the supplied substitution. It is natural to consider
using the backtrackable assignment and hence avoid the need to copy
substitutions altogether. A preliminary investigation indicates that this
is feasible but space inefficient. This may be due to poor implementation
of backtrackable assignment in the particular Icon interpreter used\*[*\*].
.(f
* &version = "Icon Version 6.0. July 7, 1986." (University of Arizona)
.)f
The parser uses procedures which return parse trees except that "clause"
uses a global variable to return a top level query and fails when it
encounters one. This causes termination of string scanning in "prog".
The behaviour of the user interface is described in the user manual.