alanf@bruce.OZ (Alan Grant Finlay) (03/29/90)
.ce .uh "Prolog in Icon version 2 : Program documentation" .ce Alan Finlay, Monash University, March 1990. This version of Prolog is loosely based upon the various denotational semantics for Prolog with which I have been acquainted\*[*\*] .(f * In particular: T. Nicholson and N. Foo. "A Denotational Semantics for Prolog", to appear in ACM TOPLAS. .)f and discussion with colleagues at Monash University. The motivation was to gain experience with Icon and to see if there was any truth to the claim "It should only take about half a page of Icon to implement a Prolog interpreter." .sh1 "Data structures" .ftCW ########################## global variables and types ###################### .nf record ctxt(env,subst) # integer[string] * ((integer | struct | null) list) record struct(name,args) # string * ((integer | struct) list) record rule(ids,head,body)# string list * predicate * predicate list \\ record all(ids,body) # string list * predicate list } clauses record one(ids,body) # string list * predicate list / record fun(name,args) # string * predicate list \\ types of record var(name) # string / predicates global dbase # table of clauses indexed by head name global consult # stack of files being consulted global query # top level query .ft .fi Despite the lack of enforced type discipline in Icon I have used some self discipline. The record and global declarations above indicate types using "|" for disjoint sum and shared field names to get a degree of type inheritance. Type "ctxt" is a context for goal resolution and consists of an environment and a substitution. The environment is a table which maps variable identifiers to variables. A variable is represented by an integer which is its position in the substitution list. A substitution then effectively maps variables to terms or unbound. The null value is used to represent an unbound variable. A term is either another variable or a structure. Type "struct" is used to represent structures which are either functors or constants. Functors have a name and a list of arguments which are terms. Constants are represented as functors with zero arguments. The types for clauses and predicates are used to represent the syntax of Prolog. I will use the words predicate and functor more or less interchangeably since the interpreter doesn't distinguish between them structurally. A consequence of this is that a variable may be used as a goal. The use of separate domains for syntactic predicates and their run time equivalents (called structures above) is required because the syntax is independent of any context whereas the "terms" referred to above are only meaningful when considered together with an appropriate context. At various stages in a computation a given clause or predicate will have to be used in different contexts. There are a few idiosyncrasies of the syntactic domains worth mentioning. Clauses have the redundant field "ids" which is a list of all the variable identifiers used in the clause. This saves a lot of recalculation during execution. The types "all" and "one" are respectively queries where all or one solution are requested. Syntactically this is signaled by using the turnstile ":-" for all solutions and "?-" for one. As before type "fun" can be used with no arguments to indicate constants. Facts are represented as rules with empty bodies. The database is global and consists of a table of clause lists indexed by the name of the predicate at the head. The index is used for efficiency reasons only. The global "consult" is a stack of files being consulted, used for nesting of consult commands. The top level query is global simply to simplify the parser and could have been passed to the driver as a parameter via procedure "prog". .ne7 .sh1 "The driver" .ftCW ################################## driver ################################## procedure main() dbase:=table([]); consult:=[&input] # empty dbase; standard input while \\query | *consult>0 do { # more queries possible prog() # parse clauses, possibly setting query as a side effect if \\query then case type(query) of { "all" : {every printsoln(query); write("no more solutions")} "one" : if not printsoln(query) then write("no")} else pop(consult)} end procedure printsoln(qry) # print first or next solution to qry local ans,v every ans:=resolve(qry.body,1,*qry.body,newctxt(qry.ids,[])) do { writes("yes") every v:=!qry.ids do writes(", ",v,"=",trmstr(ans.env[v],ans.subst)) suspend write()} end .ft Most of procedure "main" is concerned with providing a usable interactive interface and the details are of little interest. The parser is called until a query is encountered and "printsoln" is used to resolve it and display the answer substitution. The separate procedure "printsoln" is required because it can generate all solutions and may be activated once only or resumed until it fails depending on the type of query. The procedure "printsoln" simply uses procedure "resolve" to generate solution contexts from an initial context determined by the query. This initial context has an environment with all and only the free variables in the query and a substitution in which they are unbound. This context is created by calling procedure "newctxt" which will be described later. .ne7 .sh1 "The interpreter proper" .ftCW ########################### Prolog interpreter ############################# procedure resolve(qry,hd,tl,ctext) # generates all solutions of qry[hd:tl] local sub,q # in given context, returns updated context if hd>tl then return ctext if (q:=qry[hd]).name=="~" then # negation by failure {if not resolve(q.args,1,1,ctext) then suspend resolve(qry,hd+1,tl,ctext)} else every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],ctext.subst) do suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub)) end procedure tryclause(term,cls,sub) # resolves term using given clause or fails local ctext # a copy of sub is used so no side effects ctext:=newctxt(cls.ids,copy(sub)) # preallocate context for whole clause if unify(term,scanpred(cls.head,ctext),ctext.subst) then suspend resolve(cls.body,1,*cls.body,ctext).subst end procedure scanpred(prd,ctext) # converts predicate to structure local args; args:=[] if type(prd)=="var" then return ctext.env[prd.name] every put(args,scanpred(!prd.args,ctext)) return struct(prd.name,args) end .ft The procedure "resolve" is a generator which produces all solutions to a sublist of the supplied goal list "qry". This sublist is from element "hd" to element .(f * Lisp's variety of lists can be implemented in Icon but lack the useful operators the inbuilt lists have. On the other hand the inbuilt list type in Icon does not have a non copying "tail" or "cdr" operation. .)f "tl" and is passed in this way to save copying sublists\*[*\*]. The resolution takes place with respect to the supplied context and an updated context is returned as the answer. Ignoring negation for the moment, "resolve" proceeds by resolving the first goal in the list and for each solution uses a recursive call to satisfy the rest of the goal list if possible and generates a result for every case. The first goal is resolved by resuming "tryclause" for each clause in the database matching the goal and "tryclause" itself being a generator can be resumed several times for each clause. For those not accustomed to Icon's procedure resumption conventions this is more clearly expressed .ftCW q:=qry[hd] every cls:=!dbase[q.name] do every sub:=tryclause(scanpred(q,ctext),cls,ctext.subst) do suspend resolve(qry,hd+1,tl,ctxt(ctext.env,sub)) .ft Notice that the updated substitution return by "tryclause" is supplied to "resolve" for the rest of the list hence the effects of goal resolution within a clause body are cumulative. Another important point is to note that the recursion bottoms out when the sublist of goals is empty and succeeds in this case. It is tempting to try to use a goal generator instead of a goal list as in .ftCW sub:=ctext.subst every q:=!qry do every sub:=tryclause(scanpred(q,ctext),!dbase[q.name],sub) return ctxt(ctext.env,sub) .ft This appears to save an explicit test for the end of the list but suffers from a fatal flaw. Apart from only being able to generate one solution this scheme finds the first solution provided there is one but otherwise succeeds anyway with a partial solution. Negation is simply handled as "negation as failure" by using Icon's "not" operator which succeeds if and only if its argument fails. Since when a negated goal succeeds the substitution cannot be updated the original substitution is passed to the remaining goals in this case. The procedure "tryclause" first creates a context for resolving the supplied term against the supplied clause. This context is based upon the supplied substitution and all the free identifiers in the clause. The free identifiers and the a copy of the substitution are used by "newctxt" to create a context which has an environment for the identifiers as new variables and the substitution extended with these new variables unbound. Denotational semantics for Prolog may perform this task on the fly as a clause is interpreted and this corresponds operationally to a great deal of recomputation. Here the syntax parser generates a list of free variable identifiers (without repetitions) only once and combined with the "newctxt" this avoids extending the context in a piecemeal fashion. The substitution passed to "newctxt" is a copy since "newctxt" and "unify" cause side effects upon their substitution parameter. If these side effects were eliminated it would require two copying operations to be performed on very similar substitutions where only one is required. The unifier returns only success or failure of the attempted unification but in the case of success the supplied substitution is updated as a side effect. The procedure "scanpred" simply converts a predicate from its syntactic form into a term according to some relevant context. There are no side effects. .ne7 .sh1 "Primitive domain operations and the parser" .ftCW ######################## primitive domain operations ######################## procedure unify(t1,t2,sub) # (integer | struct),(integer | struct),sub local v,i,num # side effect: sub is updated if type(t1)=="integer" then { while type(v:=sub[t1])=="integer" do t1:=v # apply sub to t1 return if type(v)=="struct" then unify(v,t2,sub) else sub[t1]:=t2} if type(t2)=="integer" then return unify(t2,t1,sub) if (t1.name==t2.name) & ((num:=*t1.args)=*t2.args) then { every i:=1 to num do if not unify(t1.args[i],t2.args[i],sub) then fail return} end procedure newctxt(ids,sub) # forms a new context by extending sub local env; env:=table(&null) # to accommodate the unbound identifiers every env[!ids]:=*put(sub,&null) return ctxt(env,sub) end procedure trmstr(trm,sub) # converts a term to a string suitable for output local s; s:="" case type(trm) of { "integer" : return trmstr(sub[trm],sub) "struct" : if s:=lstr(trm,sub) then return "["||s||"]" # non-empty list else {every s:=s||trmstr(!trm.args,sub)||"," return trm.name||(if *s=0 then "" else "("||s[1:-1]||")")} "null" : return "undefined"} end procedure lstr(l,sub) # succeeds if l is a proper non-empty list and local hd,tl # converts l to string suitable for output if l.name=="." & *l.args=2 then { hd:=trmstr(l.args[1],sub); tl:=l.args[2] while type(tl)=="integer" do tl:=sub[tl] # apply sub to tl case type(tl) of { "struct" : {if tl.name=="nil" & *tl.args=0 then return hd # nil return hd||","||lstr(tl,sub)} # cons "null" : return "undefined"}} end ############################## Prolog parser ############################### procedure prog() # parses consult[1] until query found or end of file query:=&null while write(read(consult[1])) ? clause() if /query & consult[1]~===&input then close(consult[1]) end procedure clause() # adds a clause to the dbase or fails when query set local p,b,ids,t; b:=[]; ids:=[] if =":-" then query:=all(ids,b:=body()) else if ="?-" then query:=one(ids,b:=body()) else {p:=pred(); if =":-" then b:=body()} if (t:=trim(tab(0)))~=="." then # syntax error return write("syntax error: ",t,if *t=0 then "." else " not"," expected") every extractids(ids,\\p|!b) # list of variable identifiers if (\\p).name=="consult" then every push(consult,open((!p.args).name)) return dbase[(\\p).name]:=dbase[p.name]|||[rule(ids,p,b)] end procedure body() # list of predicates (may be empty) local b; b:=[] if put(b,pred()) then while ="," & put(b,pred()) return b end procedure dots() # converts non-empty body of list to cons cells local p if p:=pred() then if ="," then return fun(".",[p,dots()]) else return fun (".",[p,fun("nil",[])]) end procedure pred() # ~pred , name(body) , uc_name , lc_name , [body] , pred|pred local name,args,d,p,pp; args:=[] if ="~" then p:=fun("~",[pred()]) else if name:=tab(many(&ucase++&lcase++'0123456789._')) then { if any(&ucase,name) then p:=var(name) else {if ="(" & args:=body() then check(")"); p:=fun(name,args)}} else if ="[]" then p:=fun("nil",[]) # empty list abbreviation else if ="[" then {p:=dots(); check("]")} # non-empty list abbreviation if ="|" then if pp:=pred() then return fun(".",[p,pp]) # infix cons else write("syntax error: missing second argument to \\"|\\"") return \\p # n.b. fails if predicate invalid end procedure check(s) # report error if s not present or skip over it if not =s then write("syntax error: ",s," expected before ",tab(0)) end procedure extractids(ids,pred) # build the set of variable identifiers if type(pred)=="fun" then every extractids(ids,!pred.args) else if not (pred.name==!ids) then put(ids,pred.name) return # the identifiers have been appended to reference parameter ids end .ft The rest of the interpreter is supplied for completeness but is of little intrinsic interest. The unifier updates the supplied substitution. It is natural to consider using the backtrackable assignment and hence avoid the need to copy substitutions altogether. A preliminary investigation indicates that this is feasible but space inefficient. This may be due to poor implementation of backtrackable assignment in the particular Icon interpreter used\*[*\*]. .(f * &version = "Icon Version 6.0. July 7, 1986." (University of Arizona) .)f The parser uses procedures which return parse trees except that "clause" uses a global variable to return a top level query and fails when it encounters one. This causes termination of string scanning in "prog". The behaviour of the user interface is described in the user manual.