rpg@rex.cs.tulane.edu (Robert Goldman) (10/27/90)
I'm teaching a course in NLP here at Tulane, using James Allen's Natural Language Understanding as a text, and using Prolog as the language of instruction. I've written a small semantic processing system along the lines of the one Allen outlines in his book. That is: 1. semantic processing is done on a completed parse; 2. semantic information (selection restrictions, and object typing) is represented in isa hierarchies; 3. semantic representation is a quasi-logical form, centered around a representation of the head verb of the sentence, with a fixed number of verb cases filled by semantic representations of other sentence constituents. 4. I have used feature structures, along the lines of the ones in Gazdar and Mellish's Natural Language Processing in PROLOG, as a representation for the quasi-logical form. I would like to give this semantic processor to my students to examine, and I would appreciate it if any of you could comment on the coding, and let me know if I have committed any prolog solecisms. Some notes: 1. I have not tried to make it terribly efficient: preferring clarity over efficiency where possible. 2. I'm not terribly concerned about details of the syntactic representation that are not of immediate import. E.g., I have bracketed the sentence in ways that have to do less with linguistic validity than with ease of translation into semantic representation, and played fast and loose with terms like nbar. 3. I have not felt free to provide in the following source listing code for the two predicates pathval/4 and unify/2. They are predicates for unifying feature structures, and for finding values in feature structures, and I have used code from Gazdar & Mellish, which is probably protected by copyright. The comment for pathval is pathval(Dag,Path,Value,Remainder) The pathvalue of Dag at Path is Value. Remainder is Dag with the Path and Value removed. 4. The code is admittedly fragmentary. I have just implemented a small fragment of Allen's suggested system. E.g., there is no provision for embedded sentences or adjectives. EDITORIAL COMMENT: Quite frankly, I have had a fairly difficult time teaching this course using Allen's book and prolog together. I think I had good reasons for choosing each of these, but I wouldn't recommend the combination to anyone else. Why? Because the discussion in Allen's book gives a very procedural view, and it's very hard for me, let alone my students, to read his discussion, `declarativify' it, and then develop prolog code based on that. Following is a shar archive of the system. Thanks for your patience with my long posting. If there's interest, I will take the comments people send me, incorporate them in the code, and make the results available for anonymous ftp. #! /bin/sh # This is a shell archive. Remove anything before this line, then unpack # it by saving it into a file and typing "sh file". To overwrite existing # files, type "sh file -c". You can also feed this as standard input via # unshar, or by typing "sh <file", e.g.. If this archive is complete, you # will see the following message at the end: # "End of shell archive." # Contents: gensym.pl hierarchy.pl sem.pl grammar.pl load.pl # lexicon.pl # Wrapped by rpg@rex on Fri Oct 26 13:44:58 1990 PATH=/bin:/usr/bin:/usr/ucb ; export PATH if test -f 'gensym.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'gensym.pl'\" else echo shar: Extracting \"'gensym.pl'\" \(303 characters\) sed "s/^X//" >'gensym.pl' <<'END_OF_FILE' X X/* X a utility predicate X new_atom(A) X A must be unbound. Will be bound to a new name. X */ Xnew_atom(A) :- counter(N), X number_chars(N, L), X append("foo",L,Name), X atom_chars(A,Name). X Xcounter(N) :- X count(N),!, X retract(count(N)), X N1 is N + 1, X assert(count(N1)). Xcounter(0) :- X assert(count(1)). END_OF_FILE if test 303 -ne `wc -c <'gensym.pl'`; then echo shar: \"'gensym.pl'\" unpacked with wrong size! fi # end of 'gensym.pl' fi if test -f 'hierarchy.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'hierarchy.pl'\" else echo shar: Extracting \"'hierarchy.pl'\" \(2690 characters\) sed "s/^X//" >'hierarchy.pl' <<'END_OF_FILE' X:- op(500, xfx, isa). X Xfeatures(Thing,Feats) :- X local_feats(Thing,Feats), X inherited_feats(Thing,Feats). X Xlocal_feats(Thing,Feats) :- feats(Thing,Feats),!. Xlocal_feats(_,_). X Xinherited_feats(Thing,Feats) :- X Thing isa OtherThing,!, X features(OtherThing, NewFeats), X unify(Feats, NewFeats). Xinherited_feats(_,_). X Xfindcase(Verb,Indicator,Case) :- X case(Indicator,Verb,Case). Xfindcase(Verb,Indicator,Case) :- X Verb isa OtherVerb,!, %single inheritance X findcase(OtherVerb,Indicator,Case). X X/* X feats(Type,FeatStr) X The features in FeatStr are characteristic of Type X */ X%features for things corresponding to nouns Xfeats(physobj,[concrete:t|_]). Xfeats(organic,[organic:t|_]). Xfeats(inanimate,[organic:f|_]). Xfeats(animate,[animate:t|_]). Xfeats(vegetable,[animate:f|_]). Xfeats(human,[human:t|_]). Xfeats(dog,[human:f|_]). Xfeats(cat,[human:f|_]). X%features for verbs Xfeats(actionverb,[agent:[animate:t|_], X experiencer:nil, X co_agent:[animate:t|_], X at_time:[time:t|_], X beneficiary:[human:t|_], X at_loc:[animate:f|_]|_]). Xfeats(objaction,[theme:[physobj:t|_], X instrument:[animate:f|_]|_]). Xfeats(put,[to_loc:[animate:f|_]|_]). Xfeats(transfer,[to_poss:[animate:t|_]|_]). Xfeats(interhuman,[agent:[human:t|_], X theme:[human:t|_]|_]). Xfeats(break,[agent:[human:t|_], X theme:[animate:f|_]|_]). Xfeats(unactionverb,[agent:nil|_]). X X Xcase(subject,actionverb,agent). Xcase(subject,objaction,instrument). Xcase(subject,objaction,theme). Xcase(dobj,objaction,theme). Xcase(dobj,interhuman,theme). Xcase(iobj,objaction,beneficiary). Xcase(iobj,transfer,to_poss). Xcase(by,objaction,agent). Xcase(at,actionverb,at_loc). Xcase(on,actionverb,at_loc). Xcase(in,actionverb,at_loc). Xcase(along,actionverb,at_loc). Xcase(from,transfer,from_poss). Xcase(from,transfer,from_loc). Xcase(to,transfer,to_poss). Xcase(to,transfer,to_loc). Xcase(for,actionverb,beneficiary). Xcase(with,actionverb,instrument). Xcase(with,actionverb,co_agent). Xcase(subject,unactionverb,experiencer). Xcase(dobj,unactionverb,theme). X X X/* X isa hierarchy X */ Xorganic isa physobj. Xinanimate isa physobj. X Xanimate isa organic. Xvegetable isa organic. X Xhuman isa animate. Xdog isa animate. Xcat isa animate. X Xtree isa vegetable. Xplant isa vegetable. X Xrock isa inanimate. Xcar isa inanimate. Xbuilding isa inanimate. Xhammer isa inanimate. Xwindow isa inanimate. X Xhouse isa building. Xapartment_complex isa building. X X%verb isa hierarchy Xobjaction isa actionverb. Xinstraction isa objaction. Xinterhuman isa actionverb. Xkiss isa interhuman. Xlove isa interhuman. Xput isa objaction. Xtransfer isa objaction. Xbreak isa objaction. Xcry isa actionverb. Xsee isa unactionverb. END_OF_FILE if test 2690 -ne `wc -c <'hierarchy.pl'`; then echo shar: \"'hierarchy.pl'\" unpacked with wrong size! fi # end of 'hierarchy.pl' fi if test -f 'sem.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'sem.pl'\" else echo shar: Extracting \"'sem.pl'\" \(3619 characters\) sed "s/^X//" >'sem.pl' <<'END_OF_FILE' X/* X SEM.PL X Rules for semantic processing X */ Xtranslate(String,Tree,Sem) :- X parse(Tree,Mood,String), X semantics(Tree,Mood,Sem). X Xsemantics(smaj(Tree),Mood,Sem) :- Mood \= wh_q, X Sem =.. [Mood,SSem], X sem_translate(Tree,SSem,nil). X Xsemantics(smaj(Tree),wh_q,wh_q(Whvar,SSem)) :- X new_atom(Whvar), X sem_translate(Tree,SSem,Whvar). X Xsem_translate(s(NPTree,VPTree),Sem,Whvar) :- X vp_sem(VPTree,NPSlot, Sem, Whvar), X np_sem(NPTree,NP,Whvar), X unify(NP,NPSlot), X new_atom(EventName), X unify([token:EventName|_],Sem). X X X/* X vp_sem(VPTree, NPSlot, Sem, Whvar) X The semantics of VPTree is Sem. A substructure of Sem is X NPSlot. This is the substructure with which the semantic X representation of the subject np should be unified. X Whvar is an input variable indicating the variable corresponding X to the wh-word in a wh_q. X */ Xvp_sem(VPTree, NPSlot, Sem, Whvar) :- X head_verb(VPTree, Verb), X verb_sem(Verb,VerbSense,Sem), X findcase(VerbSense,subject,Case), X pathval(Sem,Case,NPSlot,_), X complement_semantics(VPTree,VerbSense,Sem,Whvar). X X/* X if there's a verb complement, it must have some X semantics. If not, no problem. X */ Xcomplement_semantics(VPTree,VerbSense,Sem,Whvar) :- X verb_complement(VPTree,Comp),!, X comp_sem(Comp,VerbSense,Sem,Whvar). X Xhead_verb(vp(aux(_),v(Verb),compl(_)),Verb) :- !. X Xverb_complement(vp(aux(_),v(_),Comp),Comp) :- !. X X X/* X comp_sem(Comp,VerbSense,Sem) X */ Xcomp_sem(compl(nil),_,_,_). Xcomp_sem(compl(Tree1,Tree2),VerbSense,Sem,Whvar) :- X np_sem(Tree1,Iobj,Whvar), X findcase(VerbSense,iobj,Case1), X unify([Case1:Iobj|_],Sem), X np_sem(Tree2,Dobj,Whvar), X findcase(VerbSense,dobj,Case2), X unify([Case2:Dobj|_],Sem). Xcomp_sem(compl(Tree),VerbSense,Sem,Whvar) :- X np_sem(Tree,Dobj,Whvar), X findcase(VerbSense,dobj,Case), X unify([Case:Dobj|_],Sem). X X X X X/* X np_sem(NPTree,NPSem,Wh_var), X */ Xnp_sem(np(what),[token:Wh_var|_],Wh_var). Xnp_sem(np(who),[token:Wh_var|_],Wh_var). Xnp_sem(NPTree,Sem,Wh_var) :- X np_sem1(NPTree,Sem,Wh_var), X new_atom(A), X unify([token:A|_],Sem). X Xnp_sem1(np(Nbar,Mods), NbarSem, Wh_var) :- X np_mod_sem(Mods, ModsSem), X nbar_sem(Nbar, NbarSem, Wh_var), X unify([mods:ModsSem|_],NbarSem). Xnp_sem1(np(Nbar), NbarSem, _) :- X nbar_sem(Nbar, NbarSem). X Xnbar_sem(nbar(prop_n(Name)),Sem) :- prop_noun_sem(Name,Sem). Xnbar_sem(nbar(art(Art),n(Noun)), Sem) :- X art_sem(Art,ArtSem), X noun_sem(Noun,Sem), X unify(Sem,ArtSem). Xnbar_sem(nbar(pn(PN)),Sem) :- X pronoun_sem(PN,Sem). X Xnp_mod_sem((Mod,Mods),Sem) :- X np_mod_sem(Mod,Sem1), X np_mod_sem(Mods,Sem), X unify(Sem1,Sem). Xnp_mod_sem(pp(p(P),np(NPTree)),Sem) :- X pp_noun_mod(P, NPSlot, Sem), X np_sem(NPTree,NP,nil), X unify(NP,NPSlot). X Xprop_noun_sem(Name,Sem) :- X name_sem(Name,Sem), X features(human,PSem), X unify(Sem,PSem). X Xpronoun_sem(PN,Sem) :- X PN \= it, X noun_sem(person,PSem), X pnsem(PN,Sem), X unify(Sem,PSem). Xpronoun_sem(it,Sem) :- X pnsem(it, Sem), X noun_sem(inanimate,NSem), X unify(Sem,NSem). X X/* X pp_noun_mod(Preposition, NPSlot, Structure) X The prep builds the semantic Structure; X a substructure of this semantic Structure is X NPSlot. This is the place in the structure which should be X unified with the semantics of the np which is the argument X of the pp. X */ Xpp_noun_mod(at, NPSlot, [at_location:NPSlot|_]) :- X NPSlot = [organic:f|_]. X Xpp_noun_mod(on, NPSlot, [on_location:NPSlot|_]) :- X NPSlot = [organic:f|_]. X Xpp_noun_mod(in, NPSlot, [in_location:NPSlot|_]) :- X NPSlot = [organic:f|_]. X X Xnoun_sem(Noun,Sem) :- noun_sense(Noun,Sense), X features(Sense, Sem). X Xverb_sem(Verb,Sense,Sem) :- verb_sense(Verb,Sense), X features(Sense, Sem). X Xart_sem(the,[ref:def|_]). Xart_sem(a,[ref:indef|_]). X X X X X X END_OF_FILE if test 3619 -ne `wc -c <'sem.pl'`; then echo shar: \"'sem.pl'\" unpacked with wrong size! fi # end of 'sem.pl' fi if test -f 'grammar.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'grammar.pl'\" else echo shar: Extracting \"'grammar.pl'\" \(2707 characters\) sed "s/^X//" >'grammar.pl' <<'END_OF_FILE' X/* X GRAMMAR.PL X The grammar we will use with semantic processing. X*/ X Xparse(Tree,Mood,String) :- smaj(Tree,Mood,String,[]). X X/* X smaj has been added to the grammar for the sake of X allowing it to later be expanded to include X clauses (embedded sentences), like the ones in our X verb complementation homework. X */ Xsmaj(smaj(Tree),Mood) --> s(Tree,Mood). X X/* X s(ParseTree,Mood) X */ Xs(s(NP,VP),decl) --> np(NP,NPnum,[]/[]), X vp(VP,VPnum,[]/[],[]/[]), X {num_agree(NPnum,VPnum,_)}. X Xs(s(NP,VP),yes-no-q) --> X aux(Aux,AuxNum,[]/[]), X np(NP,NPnum,[]/[]), X vp(VP,AuxNum,[Aux,AuxNum]/[],[]/[]), X {num_agree(NPnum,AuxNum,_)}. X X%A wh-question querying the subject is normal in X%structure Xs(s(NP,VP),wh_q) --> X wh(GapInfo), X np(NP,NPnum,[GapInfo]/[]), X vp(VP,VPNum,[]/[],[]/[]), X {num_agree(NPnum,VPNum,_)}. X%querying anything else requires aux-inversion. Xs(s(NP,VP),wh_q) --> X wh(GapInfo), X aux(Aux,AuxNum,[]/[]), X np(NP,NPnum,[]/[]), X vp(VP,AuxNum,[Aux,AuxNum]/[],[GapInfo]/[]), X {num_agree(NPnum,AuxNum,_)}. X X/* X vp(Tree,Number,AuxGap,GapInfo) X The vp rules have GapInfo as an argument because they have to X pass the gaps on to the np's that may be nested in them. X */ Xvp(vp(aux(Aux),v(V),Compl),Num,AuxGap,GapInfo) --> aux(Aux,Num,AuxGap), X verb(V, Num, Form), X {aux_agree(Aux,Form), X subcat(V,Subcat)}, X compl(Compl,Subcat,GapInfo). X X X/* X compl(Tree,Subcat,GapInfo) X verb complement with Tree representation, admitted by a verb with X subcategorization alternatives SubCat. X */ Xcompl(compl(nil),Subcat,X/X) --> {member(iv,Subcat)},[]. Xcompl(compl(NP),Subcat,GapInfo) --> {member(tv,Subcat)}, X np(NP,_,GapInfo). Xcompl(compl(NP1,NP2),Subcat,GapIn/GapOut) --> {member(bv,Subcat)}, X np(NP1,_,GapIn/Gap1), X np(NP2,_,Gap1/GapOut). X X/* X np(ParseTree,Number,GapInfo) X */ Xnp(np(Nbar),Num,GapIn/GapOut) --> X nbar(Nbar,Num,GapIn/GapOut). Xnp(np(Nbar,Mods),Num,GapIn/GapOut) --> X nbar(Nbar,Num,GapIn/Gap1), X npmods(Mods,Gap1/GapOut). Xnp(np(Gap),_,[np(Gap)|Gaps]/Gaps) --> []. X X/* X npmods(TreeFragment,Gap) X for the moment, we allow only prepositional X phrases as modifiers X */ Xnpmods(NPmod,GapIn/GapOut) --> pp(NPmod,GapIn/GapOut). Xnpmods((NPmod,NPmods),GapIn/GapOut) --> pp(NPmod,GapIn/Gap1), X npmods(NPmods,Gap1/GapOut). X X X/* X nbar(Tree,Num,GapInfo) X */ Xnbar(nbar(art(Art),n(Noun)),Num,Gap/Gap) --> art(Art,ArtNum), X noun(Noun,NounNum), X {num_agree(ArtNum,NounNum,Num)}. Xnbar(nbar(prop_n(Name)),[s3],Gap/Gap) --> proper_noun(Name). Xnbar(nbar(pn(PN)),Num,Gap/Gap) --> pronoun(PN,Num). X X/* X pp(Tree, GapInfo) X */ Xpp(pp(p(Prep),np(NP)),GapIn/GapOut) --> prep(Prep),np(NP,_,GapIn/GapOut). X X/* X aux(Aux,Num,AuxGapInfo X */ Xaux(Aux,Num,[Aux,Num]/[]) --> []. Xaux(Aux,Num,AuxGap/AuxGap) --> aux(Aux,Num). END_OF_FILE if test 2707 -ne `wc -c <'grammar.pl'`; then echo shar: \"'grammar.pl'\" unpacked with wrong size! fi # end of 'grammar.pl' fi if test -f 'load.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'load.pl'\" else echo shar: Extracting \"'load.pl'\" \(251 characters\) sed "s/^X//" >'load.pl' <<'END_OF_FILE' X/* X The load file for our parser with semantic analysis X */ X:-consult(library(not)). X:-consult(library(sets)). X:-consult(library(basics)). X X:-compile([dag_unify,gensym]). X X:- consult([grammar,lexicon,sem,hierarchy]). %fs_utils X X:- dynamic count/1. END_OF_FILE if test 251 -ne `wc -c <'load.pl'`; then echo shar: \"'load.pl'\" unpacked with wrong size! fi # end of 'load.pl' fi if test -f 'lexicon.pl' -a "${1}" != "-c" ; then echo shar: Will not clobber existing file \"'lexicon.pl'\" else echo shar: Extracting \"'lexicon.pl'\" \(2521 characters\) sed "s/^X//" >'lexicon.pl' <<'END_OF_FILE' X/* X LEXICON.PL X */ X X X X/* X noun(Lexeme,Num) X */ Xnoun(dog,[p3]) --> [dogs]. Xnoun(dog,[s3]) --> [dog]. Xnoun(hammer,[s3]) --> [hammer]. Xnoun(window,[s3]) --> [window]. X X X/* X art(Article,Num) X */ Xart(the,[s3,p3]) --> [the]. Xart(a,[s3]) --> [a]. X X X/* X verb(Root,Number-and-person,Form) X Form = 1 tenseless/present X 2 -s form (3s) X 3 past X 4 present participle X 5 past participle X */ Xverb(cry,[s1,p1,s2,p2,p3],1) --> [cry]. Xverb(cry,[s3],2) --> [cries]. Xverb(cry,_,4) --> [crying]. Xverb(see,[s1,p1,s2,p2,p3],1) --> [see]. Xverb(see,[s3],2) --> [sees]. Xverb(see,_,3) --> [saw]. Xverb(see,_,4) --> [seeing]. Xverb(see,_,5) --> [saw]. Xverb(love,[s3],2) --> [loves]. Xverb(love,[s1,p1,s2,p2,p3],1) --> [love]. Xverb(love,_,4) --> [loving]. Xverb(break,[s1,p1,s2,p2,p3],1) --> [break]. Xverb(break,[s3],2) --> [breaks]. Xverb(break,_,3) --> [broke]. Xverb(break,_,5) --> [broken]. X Xproper_noun(john) --> [john]. Xproper_noun(jack) --> [jack]. Xproper_noun(mary) --> [mary]. X X/* X pronoun(Pronoun,NumberandPerson X */ Xpronoun(i,[s1]) --> [i]. Xpronoun(we,[p1]) --> [we]. Xpronoun(you,[s2,p2]) --> [you]. Xpronoun(he,[s3]) --> [he]. Xpronoun(she,[s3]) --> [she]. Xpronoun(it,[s3]) --> [it]. Xpronoun(they,[p3]) --> [they]. X Xprep(at) --> [at]. Xprep(on) --> [on]. Xprep(in) --> [in]. Xprep(along) --> [along]. Xprep(from) --> [from]. Xprep(to) --> [to]. Xprep(for) --> [for]. Xprep(by) --> [by]. Xprep(with) --> [with]. X X X/* X aux(AuxVerb,Number) X */ Xaux(do,[s1,p1,s2,p2,p3]) --> [do]. Xaux(do,[s3]) --> [does]. Xaux(be,[s1]) --> [am]. Xaux(be,[s3]) --> [is]. Xaux(be,[p1,s2,p2,p3]) --> [are]. Xaux(nil,_) --> []. X X/* X wh(GapIntroduced) X */ Xwh(np(who)) --> [who]. Xwh(np(what)) --> [what]. X X/* X num_agree(N1,N2,N3) X N3 is the non-empty result of making N1 and N2 agree wrt X number. X */ Xnum_agree(N1,N2,N3) :- intersection(N1,N2,N3), X ( N3 = [] -> fail X | true X ). X X%do can be followed by the tenseless form Xaux_agree(do,1). X% be can be followed by the present participle Xaux_agree(be,4). X%participles not permitted without auxverb Xaux_agree(nil,1). Xaux_agree(nil,2). Xaux_agree(nil,3). X Xsubcat(cry,[iv]). Xsubcat(love,[tv]). Xsubcat(see,[tv]). Xsubcat(break,[iv,tv]). X X X/* X Word sense entries X */ Xnoun_sense(dog,dog). Xnoun_sense(hammer,hammer). Xnoun_sense(window,window). Xnoun_sense(rock,rock). X X Xverb_sense(break,break). Xverb_sense(cry,cry). Xverb_sense(love,love). Xverb_sense(kiss,kiss). Xverb_sense(see,see). X Xname_sem(john, [name:john,gender:male|_]). Xname_sem(jack, [name:jack,gender:male|_]). Xname_sem(mary, [name:mary,gender:female|_]). END_OF_FILE if test 2521 -ne `wc -c <'lexicon.pl'`; then echo shar: \"'lexicon.pl'\" unpacked with wrong size! fi # end of 'lexicon.pl' fi if test ! -d '' ; then echo shar: Creating directory \"''\" mkdir '' fi echo shar: End of shell archive. exit 0
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (10/30/90)
In article <4691@rex.cs.tulane.edu>, rpg@rex.cs.tulane.edu (Robert Goldman) writes: > 4. I have used feature structures, along the lines of the ones in Gazdar > and Mellish's Natural Language Processing in Prolog, as a > representation for the quasi-logical form. It's worth noting that their representation for feature structures (an improper list of Feature:Value pairs) is more than somewhat ugly. A tiny touch of pre-processing can make the source code much clearer (declare feature clusters, e.g. :- features(case_frame, [agent,patient,beneficiary,...]) and then write rules that say e.g. p(Subj, Features, ...) --> {Features^[agent] = Subj}. ) and the code that actually _runs_ much faster (because unifications are done in-line *as* unifications, not as calls to a non-logical unify/3 or whatever it was. I guess I should tidy up the code I gave my students for this and post it. > I would like to give this semantic processor to my students to > examine, and I would appreciate it if any of you could comment on the > coding, and let me know if I have committed any Prolog solecisms. I hope you really meant that. > EDITORIAL COMMENT: > Quite frankly, I have had a fairly difficult time teaching this course > using Allen's book and Prolog together. I was doing the same thing exactly this year. We ended up using rather little of Allen. My students got a _lot_ of handouts to make up for it. > X a utility predicate > X new_atom(A) > X A must be unbound. Will be bound to a new name. The program appears to be using Quintus Prolog (use of library(basics) and the like...) What on earth was wrong with the existing library predicate gensym/1, or if the "foo" prefix was so very important, gensym/2? The :- dynamic declaration for count/1 should have been in the file gensym.pl (which is the _only_ file that has any business knowing about that predicate) not in load.pl. There are actually two essentially unrelated things going on in the file 'hierarchy.pl'. First, symbols are being mapped to classes. The improper-list-of-pairs encoding of feature structures is a rather poor representation for types. A much better representation is due to Chris Mellish. Suppose we have the single-inheritance 'ako' tree a b c d e f g with the individuals cee: c, dee: d, eff: f, gee: g. We would map the individuals to terms representing their types thus: type_of(cee, b(c(_))). type_of(dee, b(d(_))). type_of(eff, e(f(_))). type_of(gee, e(g(_))). More generally, for each <class> / <parent> arc in the tree, we have class_type(<class>, T0, T) :- class_type(<parent>, <class>(T0), T). The top of the hierarchy as the corresponding rule class_type(<top>, T, T). When we have an individual <indiv> belonging to <class> we say indiv_type(<indiv>, T) :- class_type(<class>, _, T). So here we have class_type(c, T0, T) :- class_type(b, c(T0), T). class_type(d, T0, T) :- class_type(b, d(T0), T). class_type(b, T0, T) :- class_type(a, b(T0), T). class_type(f, T0, T) :- class_type(e, f(T0), T). class_type(g, T0, T) :- class_type(e, g(T0), T). class_type(e, T0, T) :- class_type(a, e(T0), T). class_type(a, T, T). class_type(C, T) :- class_type(C, _, T). indiv_type(cee, T) :- class_type(c, T). indiv_type(dee, T) :- class_type(d, T). indiv_type(eff, T) :- class_type(f, T). indiv_type(gee, T) :- class_type(g, T). This too is the kind of thing that can be done rather neatly by a preprocessor. Now, imagine that we want to say that a particular verb must have an animate subject. We might say may_fill(subject, see, X) :- class_type(animate, T), indiv_type(X, T). % X's type is compatible with T where the class_type/2 call can be preprocessed away. Chris Mellish pointed out that this scheme generalises to systems where multiple classifications apply to the same thing. For example, something of type "agreement" might be classified according to "person", "number", and "gender", so we might have agreement(1 | 2 | 3, s | p, m | f | n) With that scheme, we can easily represent things like agreement(_,p,_) "plural" agreement(3,_,f) "third person feminine" and combine them: agreement(3,p,f) "third person plural feminine" This can be much more economical, and is in my view much clearer, than lists of unstructured atoms. MAKE UNIFICATION WORK FOR YOU! As a particular example of doing things clearly with terms instead of pounding away on lists, consider the complements of a verb phrase. Goldman's program does vp(...) --> ... {subcat(V, Subcat)}, compl(..., Subcat, Gap). compl(compl(nil),Subcat,X/X) --> {member(iv,Subcat)},[]. compl(compl(NP),Subcat,GapInfo) --> {member(tv,Subcat)}, np(NP,_,GapInfo). compl(compl(NP1,NP2),Subcat,GapIn/GapOut) --> {member(bv,Subcat)}, np(NP1,_,GapIn/Gap1), np(NP2,_,Gap1/GapOut). where subcat/2 returns a subset of {iv,tv,bv} represented as a list. But why use a list here? Suppose instead that we represent the verb subcategorisation as a triple v(i | 0, t | 0, b | 0) where i, t, b mean that the verb _can_ be used as an intransitive, transitive, or ditransitive-or-benefactive respectively, and 0 in a particular slot means it can't. Let's move this information to the front as well: it is always a good idea to have the argument which we're dispatching on be the first so that a human reader has the least possible trouble finding it. Then we get compl(v(i,_,_), comp0, Gap, Gap) --> []. compl(v(_,t,_), comp1(Np), Gap0, Gap) --> np(NP, _, Gap0, Gap). compl(v(_,_,b), comp2(N1,N2), Gap0, Gap) --> np(N1, _, Gap0, Gap1), np(N2, _, Gap1, Gap). There's a lot of left-over Lisp in the code. For example, there's a rule that starts out s(s(NP,VP),yes-no-q) --> ... Now yes-no-q is a perfectly good Lisp atom (it's a spelling of |YES-NO-Q|) but it is a compound term in Prolog -(-(yes,no),q). Why does that matter? Because a later rule tries to use it as a function symbol! A rather worse hangover (and if it isn't a headache now, it soon will be) from Lisp is the use of 'nil' as a "default" or "absent" marker. Here's a particularly important case. translate(String,Tree,Sem) :- parse(Tree,Mood,String), semantics(Tree,Mood,Sem). semantics(smaj(Tree),Mood,Sem) :- Mood \= wh_q, Sem =.. [Mood,SSem], sem_translate(Tree,SSem,nil). semantics(smaj(Tree),wh_q,wh_q(Whvar,SSem)) :- new_atom(Whvar), sem_translate(Tree,SSem,Whvar). In both of the calls to sem_translate/3 we pass an atom as the last argument. An atom spelled "nil" means "there isn't any Wh-variable". An atom spelled "foo123" or the like means "there is a Wh-variable called foo123". That is NOT good Prolog coding practice. What are the situations, and what are the associated data? - there is a Wh-variable X - there is no Wh-variable Invent names for these situations, and make the associated data the arguments of appropriate terms - var(X) means there is a Wh-variable X - novar means there is no Wh-variable Then later on we'll be able to ask "was there a Wh-variable" by doing Wh = var(_) instead of by doing Wh \== nil. That's far from the only problem here. The program does "Mood \= wh_q" in order to test whether Mood is decl or yn_q (assuming that yes-no-q should have been yn_q). There is no point in using (\=)/2 here; it would be better to use the built-in predicate (\==)/2. But it's better still to say exactly what you do mean, so that a human reader can see what the possible cases for Mood are. (The use of (=..)/2 is a fairly reliable cue that something rather strange is going on. This is the bit that breaks if Mood is yes-no-q.) The variable names aren't too good either: there isn't any String here; but there _is_ a list of Words. translate(Words, Tree, Sem) :- parse(Tree, Mood, Words), semantics(Mood, Tree, Sem). semantics(decl, smaj(Tree), decl(Sem)) :- sem_translate(Tree, Sem, novar). semantics(yn_q, smaj(Tree), yn_q(Sem)) :- sem_translate(Tree, Sem, novar). semantics(wh_q, smaj(Tree), wh_q(WhVar,Sem)) :- gensym(WhVar), sem_translate(Tree, Sem, var(WhVar)). And so it goes. It would improve the program a _lot_ to have a comment which says exactly what a Tree or a Sem can look like. There's a lot more that could be said. One thing that _does_ need to be said is that I was very pleased to see this posting, and I've put a copy of it where my students can get at it. Never mind the flaws, at least it's _there_ and it's a place to _start_. Much the same can be said about the code in the Gazdar & Mellish book; the code there isn't very good, but it's _there_ and is a place to _start_, whereas Allen leaves you pretty much on your own. -- The problem about real life is that moving one's knight to QB3 may always be replied to with a lob across the net. --Alasdair Macintyre.