markh@csd4.milw.wisc.edu (Mark William Hopkins) (07/30/89)
Unification is an algorithm that forms the basis of Logic programming languages, theorem provers, and type checking systems in functional languages. Until recently it had been assumed that there was no way to efficiently implement an algorithm this an "occurrence check" that checked for cyclic references. This is an outline of an algorithm, dating from the early 1980's, that does just that. It keeps track of cycles in an incremental fashion and so has a little overhead as compared to the traditional algorithm. The occurrence check is a constant-time operation. UNIFICATION To unify two expression U, V means to find substitutions for the variables occurring in U and V so that the two expression become identical. For example, unifying f(X, Y) with f(g(Y), Z) is done by binding: X to g(Y), and Y and Z together. to get: f(g(Y), Y). Unification can be seen as an algorithm for solving a system of equations. In setting up the algorithm to unify expressions U = V, a list, L, of unifications yet-to-be-done is set up with (U = V) as its only member. Each variable, X, occurring in U and V, is set to point to an equation, EQ(X), of the form: Variable-List = Expression: Occurrences where Occurrences keeps track of the number of times any variable on the left-hand side occurs on the right-hand side of any other equation. This will keep track of cyclic references. Initially, EQ(X) is set to the equation: [V] = nil: 0. Different lists, E0, E1, are kept for those equations with zero-occurrence counts and those without. (i) Set L to the list [U = V], For all variables X in U, V, set EQ(X) to [X] = nil: 0 and add this equation to the list E0. (ii) while L is not empty: remove an equation X1 = X2 from L; * if X1, X2 are variables then access their equations -- EQ(X1): VL1 = Ex1: Occ1 EQ(X2): VL2 = Ex2: Occ2 and MERGE them: EQ(X1) = EQ(X2): VL1 union VL2 = (Ex1 | Ex2): Occ1 + Occ2. >>> (iii) To calculate Ex = Ex1 | Ex2: >>> * if Ex1 = nil then Ex = Ex2, increment variables in Ex2, >>> * if Ex2 = nil then Ex = Ex1, increment variables in Ex1, >>> * else: >>> Ex = Ex1, add (Ex1 = Ex2) to the unify list, L. >>> (iv) To increment variables in an expression Ex: >>> for all variables, X, occurring in Ex: >>> access EQ(X): VL = Exp: Occ >>> and add 1 to Occ. >>> if the equation was in the zero-count list E0 (Occ was 0), >>> then move it over to the non-zero-count list E1. * if X1 is a variable and X2 an expression Ex2, access X1's equation -- EQ(X1): VL = Ex1: Occ and insert Ex2 on the right: EQ(X1): VL = (Ex1 | Ex2): Occ * if X2 is a variable and X1 an expression Ex1, access X2's equation -- EQ(X2): VL = Ex2: Occ and insert Ex1 on the right: EQ(X2): VL = (Ex1 | Ex2): Occ * if X1 and X2 are constants, check for their equality and FAIL if the test fails. * if X1 and X2 are functions X1 = f(y1, ..., ym), X2 = g(z1, ..., zn) check for equality f = g, m = n and FAIL if the test fails. Otherwise add the arguments: y1 = z1, ..., yn = zn to the list, L, of unifications. * else X1 is a function, X2 a constant, of vica versa. FAIL. If the list, L, can be exhausted without encountering a FAIL, then unification succeeds (so far). (v) while the zero-count list, E0, is not empty: remove an equation VL = Exp: 0 from the list; for all variables, X, occurring on the right-hand side, Exp decrement X. >>> (vi) To derement a variable, X: >>> access the equation EQ(X): VL = Exp: Occ, >>> and subtract 1 from Occ. >>> If Occ is 0 as a result, then move the equation >>> from the non-zero-list, E1, to the zero-list E0. (vii) OCCURRENCE CHECK: If there are any equations left with non-zero counts, in E1, then some of them contain cyclic references. ------------------------------------------------------------ EXAMPLE: Unify f(X, Y) with f(g(Y), Z) STEP (i) L is initially the list: (0): f(X, Y) = f(g(Y), Z) EQ(X) is [X] = nil: 0, EQ(Y) is [Y] = nil: 0, EQ(Z) is [Z] = nil: 0. STEP (ii) Process element (0) of the list, L, * Add respective arguments to the list: * (1): X = g(Y), * (2): Y = Z. Process element (1) of the list, L, Modify EQ(X): EQ(X) is [X] = g(Y): 0, * Increment all variables in g(Y): * EQ(Y) is now [Y] = nil: 1. Process element (2) of L, Merge EQ(Y): [Y] = nil: 1, EQ(Z): [Z] = nil: 0 into EQ(Y) = EQ(Z): [Y, Z] = nil: 1. No more equations are left in L. STEP (v) The lists E0 and E1 are: E0: (0) [X] = g(Y): 0. E1: (0) [Y, Z] = nil: 1. Process element (0) of E0: X = g(Y): 0. * decrement all variables in g(Y): * [Y, Z] = nil: 1 ==> [Y, Z] = nil: 0. * The equation is now deleted from E1, and added to the list E0: * (1) [Y, Z] = nil: 0. Process element (1) of E0: [Y, Z] = nil: 0. * no variables on the right-hand side. No equations left to process in E0. STEP (vii) OCCURRENCE CHECK: No equations left in E1 -- no cyclic references. The result of unifying f(X, Y) with f(g(Y), Z) is the set of equations: [X] = g(Y), [Y, Z] = nil. The resulting unified expression is obtained by equating variables that occur in the left-hand sides of the same equation: f(X, Y) => f(X, Y) f(g(Y), Z) => f(g(Y), Y) and then "dereferencing" those variables with non-empty right-hand sides: f(X, Y) => f(g(Y), Y) f(g(Y), Y) => f(g(Y), Y) ------------------------------------------------------------ EXAMPLE: Unify f(X, Y) with f(g(Y), X). This will bind X to g(Y) (first argument) and Y to X (second argument) and thus create a cycle. STEP(i) L: (0) f(X, Y) = f(g(Y), X). E0: (0) X = nil: 0 (1) Y = nil: 0 (2) Z = nil: 0 STEP(ii) Process L(0): * Add * (1) X = g(Y) * (2) Y = X * to L. Process L(1): * Insert g(Y) into equation * EQ(X): X = nil: 0 * to get * EQ(X): X = g(Y): 0 * and then increment Y: * EQ(Y): Y = nil: 1. Process L(2): * Merge equations * EQ(X): X = g(Y): 0 * EQ(Y): Y = nil: 1 * to get * EQ(X) = EQ(Y): [X, Y] = g(Y): 1 STEP (v) The list E0 is empty, the list E1 has only the equation (0) [X, Y] = g(Y): 1 in it. There is no processing to be done, since E0 is empty. STEP (vii) OCCURRENCE CHECK The list E1 is not empty so there is a cyclic reference. The cyclic reference is in the variable Y of the equation: [X, Y] = g(Y): 1
jha@lfcs.ed.ac.uk (Jamie Andrews) (08/08/89)
Questions on Mark's unification algorithm: - How efficient is it on the size of the input? You mention only that the occurs check is done in constant time. If it is linear, how does it differ from Martelli & Montanari's algorithm? - How space-efficient is it? This is where linear unification algorithms usually fall down, and the reason that they are not implemented very often. --Jamie. jha@lfcs.ed.ac.uk "It's normal to be lazy" -- G. Huet