hansteb@cs.vu.nl (Tebra Hans) (11/18/89)
Next tuesday, at November 21, I will defend my thesis, titled "Optimistic AND-parallelism in Prolog" A number of copies are available to interested readers in the logic programming and parallel-processing field. Softcover, 180 pages. I am willing to send copies for U.S. $15 by surface mail. This price covers postage and printing costs. To order a copy: Transfer U.S. $15 to bank account number 65.09.12.713 of Hans Tebra, Bank: N.M.B; P.O Box 1800; Amsterdam Holland. Or, send a check to Hans Tebra, N.M.B. Bank, P.O. Box 1800 Amsterdam, Holland, Location KC 03.02. SUMMARY OF THE THESIS In this thesis, a new method for parallel execution of the logic programming language Prolog is presented. In addition to a detailed discussion of algorithms, a realistic implementation of the method is given. The thesis concludes with an observation of the strong and weak points of the method. In chapter 2, the principles of executing a logic program are described; they are centered around the notions of resolution, unification and backtracking. For the first practical logic language, called Prolog, the implementation in a conventional computer system is illustrated by means of program fragments that reflect the resolution and unification of Prolog. Chapter 3 covers the methods of obtaining faster execution by means of parallel methods. The main categories are AND and OR parallelism. They are orthogonal, i.e. each of the methods can be implemented without concern of the details of the other method. A number of logic programming languages have been designed that have the notion of parallelism incorporated in the language. The languages that are most widely known are discussed in chapter 4. In two aspects, they differ significantly from the original language Prolog. In addition to the facilities that allow the programmer to control parallel execution directly, the semantics of these languages has been altered with respect to the capability to find multiple solutions. The concept of backtracking has disappeared and a new "guard" statement has been introduced to restrict the process of searching for alternative solutions. To solve implementation difficulties, simpler "flat" versions have been derived. Because it is uncertain at this moment whether the parallel languages are as powerful as Prolog, that language has been chosen as a basis for investigation of AND-parallelism. In chapter 5, a new method, called "optimistic AND-parallelism" is introduced. It is a method that tries to exploit as much of the potential AND-parallelism in the program as can be found. The parallelism is in the process executing the Prolog program, not in the logic program itself. Contrary to other AND-parallel methods, the optimistic method is capable of dealing with complex, recursive data structures such as lists and trees. This is essential for any parallel method because almost every real life program is based on these structures. It is called optimistic because the parallel parts of the program "hope" that their operations do not create conflicts. If conflicts occur, it is expected that they are not frequent. Then, a repair operation must be carried out to correct a the contents of a logic variable. From this point of view, the other AND-parallel methods may be called pessimistic. They are based on conflict avoidance, hence reducing the number of processes that are executed in parallel. The optimistic method is specified in terms of "solver" processes, operating in an abstract machine. New versions of the resolution and unification algorithms illustrate the differences with respect of the sequential implementation of chapter 2. Large programs cause a large number of solvers to be generated. To control the resources of the abstract machine, the method must be fitted with a utility to perform reorganisations at intervals. In chapter 6, an analysis of interactions between parallel unifications of three different example programs are given. Cost functions are defined that are based on the number of accesses to variables that are located in the memory of foreign solver processes. A composition of parallel operations gives functions for the number of consecutive elementary operations that are performed to executed the examples. For sorting example programs, the cost model predicts that the number of operations to be executed after another is proportional to the size of the lists to be sorted. Chapter 7 presents a realistic implementation of the abstract machine of chapter 5. The architecture of the system must be scalable, to obtain a further improvement of the processing capability if components are added. The Transputer has been selected as the key component of a grid structure. The properties of this MIMD-system have been used to build a simulator program to investigate performance in terms of elapsed time, allocation of processes, overloading of processors and uncontrolled growth of the process load in parts of the system. The simulation timings were compared to the cost model of chapter 6. For one of the example programs, the summing program, the results of the model and the simulation did agree. The number of active processes in this example is within narrow bounds. A sorting program resultsed in uncontrolled growth and heavy overloading of the system caused the processing times to rise beyond values predicted by the cost model. The investigation of the optimistic method has shown that this kind of implicit parallelism is possible. The small-scale simulation experiment has learned that improvements over sequential execution are possible. The execution of the logic programs must be controlled to prevent an abundance of useless processes to flood the system. Further research is needed to implement an automatic throttle that prevents the solvers from being created at will.