CSL.GERLACH@SU-SIERRA.ARPA (Sharon Gerlach) (02/10/86)
From: Sharon Gerlach <CSL.GERLACH@SU-SIERRA.ARPA>
On Friday, Feb 21, Anoop Gupta, a CSL faculty candidate from CMU, will
be speaking on "Parallelism in Production Systems" in MJH 352 at 3:15.
Parallelism in Production Systems
Anoop Gupta
Department of Computer Science
Carnegie-Mellon University
Pittsburgh, PA 15213
Production systems (or rule-based systems) are widely used in Artificial
Intelligence for modeling intelligent behavior and building expert systems.
Most production system programs, however, are extremely computation intensive
and run quite slowly. The slow speed of execution has prohibited the use of
production systems in domains requiring high performance and real-time
response. The talk will elaborate on the role of parallelism in the high-speed
execution of production systems.
On the surface, production system programs appear to be capable of using
large amounts of parallelism -- it is possible to perform match for each
production in a program in parallel. Our research shows that in practice,
however, the speed-up obtainable from parallelism is quite limited, around
10-fold as compared to initial expectations of 100-fold to 1000-fold. The main
reasons for the limited speed-up are: (1) there are only a small number of
productions that are affected (require significant processing) as a result of a
change to working memory and (2) there is a large variation in the processing
requirement of these productions. Since the number of affected productions is
not controlled by the implementor of the production system interpreter (it is
governed mainly by the author of the program and the nature of the problem),
the solution to the problem of limited speed-up is to somehow decrease the
variation in the processing cost of affected productions. We propose a
parallel version of the Rete algorithm which exploits parallelism at a very
fine grain to reduce this variation. We further suggest that to exploit the
fine-grained parallelism, a shared-memory multiprocessor with 32-64 high
performance processors should be used. For scheduling the fine-grained tasks
consisting of about 50-100 instructions, a hardware task scheduler is proposed.
The results presented in the talk are based on simulations done for a large
set of production systems exploiting different sources of parallelism. The
simulation results show that using the suggested multiprocessor architecture
(with individual processors performing at 2 MIPS), it is possible to obtain
execution speeds of 5000-27000 working memory element changes per second. This
corresponds to a speed-up of 5-fold to 27-fold over the best known sequential
implementation using a 2 MIPS processor. This performance is also higher than
that obtained by other proposed parallel implementations of production systems.