[mod.ai] Seminar - Parallelism in Production Systems

CSL.GERLACH@SU-SIERRA.ARPA (Sharon Gerlach) (02/10/86)

From: Sharon Gerlach <CSL.GERLACH@SU-SIERRA.ARPA>


   On Friday, Feb 21, Anoop Gupta, a CSL faculty candidate from CMU, will
be speaking on "Parallelism in Production Systems" in MJH 352 at 3:15.



                       Parallelism in Production Systems

                                  Anoop Gupta
                        Department of Computer Science
                          Carnegie-Mellon University
                             Pittsburgh, PA 15213


  Production  systems  (or  rule-based  systems)  are widely used in Artificial
Intelligence for modeling intelligent behavior  and  building  expert  systems.
Most  production  system programs, however, are extremely computation intensive
and run quite slowly.  The slow speed of execution has prohibited  the  use  of
production   systems  in  domains  requiring  high  performance  and  real-time
response.  The talk will elaborate on the role of parallelism in the high-speed
execution of production systems.

  On  the  surface,  production  system  programs appear to be capable of using
large amounts of parallelism -- it  is  possible  to  perform  match  for  each
production  in  a  program  in  parallel.  Our research shows that in practice,
however, the speed-up obtainable from  parallelism  is  quite  limited,  around
10-fold as compared to initial expectations of 100-fold to 1000-fold.  The main
reasons for the limited speed-up are:  (1) there are only  a  small  number  of
productions that are affected (require significant processing) as a result of a
change to working memory and (2) there is a large variation in  the  processing
requirement  of these productions.  Since the number of affected productions is
not controlled by the implementor of the production system interpreter  (it  is
governed  mainly  by  the author of the program and the nature of the problem),
the solution to the problem of limited speed-up  is  to  somehow  decrease  the
variation  in  the  processing  cost  of  affected  productions.   We propose a
parallel version of the Rete algorithm which exploits  parallelism  at  a  very
fine  grain  to  reduce this variation.  We further suggest that to exploit the
fine-grained  parallelism,  a  shared-memory  multiprocessor  with  32-64  high
performance  processors  should be used.  For scheduling the fine-grained tasks
consisting of about 50-100 instructions, a hardware task scheduler is proposed.

  The results presented in the talk are based on simulations done for  a  large
set  of  production  systems  exploiting different sources of parallelism.  The
simulation results show that using the  suggested  multiprocessor  architecture
(with  individual  processors  performing  at 2 MIPS), it is possible to obtain
execution speeds of 5000-27000 working memory element changes per second.  This
corresponds  to  a speed-up of 5-fold to 27-fold over the best known sequential
implementation using a 2 MIPS processor.  This performance is also higher  than
that obtained by other proposed parallel implementations of production systems.