[comp.ai.digest] A four-bit definition of Genetic Algorithms

rik%cs@UCSD.EDU (Rik Belew) (03/17/88)

My two-bit definition of "genetic algorithms has, 
like all over-simplifications, drawn criticisms
for being --- you guessed it --- over-simplified!
So here's a second pass:

  First  and foremost,  I want to  "make one thing
perfectly  clear":  When I  defined GA(1) in terms
of  "...   John Holland  and his  students" I most
certainly  did  not  intend that  only  those that
had  journeyed  to  Ann Arbor  and  been annointed
by  the man  himself were fully  certified to open
GA  franchises!    Defining a  scientific approach
in  terms  of  the personalities  involved  is not
adequate,  for  GA  or  any other  work.     I was
attempting  to  distinguish a  particular approach
from  the broader set of  techniques that I called
GA(2).    In  my  experience, John  is  the "root"
of   all  such  work  and  much  of  it  has  been
done  by  "direct"  students  of  his at  Michigan
and  "indirect"  students of  these students.    I
also  know, however, that  others --- notably Dave
Ackley  and  Larry  Rendell  ---  have  worked  on
these  methods  without direct  contact  with this
lineage.   But I very much consider them "students
of  Holland," in that  they are aware  of and have
benefited  from John's work.   (Again, I mean that
as  a compliment, not because  I have been charged
with  GA  Clique membership  validation.)   I  see
absolutely  no benefit and  potentially great harm
in  drawing lines  between closely  related bodies
of research. 


  So  let's move on to more meaningful attempts to
define  the GA.  My two-bit definition  focused on
the  cross-over operator:   GA(1) depended  on it,
and  GA(2)  generally relyed  on the  weaker (less
intelligent)  mutation operator.   This  lead Dave
Ackley to feel that:
      ...   membership in  GA(1) is restricted
    to  a small and  somewhat quirky "DNA-ish"
    subset  of all  possible combination rules
    [ackley@flash.bellcore.com, 3 Mar 88]

  I   take  Dave   to  mean  that   the  algorithm
presented  by Holland  (let's say  the R  class of
algorithms  described  in his  ANAS Chapter  6, to
be  specific) sacrifices some performance in order
to  remain more  biologically plausible.   But I'm
with  Dave on this one:  Personally, I'm also more
intereted  in the  algorithms than  their relation
to  real biological  mechanisms.   (Let the record
show,   however,  that  there are  GA  practioners
who  do try  to take biological  plausibility more
seriously, e.g., Grosso and Westerdale.)

  So   the  next   possibility  is  to   refer  to
properties  of  the GA  we  find desirable.    For
Dave,  I think the  key property of the  GA is its
"implicit  parallelism":  the  ability to search a
huge  space implicitly  by explicitly manipulating
a  very small set  of structures.   Jon Richardson
comes  closer  to  the definition  I  had  in mind
with  his emphasis on  Holland's "building blocks"
notion:

      The  proper   distinction  I   think  is
    whether  or not the recombination operator
    in  question  supports the  building block

    hypothesis.     "Mutation-like  operators"
    do  not  do  this.     Any  kind  of weird
    recombination   which  can   be  shown  to
    propagate  and construct  building blocks,
    I  would  call a  Genetic  Algorithm.   If
    the  operator  does nothing  with building
    blocks,  I  would consider  it apocryphal.
    It   may   be   valuable   but  apocryphal

    nonetheless and  shouldn't be called a GA.
    [richards@UTKCS2.CS.UTK.EDU, 4 Mar 88]


  While  I  would  echo the  value  of  both these
desiderata,  I don't  find them  technically tight
enough  to be useful.  So I suggest that we follow
Holland's  suggestion  (in  his  talk  at  ICGA85)
and  reserve  the term  "GA" for  those algorithms
for  which  we  can prove  the  "schemata theorem"
(ANAS,  Theorem 6.2.3).    I believe  this theorem
is  still the  best understanding  we have  of how
the  GA gives  rise to the  properties of implicit
parallelism,   building  blocks,  and  why.     Of
course,  there are  problems with  this definition
as  well.     In  particular,  it  is  so  tightly
wed  to the  string representation  and cross-over
operator  that it is very difficult to imagine any
algorithm  very  different from  the (traditional)
GA  that would  satisfy the  theorem.   But that's
exactly where I think the work needs to be done.

  Finally,  I want  to say  why I (as  Dave Ackley
says)  "took the trouble to exclude" Ackely's SIGH
system  from my  definition of  GA(1).   My answer
is  simply  that I  view  SIGH as  a hybrid.    It
has  borrowed techniques  from a number  of areas,
the  GA,  connectionism,  simulated  annealing  to
name  three.   There  is absolutely  nothing wrong
with  doing this, and as  Dave's thesis showed and
Larry  Eshelman's note  confirmed [Larry.Eshelman-
@F.GP.CS.CMU.EDU,  11 Mar 1988] there are at least
some  problems  in  which  SIGH  does much  better
that  the  traditional  GA. My  only  problem with
SIGH  is  that  I can't  do  the  apportionment of
credit  problem:    when it  works,  I  don't know
exactly  which technique is  responsible, and when
it  doesn't  work  I  don't  know  who  to  blame.
I  too  think about  connectionist  algorithms and
simulated  annealing along  with Holland's  GA and
bucket  brigade,  and see all  of them  as members
of  a  class of  algorithms I  want  to understand
better.   But  I find it necessary  to isolate the
properties  of each before trying to combine them.
In  short,  I think Dave  and I  agree to  a great
extent  (on the problem,  and on  what portions of
the  solution might be), and  disagree only in our
respective approaches to putting it all together.