[mod.ai] Prisoners' Dilemma Computer Programs Tournament

MEGIDDO@IBM-SJ.ARPA (02/07/86)

From: MEGIDDO@IBM-SJ.ARPA


                       First Announcement of a

                     COMPUTER PROGRAMS TOURNAMENT
                   (of the Prisoners' Dilemma game)

1. INTRODUCTION
_______________

  This is a first announcement of a tournament for computer programs,
playing the famous Prisoners' Dilemma game.  Detailed instructions and
some background information are provided below.  The tournament is
organized for the purpose of research and no prizes are offered.  It
is intended however that the results and winners' names will be
published with permission from the persons involved.  One of the goals
is to see what will happen during a SEQUENCE of tournaments in which
information about the participating programs will be released, so that
participants will be able to revise their programs.  The tournament is
open to everyone.  However, notice the warnings below.  If you have
access to electronic mail then you can participate by submitting a
FORTRAN program according to the instructions below.  By doing so you
will also release and waive all your copyright rights and any other
intellectual property rights to your program.  It will also be assumed
that you have not violated any rights of any third party.  This
announcement also includes some programs that will help you prepare
for the tournament.

2. BACKGROUND
_____________

  The so-called prisoners' dilemma game has drawn the attention
of researchers from many fields: psychology, economics, political
science, philosophy, biology, and mathematics.  Computer scientists
are also interested in this game in the context of fundamentals of
distributed systems.

  The game is simple to describe, does not require much skill and is yet
extremely interesting from both the theoretical and practical points
of view.  By the (one-shot) Prisoners' Dilemma game we refer to a game
as follows.  The game is played by two players with symmetric roles.
Each has to choose (independently of the other) between playing action
C ("cooperate") or action D ("defect").  The scores to the two
players, corresponding to the four possible combinations of choices of
actions, are as shown in the following table:

                          Player 2

                         C       D
                      ---------------
                     |     3 |     4 |
                  C  |       |       |
                     | 3     | 0     |
       Player 1      |-------|-------|
                     |     0 |     1 |
                  D  |       |       |
                     | 4     | 1     |
                      ---------------

Thus, both players score 3 if both play C.  Both score 1 if both play D.
If one plays C and the other one plays D then the one who plays C scores
0 while the other one scores 4.

  The prisoner's dilemma game has been the subject of many experiments.
A tournament was organized several years ago by R.  Axelrod who later
published a book on it under the title "The evolution of cooperation"
(Basic Books, Inc., New York, 1984).

  Following is some discussion for the benefit of readers who are not
familiar with the fundamental considerations of how to play the game. One
should be careful to distinguish the one-shot game from the REPEATED game
in which the (one-shot) game is played many times, and after each round
both players are informed of each other's actions.  Furthermore, one
should distinguish between the infinitely repeated game and the finitely
repeated one.  These seem to be quite different from the point of view
of equilibrium.  An equilibrium in a 2-person game is a pair (S1,S2) of
strategies (one for each player) such that, given that player  i  (i=1,2)
is playing  Si , the other player, j=3-i, scores the maximum if he plays
Sj .

  We are interested here in the finitely repeated game where the number
of rounds is known in advance.  We first consider the one-shot game.
The analysis of the one-shot game is obvious.  Each of the players
realizes that no matter what his opponent does, it is always better
for him to play D rather than C.  Thus, under a very weak assumption
of rationality (namely, players do not choose actions that are
strictly dominated by other actions), the pair of actions (D,D)
remains the only rational choice.  The resulting score of (1,1) is
inferior to (3,3), which is possible if the choices are (C,C), and
this is the source of the "dilemma".

  To get some insight into the more general case, consider first
the 2-round game.  After the first round (in which the players choose
independently C or D) each player is informed of the choice of the
other one and then, once again, the players choose independently C or
D.  In this game each player has EIGHT strategies that can be coded in
the form XYZ where each of X,Y and Z equals either C or D.  The
interpretation of this notation is as follows.  (1) Play X in round 1.
(2) In round 2, play Y if the opponent played C and play Z if the
opponent played D.  It is easy to verify that any strategy XYZ is
strictly dominated by XDD (that is, regardless of what was done in
round 1, and regardless of what the opponent does in round 2, it is
better to play D rather than C in round 2.  However, there is no
domination relation between the strategies CDD and DDD: if player 2
plays DDD then player 1 is better off playing DDD rather than CDD,
whereas if player 2 plays DDC, player 1 is better off playing CDD
rather than DDD.  Of course, strategy DDC for player 2 is dominated by
DDD, but in order for player 1 to deduce that player 2 will not play
DDC, he has to assume that player 2 is capable of discovering this
domination.  Under such an assumption player 1 can eliminate 2's DDC.
Thus, if both players are "rational" they are left only with strategy
DDD as a reasonable choice.

  A similar process of repeatedly eliminating dominated strategies
applies to the general N-round game.  It is dominant for both players
to defect in the last round.  Therefore (after we drop all strategies
that play C in the last round), it becomes dominant to defect in round
N-1, and so on.  This eventually leaves both players only with the
strategy of always playing D.

  The winner in both tournaments run by R. Axelrod was the simple
strategy called "Tit-for-Tat".  It starts by playing C and in round i+1
plays whatever the opponent played in round i.  It seems like a very good
strategy for playing the repeated dilemma for an indefinite number of
rounds.  In the N-round game it is obvious that an improvement over Tit-
for-Tat would be to play Tit-for-Tat except for the last round in which
the optimal play is always to defect.

3. HOW TO PARTICIPATE IN THE TOURNAMENT?
________________________________________

  If you think you understand the dilemma quite well and would like to
participate in this tournament then please act according to the following
instructions:
1.  Design a strategy of how to play the game when the number of rounds
is known in advance.  The strategy should specify what to do in round 1
and at any point of the game, knowing what has been done so far and the
number of rounds left, specify what to do in the next round.
2.  Write a FORTRAN subroutine with the following specifications.  Give
it a six-letter name, for example, the first four letters of your last
name followed by two initials.  Suppose you picked the name JONERJ for
your subroutine.  Then the first line of your program should look as
follows.

      SUBROUTINE JONERJ (N,J,I,M)

  The arguments are defined as follows.

N - This is the total number of rounds to be played.  Whenever your
    program is called it is told the total number of rounds and
    this will not change during a single game.
J - This is the serial number of the round you are supposed to play in
    the current call.
I - When J is greater than 1, this argument tells you what your opponent
    has played in the previous round.  If I=1 it means your opponent has
    played C.  If J=2 then he played D.  Any other value is an error.
M - This is what you return as your play in the current round.  M=1 means
    you play C.  M=2 means you play D.  Any other value will result in an
    error.

  Your subroutine may compute anything you wish.  In particular, it may
keep track of the entire history of a single (N-round) game.  However,
it will not be able to record past games against any opponent since it
will be unloaded at the end of a single N-round game.  Please be
reasonable with respect to the space and time you intend your program to
use.  Unreasonable programs will have to be dropped from the tournament
at the discretion of the organizers.  Also, if your program ever returns
a faulty play, that is, it returns an M which is neither 1 nor 2, then it
will be dropped from the tournament automatically.

3. Fill in the following information (to be transmitted only by
electronic mail):

     NAME:_____________________________________________________________
     AFFILIATION:______________________________________________________
     STREET:___________________________________________________________
     CITY:___________________  STATE:_____________  Zip:_______________
     COUNTRY:__________________________________________________________
     TELEPHONE:________________________________________________________
     ELECTRONIC MAIL ADDRESS:__________________________________________


4. Important notice!

       _________________________________________________________
      |   By sending your program to any one of the following   |
      | addresses you agree to waive and release, to the extent |
      | permitted by law, all your copyright rights and other   |
      | intellectual property rights in your computer program.  |
      | You also warrant that no portion of your program or its |
      | use or distribution, violates or is protected by any    |
      | copyright or other intellectual property right of any   |
      | third party.  You also warrant you have the right to,   |
      | and hereby do, grant to IBM a royalty-free license to   |
      | use your program.  If any contestant is a minor under   |
      | the laws of the state in which contestant resides, at   |
      | least one of the contestant's parents should sign this  |
      | warranty and license.  IBM may elect to publish the     |
      | results of the contest; names of participants or their  |
      | submissions will not be published without the written   |
      | approval and signature of the individual authors.       |
      |_________________________________________________________|

Please transmit your program by March 31, 1986, along with the filled
questionnaire to one of the following addresses:

     CSNET or ARPANET:      megiddo@ibm-sj
     VNET  or BITNET :      megiddo at almvma


4. TRAINING PROGRAM
___________________

  For your convenience, we include here an interactive program that lets
you play the game with another "player".  While playing this interactive
program please remember that your goal is actually to SCORE high and not
necessarily to BEAT the other player.  In the tournament, your ability
to affect the player's total score is limited since he plays against many
other players besides you.  Thus you will benefit if you will create
"confidence" so that both of you end up playing C very often.  You have
the option of either playing yourself or using the subroutine that
represents you.  If you use a subroutine then you have to name it MINE
and follow the instructions in Section 3.  Simply append it the following
program.  It is advised that you use this option to test your own program
before submitting it to the tournament.

-------------------------------------------------------------------------
      INTEGER SCORE,SCORE2,CH1,CH2,PRE1,PRE2,CC,DD,CD,DC
C
      DATA CC,DD,CD,DC/3,1,0,4/
   20 SCORE = 0
      SCORE2 = 0
      PRE1=1
      PRE2=1
      WRITE(6,102)
  102 FORMAT(' ENTER NUMBER OF ROUNDS YOU WISH TO PLAY (0=END)')
  103 FORMAT (I6)
      READ (5,*) NR
      IF (NR.LE.0) STOP
  118 FORMAT(' WILL YOU (1) PLAY OR WILL YOUR SUBROUTINE (2) DO? (1/2)')
  430 WRITE (6,118)
      READ (5,*) II
      IF (II.EQ.2)  GO TO 420
      IF (II.NE.1) GO TO 430
  420 DO 30 JR = 1, NR
  104 FORMAT(' ROUND NO.',I6,'  OF',I6,'  ROUNDS.  PLEASE ENTER 1 OR 2')
      IF (II.EQ.2) GO TO 440
      WRITE (6,104) JR,NR
   40 CONTINUE
      READ (5,*) CH1
      GO TO 450
  440 CALL MINE(NR,JR,PRE2,CH1)
      IF ((CH1-1)*(CH1-2)) 470,71,470
  470 WRITE (6,117)
  117 FORMAT (' YOUR SUBROUTINE RETURNED A FAULTY PLAY')
      GO TO 20
  450 IF ((CH1-1)*(CH1-2)) 70,71,70
   70 IF (CH1.EQ.0) GO TO 20
  105 FORMAT(' PLEASE ENTER EITHER  1  OR  2 .     (0=END)')
      WRITE (6,105)
      GO TO 40
   71 IF (JR-1) 220,220,230
  220 CH2 = 1
      IF (NR.EQ.1) CH2 = 2
      GO TO 300
  230 IF (JR-NR) 250,260,260
  250 CH2 = PRE1
      GO TO 300
  260 CH2 = 2
  107 FORMAT(' PLAY WAS:   YOU=',I3,'  OPPONENT=',I3)
  300 WRITE(6,107) CH1,CH2
      IF (CH1-1) 110,110,120
  110 IF (CH2-1) 130,130,140
  130 SCORE = SCORE + CC
      SCORE2 = SCORE2 + CC
      GO TO 35
  140 SCORE = SCORE + CD
      SCORE2 = SCORE2 + DC
      GO TO 35
  120 IF (CH2-1) 150,150,160
  150 SCORE = SCORE + DC
      SCORE2 = SCORE2 + CD
      GO TO 35
  160 SCORE = SCORE + DD
      SCORE2 = SCORE2 + DD
   35 WRITE (6,106) SCORE,SCORE2
  106 FORMAT (' NEW TOTAL SCORE:    YOU=',I5,'    OPPONENT=',I5)
      PRE1=CH1
      PRE2=CH2
   30 CONTINUE
      GO TO 20
      END
-------------------------------------------------------------------------

5. SAMPLE PROGRAMS
__________________

  For your convenience we include here copies of two sample programs.
The first subroutine, called TIFRTA, plays Tit-for-Tat (see Section 2)
except that it always defects in the last round.  The second, called
GRIM, starts playing C but switches to D the first time th opponent has
played D.  It also always defects in the last round.

-------------------------------------------------------------------------
      SUBROUTINE TIFRTA (N,J,IHE,MY)
C
C      THIS IS THE TIT-FOR-TAT RULE.  IN ROUND 1 PLAY 1.  IN ROUND  N
C    PLAY 0.  OTHERWISE, PLAY WHAT THE OPPONENT PLAYED IN THE PRECEDING
C    ROUND.
C
C       N = TOTAL NUMBER OF ROUNDS
C       J = CURRENT ROUND
C       IHE = THE CHOICE OF THE OPPONENT IN THE PRECEDING ROUND (1 OR 2)
C       MY = MY CHOICE FOR THE CURRENT ROUND (1 OR 2)
C
      IF (J-1) 20,20,30
   20 MY = 1
      IF(N.EQ.1) MY=2
      RETURN
   30 IF (J-N) 50,60,60
   50 MY = IHE
      RETURN
   60 MY = 2
      RETURN
      END
C------------------------------------------------------------------------
      SUBROUTINE GRIM (N,J,IHE,MY)
C
C      THIS IS THE   GRIM  STRATEGY: START WITH  C  AND SWITCH TO  D
C      AS SOON AS THE OPPONENT DOES
C
      IF (J-1) 10,10,20
   10 IX = 1
   20 IF (IHE.EQ.2) IX = 2
      IF (J.EQ.N) IX = 2
      MY = IX
      RETURN
      END
-------------------------------------------------------------------------