MEGIDDO@IBM-SJ.ARPA (02/07/86)
From: MEGIDDO@IBM-SJ.ARPA First Announcement of a COMPUTER PROGRAMS TOURNAMENT (of the Prisoners' Dilemma game) 1. INTRODUCTION _______________ This is a first announcement of a tournament for computer programs, playing the famous Prisoners' Dilemma game. Detailed instructions and some background information are provided below. The tournament is organized for the purpose of research and no prizes are offered. It is intended however that the results and winners' names will be published with permission from the persons involved. One of the goals is to see what will happen during a SEQUENCE of tournaments in which information about the participating programs will be released, so that participants will be able to revise their programs. The tournament is open to everyone. However, notice the warnings below. If you have access to electronic mail then you can participate by submitting a FORTRAN program according to the instructions below. By doing so you will also release and waive all your copyright rights and any other intellectual property rights to your program. It will also be assumed that you have not violated any rights of any third party. This announcement also includes some programs that will help you prepare for the tournament. 2. BACKGROUND _____________ The so-called prisoners' dilemma game has drawn the attention of researchers from many fields: psychology, economics, political science, philosophy, biology, and mathematics. Computer scientists are also interested in this game in the context of fundamentals of distributed systems. The game is simple to describe, does not require much skill and is yet extremely interesting from both the theoretical and practical points of view. By the (one-shot) Prisoners' Dilemma game we refer to a game as follows. The game is played by two players with symmetric roles. Each has to choose (independently of the other) between playing action C ("cooperate") or action D ("defect"). The scores to the two players, corresponding to the four possible combinations of choices of actions, are as shown in the following table: Player 2 C D --------------- | 3 | 4 | C | | | | 3 | 0 | Player 1 |-------|-------| | 0 | 1 | D | | | | 4 | 1 | --------------- Thus, both players score 3 if both play C. Both score 1 if both play D. If one plays C and the other one plays D then the one who plays C scores 0 while the other one scores 4. The prisoner's dilemma game has been the subject of many experiments. A tournament was organized several years ago by R. Axelrod who later published a book on it under the title "The evolution of cooperation" (Basic Books, Inc., New York, 1984). Following is some discussion for the benefit of readers who are not familiar with the fundamental considerations of how to play the game. One should be careful to distinguish the one-shot game from the REPEATED game in which the (one-shot) game is played many times, and after each round both players are informed of each other's actions. Furthermore, one should distinguish between the infinitely repeated game and the finitely repeated one. These seem to be quite different from the point of view of equilibrium. An equilibrium in a 2-person game is a pair (S1,S2) of strategies (one for each player) such that, given that player i (i=1,2) is playing Si , the other player, j=3-i, scores the maximum if he plays Sj . We are interested here in the finitely repeated game where the number of rounds is known in advance. We first consider the one-shot game. The analysis of the one-shot game is obvious. Each of the players realizes that no matter what his opponent does, it is always better for him to play D rather than C. Thus, under a very weak assumption of rationality (namely, players do not choose actions that are strictly dominated by other actions), the pair of actions (D,D) remains the only rational choice. The resulting score of (1,1) is inferior to (3,3), which is possible if the choices are (C,C), and this is the source of the "dilemma". To get some insight into the more general case, consider first the 2-round game. After the first round (in which the players choose independently C or D) each player is informed of the choice of the other one and then, once again, the players choose independently C or D. In this game each player has EIGHT strategies that can be coded in the form XYZ where each of X,Y and Z equals either C or D. The interpretation of this notation is as follows. (1) Play X in round 1. (2) In round 2, play Y if the opponent played C and play Z if the opponent played D. It is easy to verify that any strategy XYZ is strictly dominated by XDD (that is, regardless of what was done in round 1, and regardless of what the opponent does in round 2, it is better to play D rather than C in round 2. However, there is no domination relation between the strategies CDD and DDD: if player 2 plays DDD then player 1 is better off playing DDD rather than CDD, whereas if player 2 plays DDC, player 1 is better off playing CDD rather than DDD. Of course, strategy DDC for player 2 is dominated by DDD, but in order for player 1 to deduce that player 2 will not play DDC, he has to assume that player 2 is capable of discovering this domination. Under such an assumption player 1 can eliminate 2's DDC. Thus, if both players are "rational" they are left only with strategy DDD as a reasonable choice. A similar process of repeatedly eliminating dominated strategies applies to the general N-round game. It is dominant for both players to defect in the last round. Therefore (after we drop all strategies that play C in the last round), it becomes dominant to defect in round N-1, and so on. This eventually leaves both players only with the strategy of always playing D. The winner in both tournaments run by R. Axelrod was the simple strategy called "Tit-for-Tat". It starts by playing C and in round i+1 plays whatever the opponent played in round i. It seems like a very good strategy for playing the repeated dilemma for an indefinite number of rounds. In the N-round game it is obvious that an improvement over Tit- for-Tat would be to play Tit-for-Tat except for the last round in which the optimal play is always to defect. 3. HOW TO PARTICIPATE IN THE TOURNAMENT? ________________________________________ If you think you understand the dilemma quite well and would like to participate in this tournament then please act according to the following instructions: 1. Design a strategy of how to play the game when the number of rounds is known in advance. The strategy should specify what to do in round 1 and at any point of the game, knowing what has been done so far and the number of rounds left, specify what to do in the next round. 2. Write a FORTRAN subroutine with the following specifications. Give it a six-letter name, for example, the first four letters of your last name followed by two initials. Suppose you picked the name JONERJ for your subroutine. Then the first line of your program should look as follows. SUBROUTINE JONERJ (N,J,I,M) The arguments are defined as follows. N - This is the total number of rounds to be played. Whenever your program is called it is told the total number of rounds and this will not change during a single game. J - This is the serial number of the round you are supposed to play in the current call. I - When J is greater than 1, this argument tells you what your opponent has played in the previous round. If I=1 it means your opponent has played C. If J=2 then he played D. Any other value is an error. M - This is what you return as your play in the current round. M=1 means you play C. M=2 means you play D. Any other value will result in an error. Your subroutine may compute anything you wish. In particular, it may keep track of the entire history of a single (N-round) game. However, it will not be able to record past games against any opponent since it will be unloaded at the end of a single N-round game. Please be reasonable with respect to the space and time you intend your program to use. Unreasonable programs will have to be dropped from the tournament at the discretion of the organizers. Also, if your program ever returns a faulty play, that is, it returns an M which is neither 1 nor 2, then it will be dropped from the tournament automatically. 3. Fill in the following information (to be transmitted only by electronic mail): NAME:_____________________________________________________________ AFFILIATION:______________________________________________________ STREET:___________________________________________________________ CITY:___________________ STATE:_____________ Zip:_______________ COUNTRY:__________________________________________________________ TELEPHONE:________________________________________________________ ELECTRONIC MAIL ADDRESS:__________________________________________ 4. Important notice! _________________________________________________________ | By sending your program to any one of the following | | addresses you agree to waive and release, to the extent | | permitted by law, all your copyright rights and other | | intellectual property rights in your computer program. | | You also warrant that no portion of your program or its | | use or distribution, violates or is protected by any | | copyright or other intellectual property right of any | | third party. You also warrant you have the right to, | | and hereby do, grant to IBM a royalty-free license to | | use your program. If any contestant is a minor under | | the laws of the state in which contestant resides, at | | least one of the contestant's parents should sign this | | warranty and license. IBM may elect to publish the | | results of the contest; names of participants or their | | submissions will not be published without the written | | approval and signature of the individual authors. | |_________________________________________________________| Please transmit your program by March 31, 1986, along with the filled questionnaire to one of the following addresses: CSNET or ARPANET: megiddo@ibm-sj VNET or BITNET : megiddo at almvma 4. TRAINING PROGRAM ___________________ For your convenience, we include here an interactive program that lets you play the game with another "player". While playing this interactive program please remember that your goal is actually to SCORE high and not necessarily to BEAT the other player. In the tournament, your ability to affect the player's total score is limited since he plays against many other players besides you. Thus you will benefit if you will create "confidence" so that both of you end up playing C very often. You have the option of either playing yourself or using the subroutine that represents you. If you use a subroutine then you have to name it MINE and follow the instructions in Section 3. Simply append it the following program. It is advised that you use this option to test your own program before submitting it to the tournament. ------------------------------------------------------------------------- INTEGER SCORE,SCORE2,CH1,CH2,PRE1,PRE2,CC,DD,CD,DC C DATA CC,DD,CD,DC/3,1,0,4/ 20 SCORE = 0 SCORE2 = 0 PRE1=1 PRE2=1 WRITE(6,102) 102 FORMAT(' ENTER NUMBER OF ROUNDS YOU WISH TO PLAY (0=END)') 103 FORMAT (I6) READ (5,*) NR IF (NR.LE.0) STOP 118 FORMAT(' WILL YOU (1) PLAY OR WILL YOUR SUBROUTINE (2) DO? (1/2)') 430 WRITE (6,118) READ (5,*) II IF (II.EQ.2) GO TO 420 IF (II.NE.1) GO TO 430 420 DO 30 JR = 1, NR 104 FORMAT(' ROUND NO.',I6,' OF',I6,' ROUNDS. PLEASE ENTER 1 OR 2') IF (II.EQ.2) GO TO 440 WRITE (6,104) JR,NR 40 CONTINUE READ (5,*) CH1 GO TO 450 440 CALL MINE(NR,JR,PRE2,CH1) IF ((CH1-1)*(CH1-2)) 470,71,470 470 WRITE (6,117) 117 FORMAT (' YOUR SUBROUTINE RETURNED A FAULTY PLAY') GO TO 20 450 IF ((CH1-1)*(CH1-2)) 70,71,70 70 IF (CH1.EQ.0) GO TO 20 105 FORMAT(' PLEASE ENTER EITHER 1 OR 2 . (0=END)') WRITE (6,105) GO TO 40 71 IF (JR-1) 220,220,230 220 CH2 = 1 IF (NR.EQ.1) CH2 = 2 GO TO 300 230 IF (JR-NR) 250,260,260 250 CH2 = PRE1 GO TO 300 260 CH2 = 2 107 FORMAT(' PLAY WAS: YOU=',I3,' OPPONENT=',I3) 300 WRITE(6,107) CH1,CH2 IF (CH1-1) 110,110,120 110 IF (CH2-1) 130,130,140 130 SCORE = SCORE + CC SCORE2 = SCORE2 + CC GO TO 35 140 SCORE = SCORE + CD SCORE2 = SCORE2 + DC GO TO 35 120 IF (CH2-1) 150,150,160 150 SCORE = SCORE + DC SCORE2 = SCORE2 + CD GO TO 35 160 SCORE = SCORE + DD SCORE2 = SCORE2 + DD 35 WRITE (6,106) SCORE,SCORE2 106 FORMAT (' NEW TOTAL SCORE: YOU=',I5,' OPPONENT=',I5) PRE1=CH1 PRE2=CH2 30 CONTINUE GO TO 20 END ------------------------------------------------------------------------- 5. SAMPLE PROGRAMS __________________ For your convenience we include here copies of two sample programs. The first subroutine, called TIFRTA, plays Tit-for-Tat (see Section 2) except that it always defects in the last round. The second, called GRIM, starts playing C but switches to D the first time th opponent has played D. It also always defects in the last round. ------------------------------------------------------------------------- SUBROUTINE TIFRTA (N,J,IHE,MY) C C THIS IS THE TIT-FOR-TAT RULE. IN ROUND 1 PLAY 1. IN ROUND N C PLAY 0. OTHERWISE, PLAY WHAT THE OPPONENT PLAYED IN THE PRECEDING C ROUND. C C N = TOTAL NUMBER OF ROUNDS C J = CURRENT ROUND C IHE = THE CHOICE OF THE OPPONENT IN THE PRECEDING ROUND (1 OR 2) C MY = MY CHOICE FOR THE CURRENT ROUND (1 OR 2) C IF (J-1) 20,20,30 20 MY = 1 IF(N.EQ.1) MY=2 RETURN 30 IF (J-N) 50,60,60 50 MY = IHE RETURN 60 MY = 2 RETURN END C------------------------------------------------------------------------ SUBROUTINE GRIM (N,J,IHE,MY) C C THIS IS THE GRIM STRATEGY: START WITH C AND SWITCH TO D C AS SOON AS THE OPPONENT DOES C IF (J-1) 10,10,20 10 IX = 1 20 IF (IHE.EQ.2) IX = 2 IF (J.EQ.N) IX = 2 MY = IX RETURN END -------------------------------------------------------------------------