Difference between revisions of "Prisoner's dilemma"

Revision as of 19:11, May 7, 2007

Introduction

The Prisoner's Dilemma is a classic problem in Game Theory. It has the paradoxial outcome that members of a group will consciously steer towards a sub-optimal outcome in certain scenarios.

		B
		don't confess	confess
A	don't confess	A: 6 months B: 6 months	A: 10 years B: free
A	confess	A: free B: 10 years	A: 2 years B: 2 years

The game is usually phrased in terms of two suspects, both of whom have been arrested, and offered a bargain. If both stay silent, they will both serve 6 months in prison for a minor crime. If one of them confesses, this provides evidence of a major crime. The confessor is rewarded by being let off, and the other suspect will serve ten years in prison. If both confess, they will both serve two years.

It is obvious that the best outcome (the Pareto optimum) for the group would be if both prisoners cooperated and stayed silent: Six months for both prisoners. However, in the "default" setting of the Prisoner's dilemma, we assume that the prisoners are not given the chance to work out such a strategy and that they are interested in their own wellbeing first.

Prisoner A will now analyze his options:

If Prisoner B chooses "don't confess", Prisoner A's best choice will be "confess": A gets out of prison immediately.
If Prisoner B chooses "confess", Prisoner A's best choice will be "confess", too: 2 years is better than 10 years.

(The case for Prisoner B is symmetric.)

Using this reasoning, both prisoners will choose "confess", even though it is not best result.

The strategy "confess" is a strictly dominant strategy: The choice of the Prisoner B does not change the way Prisoner A will act. The "confess/confess" scenario is also the only Nash equilibrium in this problem.

Iterated Prisoner's Dilemma

The Iterated Prisoner's dilemma is when the basic game is played multiple times (sometimes infinitely many times). Here, co-operation can be a Nash equilibrium. This requires that each player pays attention to what the other player does on previous "rounds", and punish or reward the other player as appropriate.

In 1980 Robert Axelrod put out a call for experts in game theory and computational science to send in algorithms for playing an iterative Prisoner's Dilemma. He proposed to have all submitted algorithms compete in a tournament to see which one was the best. A total of 14 algorithms were submitted, ranging from immensely complicated and computational intensive to extremely simple. The results were published in the Journal of Conflict Resolution and as it turned out the simplest and smallest algorithm won the tournament. It was developed by Anatol Rapoport out of the University of Toronto and it was called "tit for tat". The "tit for tat" strategy is to cooperate the first time and then on all subsequent times the strategy is to do whatever the opponent did on the turn prior to the one you are on. While subsequent algorithms have been developed that can best the "tit for tat" strategy it remains the most computationally efficent. Because of this it has been proposed as the strategy that humans employ in social interactions.

An additional strategy that is often followed and debated is the "Grim Trigger": that is to say, cooperate until the first defection, and from then on out, defect every turn. Grim Trigger tends to work only when there is informational exchange.

Relation to International Affairs: Nuclear Detente

		B
		defect	cooperate
A	defect	A: +1 B: +1	A: +5 B: +0
A	cooperate	A: +0 B: +5	A: +3 B: +3

The Prisoner's Dilemma can be used to explain the awkward situation of exact nuclear parity. So long as a first-strike is possible, and the first-strike would eliminate the chance of retaliation, the players are in a "Prisoner's Dilemma," and, as noted above, are incentivized to defect.

Much of Cold War policy, then, was at preventing a Prisoner's Dilemma situation. For example, the retention of American nuclear warheads in untraceable submarines, and the retention of Russian arms in untraceable rail cars, prevented the incentive of the first strike, and kept the parties locked in a situation where cooperation remained the best strategy.

Nuclear "escalation" could destabilize such parity. For example, the development of MIRV (Multiple Re-entry Vehicle) warheads placed Russia briefly ahead of the US. Similarly, American development of an effective nuclear shield would have "won" the game, and was greatly feared by Russia, and contemplated as a reason for a first-strike were its completion imminent.

The "game" as studied by political scientists is displayed to the right, with the traditional values.

References

Mutually Assured Destruction
See Thomas Schelling, "Strategies of Conflict."

Difference between revisions of "Prisoner's dilemma"

Revision as of 19:11, May 7, 2007

Contents

Introduction

Iterated Prisoner's Dilemma

Relation to International Affairs: Nuclear Detente

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Popular Links

donate

Edit Console

@@ Line 43: / Line 43: @@
 ==Iterated Prisoner's Dilemma==
-The Iterated Prisoner's dilemma is when the basic game is played multiple times (sometimes infinitely many times). Here, co-operation can be a Nash equilibrium. This requires that each player pays attention to what the other player does on previous "rounds", and punish or reward the other player as appropriate. One of the best strategies in the Iterated Prisoner's dilemma is the "tit for tat" strategy. The "tit for tat" strategy is to cooperate the first time and then on all subsequent times the strategy is to do whatever the opponent did on the turn prior to the one you are on.
+The Iterated Prisoner's dilemma is when the basic game is played multiple times (sometimes infinitely many times). Here, co-operation can be a Nash equilibrium. This requires that each player pays attention to what the other player does on previous "rounds", and punish or reward the other player as appropriate.
+In 1980 Robert Axelrod put out a call for experts in game theory and computational science to send in algorithms for playing an iterative Prisoner's Dilemma. He proposed to have all submitted algorithms compete in a tournament to see which one was the best. A total of 14 algorithms were submitted, ranging from immensely complicated and computational intensive to extremely simple. The results were published in the Journal of Conflict Resolution and as it turned out the simplest and smallest algorithm won the tournament. It was developed by Anatol Rapoport out of the University of Toronto and it was called "tit for tat". The "tit for tat" strategy is to cooperate the first time and then on all subsequent times the strategy is to do whatever the opponent did on the turn prior to the one you are on. While subsequent algorithms have been developed that can best the "tit for tat" strategy it remains the most computationally efficent. Because of this it has been proposed as the strategy that humans employ in social interactions.
 An additional strategy that is often followed and debated is the "Grim Trigger": that is to say, cooperate until the first defection, and from then on out, defect every turn.  Grim Trigger tends to work only when there is informational exchange.