Last modified on November 28, 2016, at 23:36

Prisoner's dilemma

The Prisoner's Dilemma is a classic problem in game theory. It has the paradoxical outcome that members of a group will consciously steer towards a sub-optimal outcome in certain scenarios.

don't confess confess
A don't confess A: 6 months

B: 6 months

A: 10 years

B: free

confess A: free

B: 10 years

A: 2 years

B: 2 years

The game is usually phrased in terms of two suspects, both of whom have been arrested for a major crime, who are offered a bargain. If both stay silent, each of them can still be convicted of a minor crime and sentenced to 6 months in prison. If one of them confesses and implicates the other, this provides evidence of a major crime. The confessor is rewarded by being let off of all crimes, and the other suspect will serve ten years in prison. If both confess, they will both serve two years in a plea for the major crime.

It is obvious that the best outcome (the Pareto optimum) for the group would be if both prisoners cooperated and stayed silent: Six months for both prisoners. However, in the "default" setting of the Prisoner's dilemma, we assume that the prisoners are not given the chance to work out such a strategy and that they are interested in their own wellbeing first.

Prisoner A will now analyze his options:

  • If Prisoner B chooses "don't confess", Prisoner A's best choice will be "confess": A gets out of prison immediately.
  • If Prisoner B chooses "confess", Prisoner A's best choice will be "confess", too: 2 years is better than 10 years.

(The case for Prisoner B is symmetric.)

Using this reasoning, both prisoners will choose "confess" as providing the best outcome for themselves in all circumstances, even though it is not best result for the group.

The strategy "confess" is a strictly dominant strategy: The choice of the Prisoner B does not change the way Prisoner A will act. The "confess/confess" scenario is also the only Nash equilibrium in this problem.

When a situation like the prisoner's dilemma, where the Pareto efficient outcome is not the actual outcome, occurs in a market economy, that can be an example of a market failure.

Iterated Prisoner's Dilemma

The Iterated Prisoner's dilemma is when the basic game is played multiple times (sometimes infinitely many times). Here, co-operation (neither player confessing) can be a Nash equilibrium. This requires that each player pays attention to what the other player does on previous "rounds", and punish or reward the other player as appropriate.

The best known strategy in the Iterated Prisoner's dilemma is the "tit for tat" strategy. The "tit for tat" strategy is to cooperate the first time and then on all subsequent times the strategy is to do whatever the opponent did on the turn prior to the one you are on. In this game a prisoner will cooperate, because by doing so he ensures that the other player will cooperate in the next iteration. It is necessary that the game be played indefinitely for this strategy to foster cooperation. If not, players will deviate and confess in the last iteration, because there are no consequences, which will then remove the consequence from deviating in the second last iteration, followed by the third last, and so on, so both players confess immediately.