Iterated Strict Dominance in Pure Strategies We know that no rational player ever plays strictly dominated strategies. As each player knows that each player is rational, each player knows that his opponents will not play strictly dominated strategies and thus all opponents know that effectively they are facing a "smaller" game. As rationality is a common knowledge, everyone knows that everyone knows that the game is effectively smaller. Thus everyone knows, that nobody will play strictly dominated strategies in the smaller game (and such strategies may indeed exist). Because it is a common knowledge that all players will perform this kind of reasoning again, the process can continue until no more strictly dominated strategies can be eliminated. 35 IESDS The previous reasoning yields the Iterated Elimination of Strictly Dominated Strategies (IESDS): Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting to Dk i , i ∈ N.) 1. Initialize k = 0 and D0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Dk+1 i be the set of all pure strategies of Dk i that are not strictly dominated in Gk DS . 3. Let k := k + 1 and go to 2. We say that si ∈ Si survives IESDS if si ∈ Dk i for all k = 0, 1, 2, . . . Definition 10 A strategy profile s = (s1, . . . , sn) ∈ S is an IESDS equilibrium if each si survives IESDS. A game is IESDS solvable if it has a unique IESDS equilibrium. Remark: If all Si are finite, then in 2. we may remove only some of the strictly dominated strategies (not necessarily all). The result is not affected by the order of elimination since strictly dominated strategies remain strictly dominated even after removing some other strictly dominated strategies. 36 IESDS Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only one surviving the first round of IESDS. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 all strategies survive all rounds (i.e. IESDS ≡ anything may happen, sorry) 37 A Bit More Interesting Example L C R L 4, 3 5, 1 6, 2 C 2, 1 8, 4 3, 6 R 3, 0 9, 6 2, 8 IESDS on greenboard! 38 Political Science Example: Median Voter Theorem Hotelling (1929) and Downs (1957) � N = {1, 2} � Si = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} (political and ideological spectrum) � 10 voters belong to each position (Here 10 means ten percent in the real-world) � Voters vote for the closest candidate. If there is a tie, then 1 2 got to each candidate � Payoff: The number of voters for the candidate, each candidate (selfishly) strives to maximize this number 39 Political Science Example: Median Voter Theorem � 1 and 10 are the (only) strictly dominated strategies ⇒ D1 1 = D1 2 = {2, . . . , 9} � in G1 DS , 2 and 9 are the (only) strictly dominated strategies ⇒ D2 1 = D2 2 = {3, . . . , 8} � . . . � only 5, 6 survive IESDS 40 Belief & Best Response IESDS eliminated apparently unreasonable behavior (leaving "reasonable" behavior implicitly untouched). What if we rather want to actively preserve reasonable behavior? What is reasonable? .... what we believe is reasonable :-). Intuition: � Imagine that your colleague did something stupid � What would you ask him? Usually something like "What were you thinking?" � The colleague may respond with a reasonable description of his belief in which his action was (one of) the best he could do (You may of course question reasonableness of the belief) Let us formalize this type of reasoning .... 41 Belief & Best Response Definition 11 A belief of player i is a pure strategy profile s−i ∈ S−i of his opponents. Definition 12 A strategy si ∈ Si of player i is a best response to a belief s−i ∈ S−i if ui(si, s−i) ≥ ui(s� i , s−i) for all s� i ∈ Si Claim 3 A rational player who believes that his opponents will play s−i ∈ S−i always chooses a best response to s−i ∈ S−i. Definition 13 A strategy si ∈ Si is never best response if it is not a best response to any belief s−i ∈ S−i. A rational player never plays any strategy that is never best response. 42 Best Response vs Strict Dominance Proposition 1 If si is strictly dominated for player i, then it is never best response. The opposite does not have to be true in pure strategies: X Y A 1, 1 1, 1 B 2, 1 0, 1 C 0, 1 2, 1 Here A is never best response but is strictly dominated neither by B, nor by C. 43 Elimination of Stupid Strategies = Rationalizability Using similar iterated reasoning as for IESDS, strategies that are never best response can be iteratively eliminated. Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting to Rk i , i ∈ N.) 1. Initialize k = 0 and R0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Rk+1 i be the set of all strategies of Rk i that are best responses to some beliefs in Gk Rat . 3. Let k := k + 1 and go to 2. We say that si ∈ Si is rationalizable if si ∈ Rk i for all k = 0, 1, 2, . . . Definition 14 A strategy profile s = (s1, . . . , sn) ∈ S is a rationalizable equilibrium if each si is rationalizable. We say that a game is solvable by rationalizability if it has a unique rationalizable equilibrium. (Warning: For some reasons, rationalizable strategies are almost always defined using mixed strategies!) 44 Rationalizability Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only rationalizable equilibrium. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 all strategies are rationalizable. 45 Cournot Duopoly G = (N, (Si)i∈N , (ui)i∈N) � N = {1, 2} � Si = [0, ∞) � u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. What is a best response of player 1 to a given q2 ? Solve δu1 δq1 = θ − 2q1 − q2 = 0, which gives that q1 = (θ − q2)/2 is the only best response of player 1 to q2. Similarly, q2 = (θ − q1)/2 is the only best response of player 2 to q1. Since q2 ≥ 0, we obtain that q1 is never best response iff q1 > θ/2. Similarly q2 is never best response iff q2 > θ/2. Thus R1 1 = R1 2 = [0, θ/2]. 46 Cournot Duopoly G = (N, (Si)i∈N , (ui)i∈N) � N = {1, 2} � Si = [0, ∞) � u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. Now, in G1 Rat , we still have that q1 = (θ − q2)/2 is the best response to q2, and q2 = (θ − q1)/2 the best resp. to q1 Since q2 ∈ R1 2 = [0, θ/2], we obtain that q1 is never best response iff q1 ∈ [0, θ/4) Similarly q2 is never best response iff q2 ∈ [0, θ/4) Thus R2 1 = R2 2 = [θ/4, θ/2]. .... 47 Cournot Duopoly (cont.) G = (N, (Si)i∈N , (ui)i∈N) � N = {1, 2} � Si = [0, ∞) � u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. In general, after 2k iterations we have R2k i = R2k i = [�k , rk ] where � rk = (θ − �k−1)/2 for k ≥ 1 � �k = (θ − rk )/2 for k ≥ 1 and �0 = 0 Solving the recurrence we obtain � �k = θ/3 − � 1 4 �k θ/3 � rk = θ/3 + � 1 4 �k−1 θ/6 Hence, limk→∞ �k = limk→∞ rk = θ/3 and thus (θ/3, θ/3) is the only rationalizable equilibrium. 48 Cournot Duopoly (cont.) G = (N, (Si)i∈N , (ui)i∈N) � N = {1, 2} � Si = [0, ∞) � u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. Are qi = θ/3 Pareto optimal? NO! u1(θ/3, θ/3) = u2(θ/3, θ/3) = θ2 /9 but u1(θ/4, θ/4) = u2(θ/4, θ/4) = θ2 /8 49 IESDS vs Rationalizability in Pure Strategies Theorem 15 Assume that S is finite. Then for all k we have that Rk i ⊆ Dk i . That is, in particular, all rationalizable strategies survive IESDS. The opposite inclusion does not have to be true in pure strategies: X Y A 1, 1 1, 1 B 2, 1 0, 1 C 0, 1 2, 1 Recall that A is never best response but is strictly dominated by neither B, nor C. That is, A survives IESDS but is not rationalizable. 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Now assume that Rk i ⊆ Dk i for some k ≥ 0. We prove that Rk+1 i ⊆ Dk+1 i by showing the following: For all s∗ i ∈ Rk i ⊆ Dk i : If s∗ i � Dk+1 i , then s∗ i � Rk+1 i Let us fix s∗ i ∈ Rk i such that s∗ i � Dk+1 i . By definition, it suffices to prove that for every sk −i ∈ Rk −i there exists sk i ∈ Ri such that ui(sk i , sk −i) > ui(s∗ i , sk −i) (1) (In words, for every possible behavior of opponents of player i in Gk Rat , player i has a strictly better strategy than s∗ i in Gk Rat ) As s∗ i � Dk+1 i , the strategy s∗ i must be strictly dominated in Gk DS by a strategy ¯si. That is for all sk −i ∈ Dk −i ⊇ Rk −i we have ui(¯si, sk −i) > ui(s∗ i , sk −i) (2) (Now note that if ¯si ∈ Rk i ⊆ Dk i , then we are done. Indeed, it suffices to put sk i := ¯si and the equation (1) will be satisfied for all sk −i ∈ Dk −i ⊇ Rk −i . However, it does not have to be the case that ¯si ∈ Rk i ) 51 Proof of Theorem 15 (cont.) Clearly, there is � ≤ k such that ¯si ∈ R� i . (Note that ¯si does not have to strictly dominate s∗ i in G� Rat since R� −i may be larger than Dk −i ) Recall that we need to find sk i ∈ Rk i for every given sk −i ∈ Rk −i so that the inequality (1) holds. (That is, sk i may be different for different sk −i ’s) Let us fix sk −i ∈ Rk −i ⊆ Dk −i . Let sk i ∈ R� i be a strategy maximizing ui(si, sk −i ) over all si ∈ R� i . In particular, we obtain the inequality (1): ui(sk i , sk −i) ≥ ui(¯si, sk −i) > ui(s∗ i , sk −i) Finally, note that sk i ∈ Rk i follows immediately from the fact that sk i is a best response to sk −i in all games G� Rat , . . . , Gk Rat (Indeed, even after removing some strategies (other than sk i and sk −i ), sk i remains a best resp. to sk −i ) 52 Pinning Down Beliefs – Nash Equilibria Criticism of previous approaches: � Strictly dominant strategy equilibria often do not exist � IESDS and rationalizability may not remove any strategies Typical example is Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 Here all strategies are equally reasonable according to the above concepts. But are all strategy profiles really equally reasonable? 53 Pinning Down Beliefs – Nash Equilibria O F O 2, 1 0, 0 F 0, 0 1, 2 Assume that each player has a belief about strategies of other players. By Claim 3, each player plays a best response to his beliefs. Is (O, F) as reasonable as (O, O) in this respect? Note that if player 1 believes that player 2 plays O, then playing O is reasonable, and if player 2 believes that player 1 plays F, then playing F is reasonable. But such beliefs cannot be correct together! (O, O) can be obtained as a profile where each player plays the best response to his belief and the beliefs are correct. 54 Nash Equilibrium Nash equilibrium can be defined as a set of beliefs (one for each player) and a strategy profile in which every player plays a best response to his belief and each strategy of each player is consistent with beliefs of his opponents. A usual definition is following: Definition 16 A pure-strategy profile s∗ = (s∗ 1 , . . . , s∗ n) ∈ S is a (pure) Nash equilibrium if s∗ i is a best response to s∗ −i for each i ∈ N, that is ui(s∗ i , s∗ −i) ≥ ui(si, s∗ −i) for all si ∈ Si and all i ∈ N Note that this definition is equivalent to the previous one in the sense that s∗ −i may be considered as the (consistent) belief of player i to which he plays a best response s∗ i 55 Nash Equilibria Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only Nash equilibrium. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 only (O, O) and (F, F) are Nash equilibria. In Cournot Duopoly, (θ/3, θ/3) is the only Nash equilibrium. (Best response relations: q1 = (θ − q2)/2 and q2 = (θ − q1)/2 are both satisfied only by q1 = q2 = θ/3) 56 Example: Stag Hunt Story: � Two (in some versions more than two) hunters, players 1 and 2, can each choose to hunt � stag (S) = a large tasty meal � hare (H) = also tasty but small � Hunting stag is much more demanding and forces of both players need to be joined (hare can be hunted individually) Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? 57 Example: Stag Hunt Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? If each player believes that the other one will go for hare, then (H, H) is a reasonable outcome ⇒ a society of individualists who do not cooperate at all. If each player believes that the other will cooperate, then this anticipation is self-fulfilling and results in what can be called a cooperative society. This is supposed to explain that in real world there are societies that have similar endowments, access to technology and physical environment but have very different achievements, all because of self-fulfilling beliefs (or norms of behavior). 58 Example: Stag Hunt Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? Another point of view: (H, H) is less risky Minimum secured by playing S is 0 as opposed to 3 by playing H (We will get to this minimax principle later) So it seems to be rational to expect (H, H) (?) 59 Nash Equilibria vs Previous Concepts Theorem 17 1. If s∗ is a strictly dominant strategy equilibrium, then it is the unique Nash equilibrium. 2. Each Nash equilibrium is rationalizable and survives IESDS. 3. If S is finite, neither rationalizability, nor IESDS creates new Nash equilibria. Proof: Homework! Corollary 18 Assume that S is finite. If rationalizability or IESDS result in a unique strategy profile, then this profile is a Nash equilibrium. 60 Interpretations of Nash Equilibria Except the two definitions, usual interpretations are following: � When the goal is to give advice to all of the players in a game (i.e., to advise each player what strategy to choose), any advice that was not an equilibrium would have the unsettling property that there would always be some player for whom the advice was bad, in the sense that, if all other players followed the parts of the advice directed to them, it would be better for some player to do differently than he was advised. If the advice is an equilibrium, however, this will not be the case, because the advice to each player is the best response to the advice given to the other players. � When the goal is prediction rather than prescription, a Nash equilibrium can also be interpreted as a potential stable point of a dynamic adjustment process in which individuals adjust their behavior to that of the other players in the game, searching for strategy choices that will give them better results. 61