Best Response Definition 33 A strategy σi ∈ Σi of player i is a best response to a strategy profile σ−i ∈ Σ−i of his opponents if ui(σi, σ−i) ≥ ui(σ� i , σ−i) for all σ� i ∈ Σi We denote by BRi(σ−i) ⊆ Σi the set of all best responses of player i to the strategy profile of opponents σ−i ∈ Σ−i. 92 Best Response – Example Consider a game with the following payoffs of player 1: X Y A 2 0 B 0 2 C 1 1 � Player 1 (row) plays σ1 = (a(A), b(B), c(C)). � Player 2 (column) plays (q(X), (1 − q)(Y)) (we write just q). Compute BR1(q). 93 Rationalizability in Mixed Strategies (Two Players) For simplicity, we temporarily switch to two-player setting N = {1, 2}. Definition 34 A (mixed) belief of player i ∈ {1, 2} is a mixed strategy σ−i of his opponent. (A general definition works with so called correlated beliefs that are arbitrary distributions on S−i, the notion of the expected payoff needs to be adjusted, we are not going in this direction ....) Assumption: Any rational player with a belief σ−i always plays a best response to σ−i. Definition 35 A strategy σi ∈ Σi of player i ∈ {1, 2} is never best response if it is not a best response to any belief σ−i. No rational player plays a strategy that is never best response. 94 Rationalizability in Mixed Strategies (Two Players) Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting the pure strategy sets to Rk i , i ∈ N.) 1. Initialize k = 0 and R0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Rk+1 i be the set of all strategies of Rk i that are best responses to some (mixed) beliefs in Gk Rat . 3. Let k := k + 1 and go to 2. We say that si ∈ Si is rationalizable if si ∈ Rk i for all k = 0, 1, 2, . . . Definition 36 A strategy profile s = (s1, . . . , sn) ∈ S is a rationalizable equilibrium if each si is rationalizable. 95 Rationalizability vs IESDS (Two Players) X Y A 3 0 B 0 3 C 1 1 � Player 1 (row) plays σ1 = (a(A), b(B), c(C)) � player 2 (column) plays (q(X), (1 − q)(Y)) (we write just q) What strategies of player 1 are never best responses? What strategies of player 1 are strictly dominated? Observation: The set of strictly dominated strategies coincides with the set of never best responses! ... and this holds in general for two player games: Theorem 37 Assume N = {1, 2}. A pure strategy si is never best response to any belief σ−i ∈ Σ−i iff si is strictly dominated by a strategy σi ∈ Σi. It follows that a strategy of Si survives IESDS iff it is rationalizable. (The theorem is true also for an arbitrary number of players but correlated beliefs need to be used.) 96 Mixed Nash Equilibrium Definition 38 A mixed-strategy profile σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a (mixed) Nash equilibrium if σ∗ i is a best response to σ∗ −i for each i ∈ N, that is ui(σ∗ i , σ∗ −i) ≥ ui(σi, σ∗ −i) for all σi ∈ Σi and all i ∈ N An interpretation: each σ∗ −i can be seen as a belief of player i against which he plays a best response σ∗ i . Given a mixed strategy profile of opponents σ−i ∈ Σ−i, we denote by BRi(σ−i) the set of all σi ∈ Σi that are best responses to σ−i. Then σ∗ is a Nash equilibrium iff σ∗ i ∈ BRi(σ∗ −i ) for all i ∈ N. Theorem 39 (Nash 1950) Every finite game in strategic form has a Nash equilibrium. This is THE fundamental theorem of game theory. 97 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. What are the expected payoffs of playing pure strategies for player 1? v1(H, q) = 2q − 1 and v1(T, q) = 1 − 2q Then v1(p, q) = pv1(H, q) + (1 − p)v1(T, q) = p(2q − 1) + (1 − p)(1 − 2q). We obtain the best-response correspondence BR1: BR1(q) =    p = 0 if q < 1 2 p ∈ [0, 1] if q = 1 2 p = 1 if q > 1 2 98 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. Similarly for player 2 : v2(p, H) = 1 − 2p and v1(p, T) = 2p − 1 We obtain best-response relation BR2: BR2(p) =    q = 1 if p < 1 2 q ∈ [0, 1] if p = 1 2 q = 0 if p > 1 2 The only "intersection" of BR1 and BR2 is the only Nash equilibrium σ1 = σ2 = (1 2 , 1 2 ). 99 Computing Mixed Nash Equilibria Lemma 40 σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: � For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. � For all i ∈ N and all si � supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Here, the right hand side implies ui(σ∗ ) = wi. Proof. The fact that the right hand side implies ui(σ∗ ) = wi follows immediately from Lemma 23: ui(σ∗ ) = � si ∈Si σ∗ (si)ui(si, σ∗ −i) = � si ∈supp(σ∗ i ) σ∗ (si)ui(si, σ∗ −i) = � si ∈supp(σ∗ i ) σ∗ (si)wi = wi � si ∈supp(σ∗ i ) σ∗ (si) = wi 100 Computing Mixed Nash Equilibria Lemma 41 σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: � For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. � For all i ∈ N and all si � supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Here, the right hand side implies ui(σ∗ ) = wi. Proof. (Cont.) "⇐": Use the first equality of Lemma 23 to obtain for every i ∈ N and every σ� i ∈ Σi ui(σ� i , σ∗ −i) = � si ∈Si σ� i (si)ui(si, σ∗ −i) ≤ � si ∈Si σ� i (si)ui(σ∗ ) = ui(σ∗ ) Thus σ∗ is a Nash equilibrium. 101 Computing Mixed Nash Equilibria Lemma 42 σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: � For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. � For all i ∈ N and all si � supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Here, the right hand side implies ui(σ∗ ) = wi. Proof (Cont.) Idea for "⇒": Let wi := ui(σ∗ ). Clearly, every i ∈ N and si ∈ Si satisfy ui(si, σ∗ −i ) ≤ ui(σ∗ ) = wi. By Corollary 24, there is at least one si ∈ supp(σ∗ i ) satisfying ui(si, σ∗ −i ) = ui(σ∗ ) = wi. Now if there is s� i ∈ supp(σ∗ i ) such that ui(s� i , σ∗ −i) < ui(σ∗ ) (= ui(si, σ∗ −i)) then increasing the probability σ∗ i (si) and decreasing (in proportion) σ∗ i (s� i ) strictly increases of ui(σ∗ ), a contradiction with σ∗ being NE. 102 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. There are no pure strategy equilibria. There are no equilibria where only player 1 randomizes: Indeed, assume that (p, H) is such an equilibrium. Then by Lemma 42, 1 = u1(H, H) = u1(T, H) = −1 a contradiction. Also, (p, T) cannot be an equilibrium. Similarly, there is no NE where only player 2 randomizes. 103 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. Assume that both players randomize, i.e., p, q ∈ (0, 1). The expected payoffs of playing pure strategies for player 1: v1(H, q) = 2q − 1 and v1(T, q) = 1 − 2q Similarly for player 2 : v2(p, H) = 1 − 2p and v1(p, T) = 2p − 1 By Lemma 42, Nash equilibria must satisfy: 2q − 1 = 1 − 2q and 1 − 2p = 2p − 1 That is p = q = 1 2 is the only Nash equilibrium. 104 Example: Battle of Sexes O F O 2, 1 0, 0 F 0, 0 1, 2 Player 1 (row) plays (p(O), (1 − p)(F)) (we write just p) and player 2 (column) plays (q(O), (1 − q)(F)) (we write q). Compute all Nash equilibria. There are two pure strategy equilibria (2, 1) and (1, 2), no Nash equilibrium where only one player randomizes. Now assume that � player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and � player 2 (column) plays (q(H), (1 − q)(T)) (we write q) where p, q ∈ (0, 1). By Lemma 42, any Nash equilibrium must satisfy: 2q = 1 − q and p = 2(1 − p) This holds only for q = 1 3 and p = 2 3 . 105 An Algorithm? What did we do in the previous examples? We went through all support combinations for both players. (pure, one player mixing, both mixing) For each pair of supports we tried to find equilibria in strategies with these supports. (in Battle of Sexes: two pure, no equilibrium with just one player mixing, one equilibrium when both mixing) Whenever one of the supports was non-singleton, we reduced computation of Nash equilibria to linear equations. 106 Support Enumeration (Idea) Recall Lemma 42: σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: � For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. � For all i ∈ N and all si � supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Suppose that we somehow know the supports supp(σ∗ 1 ), . . . , supp(σ∗ n) for some Nash equilibrium σ∗ 1 , . . . , σ∗ n (which itself is unknown to us). Now we may consider all σ∗ i (si)’s and all wi’s as variables and use the above conditions to design a system of inequalities capturing Nash equilibria with the given support sets supp(σ∗ 1 ), . . . , supp(σ∗ n). 107 Support Enumeration To simplify notation, assume that for every i we have Si = {1, . . . , mi}. Then σi(j) is the probability of the pure strategy j in the mixed strategy σi. Fix supports suppi ⊆ Si for every i ∈ N and consider the following system of constraints with variables σ1(1), . . . , σ1(m1), . . . , σn(1), . . . , σn(mn), w1, . . . , wn: 1. For all i ∈ N and all k ∈ suppi we have (ui(k, σ−i) = ) � s∈S∧si =k   � j�i σj(sj)   ui(s) = wi 2. For all i ∈ N and all k � suppi we have (ui(k, σ−i) = ) � s∈S∧si =k   � j�i σj(sj)   ui(s) ≤ wi 3. For all i ∈ N: σi(1) + · · · + σi(mi) = 1. 4. For all i ∈ N and all k ∈ suppi: σi(k) ≥ 0. 5. For all i ∈ N and all k � suppi: σi(k) = 0. 108 Support Enumeration Consider the system of constraints from the previous slide. The following lemma follows immediately from Lemma 42. Lemma 43 Let σ∗ ∈ Σ be a strategy profile. � If σ∗ is a Nash equilibrium and supp(σ∗ i ) = suppi for all i ∈ N, then assigning σi(k) := σ∗ i (k) and wi := ui(σ∗ ) solves the system. � If σi(k) := σ∗ i (k) and wi := ui(σ∗ ) solves the system, then σ∗ is a Nash equilibrium with supp(σ∗ i ) ⊆ suppi for all i ∈ N. 109 Support Enumeration (Two Players) The constraints are non-linear in general, but linear for two player games! Let us stick to two players. How to find supp1 and supp2? ... Just guess! Input: A two-player strategic-form game G with strategy sets S1 = {1, . . . , m1} and S2 = {1, . . . , m2} and rational payoffs u1, u2. Output: A Nash equilibrium σ∗ . Algorithm: For all possible supp1 ⊆ S1 and supp2 ⊆ S2: � Check if the corresponding system of linear constraints (from the previous slide) has a feasible solution σ∗ , w∗ 1 , . . . , w∗ n. � If so, STOP: the feasible solution σ∗ is a Nash equilibrium satisfying ui(σ∗ ) = w∗ i . Question: How many possible subsets supp1, supp2 are there to try? Answer: 2(m1+m2) So, unfortunately, the algorithm requires worst-case exponential time. 110 Remarks on Support Enumeration � The algorithm combined with Theorem 39 and properties of linear programming imply that every finite two-player game has a rational Nash equilibrium (furthermore, the rational numbers have polynomial representation in binary). � The algorithm can be used to compute all Nash equilibria. (There are algorithms for computing (a finite representation of) a set of all feasible solutions of a given linear constraint system.) � The algorithm can be used to compute "good" equilibria. For example, to find a Nash equilibrium maximizing the sum of all expected payoffs (the "social welfare") it suffices to solve the system of constraints while maximizing w1 + · · · + wn. More precisely, the algorithm can be modified as follows: � Initialize W := −∞ (W stores the current maximum welfare) � For all possible supp1 ⊆ S1 and supp2 ⊆ S2: � Find the maximum value max( � wi) of w1 + · · · + wn so that the constraints are satisfiable (using linear programming). � Put W := max{W, max( � wi)}. � Return W. 111 Remarks on Support Enumeration (Cont.) Similar trick works for any notion of "good" NE that can be expressed using a linear objective function and (additional) linear constraints in variables σi(j) and wi. (e.g., maximize payoff of player 1, minimize payoff of player 2 and keep probability of playing the strategy 1 below 1/2, etc.) 112 Complexity Results – (Two Players) Theorem 44 All the following problems are NP-complete: Given a two-player game in strategic form, does it have 1. a NE in which player 1 has utility at least a given amount v ? 2. a NE in which the sum of expected payoffs of the two players is at least a given amount v ? 3. a NE with a support of size greater than a given number? 4. a NE whose support contains a given strategy s ? 5. a NE whose support does not contain a given strategy s ? 6. .... Membership to NP follows from the support enumeration: For example, for 1., it suffices to guess supports supp1, supp2 and add w1 ≥ v to the constraints; the resulting NE σ∗ satisfies u1(σ∗ ) ≥ v. 113 Complexity Results (Two Players) Theorem 45 All the following problems are NP-complete: Given a two-player game in strategic form, does it have 1. a NE in which player 1 has utility at least a given amount v ? 2. a NE in which the sum of expected payoffs of the two players is at least a given amount v ? 3. a NE with a support of size greater than a given number? 4. a NE whose support contains a given strategy s ? 5. a NE whose support does not contain a given strategy s ? 6. .... NP-hardness can be proved using reduction from SAT (The reduction is not difficult but we are not going into it. It is presented in "New Complexity Results about Nash Equilibria" by V. Conitzer and T. Sandholm (pages 6–8) ) 114 The Reduction (It’s Short and Sweet) 115 ... But What is The Exact Complexity of Computing Nash Equilibria in Two Player Games? Let us concentrate on the problem of computing one Nash equilibrium (sometimes called the sample equilibrium problem). As the class NP consists of decision problems, it cannot be directly used to characterize complexity of the sample equilibrium problem. We use complexity classes of function problems such as FP, FNP, etc. The support enumeration gives a deterministic algorithm which runs in exponential time. Can we do better? In what follows we show that � the sample equilibrium problem can be solved in polynomial time for zero-sum two-player games, (Using a beautiful characterization of all Nash equilibria) � the sample equilibrium problem belongs to the complexity class PPAD (which is a subclass of FNP) for two-player games. (... to be defined later) 116 MaxMin Is there a better characterization of Nash equilibria than Lemma 42 ? Definition 46 σ∗ i ∈ Σi is a maxmin strategy of player i if σ∗ i ∈ argmax σi ∈Σi min σ−i ∈Σ−i ui(σi, σ−i) (Intuitively, a maxmin strategy σ∗ 1 maximizes player 1’s worst-case payoff in the situation where player 2 strives to cause the greatest harm to player 1.) (Since ui is continuous and Σ−i compact, minσ−i ∈Σ−i ui(σi, σ−i) is well defined and continuous on Σi, which implies that there is at least one maxmin strategy.) 117 MaxMin Lemma 47 σ∗ i is maxmin iff σ∗ i ∈ argmax σi ∈Σi min s−i ∈S−i ui(σi, s−i) Proof. By Corollary 24, for every σ ∈ Σ we have ui(σi, σ−i) ≥ ui(σi, s−i) for some s−i ∈ S−i. Thus minσ−i ∈Σ−i ui(σi, σ−i) = mins−i ∈S−i ui(σi, s−i). Hence, argmax σi ∈Σi min σ−i ∈Σ−i ui(σi, σ−i) = argmax σi ∈Σi min s−i ∈S−i ui(σi, s−i) � Question: Assume a strategy profile where both players play their maxmin strategies? Does it have to be a Nash equilibrium? 118 Zero-Sum Games: von Neumann’s Theorem Assume that G is zero sum, i.e., u1 = −u2. Then σ∗ 2 ∈ Σ2 is maxmin of player 2 iff σ∗ 2 ∈ argmin σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) (= argmin σ2∈Σ2 max s1∈S1 u1(s1, σ2)) (Intuitively, maxmin of player 2 minimizes the payoff of player 1 when player 1 plays his best responses. Such strategy of player 2 is often called minmax.) Theorem 48 (von Neumann) Assume a two-player zero-sum game. Then max σ1∈Σ1 min σ2∈Σ2 u1(σ1, σ2) = min σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) Morever, σ∗ = (σ∗ 1 , σ∗ 2 ) ∈ Σ is a Nash equilibrium iff both σ∗ 1 and σ∗ 2 are maxmin. So to compute a Nash equilibrium it suffices to compute (arbitrary) maxmin strategies for both players. 119 Proof of Theorem 48 (Homework) Homework: Prove von Neumann’s Theorem in 4 easy steps: 1. Prove this inequality: max σ1∈Σ1 min σ2∈Σ2 u1(σ1, σ2) ≤ min σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) 2. Prove that (σ∗ 1 , σ∗ 2 ) is a Nash equilibrium iff min σ2∈Σ2 u1(σ∗ 1, σ2) ≥ u1(σ∗ 1, σ∗ 2) ≥ max σ1∈Σ1 u1(σ1, σ∗ 2) Hint: One of the inequalities is trivial and the other one almost. 3. Use 1. and 2. together with Theorem 39 to prove max σ1∈Σ1 min σ2∈Σ2 u1(σ1, σ2) ≥ min σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) 4. Use the above to prove the rest of the theorem. Hint: Use the characterization of NE from 2., do not forget that you already have maxσ1∈Σ1 minσ2∈Σ2 u1(σ1, σ2) = minσ2∈Σ2 maxσ1∈Σ1 u1(σ1, σ2) You may already have proved one of the implications when proving 3. 120 Zero-Sum Two-Player Games – Computing NE Assume S1 = {1, . . . , m1} and S2 = {1, . . . , m2}. We want to compute σ∗ 1 ∈ argmax σ1∈Σ1 min �∈S2 u1(σ1, �) Consider a linear program with variables σ1(1), . . . , σ1(m1), v: maximize: v subject to: m1� k=1 σ1(k) · u1(k, �) ≥ v � = 1, . . . , m2 m1� k=1 σ1(k) = 1 σ1(k) ≥ 0 k = 1, . . . , m1 Lemma 49 σ∗ 1 ∈ argmaxσ1∈Σ1 min�∈S2 u1(σ1, �) iff assigning σ1(k) := σ∗ 1 (k) and v := min�∈S2 u1(σ∗ 1 , �) gives an optimal solution. 121 Zero-Sum Two-Player Games – Computing NE Summary: � We have reduced computation of NE to computation of maxmin strategies for both players. � Maxmin strategies can be computed using linear programming in polynomial time. � That is, Nash equilibria in zero-sum two-player games can be computed in polynomial time. 122 IESDS vs Rationalizability Revisited We get Theorem 37 as a simple corollary of Theorem 48. Let s∗ 1 be a strategy of player 1. Consider a zero-sum game G� = ({1, 2}, (S� 1 , S� 2 ), (u� 1 , u� 2 )) where � S� 1 = S1 � {s∗ 1 } and S� 2 = S2, � u� 1 (s1, s2) = u1(s1, s2) − u1(s∗ 1 , s2) and u� 2 (s1, s2) = u1(s∗ 1 , s2) − u1(s1, s2). Now s∗ 1 is never best resp. in G iff for every σ2 ∈ Σ2 exists σ1 ∈ Σ1 : u1(σ1, σ2) − u1(s∗ 1 , σ2) > 0 iff for every σ2 ∈ Σ2 exists s1 ∈ S1 : u1(s1, σ2) − u1(s∗ 1 , σ2) > 0 iff minσ2∈Σ2 maxs1∈S1 u� 1 (s1, σ2) > 0 iff minσ2∈Σ2 maxσ1∈Σ1 u� 1 (σ1, σ2) > 0 iff maxσ1∈Σ1 minσ2∈Σ2 u� 1 (σ1, σ2) > 0 iff there is σ1 ∈ Σ1 such that for all σ2 ∈ Σ2 we have 0 < u� 1 (σ1, σ2) = u1(σ1, σ2) − u1(s∗ 1 , σ2) iff s∗ 1 is strictly dominated (by σ1) in G. 123