3 Decision Theory 3.1 Utility Functions Suppose the utility of having x units of a good is u(x). It is normally assumed that u is a concave, increasing function and u(0) = 0. i.e. the utility from having 2x Euros is greater than the utility of having x Euros, but not more than twice as great. It is assumed that individuals act so as to maximise their (expected) utility. 1 / 61 Utility functions Utility functions for the acquisition of n goods can be defined in a similar way. Suppose x = (x1, x2, . . . , xn) defines a bundle of goods, i.e. an individual has xi units of good i. The utility from having such a bundle is defined to be u(x), where u(0) = 0, ∂u ∂xi > 0 and ∂2u ∂xi ∂xj ≤ 0. 2 / 61 Utility functions and optimality in deterministic problems From the assumptions made, if a deterministic problem only involves the acquisition of one good (commonly money in the form of profit), maximising profit will automatically maximise utility. Hence, the form of the utility function will not have any effect on the optimal solution of such a problem. 3 / 61 Utility functions and optimality in deterministic problems However, if a problem involves the acquisition of two goods, then the form of the utility function will in general have an effect on the optimal solution, e.g. when allowed to choose five pieces of fruit some individuals may prefer to take 3 apples and 2 oranges, while others prefer 2 oranges and 3 apples. However taking 2 apples and 2 oranges would clearly be a sub-optimal solution. 4 / 61 Concavity of utility functions The assumption that a utility function is concave seems reasonable as follows: The utility I gain by obtaining an apple (the marginal utility of an apple) when I have no apples is greater than the utility I gain by obtaining an apple when I already have a large number of apples. It should be noted that a) Individuals have different utility functions. b) A person’s utility function depends on his/her present circumstances. 5 / 61 Utility functions and optimality in probabilistic problems In probabilistic models we maximise expected utility. In general, the optimal solution depends on the form of this utility function. The concept of utility is related to Jensen’s inequalities. Jensen’s Inequalities If f is a concave function, g is a convex function and X a random variable then E[f (X)]≤f [E(X)] E[g(X)]≥g[E(X)] 6 / 61 Risk neutral individuals Suppose u(x) is an individual’s utility function for obtaining x units of money. If u (x) = 0 for all x (i.e. u(x) is linear), then that individual is said to be risk neutral. When presented with the choice between a fixed sum of money k and lottery in which the expected prize is k, such an individual is indifferent between these choices. 7 / 61 Risk averse individuals If a person has a utility function that is strictly concave, then that individual is said to be risk averse. When presented with the choice between a fixed sum of money k and lottery in which the expected prize is k, such an individual would choose the fixed sum. This follows from the fact that if u is a strictly concave function E[u(X)] < u[E(X)]. The first expression is the expected utility from playing the lottery. The second expression is the utility from choosing the fixed sum. 8 / 61 Example 3.1 Consider the following situation. A contestant in a game show must choose between a) A guaranteed reward of 125 000 Euros. b) Taking one of two bags. One of the bags is empty, the other contains 250 000 Euro. It can be seen that by taking option b), the expected amount of money obtained by the player is also 125 000 Euros. 9 / 61 Example 3.1 We consider 3 utility functions describing the utility of obtaining x Euro u1(x) = x; u2(x) = √ x; u3(x) = x2 10 / 61 Example 3.1 - Risk neutral individual If an individual has a utility function given by u1(x) = x, then by taking option a), he/she obtains an expected utility of 125 000. By taking option b), he/she obtains an expected utility of E[u1(X)] = 1 2 × 0 + 1 2 × 250000 = 125000. Hence, this player is indifferent between choosing a guaranteed amount of money and a lottery which gives the same expected amount of money (i.e. is risk neutral). 11 / 61 Example 3.1 - Risk averse individual If an individual has a utility function given by u2(x) = √ x, then by taking option a), he/she obtains an expected utility of√ 125000 ≈ 353.55. By taking option b), he/she obtains an expected utility of E[u2(X)] = 1 2 × 0 + 1 2 × √ 250000 = 250. Hence, this player would rather choose a guaranteed amount of money than a lottery which gives the same expected amount of money (i.e. is risk aversive). 12 / 61 Example 3.1 - Risk averse individual We can calculate the guaranteed amount x1 for which the player is indifferent between that amount and the lottery as follows: u2(x1) = √ x1 = 250 ⇒ x1 = 62500. 13 / 61 Example 3.1 - Risk seeking individual If an individual has a utility function given by u3(x) = x2, then by taking option a), he/she obtains an expected utility of 1250002 = 1.625 × 1010. By taking option b), he/she obtains an expected utility of E[u3(X)] = 1 2 × 0 + 1 2 × 2500002 = 3.125 × 1010 . Hence, this player would rather choose a lottery than a guaranteed amount of money, if the lottery gives the same expected amount of money (i.e. is risk seeking). 14 / 61 Decision Analysis - Deterministic models When making a decision we must often take various criteria into account. For example, when I want to buy an airline ticket I do not just take the price of the ticket into account. Other factors I take into account may be the time of the flight, the distance of airports to a) my home b) to my destination etc. 15 / 61 Decision Analysis - Deterministic models In order to decide on which ticket to buy, I may ascribe i) Various weights to these factors. ii) Scores describing the attractiveness of price, time of flights and location. Such an approach is commonly used in defining the bid a public sector company should accept. 16 / 61 Decision Analysis - Deterministic models I choose the option that maximises my utility, which is defined to be the appropriate weighted average of these scores. In mathematical terms, suppose there are k factors. Let wi be the weight associated with factor i and si,j the score of option j according to factor i (i.e. how good option j is according to factor i). It is normally assumed that k i=1 wi = 1. 17 / 61 Decision Analysis - Deterministic models I choose the option j, which maximises the utility of option j, uj , which is taken to be the weighted average of the scores i.e. uj = k i=1 wi si,j Hence, a mathematical model of such a problem is defined by a) The set of options. b) The set of factors influencing the decision. c) Weights for each of these factors. d) Scores describing how attractive each option is according to each factor. 18 / 61 Example 3.2 Suppose I am travelling to Barcelona and I consider two possibilities. I can fly using a cheap airline from Shannon to Girona (about 70km from Barcelona), with one of the flights being at a very early time or I can fly using a more expensive airline from Shannon to Barcelona and the times of the flight are convenient. 19 / 61 Example 3.2 The three factors I consider are price (factor 1), location (factor 2) and time (factor 3). I ascribe the weights w1 = 0.5, w2 = 0.3 and w3 = 0.2 to each of these factors. The budget airline scores s1,1 = 90 according to price, s2,1 = 40 according to location and s3,1 = 60 according to time. The other airline scores s1,2 = 70 according to price, s2,2 = 90 according to location and s3,2 = 90 according to time. 20 / 61 Example 3.2 - Utility of the options Hence, u1=0.5 × 90 + 0.3 × 40 + 0.2 × 60 = 69 u2=0.5 × 70 + 0.3 × 90 + 0.2 × 90 = 80. It follows that I should choose the flight from Shannon to Barcelona. 21 / 61 Multi-person decision processes When more than one person is involved in the decision process, we may define a composite utility as a weighted average of the utilities of the alternatives to each of the decision makers. Suppose Adam and Betty want to meet in one of two restaurants. They take two factors into account a) the location of a restaurant and b) the attractiveness of a restaurant. 22 / 61 Multi-person decision processes In this case we need to define a) Weights describing the importance of each person in the decision process (say p and q here). b) The weights Adam gives to the importance of location, p1 and attractiveness, p2 and the scores he gives to restaurant j according to these criteria r1,j and r2,j . c) The weights Betty gives to the importance of location, q1 and attractiveness, q2 and the scores she gives to restaurant j according to these criteria s1,j and s2,j . These scores must be made on the same scale. 23 / 61 Multi-person decision processes Under these assumptions we can define the utility of the choice of restaurant j to both Adam, uA,j , and Betty, uB,j . These are simply the weighted averages of the scores each one gives for location and attractiveness. We have uA,j = 2 i=1 pi ri,j uB,j = 2 i=1 qi si,j 24 / 61 Multi-person decision processes The composite utility is the weighted average of these utilities (weighted according to the importance of the decision makers). In this case, the composite utility gained from choosing the j-th restaurant is uj , where uj = puA,j + quB,j . The decision makers should choose the restaurant which maximises the composite utility. 25 / 61 Multi-person decision processes For example, suppose the decision makers are of equal importance, i.e. p = q = 0.5. Adam’s weights for the importance of location and attractiveness are 0.7 and 0.3. Betty’s weights are 0.4 and 0.6, respectively. 26 / 61 Multi-person decision processes They must choose between 2 restaurants. Adam assigns a score of 70 (out of 100) for location and 50 for attractiveness to Restaurant 1 and a score of 40 for location and 90 for attractiveness to Restaurant 2. The analogous scores given by Betty are 50 and 60 (to Restaurant 1) and 30 and 80 (to Restaurant 2). 27 / 61 Multi-person decision processes The utilities to Adam of these restaurants are uA,1=0.7 × 70 + 0.3 × 50 = 64 uA,2=0.7 × 40 + 0.3 × 90 = 55. The utilities to Betty of these restaurants are given by uB,1=0.4 × 50 + 0.6 × 60 = 56 uB,2=0.4 × 30 + 0.6 × 80 = 60. 28 / 61 Multi-person decision processes Hence, the composite utilities of these choices are given by u1=0.5 × 64 + 0.6 × 56 = 60 u2=0.5 × 55 + 0.6 × 60 = 57.5. It follows that Restaurant 1 should be chosen. 29 / 61 Probabilistic models In many cases we do not know what the outcome of our decisions will be, but we can ascribe some likelihood (probability) to these outcomes. In this case we can define a decision tree to illustrate the actions we may take and the possible outcomes resulting from these actions. 30 / 61 Probabilistic models Consider the following simplified model of betting on a horse. Suppose there are only 2 horses we are interested in. If the going is good or firmer (this occurs with probability 0.6), horse 1 wins with probability 0.5, otherwise it wins with probability 0.2. If the going is good horse 2 wins with probability 0.1, otherwise it wins with probability 0.4. 31 / 61 Probabilistic models The odds on horse 1 are evens. The odds on horse 2 are 2 to 1. Hence (ignoring taxes), if we place a 10 Euro bet on horse 1, we win 10 Euro if it wins. If we place a 10 Euro bet on horse 2, we win 20 Euro if it wins. In all other cases we lose 10 Euro. Winnings are denoted by W . 32 / 61 Probabilistic models It should be noted that if the odds are valid only before the ground conditions are known, we are only really interested whether these horses win or lose and not in whether the ground is good or not. Hence, to simplify the decision tree, we calculate the probability that a horse wins. Let A be the event that the first horse wins and let B be the event that the second horse wins. Let G be the event that the ground is good or firmer (Gc is the complement of event G, i.e. not G). 33 / 61 Probabilistic models Using the law of complete probability P(A)=P(A|G)P(G) + P(A|Gc )P(Gc ) = 0.5 × 0.6 + 0.2 × 0.4 = 0.38 P(B)=P(B|G)P(G) + P(B|Gc )P(Gc ) = 0.1 × 0.6 + 0.4 × 0.4 = 0.22 34 / 61 Probabilistic models The simplified decision tree for this problem is given by Choice            © d d d d d d‚ I II wins loses ¡ ¡ ¡ ¡ ¡ ¡ e e e e e e… 20 -10 wins loses ¡ ¡ ¡ ¡ ¡ ¡ e e e e e e… 10 -10 Fig. 1: Decision Tree for Betting Problem 35 / 61 Criteria for Choosing under Uncertainty In the case where risk exists, there are various criteria for choosing the optimal action. The first criterion we consider is the criterion of maximising the expected amount of the good to be obtained. Here we should maximise the expected winnings. This criterion is valid for individuals who are indifferent to risk (i.e. have a linear utility function). By betting on horse 1, my expected winnings (in Euros) are E(W ) = 10 × 0.38 − 10 × 0.62 = −2.4. By betting on horse 2, my expected winnings are E(W ) = 20 × 0.22 − 10 × 0.78 = −3.4. Thus, it is better to bet on horse 1 than on horse 2. Of course, in this case it would be better not to bet at all. 36 / 61 Information in probabilistic problems One interesting point to note regarding this example is that there is a signal (the state of the ground), which is correlated with the results of the race. It was assumed that the bet took place before the signal could be observed. The probabilities of each horse winning in this case are termed the ”a priori” probabilities. 37 / 61 Information in probabilistic problems Suppose now the signal was observed before the bet was made. In this case we may base our bet on the signal, using the ”posterior” probabilities of each horse winning. These are the conditional probabilities of the horses winning given the state of the ground. Hence, we should define the decision to be made when the conditions are good or firmer DG and the decision to be made when the conditions are different DNG , i.e. we define the decision to be made for each possible signal. 38 / 61 Information in probabilistic problems Suppose the conditions are good or firmer. If I bet on horse 1 (who wins with probability 0.5), my expected winnings are E(W ) = 10 × 0.5 − 10 × 0.5 = 0 If I bet on horse 2 (who wins with probability 0.1), my expected winnings are E(W ) = 20 × 0.1 − 10 × 0.9 = −7 Hence, when the conditions are good or firmer I should bet on horse 1. In this case I am indifferent between betting and not betting. 39 / 61 Information in probabilistic problems Suppose the conditions are softer than good. If I bet on horse 1 (who wins with probability 0.2), my expected winnings are E(W ) = 10 × 0.2 − 10 × 0.8 = −6 If I bet on horse 2 (who wins with probability 0.4), my expected winnings are E(W ) = 20 × 0.4 − 10 × 0.6 = 2 Hence, when the conditions are softer I should bet on horse 2. In this case, I prefer betting to not betting. 40 / 61 Maximisation of expected utility The second criterion we consider is the maximisation of utility. Although the criterion of maximisation of the expected amount of the good to be obtained is very simple to use, some people may be more averse to risk than others. In this case it may be useful to use the maximisation of utility. 41 / 61 Example 3.3 Consider the previous example in which the ground conditions are observed before the bet is made. Assume that my utility function is u(x) = √ 10 + x, where x are my winnings. Note that 10 + x is the amount of money I have after the bet. 42 / 61 Example 3.3 Suppose the conditions are good or firmer. If I bet on horse 1 (who wins with probability 0.5), my expected utility is E(W ) = √ 20 × 0.5 + 0 × 0.5 = 2.236 If I bet on horse 2 (who wins with probability 0.1), my expected winnings are E(W ) = √ 30 × 0.1 − 0 × 0.9 = 0.5477 If I do not bet, then my utility is √ 10 = 3.162. Hence, when the conditions are good or firmer (if I bet) I should bet on horse 1. In this case I would rather not bet. 43 / 61 Example 3.3 Suppose the conditions are softer than good. If I bet on horse 1 (who wins with probability 0.2), my expected utility E(W ) = √ 20 × 0.2 − 0 × 0.8 = 0.8944 If I bet on horse 2 (who wins with probability 0.4), my expected utility is E(W ) = √ 30 × 0.4 − 0 × 0.6 = 2.191 Hence, when the conditions are softer (if I bet) I should bet on horse 2. However in this case, I prefer not betting to betting. 44 / 61 Other criteria for choice under uncertainty Other criteria for choosing under uncertainty are 1) Laplace’s criterion 2) The minimax criterion 3) The Savage criterion 4) The Hurwicz criterion 45 / 61 Laplace’s criterion The Laplace criterion is based on the principle of insufficient reason. Since normally we do not know the probabilities of the events that interest us, we should assume that these are equally likely. This means that if there are n horses in a race, then we assume that each has a probability 1 n of winning. In this case it is clear that in order to maximise our expected winnings (or utility) any bet should be placed on the horse with the largest odds. 46 / 61 The minimax criterion The minimax criterion is based on the conservative attitude of minimising the maximum possible loss (this is equivalent to maximising the minimum possible gain). 47 / 61 The Savage criterion The Savage (regret) criterion aims at moderating conservatism by introducing a regret matrix. Let v(ai , sj ) be the payoff (or loss) when a decision maker takes action ai and the state (result of the random experiment) is sj . If v denotes a payoff, then the measure of regret r(ai , sj ) is given by r(ai , sj ) = max ak {v(ak, sj )} − v(ai , sj ). This is the maximum gain a decision maker could have made, if he/she knew what the state of nature was going to be. 48 / 61 The Savage criterion If v denotes a loss, then the measure of regret r(ai , sj ) is given by r(ai , sj ) = v(ai , sj ) − min ak {v(ak, sj )}. This is the maximum decrease in costs a decision maker could have made (relative to the loss he/she actually incurred), if he/she knew what the state of nature would be. The decision is made by applying the minimax criterion to the regret matrix. Since regret is a ”cost”, we minimise the maximum regret. 49 / 61 The Hurwitz criterion The Hurwitz criterion is designed to model a range of decision-making attitudes from the most conservative to the most optimistic. Define 0 ≤ α ≤ 1 and suppose the gain function is v(ai , sj ). The action selected is the action which maximises α max sj v(ai , sj ) + (1 − α) min sj v(ai , sj ). 50 / 61 The Hurwitz criterion The parameter α is called the index of optimism. If α = 0, this criterion is simply the minimax criterion (i.e. the minumum gain is maximised, a conservative criterion). If α = 1, then the criterion seeks the maximum possible payoff (an optimistic criterion). A typical value for α is 0.5. 51 / 61 The Hurwitz criterion If v(ai , sj ) is a loss, then the criterion selects the action which minimises α min sj v(ai , sj ) + (1 − α) max sj v(ai , sj ). 52 / 61 Example 3.4 Suppose a farmer can plant corn (a1), wheat (a2), soybeans (a3) or use the land for grazing (a4). The possible states of nature are: heavy rainfall (s1), moderate rainfall (s2), light rainfall (s3) or drought (s4). The payoff matrix in thousands of Euro is as follows s1 s2 s3 s4 a1 -20 60 30 -5 a2 40 50 35 0 a3 -50 100 45 -10 a4 12 15 15 10 53 / 61 Example 3.4 Using the four criteria given above, determine the action that the farmer should take. Use an optimality index α = 0.4. 54 / 61 Example 3.4 - Using Laplace’s criterion Using Laplace’s criterion, we assume each of the states is equally likely and we calculate the expected reward obtained for each action E(W ; ai ). We have E(W ; a1)= −20 + 60 + 30 − 5 4 = 16.25 E(W ; a2)= 40 + 50 + 35 + 0 4 = 31.25 E(W ; a3)= −50 + 100 + 45 − 10 4 = 21.25 E(W ; a4)= 12 + 15 + 15 + 10 4 = 13 Under this criterion, the farmer should take action a2 (plant wheat). 55 / 61 Example 3.4 - Using the minimax criterion Under the minimax criterion, we maximise the minimum possible reward. Let Wmin(ai ) be the minimum possible reward given that action ai is taken Wmin(a1) = −20; Wmin(a2) = 0; Wmin(a3) = −50; Wmin(a4) = 10. Under this criterion, the farmer should take action a4 (use for grazing). 56 / 61 Example 3.4 - Using the Savage criterion When using Savage’s criterion, we first need to calculate the regret matrix. The values in the column of the regret matrix corresponding to state si are the gains that a decision maker could make by switching his action from the action taken to the optimal action in that state. Given the state is s1, the optimal action is a2, which gives a payoff of 40. Hence, the regret from playing a1 is 60. The regret from playing a3 is 90 and the regret from playing a4 is 28. 57 / 61 Example 3.4 - Using the Savage criterion Calculating in a similar way, the regret matrix is given by s1 s2 s3 s4 a1 60 40 15 15 a2 0 50 10 10 a3 90 0 0 20 a4 28 85 30 0 58 / 61 Example 3.4 - Using the Savage criterion In order to choose the appropriate action, we apply the minimax criterion to the regret matrix. Since regret is a cost (loss), we minimise the maximum regret. Let Rmax (ai ) be the maximum regret possible given that action ai is taken. We have Rmax (a1) = 60; Rmax (a2) = 50; Rmax (a3) = 90; Rmax (a4) = 85 It follows that according to this criterion, the action a2 should be taken (plant wheat). 59 / 61 Example 3.4 - Using the Hurwicz criterion Finally, we consider the Hurwicz criterion with α = 0.4. Let the Hurwicz score of action a1 be given by H(ai ) = α max sj v(ai , sj ) + (1 − α) min sj v(ai , sj ). If the farmer takes action a1, the maximum possible reward is 60 and the minimum possible reward is -20. The Hurwicz score is thus given by H(a1) = 0.4 × 60 + 0.6 × (−20) = 12. 60 / 61 Example 3.4 - Using the Hurwicz criterion Similarly, H(a2)=0.4 × 50 + 0.6 × 0 = 20 H(a3)=0.4 × 100 + 0.6 × (−50) = 10 H(a4)=0.4 × 15 + 0.6 × 10 = 12. Hence, according to this criterion action a2 (plant wheat) should be taken. It can be seen that for small α (the decision maker is pessimistic), then action a4 should be taken. If α is close to 1 (the decision maker is optimistic), then action a3 should be taken. 61 / 61