3 Decision Theory
3.1 Utility Functions
Suppose the utility of having x units of a good is u(x).
It is normally assumed that u is a concave, increasing function and
u(0) = 0.
i.e. the utility from having 2x Euros is greater than the utility of
having x Euros, but not more than twice as great.
It is assumed that individuals act so as to maximise their
(expected) utility.
1 / 61
Utility functions
Utility functions for the acquisition of n goods can be deﬁned in a
similar way.
Suppose x = (x1, x2, . . . , xn) deﬁnes a bundle of goods, i.e. an
individual has xi units of good i.
The utility from having such a bundle is deﬁned to be u(x), where
u(0) = 0, ∂u
∂xi
> 0 and ∂2u
∂xi ∂xj
≤ 0.
2 / 61
Utility functions and optimality in deterministic problems
From the assumptions made, if a deterministic problem only
involves the acquisition of one good (commonly money in the form
of proﬁt), maximising proﬁt will automatically maximise utility.
Hence, the form of the utility function will not have any eﬀect on
the optimal solution of such a problem.
3 / 61
Utility functions and optimality in deterministic problems
However, if a problem involves the acquisition of two goods, then
the form of the utility function will in general have an eﬀect on the
optimal solution,
e.g. when allowed to choose ﬁve pieces of fruit some individuals
may prefer to take 3 apples and 2 oranges, while others prefer 2
oranges and 3 apples.
However taking 2 apples and 2 oranges would clearly be a
sub-optimal solution.
4 / 61
Concavity of utility functions
The assumption that a utility function is concave seems reasonable
as follows:
The utility I gain by obtaining an apple (the marginal utility of an
apple) when I have no apples is greater than the utility I gain by
obtaining an apple when I already have a large number of apples.
It should be noted that
a) Individuals have diﬀerent utility functions.
b) A person’s utility function depends on his/her present
circumstances.
5 / 61
Utility functions and optimality in probabilistic problems
In probabilistic models we maximise expected utility. In general,
the optimal solution depends on the form of this utility function.
The concept of utility is related to Jensen’s inequalities.
Jensen’s Inequalities
If f is a concave function, g is a convex function and X a random
variable then
E[f (X)]≤f [E(X)]
E[g(X)]≥g[E(X)]
6 / 61
Risk neutral individuals
Suppose u(x) is an individual’s utility function for obtaining x
units of money.
If u (x) = 0 for all x (i.e. u(x) is linear), then that individual is
said to be risk neutral.
When presented with the choice between a ﬁxed sum of money k
and lottery in which the expected prize is k, such an individual is
indiﬀerent between these choices.
7 / 61
Risk averse individuals
If a person has a utility function that is strictly concave, then that
individual is said to be risk averse.
When presented with the choice between a ﬁxed sum of money k
and lottery in which the expected prize is k, such an individual
would choose the ﬁxed sum.
This follows from the fact that if u is a strictly concave function
E[u(X)] < u[E(X)]. The ﬁrst expression is the expected utility
from playing the lottery. The second expression is the utility from
choosing the ﬁxed sum.
8 / 61
Example 3.1
Consider the following situation. A contestant in a game show
must choose between
a) A guaranteed reward of 125 000 Euros.
b) Taking one of two bags. One of the bags is empty,
the other contains 250 000 Euro.
It can be seen that by taking option b), the expected amount of
money obtained by the player is also 125 000 Euros.
9 / 61
Example 3.1
We consider 3 utility functions describing the utility of obtaining x
Euro
u1(x) = x; u2(x) =
√
x; u3(x) = x2
10 / 61
Example 3.1 - Risk neutral individual
If an individual has a utility function given by u1(x) = x, then by
taking option a), he/she obtains an expected utility of 125 000.
By taking option b), he/she obtains an expected utility of
E[u1(X)] =
1
2
× 0 +
1
2
× 250000 = 125000.
Hence, this player is indiﬀerent between choosing a guaranteed
amount of money and a lottery which gives the same expected
amount of money (i.e. is risk neutral).
11 / 61
Example 3.1 - Risk averse individual
If an individual has a utility function given by u2(x) =
√
x, then by
taking option a), he/she obtains an expected utility of√
125000 ≈ 353.55.
By taking option b), he/she obtains an expected utility of
E[u2(X)] =
1
2
× 0 +
1
2
×
√
250000 = 250.
Hence, this player would rather choose a guaranteed amount of
money than a lottery which gives the same expected amount of
money (i.e. is risk aversive).
12 / 61
Example 3.1 - Risk averse individual
We can calculate the guaranteed amount x1 for which the player is
indiﬀerent between that amount and the lottery as follows:
u2(x1) =
√
x1 = 250 ⇒ x1 = 62500.
13 / 61
Example 3.1 - Risk seeking individual
If an individual has a utility function given by u3(x) = x2, then by
taking option a), he/she obtains an expected utility of
1250002 = 1.625 × 1010.
By taking option b), he/she obtains an expected utility of
E[u3(X)] =
1
2
× 0 +
1
2
× 2500002
= 3.125 × 1010
.
Hence, this player would rather choose a lottery than a guaranteed
amount of money, if the lottery gives the same expected amount of
money (i.e. is risk seeking).
14 / 61
Decision Analysis - Deterministic models
When making a decision we must often take various criteria into
account.
For example, when I want to buy an airline ticket I do not just take
the price of the ticket into account.
Other factors I take into account may be the time of the ﬂight, the
distance of airports to a) my home b) to my destination etc.
15 / 61
Decision Analysis - Deterministic models
In order to decide on which ticket to buy, I may ascribe
i) Various weights to these factors.
ii) Scores describing the attractiveness of price, time
of ﬂights and location.
Such an approach is commonly used in deﬁning the bid a public
sector company should accept.
16 / 61
Decision Analysis - Deterministic models
I choose the option that maximises my utility, which is deﬁned to
be the appropriate weighted average of these scores.
In mathematical terms, suppose there are k factors. Let wi be the
weight associated with factor i and si,j the score of option j
according to factor i (i.e. how good option j is according to factor
i).
It is normally assumed that k
i=1 wi = 1.
17 / 61
Decision Analysis - Deterministic models
I choose the option j, which maximises the utility of option j, uj ,
which is taken to be the weighted average of the scores i.e.
uj =
k
i=1
wi si,j
Hence, a mathematical model of such a problem is deﬁned by
a) The set of options.
b) The set of factors inﬂuencing the decision.
c) Weights for each of these factors.
d) Scores describing how attractive each option is
according to each factor.
18 / 61
Example 3.2
Suppose I am travelling to Barcelona and I consider two
possibilities.
I can ﬂy using a cheap airline from Shannon to Girona (about
70km from Barcelona), with one of the ﬂights being at a very early
time or
I can ﬂy using a more expensive airline from Shannon to Barcelona
and the times of the ﬂight are convenient.
19 / 61
Example 3.2
The three factors I consider are price (factor 1), location (factor 2)
and time (factor 3). I ascribe the weights w1 = 0.5, w2 = 0.3 and
w3 = 0.2 to each of these factors.
The budget airline scores s1,1 = 90 according to price, s2,1 = 40
according to location and s3,1 = 60 according to time.
The other airline scores s1,2 = 70 according to price, s2,2 = 90
according to location and s3,2 = 90 according to time.
20 / 61
Example 3.2 - Utility of the options
Hence,
u1=0.5 × 90 + 0.3 × 40 + 0.2 × 60 = 69
u2=0.5 × 70 + 0.3 × 90 + 0.2 × 90 = 80.
It follows that I should choose the ﬂight from Shannon to
Barcelona.
21 / 61
Multi-person decision processes
When more than one person is involved in the decision process, we
may deﬁne a composite utility as a weighted average of the utilities
of the alternatives to each of the decision makers.
Suppose Adam and Betty want to meet in one of two restaurants.
They take two factors into account a) the location of a restaurant
and b) the attractiveness of a restaurant.
22 / 61
Multi-person decision processes
In this case we need to deﬁne
a) Weights describing the importance of each person
in the decision process (say p and q here).
b) The weights Adam gives to the importance of
location, p1 and attractiveness, p2 and the scores he
gives to restaurant j according to these criteria r1,j
and r2,j .
c) The weights Betty gives to the importance of
location, q1 and attractiveness, q2 and the scores she
gives to restaurant j according to these criteria s1,j
and s2,j .
These scores must be made on the same scale.
23 / 61
Multi-person decision processes
Under these assumptions we can deﬁne the utility of the choice of
restaurant j to both Adam, uA,j , and Betty, uB,j .
These are simply the weighted averages of the scores each one
gives for location and attractiveness. We have
uA,j =
2
i=1
pi ri,j
uB,j =
2
i=1
qi si,j
24 / 61
Multi-person decision processes
The composite utility is the weighted average of these utilities
(weighted according to the importance of the decision makers).
In this case, the composite utility gained from choosing the j-th
restaurant is uj , where
uj = puA,j + quB,j .
The decision makers should choose the restaurant which maximises
the composite utility.
25 / 61
Multi-person decision processes
For example, suppose the decision makers are of equal importance,
i.e. p = q = 0.5.
Adam’s weights for the importance of location and attractiveness
are 0.7 and 0.3.
Betty’s weights are 0.4 and 0.6, respectively.
26 / 61
Multi-person decision processes
They must choose between 2 restaurants. Adam assigns a score of
70 (out of 100) for location and 50 for attractiveness to
Restaurant 1 and a score of 40 for location and 90 for
attractiveness to Restaurant 2.
The analogous scores given by Betty are 50 and 60 (to Restaurant
1) and 30 and 80 (to Restaurant 2).
27 / 61
Multi-person decision processes
The utilities to Adam of these restaurants are
uA,1=0.7 × 70 + 0.3 × 50 = 64
uA,2=0.7 × 40 + 0.3 × 90 = 55.
The utilities to Betty of these restaurants are given by
uB,1=0.4 × 50 + 0.6 × 60 = 56
uB,2=0.4 × 30 + 0.6 × 80 = 60.
28 / 61
Multi-person decision processes
Hence, the composite utilities of these choices are given by
u1=0.5 × 64 + 0.6 × 56 = 60
u2=0.5 × 55 + 0.6 × 60 = 57.5.
It follows that Restaurant 1 should be chosen.
29 / 61
Probabilistic models
In many cases we do not know what the outcome of our decisions
will be, but we can ascribe some likelihood (probability) to these
outcomes.
In this case we can deﬁne a decision tree to illustrate the actions we
may take and the possible outcomes resulting from these actions.
30 / 61
Probabilistic models
Consider the following simpliﬁed model of betting on a horse.
Suppose there are only 2 horses we are interested in. If the going is
good or ﬁrmer (this occurs with probability 0.6), horse 1 wins with
probability 0.5, otherwise it wins with probability 0.2.
If the going is good horse 2 wins with probability 0.1, otherwise it
wins with probability 0.4.
31 / 61
Probabilistic models
The odds on horse 1 are evens. The odds on horse 2 are 2 to 1.
Hence (ignoring taxes), if we place a 10 Euro bet on horse 1, we
win 10 Euro if it wins. If we place a 10 Euro bet on horse 2, we
win 20 Euro if it wins.
In all other cases we lose 10 Euro. Winnings are denoted by W .
32 / 61
Probabilistic models
It should be noted that if the odds are valid only before the ground
conditions are known, we are only really interested whether these
horses win or lose and not in whether the ground is good or not.
Hence, to simplify the decision tree, we calculate the probability
that a horse wins.
Let A be the event that the ﬁrst horse wins and let B be the event
that the second horse wins.
Let G be the event that the ground is good or ﬁrmer (Gc is the
complement of event G, i.e. not G).
33 / 61
Probabilistic models
Using the law of complete probability
P(A)=P(A|G)P(G) + P(A|Gc
)P(Gc
) = 0.5 × 0.6 + 0.2 × 0.4 = 0.38
P(B)=P(B|G)P(G) + P(B|Gc
)P(Gc
) = 0.1 × 0.6 + 0.4 × 0.4 = 0.22
34 / 61
Probabilistic models
The simpliﬁed decision tree for this problem is given by
Choice
 
 
 
 
 
 ©
d
d
d
d
d
d
I II
wins loses
¡
¡
¡
¡
¡
¡
e
e
e
e
e
e
20 -10
wins loses
¡
¡
¡
¡
¡
¡
e
e
e
e
e
e
10 -10
Fig. 1: Decision Tree for Betting Problem
35 / 61
Criteria for Choosing under Uncertainty
In the case where risk exists, there are various criteria for choosing
the optimal action. The ﬁrst criterion we consider is the criterion
of maximising the expected amount of the good to be
obtained. Here we should maximise the expected winnings.
This criterion is valid for individuals who are indiﬀerent to risk (i.e.
have a linear utility function).
By betting on horse 1, my expected winnings (in Euros) are
E(W ) = 10 × 0.38 − 10 × 0.62 = −2.4.
By betting on horse 2, my expected winnings are
E(W ) = 20 × 0.22 − 10 × 0.78 = −3.4.
Thus, it is better to bet on horse 1 than on horse 2. Of course, in
this case it would be better not to bet at all.
36 / 61
Information in probabilistic problems
One interesting point to note regarding this example is that there
is a signal (the state of the ground), which is correlated with the
results of the race.
It was assumed that the bet took place before the signal could be
observed.
The probabilities of each horse winning in this case are termed the
”a priori” probabilities.
37 / 61
Information in probabilistic problems
Suppose now the signal was observed before the bet was made.
In this case we may base our bet on the signal, using the
”posterior” probabilities of each horse winning.
These are the conditional probabilities of the horses winning given
the state of the ground.
Hence, we should deﬁne the decision to be made when the
conditions are good or ﬁrmer DG and the decision to be made
when the conditions are diﬀerent DNG , i.e. we deﬁne the decision
to be made for each possible signal.
38 / 61
Information in probabilistic problems
Suppose the conditions are good or ﬁrmer. If I bet on horse 1 (who
wins with probability 0.5), my expected winnings are
E(W ) = 10 × 0.5 − 10 × 0.5 = 0
If I bet on horse 2 (who wins with probability 0.1), my expected
winnings are
E(W ) = 20 × 0.1 − 10 × 0.9 = −7
Hence, when the conditions are good or ﬁrmer I should bet on
horse 1. In this case I am indiﬀerent between betting and not
betting.
39 / 61
Information in probabilistic problems
Suppose the conditions are softer than good. If I bet on horse 1
(who wins with probability 0.2), my expected winnings are
E(W ) = 10 × 0.2 − 10 × 0.8 = −6
If I bet on horse 2 (who wins with probability 0.4), my expected
winnings are
E(W ) = 20 × 0.4 − 10 × 0.6 = 2
Hence, when the conditions are softer I should bet on horse 2. In
this case, I prefer betting to not betting.
40 / 61
Maximisation of expected utility
The second criterion we consider is the maximisation of utility.
Although the criterion of maximisation of the expected amount of
the good to be obtained is very simple to use, some people may be
more averse to risk than others.
In this case it may be useful to use the maximisation of utility.
41 / 61
Example 3.3
Consider the previous example in which the ground conditions are
observed before the bet is made.
Assume that my utility function is u(x) =
√
10 + x, where x are
my winnings.
Note that 10 + x is the amount of money I have after the bet.
42 / 61
Example 3.3
Suppose the conditions are good or ﬁrmer. If I bet on horse 1 (who
wins with probability 0.5), my expected utility is
E(W ) =
√
20 × 0.5 + 0 × 0.5 = 2.236
If I bet on horse 2 (who wins with probability 0.1), my expected
winnings are
E(W ) =
√
30 × 0.1 − 0 × 0.9 = 0.5477
If I do not bet, then my utility is
√
10 = 3.162.
Hence, when the conditions are good or ﬁrmer (if I bet) I should
bet on horse 1. In this case I would rather not bet.
43 / 61
Example 3.3
Suppose the conditions are softer than good. If I bet on horse 1
(who wins with probability 0.2), my expected utility
E(W ) =
√
20 × 0.2 − 0 × 0.8 = 0.8944
If I bet on horse 2 (who wins with probability 0.4), my expected
utility is
E(W ) =
√
30 × 0.4 − 0 × 0.6 = 2.191
Hence, when the conditions are softer (if I bet) I should bet on
horse 2. However in this case, I prefer not betting to betting.
44 / 61
Other criteria for choice under uncertainty
Other criteria for choosing under uncertainty are
1) Laplace’s criterion
2) The minimax criterion
3) The Savage criterion
4) The Hurwicz criterion
45 / 61
Laplace’s criterion
The Laplace criterion is based on the principle of insuﬃcient
reason.
Since normally we do not know the probabilities of the events that
interest us, we should assume that these are equally likely.
This means that if there are n horses in a race, then we assume
that each has a probability 1
n of winning. In this case it is clear
that in order to maximise our expected winnings (or utility) any
bet should be placed on the horse with the largest odds.
46 / 61
The minimax criterion
The minimax criterion is based on the conservative attitude of
minimising the maximum possible loss (this is equivalent to
maximising the minimum possible gain).
47 / 61
The Savage criterion
The Savage (regret) criterion aims at moderating conservatism
by introducing a regret matrix.
Let v(ai , sj ) be the payoﬀ (or loss) when a decision maker takes
action ai and the state (result of the random experiment) is sj .
If v denotes a payoﬀ, then the measure of regret r(ai , sj ) is given
by
r(ai , sj ) = max
ak
{v(ak, sj )} − v(ai , sj ).
This is the maximum gain a decision maker could have made, if
he/she knew what the state of nature was going to be.
48 / 61
The Savage criterion
If v denotes a loss, then the measure of regret r(ai , sj ) is given by
r(ai , sj ) = v(ai , sj ) − min
ak
{v(ak, sj )}.
This is the maximum decrease in costs a decision maker could have
made (relative to the loss he/she actually incurred), if he/she knew
what the state of nature would be.
The decision is made by applying the minimax criterion to the
regret matrix. Since regret is a ”cost”, we minimise the maximum
regret.
49 / 61
The Hurwitz criterion
The Hurwitz criterion is designed to model a range of
decision-making attitudes from the most conservative to the most
optimistic.
Deﬁne 0 ≤ α ≤ 1 and suppose the gain function is v(ai , sj ).
The action selected is the action which maximises
α max
sj
v(ai , sj ) + (1 − α) min
sj
v(ai , sj ).
50 / 61
The Hurwitz criterion
The parameter α is called the index of optimism.
If α = 0, this criterion is simply the minimax criterion (i.e. the
minumum gain is maximised, a conservative criterion).
If α = 1, then the criterion seeks the maximum possible payoﬀ (an
optimistic criterion). A typical value for α is 0.5.
51 / 61
The Hurwitz criterion
If v(ai , sj ) is a loss, then the criterion selects the action which
minimises
α min
sj
v(ai , sj ) + (1 − α) max
sj
v(ai , sj ).
52 / 61
Example 3.4
Suppose a farmer can plant corn (a1), wheat (a2), soybeans (a3) or
use the land for grazing (a4).
The possible states of nature are: heavy rainfall (s1), moderate
rainfall (s2), light rainfall (s3) or drought (s4). The payoﬀ matrix
in thousands of Euro is as follows
s1 s2 s3 s4
a1 -20 60 30 -5
a2 40 50 35 0
a3 -50 100 45 -10
a4 12 15 15 10
53 / 61
Example 3.4
Using the four criteria given above, determine the action that the
farmer should take.
Use an optimality index α = 0.4.
54 / 61
Example 3.4 - Using Laplace’s criterion
Using Laplace’s criterion, we assume each of the states is equally
likely and we calculate the expected reward obtained for each
action E(W ; ai ).
We have
E(W ; a1)=
−20 + 60 + 30 − 5
4
= 16.25
E(W ; a2)=
40 + 50 + 35 + 0
4
= 31.25
E(W ; a3)=
−50 + 100 + 45 − 10
4
= 21.25
E(W ; a4)=
12 + 15 + 15 + 10
4
= 13
Under this criterion, the farmer should take action a2 (plant
wheat).
55 / 61
Example 3.4 - Using the minimax criterion
Under the minimax criterion, we maximise the minimum possible
reward.
Let Wmin(ai ) be the minimum possible reward given that action ai
is taken
Wmin(a1) = −20; Wmin(a2) = 0; Wmin(a3) = −50; Wmin(a4) = 10.
Under this criterion, the farmer should take action a4 (use for
grazing).
56 / 61
Example 3.4 - Using the Savage criterion
When using Savage’s criterion, we ﬁrst need to calculate the regret
matrix.
The values in the column of the regret matrix corresponding to
state si are the gains that a decision maker could make by
switching his action from the action taken to the optimal action in
that state.
Given the state is s1, the optimal action is a2, which gives a payoﬀ
of 40.
Hence, the regret from playing a1 is 60. The regret from playing a3
is 90 and the regret from playing a4 is 28.
57 / 61
Example 3.4 - Using the Savage criterion
Calculating in a similar way, the regret matrix is given by
s1 s2 s3 s4
a1 60 40 15 15
a2 0 50 10 10
a3 90 0 0 20
a4 28 85 30 0
58 / 61
Example 3.4 - Using the Savage criterion
In order to choose the appropriate action, we apply the minimax
criterion to the regret matrix.
Since regret is a cost (loss), we minimise the maximum regret.
Let Rmax (ai ) be the maximum regret possible given that action ai
is taken. We have
Rmax (a1) = 60; Rmax (a2) = 50; Rmax (a3) = 90; Rmax (a4) = 85
It follows that according to this criterion, the action a2 should be
taken (plant wheat).
59 / 61
Example 3.4 - Using the Hurwicz criterion
Finally, we consider the Hurwicz criterion with α = 0.4. Let the
Hurwicz score of action a1 be given by
H(ai ) = α max
sj
v(ai , sj ) + (1 − α) min
sj
v(ai , sj ).
If the farmer takes action a1, the maximum possible reward is 60
and the minimum possible reward is -20. The Hurwicz score is thus
given by
H(a1) = 0.4 × 60 + 0.6 × (−20) = 12.
60 / 61
Example 3.4 - Using the Hurwicz criterion
Similarly,
H(a2)=0.4 × 50 + 0.6 × 0 = 20
H(a3)=0.4 × 100 + 0.6 × (−50) = 10
H(a4)=0.4 × 15 + 0.6 × 10 = 12.
Hence, according to this criterion action a2 (plant wheat) should
be taken.
It can be seen that for small α (the decision maker is pessimistic),
then action a4 should be taken. If α is close to 1 (the decision
maker is optimistic), then action a3 should be taken.
61 / 61