Lectures on the FTAP
Notes complementing Delbaen and
Schachermayer's book
"The Mathematics of Arbitrage"
Harry van Zanten
September 25, 2007 (this version)
use at your own risk
this text undoubtedly contains many errors
please report them to harry@cs.vu.nl
ii
Preface
These lecture notes treat various versions of the so-called "fundamental theorem
of asset pricing". Many students are familiar with statements about models for
financial markets saying that "absence of arbitrage" and the existence of an
"equivalent martingale measure" are in some sense equivalent. The purpose of
these notes is to present and explain mathematical results making such statements
precise.
We start with the relatively simple situation of models defined on a finite
underlying probability space, cf. Harrison and Pliska (1981). In this case a
mathematically precise and economically satisfactory fundamental theorem can
be derived using the separating hyperplane theorem. We then move to general
continuous time models, treating the theorem of Kreps (1981). The basic
idea behind this theorem is in fact similar to the finite case. However, since
the problem is now infinite-dimensional, it is technically much more involved
and it is necessary to involve topological considerations in the definition of "no
arbitrage", called "no free lunch" by Kreps. Although Kreps' theorem is satisfactory
from the mathematical perspective, the use of the weak
-topology in
the definition of "no free lunch" destroys the economic interpretation of the result.
We finally treat the fundamental theorem of Delbaen and Schachermayer
(1994), who work in the setting of asset prices modelled by locally bounded
semimartingales, and trading strategies modelled by predictable processes. In
this setting Kreps' definition of "no free lunch" can be replaced by the condition
of "no free lunch with vanishing risk", which does not involve an unnatural
topology, and has a clear economic interpretation. The proof of the fundamental
theorem of Delbaen and Schachermayer (1994) is technically very involved, and
we only sketch the main arguments in these notes. In particular, we explain the
connection to the result of Kreps (1981).
The text is mainly based on Chapters 2, 5 and 8 and 9 of Delbaen
and Schachermayer (2006). The necessary results from functional analysis are
treated in the appendix, and are taken from Conway (1990) or Rudin (1991).
Readers are assumed to be familiar with basic topology, measure theoretic
probability theory, martingale theory and stochastic integration theory,
the latter up to the level of stochastic integration of locally bounded predictable
processes relative to general semimartingales.
Amsterdam, fall 2006
HvZ
iv
Contents
Preface iii
1 A simple example 1
1.1 Exercises 3
2 Finite underlying probability spaces 5
2.1 Description of the model and basic definitions 5
2.2 Fundamental theorem of asset pricing 7
2.3 Single period versus multiperiod models 8
2.4 Completeness 9
2.5 Change of numerair 11
2.6 No-arbitrage pricing 12
2.7 Example: binomial model 13
2.8 Exercises 15
3 The Kreps-Yan theorem 17
3.1 Description of the model 17
3.2 Kreps-Yan theorem 19
3.3 Discussion 20
3.4 Exercises 22
4 The general theorem 23
4.1 Preliminaries on stochastic integration 23
4.2 No free lunch with vanishing risk and the FTAP 26
4.3 Sketch of proof of the fundamental theorem 27
4.3.1 The relatively easy half 27
4.3.2 The much more difficult half 29
4.4 Example: Itô processes 30
4.5 Exercises 32
A Elements of functional analysis 33
A.1 Separating hyperplane theorem 33
A.2 Topological vector spaces 35
A.3 Hahn-Banach theorem 37
A.4 Dual space 39
A.5 Exercises 44
B Elements of martingale theory 45
B.1 Basic definitions 45
vi Contents
B.2 Theorems 46
B.3 Exercises 48
References 48
1
A simple example
Consider a world with two time points, t = 0 (today) and t = 1 (tomorrow).
In this world there exists a bank where money can be deposited or borrowed at
zero interest and a stock is traded. The value S0 of the stock at time 0 equals
1 and the value S1 at time t = 1 equals the value u with probability p  (0, 1)
and d < u with probability 1-p, respectively. Note that the stock price process
can be viewed as a stochastic process defined on an underlying probability space
(, F, P) with  consisting of just two elements, one corresponding to the stock
price going up, one to the price going down, F the power set of  and P the
probability measure that gives probability p to the event of the price going up
and 1 - p to the event of the price going down.
A trader in this world can form a portfolio today consisting of a number
of units of money in the bank, call this 0, and a number of stocks, call this 0.
Clearly, this portfolio is worth V0 = 0 + 0 at time 0. When tomorrow comes,
the portfolio will have the new, random value V1 = 0 + 0S1.
We say that there exists an arbitrage opportunity in this world if there
exists a portfolio (0, 0) as above such that the associated value process V
satisfies V0 = 0, V1  0 and P(V1 > 0) > 0. Clearly, this corresponds to a
possibility of making a risk-free profit.
Proposition 1.0.1. There exist no arbitrage opportunities in this world if and
only if d < 1 < u.
Proof. If d < 1 < u then q = (1 - d)/(u - d) belongs to (0, 1) and hence the
probability measure Q under which the stock price moves up or down according
to the probability q instead of p is equivalent (i.e. mutually absolutely continuous)
to the underlying probability measure P (see Exercise 1). Now let (0, 0)
be a portfolio such that V0 = 0 and V1  0. Then by construction we have
EQV1 = (0 + 0u)q + (0 + 0d)(1 - q) = 0 + 0 = V0 = 0. Hence, since
V1  0, we have Q(V1 > 0) = 0. But since the measures P and Q are equivalent,
it follows that P(V1 > 0) = 0 as well.
To prove the converse statement, suppose for instance that 1  d < u.
Then the stock always performs at least as good as money in the bank and there
2 A simple example
is a positive probability that it performs strictly better. Borrowing money from
the bank and investing it in stock then yields an arbitrage. Specifically, consider
the portfolio 0 = -1, 0 = 1. The corresponding values process satisfies V0 = 0
and V1 is either d-1 or u-1, which is strictly positive with positive probability
in this case. The case d < u  1 can be handled similarly.
In the proof of the proposition we noted that in the case of no-arbitrage,
i.e. when d < 1 < u, the new underlying probability measure Q under which the
stock goes up with probability q = (1 - d)/(u - d) is equivalent to the original
probability measure P. Observe that
EQS1 = uq + d(1 - q) = S0.
In other words, the stock price process S = (S0, S1) is a martingale under Q
(relative to the trivial filtration F0 = {, }, F1 = P()). Note that any other
equivalent measure Q is fully described by specifying a probability q  (0, 1)
with which the stock goes up (Exercise 1 again). For such a measure Q we have
EQ S1 = uq + d(1 - q ) = q (u - d) + d.
Hence if Q has the property that EQ S1 = S0, then (1 - d)/(u - d) = q  (0, 1)
which is the same as saying that d < 1 < u. We just proved that this implies
absence of arbitrage opportunities.
We conclude that the condition for no-arbitrage can be reformulated in
terms of the existence of certain probability measures.
Proposition 1.0.2 (Fundamental theorem of asset pricing O). There exist
no arbitrage opportunities in this world if and only there exists a probability
measure Q equivalent to the original probability measure P such that the stock
price process S = (S0, S1) satisfies EQS1 = S0.
For obvious reasons a probability measure Q as in the proposition is called
an equivalent martingale measure. Using this terminology the result asserts that
absence of arbitrage is equivalent to the existence of an equivalent martingale
measure. It turns out that (the appropriate version of) this result, often called
the fundamental theorem of asset pricing, is true in a very general setting. This
is of great interest, since it relates a fundamental economic notion (arbitrage)
to an important mathematical concept (martingales). As a result, martingale
theory plays a central role in the modelling of financial markets and pricing of
derivatives. In these notes we discuss the fundamental theorem of asset pricing
in increasingly general settings.
1.1 Exercises 3
1.1 Exercises
1. Show that two probability measures on a finite set are equivalent (i.e. mutually
absolutely continuous) if and only if they give positive probability
to the same singletons.
4 A simple example
2
Finite underlying
probability spaces
2.1 Description of the model and basic definitions
Consider a finite probability space (, F, P), with  = {1, . . . , n}, F = P()
and pi = P({i}) > 0 for every i. Suppose that on this probability space we
are given a filtration (Ft)t=0,...,T and a (d + 1)-dimensional adapted stochastic
process S = (S
(0)
t , . . . , S
(d)
t )t=0,...,T , where T is some finite positive integer.
Assume that FT = F.
We think of the components of S as the price processes of d + 1 different
financial assets, measured relative to the price of the 0-th asset, called a numerair.
Since we measure prices relative to the price of the numerair, the price
process S(0)
equals 1 at all times.
A portfolio is a (d + 1)-dimensional, predictable process  =
(
(0)
t , . . . , 
(d)
t )t=1,...,T . We think of 
(i)
t as the number of assets of type i that
is in the portfolio in the time interval (t - 1, t]. The requirement that  is predictable
means that at each time t-1, the portfolio is constructed using only the
information available up to that time, i.e. the trader can not look into the future.
The value process associated with a portfolio  is the process V = (Vt)t=0,...,T
defined by
V0 =
d
i=0

(i)
1 S
(i)
0 , Vt =
d
i=0

(i)
t S
(i)
t , t  1. (2.1)
Clearly V is an adapted process. The value V0 is called the initial value of the
portfolio.
A special role is played by portfolios that do not involve injections or withdrawals
of money after time 0. Consider such a portfolio with value Vt-1 at
time t - 1. Just after t - 1 the portfolio is rebalanced, and the new value equals
d
i=0

(i)
t S
(i)
t-1 = Vt -
d
i=0

(i)
t (S
(i)
t - S
(i)
t-1).
6 Finite underlying probability spaces
If no money is injected or withdrawn this should equal Vt-1, hence
Vt - Vt-1 =
d
i=0

(i)
t (S
(i)
t - S
(i)
t-1) =
d
i=1

(i)
t (S
(i)
t - S
(i)
t-1)
(we use that S(0)
equals 1 at all times).
Definition 2.1.1. A portfolio  is called self-financing if its value process V
satisfies
Vt - Vt-1 =
d
i=1

(i)
t (S
(i)
t - S
(i)
t-1)
for all t  1.
We use the usual notation f(t) = f(t) - f(t - 1) for a (possibly vectorvalued)
function f on the integers. Moreover, we write v, w for the Euclidean
inner product of two vectors in Rd
. Using that notation the preceding display
reads
Vt = t, St
and a self-financing portfolio satisfies the relation
Vt = V0 +
t
u=1
u, Su .
If we define the process   S by (  S)0 = 0 and (  S)t =
t
u=1 u, Su for
t  1, we can write
Vt = V0 + (  S)t
for a self-financing portfolio.
If we compare the definition of a self-financing portfolio with (2.1) we see
that for such a portfolio it holds that

(0)
1 = V0 -
d
i=1

(i)
1 S
(i)
0
and

(0)
t = -
d
i=1

(i)
t S
(i)
t-1, t  2.
Hence, if we specify the initial value V0 and ((1)
, . . . , (d)
), the process (0)
describing the holdings in the numerair asset is completely determined by the
requirement that the portfolio is self-financing.
2.2 Fundamental theorem of asset pricing 7
2.2 Fundamental theorem of asset pricing
In this setting the definition of an arbitrage opportunity is as follows.
Definition 2.2.1. An arbitrage opportunity is a self-financing portfolio whose
value process V satisfies V0 = 0, VT  0 and P(VT > 0) > 0.
To prepare for the proof of the theorem below it is useful to reformulate
this in more geometric terms. Define the collection of random variables
K = K(S) = {(  S)T :  predictable}.
Note that K is the set of all possible pay-offs of self-financing portfolios with
zero initial value. Denoting the collection of all integrable nonnegative random
variables on (, F, P) by L
+ , absence of arbitrage is the same as the requirement
that K  L
+ = {0}. In our setting of finite  we can identify collections
of random variables with subsets of of Rn
: simply identify a random variable
X with the vector of possible realizations (X(1), . . . , X(n)). For instance
L
+ corresponds to the set {(x1, . . . , xn) : x1  0, . . . , xn  0}. This way the
requirement KL
+ = {0} of no-arbitrage translates into a geometric statement
about subsets of Rn
.
By L
we denote the collection of all bounded random variables. Since 
is finite, bounded just means finite-valued, so that L
can be identified with
all of Rn
.
We will see, as in the preceding chapter, that absence of arbitrage is equivalent
to the existence of an equivalent martingale measure, which is defined as
follows.
Definition 2.2.2. A probability measure Q on (, F) is called equivalent martingale
measure if it is equivalent to P (i.e. Q P and P Q) and S is
a (d-dimensional) martingale with respect to Q. The collection of equivalent
martingale measures is denoted by Me
= Me
(S).
Theorem 2.2.3 (Fundamental theorem of asset pricing I). There are no
arbitrage opportunities in this model if and only if there exists an equivalent
martingale measure.
Proof. Suppose first that there exists a martingale measure Q and let  be a
self-financing portfolio whose value process V satisfies V0 = 0 and VT  0. Since
 is finite  is bounded, hence V = V0 +   S is a Q-martingale (see Exercise
1). In particular, EQVT = EQV0 = 0, so VT = 0 with Q-probability one, but
then also P-almost surely.
Now assume that no arbitrage opportunities exist, so that K  L
+ =
{0}. Let A be the convex hull of the elements 1{1}, . . . , 1{n} in L
. This
is a convex, compact subset of L
, disjoint from K by assumption. Since the
latter is a linear subspace it is closed and convex, and hence we can apply the
8 Finite underlying probability spaces
separating hyperplane theorem. This yields a vector q  Rn
and ,   R such
that
q, f   <   q, h (2.2)
for all f  K and h  A. Since K is a linear space we can take  = 0 in this
display, i.e.
q, f  0 <   q, h
for all f  K and h  A (see Exercise 2). It follows that for every i we have
qi = q, 1{i}   > 0, hence we can renormalize q such that it becomes a
vector of strictly positive probabilities adding up to 1. The last display then
remains true, but with  suitably normalized. The corresponding probability
measure Q, defined by Q({i}) = qi, is equivalent to P and satisfies EQf  0
for all f  K. But since K is a linear space this implies that in fact EQf = 0
for all f  K. By Exercise 3 if follows that S is a Q-martingale.
2.3 Single period versus multiperiod models
Recall our setting of a finite underlying probability space (, F, P) on which
we have a filtration (Ft)t=0,...,T and a d-dimensional adapted stochastic process
S = (S
(1)
t , . . . , S
(d)
t )t=0,...,T describing the discounted asset prices. If T  2 this
is called a multiperiod model. It turns out that that absence of arbitrage in such
a model is equivalent to absence of arbitrage in all the single-period sub-models.
Theorem 2.3.1. There are no arbitrage opportunities in the full model if and
only if for every t, the one-period model (St, St+1), with respect to the filtration
(Ft, Ft+1), admits no arbitrage opportunities.
Proof. Suppose all the one-period models are free of arbitrage. Then by the
fundamental theorem there exist probability measures Qt on (, Ft+1) such that
Qt is equivalent to P on Ft+1 and EQt (St+1 | Ft) = St. By the lemma following
the theorem we may assume that Qt|Ft
= P|Ft
. Now define the process L by
L0 = 1 and
Lt =
dQ0
dP
  
dQt-1
dP
,
and define the measure Q by dQ = LT dP. Then Q  Me
(Exercise 4), hence
the full model is free of arbitrage by the fundamental theorem.
The following lemma applies to the general, multiperiod model, but in the
proof of the theorem it is applied only to single period models.
Lemma 2.3.2. Suppose there is no arbitrage. Let Q  Me
and define Zt =
EP(dQ/dP | Ft) and Lt = Zt/Z0. Then the measure Q
defined by dQ
= LT dP
belongs to Me
and satisfies Q
|F0
= P|F0
.
2.4 Completeness 9
Proof. Since Z is a martingale we have EP(ZT | Ft) = Zt for all t  T. Hence
for A  Ft and t  T,
Q(A) =
A
ZT dP =
A
Zt dP.
It follows that Zt = (dQ|Ft
)/(dP|Ft
). The fact that Q is equivalent to P now
implies that Zt > 0 for all t, and in particular that the process L is well defined.
Since Z is a positive P-martingale and Z0 is F0-measurable, L is a positive Pmartingale
as well and therefore Q
is a probability measure equivalent to P.
The fact that S is a Q-martingale implies that SZ is a P-martingale (check!).
Since Z0 is F0-measurable, SL is a P-martingale as well. Using the fact that
Lt = (dQ
|Ft
)/(dP|Ft
), we obtain that S is a Q
-martingale (check!), hence
Q
 Me
. Finally, the fact that L1 = 1 implies that Q
|F0 = P|F0 .
2.4 Completeness
The FTAP does not say how many equivalent martingale measures there are in
the absence of arbitrage. We will see below that this is related to the notion of
completeness.
Definition 2.4.1. We call f  L
attainable if f = a+(S)T for some a  R
and predictable process . The model is called complete if every claim f  L
is attainable.
So an attainable contingent claim f is a random pay-off at time T that can
be realized by following a self-financing strategy requiring some initial capital
a.
For the proof of the following theorem it is useful to introduce, in addition
to the set K, the set of random variables
C = {f  L
: there exists a g  K such that g  f}.
This set is a cone1
containing K and it is easy to see that K  L
+ = {0} if and
only if C  L
+ = {0} (see Exercise 5). Moreover, under no-arbitrage it holds
that K = C  (-C) (Exercise 6).
Lemma 2.4.2. For any probability measure Q we have that S is a Q-martingale
if and only if EQg  0 for all g  C.
1Recall that a subset C of a vector space is called a cone if for all x  C and a  0, it
holds that ax  C.
10 Finite underlying probability spaces
Proof. Take g  C, say g  f for f  K. Then if S is Q-martingale we have
EQf = 0 (see Exercise 1) and hence EQg  0.
If EQg  0 for all g  C then for all f  K it holds that EQf = 0, since
f  C and -f  C for f  K. Hence, by Exercise 3, S is a Q-martingale.
Theorem 2.4.3. Assume there are no arbitrage opportunities. Then
K = {f  L
: EQf = 0 for all Q  Me
}.
Proof. The set C is convex and closed (see Exercise 7) and hence, by the
Bipolar Theorem, equals its own bipolar C00
. Since C is closed under multiplication
with positive scalars we have (see the appendix) C0
= {q  Rn
:
g, q  0 for all g  C}. Hence, by the Lemma preceding the theorem, the
collection Ma
of probability measure Q such that S is a Q-martingale is contained
in C0
, hence cone(Ma
)  C0
. By considering the elements -1{i}  C
we see that every q  C0
has nonnegative coordinates and hence is a nonnegative
multiple of a probability distribution. By the lemma again this probability
measure belongs to Ma
. We conclude that cone(Ma
) = C0
. Now C0
is closed
under multiplication with positive scalars and hence C = C00
= {g  Rn
:
g, q  0 for all q  C0
}. Combined with the preceding observations we obtain
C = {g  Rn
: EQg  0 for all Q  Ma
}. By Exercise 8 we have that Me
is
dense in Ma
and hence
C = {g  Rn
: EQg  0 for all Q  Me
}. (2.3)
The proof is completed by using the fact that K = C  (-C) (Exercise 9).
Corollary 2.4.4 (Completeness). Assume there are no arbitrage opportuni-
ties.
(i) The model is complete if and only if the equivalent martingale measure is
unique.
(ii) In case of completeness the representation f = a + f0 with a  R and
f0  K of a claim f  L
is unique.
Proof. (i). Suppose first that Me
= {Q} and take f  L
. By Theorem 2.4.3
f - EQf  K, hence f is attainable. Conversely, suppose we have Q1 = Q2
in Me
. Then there exists an f  L
such that EQ1 f = EQ2 f. If this f were
attainable, there would exist an a  R such that f - a  K. By Theorem 2.4.3
this would imply that EQ1
f = a = EQ2
f, a contradiction.
(ii). Exercise 10.
2.5 Change of numerair 11
2.5 Change of numerair
Recall that in our model we have d + 1 traded assets and the processes
S(0)
, . . . , S(d)
are the prices of these assets relative to the price of the 0-th asset,
the so-called numerair. Intuitively, absence or presence of arbitrage should not
depend on the choice of numerair. In this section we prove that this is indeed
the case.
Any asset with a strictly positive price at all times could be taken as a
numerair. More generally, we shall allow any self-financing portfolio of assets
with a strictly positive value at all times. Let  be a predictable process and
consider the value process V = 1 +   S. Assume that almost surely Vt > 0
for all t. We can view this portfolio as a traded asset and use it to express the
value of our d + 1 original assets. The new value process ~S = ( ~S(0)
, . . . , ~S(d)
) is
given by
~S(i)
=
S(i)
V
, i = 0, . . . , d.
Theorem 2.5.1 (Change of numerair). Suppose the model S admits no
arbitrage opportunities. Then the model ~S admits no arbitrage opportunities
either. It holds that Q  Me
(S) if and only the measure ~Q defined by d~Q =
VT dQ belongs to Me
( ~S).
Proof. We claim that that K( ~S) = V -1
T K(S). To see this, first observe that,
for every i,
 ~S
(i)
t =
1
Vt
S
(i)
t - ~S
(i)
t-1Vt .
It follows that for a given predictable process , we have
(  ~S)T =
t
ft
Vt
,
with ft an Ft-measurable element of K(S), for every t. By the lemma following
the theorem it holds that ft/Vt = gt/VT for certain gt  K(S). Hence, we have
the inclusion K( ~S)  V -1
T K(S). The converse inclusion follows by symmetry,
by considering the model ~S and taking 1/V as numerair.
It now follows from the definition of arbitrage that the model S is free
of arbitrage if and only if this holds for ~S. To complete the proof, take an
equivalent probability measure Q. By Lemma 2.4.2 and Exercise 6 it holds that
Q  Me
(S) if and only if EQf = 0 for all f  K(S). By the first part of the
proof, this holds if and only if EQVT f for all f  K( ~S), which is the same as
saying that the measure ~Q defined by d~Q = VT dQ belongs to Me
( ~S).
12 Finite underlying probability spaces
Lemma 2.5.2. For f an Ft-measurable element of K(S) and t  T, it holds
that (VT /Vt)f  K(S).
Proof. Observe that
VT
Vt
f = f +
T
s=t+1
f
Vt
Vs = f + (  S)T ,
where
s =
f
Vt
s1{st+1}.
Since f is Ft-measurable the process  is predictable, and it follows that
(VT /Vt)f  K(S).
2.6 No-arbitrage pricing
Suppose that in our market we can buy the claim f  L
at price a at time
t = 0. Then the collection of all claims that we can attain with 0 initial cost
changes from K to
Kf,a
= span(K  {f - a}).
In case of no-arbitrage it should hold, as before, that Kf,a
 L
+ = {0}. This
leads us to the following definition.
Definition 2.6.1. We call a  R an arbitrage free price for the claim f  L
if Kf,a
 L
+ = {0}.
Observe that if the claim f  L
is attainable at price a, i.e. f - a  K,
then Kf,a
= K. In the absence of arbitrage we have K  L
+ = {0} and hence
in this case a is an arbitrage-free price for f. Moreover, any other value b = a
is not an arbitrage-free price for f (Exercise 11).
Theorem 2.6.2 (No-arbitrage pricing). Assume absence of arbitrage and
let f  L
. Define I = {EQf | Q  Me
}. The set I is the collection of all
arbitrage-free prices for f. Either I = {a}, in which case f is attainable at price
a, or I is a bounded, open interval, in which case f is not attainable.
Proof. Define
(f) = inf{EQf | Q  Me
}, (f) = sup{EQf | Q  Me
}.
Suppose (f) = (f) = a. Then by Theorem 2.4.3, f - a  K, which
means that f is attainable at price a. Hence, by the remarks preceding the
theorem, a is the unique arbitrage-free price for f.
2.7 Example: binomial model 13
Now assume (f) < (f). Since f is bounded and Me
is convex, I =
{EQf | Q  Me
} is a bounded subinterval of R. Suppose a  I. Then there
is a Q  Me
such that EQ(f - a) = 0. This implies that Kf,a
 L
+ = {0}
(check). Hence, a is an arbitrage-free price for f. Conversely, suppose that
Kf,a
 L
+ = {0}. Then by repeating the proof of Theorem 2.2.3 with Kf,a
in the place of K we find a Q  Me
such that EQg = 0 for all g  Kf,a
. In
particular, we see that EQ(f - a) = 0, hence a  I. It remains to show that I
is an open interval. Set a = (f), the right boundary point of I, and consider
the claim f - a. By definition we have EQ(f - a)  0 for all Q  Me
. Since
we have the representation (2.3) for the set C, it follows that f - a  C. Hence
there exists a g  K such that g  f - a. Now suppose that a  I. Then
there exists a Q
 Me
such that EQ f = a and hence EQ (g - (f - a)) = 0,
so that g = f - a. But this means that f - a  K, i.e. f is attainable at price
a. Theorem 2.4.3 implies that I reduces to a singleton in that case, which leads
to a contradiction. We conclude that the right endpoint of I does not belong
to I. The left endpoint can be handled similarly (or by considering the claim
-f).
2.7 Example: binomial model
Consider a world with n time points, t = 0, . . . , n. In this world there exists a
bank where money can be deposited or borrowed at interest rate r > 0 and a
stock is traded. We denote the bank account process by B, so Bt = (1 + r)t
.
The value X0 of the stock at time 0 equals 1 and given the stock has value Xt at
time t, the value Xt+1 at time t + 1 equals uXt with probability p  (0, 1) and
dXt with probability 1-p, respectively, where d < u are certain given numbers.
This model is called the binomial model.
Let us choose the bank account process B as numerair and denote the
discounted price processes by S(0)
and S(1)
, so S(0)
 1 and S
(1)
t = Xt/Bt. Then
under the objective probability measure P described above we have S
(1)
0 = 1
and given S
(1)
t we have that S
(1)
t+1 equals (u/(1 + r))S
(1)
t with probability p and
(d/(1 + r))S
(1)
t with probability 1 - p.
We want to investigate the existence of arbitrage opportunities in this
model. According to Theorem 2.3.1, it suffices to consider the one-period model
we studied in Chapter 1. Proposition 1.0.1 (applied with d/(1 + r) in the place
of d and u/(1 + r) in the place of u) then implies that the binomial model is
free of arbitrage if and only if d < 1 + r < u. The considerations following
Proposition 1.0.1 show that the one-period model admits a unique martingale
measure, described by changing the probability with which the stock moves up
from p to q = (1 + r - d)/(u - d). This implies that the full, multi-period
model also admits only one martingale measure Q. Hence, by Corollary 2.4.4,
the model is complete.
In this complete model every claim f  L
is attainable and by Theorem
2.6.2 its arbitrage-free price is given by the expectation of f under the martingale
measure. Recall that all of this is still relative to the chosen numerair, the bank
14 Finite underlying probability spaces
account process B. Hence, a claim f should be interpreted as a pay-off of at
time T of f units of the bank account process. In ordinary money terms, this
is a pay-off of fBT = f(1 + r)T
euros at time T. At time 0 this has the value
of EQf units of the bank account process. But B0 equals one euro, and hence
a pay-off of f(1 + r)T
euros at time T is worth EQf euros at time 0. In other
words, a pay-off of f euros at time T is worth EQ(1 + r)-T
f euros at time 0.
Putting this together we obtain the following result.
Proposition 2.7.1. The binomial model is free of arbitrage if and only d <
1 + r < u. In this case the model is complete and the arbitrage-free value at
time 0 of a claim paying f  L
euros at time T is EQ(1 + r)-T
f, where Q
is the probability measure obtained by changing the probability with which the
stock price moves up from p to q = (1 + r - d)/(u - d).
2.8 Exercises 15
2.8 Exercises
1. If S is a (multi-dimensional) martingale and  is a (multi-dimensional)
bounded, predictable process, then   S is a martingale.
2. In the proof of Theorem 2.2.3, show that the fact that K is a linear space
implies that we can take  = 0 in (2.2).
3. If EQ(  S)T = 0 for every bounded, predictable process , then S is a
Q-martingale.
4. In the proof of Theorem 2.3.1, show that Q  Me
.
5. Show that K  L
+ = {0} if and only if C  L
+ = {0}.
6. Show that under no-arbitrage, K = C  (-C).
7. Let K be a linear subspace of Rn
and L+ = {x  Rn
: x1  0, . . . , xn  0}.
(a) For v1, . . . , vm  Rn
, show that the convex cone C = { civi : c1 
0, . . . , cm  0} is closed. (Hint: Write C = {ax : a  0, x  H},
where H = { civi : c1  0, . . . , cm  0, ci = 1} is the convex hull
of the vi, and first prove that H is compact.)
(b) Show that K+L+ is closed. (Hint: Say dim(K) = k and let f1, . . . , fn
be an orthonormal basis of Rn
such that f1, . . . , fk is an orthonormal
basis of K. Using (a), show that the coordinates of the points of
K + L+ relative to this basis form a closed subset of Rn
.)
8. Show that under no-arbitrage, Me
is dense in Ma
. (Here we identify
probability measures on  with points in Rn
again.)
9. Supply the details of the last part of the proof of Theorem 2.4.3.
10. Prove Corollary 2.4.4.(ii).
11. Assume absence of arbitrage. Show that for a claim f  L
that is
attainable at price a, the value a is the unique arbitrage-free price.
12. Consider a one-period model with a bank with zero interest and a stock
which has value 1 at time 0 and value s1, s2 or s3, respectively, at time
1, with probabilities p1, p2 or p3, respectively. Assume that s1 > s2 > s3
and the pi's are non-zero and add up to 1.
(i) Give conditions on the values s1, s2, s3 characterizing absence of ar-
bitrage.
(ii) In case of absence of arbitrage, investigate whether the model is complete
or not.
16 Finite underlying probability spaces
3
The Kreps-Yan theorem
3.1 Description of the model
Let S = (St)t[0,T ] be a (d + 1)-dimensional, cadlag, adapted, locally bounded
stochastic process, with T > 0 a fixed time horizon, defined on a probability
space (, F, P) endowed with a filtration (Ft)t[0,T ] satisfying the usual conditions.
As in the preceding chapter, we think of S = (S(0)
, . . . , S(d)
) as describing
the value of d+1 financial assets, expressed relative to a chosen numerair, which
is the 0-th asset. In particular, we assume that S(0)
 1.
By the classical theorems on regularity of continuous-time martingales, the
usual conditions on the filtration imply that (local) martingales have a cadlag
modification. Since the basic theorems of martingale theory are valid for rightcontinuous
martingales, the usual conditions ensure that we may apply the
classical theorems to the martingales that we encounter.
Observe that our setup includes the discrete-time case, simply take the
process S (and the filtration) to be constant on the intervals [t - 1, t) for integers
t. If the underlying probability space is finite the process S is necessarily
uniformly bounded, and hence the present setup includes the setting of the
preceding chapter.
A first important question is which trading strategies we want to allow in
this model. At the very least we shall allow strategies in which the asset portfolio
is only rebalanced at a finite number of stopping times, in a predictable manner.
Definition 3.1.1. A d-dimensional process  is called a simple trading strategy
if it is of the form
t =
n
i=1
i1(i-1,i](t),
where 0 = 0      n  T are finite stopping times and the i are ddimensional,
Fi-1
-measurable random variables.
The strategy is called admissible if, in addition, the stopped process Sn
and the random variables 1, . . . , n are uniformly bounded.
18 The Kreps-Yan theorem
The interpretation of the definition is clear: 
(j)
i is the number of assets of
type j in the portfolio between times i-1 and i. As in the preceding chapter
we define the stochastic process   S by
(  S)t =
n
i=1
i, Sit - Si-1t , t  [0, T].
As before,   S should be interpreted as the value process of a self-financing
portfolio starting with 0 initial capital and following the trading strategy ,
the adjustments of the positions in assets 1 to d being financed by taking the
appropriate amount from the "bank account", modelled by the numerair asset
0.
Our first notion of no-arbitrage in this setting is the requirement that we
can not make a risk-free profit by following a simple, admissible strategy. We
proceed analogously to the finite case and first define the space
Ks
= {(  S)T |  simple, admissible}
of pay-offs that can be achieved with 0 initial capital, following a simple, admissible
self-financing trading strategy.
Definition 3.1.2. We say that S satisfies the condition of no-arbitrage with
simple strategies if Ks
 L
+ (, F, P) = {0}.
A sufficient condition for the absence of arbitrage with simple strategies
is the existence of a so-called equivalent local martingale measure. This is, by
definition, a probability measure Q equivalent to the objective measure P such
that S is a local martingale under Q.
Proposition 3.1.3. If there exists an equivalent local martingale measure, S
admits no arbitrage with simple strategies.
Proof. Let Q be an equivalent local martingale measure. We first show that
for every simple, admissible strategy  it holds that
EQ(  S)T = 0.
By Definition 3.1.1 it suffices to show that if 0      T are stopping times
such that S
is bounded and X is a bounded, d-dimensional, F-measurable
random variable, then
EQ X, S - S = 0.
This can be derived from the optional stopping theorem (Exercise 1).
Now suppose we have f  Ks
, f  0, say f = (  S)T for a simple,
admissible . Then by the first part of the proof we have EQf = 0. Since f is
nonnegative it follows that f vanishes Q-a.s. and hence also P-a.s., since P and
Q are equivalent.
3.2 Kreps-Yan theorem 19
The converse of this proposition is, unfortunately, not true. To have the
existence of a (local) martingale measure in this general continuous-time setting,
the absence of simple arbitrages is not strong enough.
Example 3.1.4. Consider a process S = (St)t[0,1] with S0 = 1 and that is
constant except for jumps at times tn = 1 - (n + 1)-1
for n = 1, 2, . . .. At
time tn the process S has a jump of magnitude 3-n
Zn, where Z1, Z2, . . . are
independent random variables with P(Zn = 1) = 1-P(Zn = -1) = 1/2+n for
certain numbers n  (-1/2, 1/2). Since the process S is uniformly bounded,
it is a martingale under a measure Q if it local martingale (check!). But there
is only one probability measure under which S is a martingale, namely the
measure Q under which Q(Zn = 1) = 1 - Q(Zn = -1) = 1/2. Hence, by
Example B.2.4, there exists no equivalent local martingale measure if we choose
the n such that 2
n = . However, this model does satisfy the condition of
no-arbitrage with simple strategies. To see that, first observe that if a there is
a simple admissible arbitrage strategy, then there is simple arbitrage strategy
of the form  = h1(,], for stopping times     1 and a bounded Fmeasurable
random variable h (Exercise 2). Moreover, we only have to consider
stopping times that take values in the collection of tn's. Such a strategy has payoff
V = h(S -S). Now observe that on the event An = { = tn-1,   tn} we
have sign(S - S) = sign(Zn) = Zn, so sign(V ) = sign(h)Zn. By assumption
sign(V )  0, so we get that
sign(h)1An
Zn  0.
Note that An  Ftn-1
and hence, by definition of F, sign(h)1An
is Ftn-1
measurable.
So sign(h)1An
and Zn are independent and in view of the preceding
display, it follows that sign(h)1An
= 0 a.s. (check) . Hence h = 0 on every event
An and therefore h = 0 on the event { > } = nAn.
3.2 Kreps-Yan theorem
As in Chapter 2 we can introduce the cone
Cs
= {f  L
(, F, P) : there exists a g  Ks
such that g  f}.
Then (see Exercise 5 of Chapter 2) no-arbitrage with simple integrands is equivalent
to Cs
 L
+ (, F, P) = {0}. In the preceding section we remarked that
this condition is not strong enough to guarantee the existence of an equivalent
martingale measure. It turns out we have to replace the cone Cs
by its closure
C with respect to the weak
-topology on L
(, F, P), the latter space viewed
as the dual of L1
(, F, P).
Definition 3.2.1. We say that S satisfies the condition of no free lunch if
C  L
+ (, F, P) = {0}.
20 The Kreps-Yan theorem
Clearly this condition is stronger than the condition of no-arbitrage with
simple strategies. It gives us the following version of the fundamental theorem
of asset pricing.
Theorem 3.2.2 (Fundamental theorem of asset pricing II). The process
S satisfies the condition of no free lunch if and only if there exists an equivalent
local martingale measure.
Proof. Suppose first that there exists an equivalent local martingale measure Q.
Then by the first part of the proof of Proposition 3.1.3 it holds that EQf  0 for
all f  Cs
. Since the map f  EQf is weak
-continuous (check), the inequality
EQf  0 holds in fact for every f  C. It follows that if f  C, f  0, then
EQf = 0, so that f vanishes Q-a.s. and hence also P-a.s..
Now suppose that S satisfies the condition of no free lunch. For   (0, 1),
define B = {f  L
: 0  f  1, Ef  }. The set B is a weak
-closed
subset of the unit ball of L
, the latter space viewed as dual of L1
(Exercise
4). Hence, by Alaoglu's theorem it is weak
-compact. Clearly, it is also convex.
By the separation theorem there exists a g  L1
and ,   R such that
sup
fC
Egf   <   inf
hB
Egh.
Since 0  C we have   0. Since C is a cone, it follows that Egf  0 for all
f  C, hence
sup
fC
Egf  0 < inf
hB
Egh.
Since C contains all negative functions in L
we have g  0 a.s.. Also observe
that 1  B, so that Eg > 0, and therefore g does not vanish almost surely.
We renormalize g such that Eg = 1.
For every positive integer n we now define the probability measure Qn
by dQn = g2-n dP and Q = anQn, for a sequence an of positive numbers
such that an = 1. Note that if P(A) > 2-n
then 1A  B2-n and hence
Qn(A) > 0. It follows that P is absolutely continuous with respect to Q (check).
The fact that Q is absolutely continuous relative to P is immediate, and hence
P and Q are equivalent. To complete the proof observe that if f  C, then
EQf = anEQn
f = anEg2-nf  0. This implies that S is a local martingale
under Q (Exercise 3).
3.3 Discussion
We saw in this chapter that in the general continuous-time setting, absence of
arbitrage with simple strategies is not sufficient for the existence of a (local)
martingale measure. The collection of pay-offs that can be attained with simple
strategies is somehow to small, and it turned out to be necessary to take the
3.3 Discussion 21
weak
-closure of this set. While this is completely satisfactory from a mathematical
point of view, we should observe that it destroys the economic meaning
of Theorem 3.2.2. Since taking the weak
-closure is not a very intuitive operation,
the weak
-closure of a collection of pay-offs of simple strategies is not
a set that can for instance be interpreted as a collection of pay-offs of "more
complex" strategies.
To obtain an economically meaningful result, we would prefer to replace
the weak
-topology by a stronger, more intuitive one. It turns out that this
is possible if we are willing to restrict ourselves to asset prices that are semimartingales.
Then the class of simple strategies can be enlarged in a natural
way, using the theory of integration with respect to semimartingales. Taking
the closure with respect to weak
-topology can then be replaced by taking the
closure in the norm-topology of L
, which is much more satisfactory from the
economic perspective.
22 The Kreps-Yan theorem
3.4 Exercises
1. Complete the first part of the proof of Proposition 3.1.3.
2. Show that if there exists a simple arbitrage strategy, there also exists an
arbitrage strategy of the form  = h1(,] with     1 stopping times
and h a bounded, F-measurable random variable. (Hint: use induction
on the number of stopping times in the given simple strategy.)
3. Show that if the equivalent measure Q satisfies EQf  0 for all f  C,
then it is an equivalent local martingale measure.
4. Show that the set B defined in the proof of Theorem 3.2.2 is a weak
closed
subset of the unit ball of (L1
)
.
4
The general theorem
4.1 Preliminaries on stochastic integration
In this chapter we will assume that the asset price process S is a one-dimensional
semimartingale, defined on some filtered probability space (, F, (Ft), P) satisfying
the usual conditions. We will consider all processes on a finite time interval
[0, T], with the time horizon T > 0 fixed. As before, S is interpreted as the value
of an asset relative to a numerair asset, the numerair asset itself has the constant
value 1.
We will interpret a predictable process  as a trading strategy, t denoting
the number of non-numerair assets that we hold at time t. If  is a simple
process of the form
 = i1(ti-1,ti](t),
for 0 = t0 <    < tn = T a deterministic partition of [0, T] and i bounded
and Fti-1
-measurable, then, as explained in Chapter 2, the process
(  S)t = i(Stit - Sti-1t), t  [0, T],
can be interpreted as the value process of a self-financing portfolio starting with
0 initial capital and following the trading strategy , the adjustments of the
position of the non-numerair assets being financed by taking the appropriate
amount from the "bank account", modelled by the numerair asset.
As the notation suggests,   S is exactly the stochastic integral process
of the locally bounded, predictable process  relative to the semimartingale
S. Hence, using stochastic integration theory we can now go beyond simple
strategies. However, to retain an economically meaningful theory we have to
verify that for non-simple predictable processes,   S can still be interpreted
as the value process associated to the trading strategy . Since we are now
considering strategies that can involve continuous trading, we should consider
approximations to make this precise. For a fine enough partition 0 = t0 <    <
24 The general theorem
tn = T of [0, T], the strategy  is well approximated by the simple strategy
ti-1
1(ti-1,ti] and the value process associated to this simple strategy equals
ti-1
(Sti - Sti-1).
Hence, we want the latter process to be a good approximation of the integral
process   S. Thanks to the continuity property of the stochastic integral this
is indeed the case.
Lemma 4.1.1. Let  be a left-continuous process. Then if 0 = tn
0 <    <
tn
kn
= T is a sequence of partitions of [0, T] with mesh tending to 0, we have
sup
t[0,T ]
tn
i-1
(Stn
i t - Stn
i-1t) - (  S)t
P
 0.
Proof. For every n, define the process n
by
n
= tn
i-1
1(tn
i-1,tn
i ].
Then n
is left-continuous and adapted, hence locally bounded and predictable.
Since  is left-continuous it holds that n
  on [0, T] × , and we have
|n
t |  sup
st
|s|.
The process on the right-hand side of the display is adapted and left-continuous,
hence predictable and locally bounded (cf. Exercise 5.53 in Van der Vaarťs
notes). The conclusion of the lemma now follows from Lemma 5.52 of Van der
Vaarťs notes.
Unfortunately, the definition of the stochastic integral of locally bounded
predictable processes with respect to semimartingales is still not general enough
for the next version of the FTAP. To extend the integral we first endow the
space of semimartingales with a topology. We denote by S(P) the space of all
P-semimartingales on our fixed filtered probability space. For X  S(P) we
define
X S(P) = sup
|H|1
E(|(H  X)T |  1),
where the supremum is over all predictable processes H that are uniformly
bounded by 1. It is easy to see that  S(P) satisfies the triangle inequality
(check). It follows that we can define a metric d on S(P) by setting d(X, Y ) =
X - Y S(P). The topology that this metric generates on S(P) is called the
semimartingale topology. Observe that Xn
 X in this topology if and only if
(H  Xn
)T
P
 (H  X)T
for all uniformly bounded predictable processes H (Exercise 1).
We can now use the semimartingale topology to extend the definition of
the stochastic integral. For a predictable process H and a positive integer n the
4.1 Preliminaries on stochastic integration 25
process H1{|H|n} is bounded and predictable. Hence if X is a semimartingale
the stochastic integral process H1{|H|n}  X is well defined in the sense of
integration of locally bounded predictable processes (see for instance Van der
Vaarťs notes). If X  S(P), the sequence of processes H1{|H|n} X belongs to
S(P) as well. If the sequence has a limit in S(P) (relative to the semimartingale
topology) we say that H is X-integrable and we denote the limit semimartingale
by H  X. Observe that by Lemma 5.52 of Van der Vaart, the new definition of
H  X coincides with the old one if H is locally bounded (check).
Since the FTAP involves changes of measure, it is useful to investigate how
stochastic integrals depend on the underlying probability measure. We write
(H  X)P
if we want to emphasize the dependence of the integral process on P.
For a simple predictable process it is clear that the stochastic integral does not
depend on the underlying measure at all. Now let H be a nonnegative, bounded,
predictable process and X a P-semimartingale and let Q be equivalent to P.
A general version of Girsanov's theorem says that for equivalent probability
measures P and Q, the spaces S(P) and S(Q) of P- and Q-semimartingales
coincide, hence X is a Q-semimartingale as well. Now by standard measure
theory there exist a sequence of Hn of simple predictable processes, independent
of P, such that Hn  H on [0, ) × . By Lemma 5.52 of Van der Vaart we
have that
(Hn  X)t
P
 (H  X)P
t
for all t  0. Hence, there exists a sequence kn   such that (Hkn  X)t 
(H  X)P
t , P-a.s.. Repeating the argument with Q instead of P we see that kn
has a further subsequence ln such that (Hln
 X)t  (H  X)Q
t , Q-a.s.. But
P and Q are equivalent, so (H  X)P
t = (H  X)Q
t almost surely (relative to P
or Q). We conclude that for bounded predictable H and P and Q equivalent,
(H  X)P
and (H  X)Q
are indistinguishable. It can be shown that if P and Q
are equivalent, the semimartingale topologies induced on S(P) = S(Q) by P and
Q coincide. For a predictable process H we just observed that H1{|H|n}  X
does not depend on the probability measure. It follows that whether or not
a predictable process H is X-integrable only depends on the equivalence class
of the underlying probability measure P, and the same holds for the integral
processes H  X.
Some care should be taken with integrands that are not locally bounded.
For the extended integral it is for instance no longer true that the integral with
respect to a local martingale is again a local martingale.
Example 4.1.2. Suppose we have a standard exponential random variable 
and, independent of , a standard Bernoulli variable B, i.e. P(B = -1) = P(B =
1) = 1/2. Define the process M by Mt = B1{t}. Then M is a martingale
relative to its natural filtration (Ft) (Exercise 3). Now define the deterministic
process H by Ht = 1/t for t > 0. Then it holds that
(H1{|H|n}  M)t =


0, t < 
B

1{1/n}, t  .
It follows that H1{|H|n}  M converges in the semimartingale topology to the
26 The general theorem
process X given by
Xt =


0, t < 
B

, t  ,
and X = H  M by definition (check!). Observe however that
E|Xt| = E
1

1{t} =
t
0
1
x
e-x
dx = ,
hence X is not a martingale. It can be shown that X is not a local martingale
either (Exercise 4).
4.2 No free lunch with vanishing risk and the FTAP
Replacing the general asset price process of the preceding chapter by a semimartingale
will allow us to replace the economically meaningless condition of
no free lunch by the condition of no free lunch with vanishing risk.
As discussed in the preceding section, we think of a predictable process
as describing a self-financing trading strategy. We will assume that a trader
has a finite credit line, in the sense that her wealth always stays above some
deterministic (but possibly very negative) number. This is formalized by the
following definition.
Definition 4.2.1. An a-admissible strategy is an S-integrable predictable process
 that satisfies (S)  -a. An admissible strategy is a predictable process
that is a-admissible for some a > 0.
In order to formulate the condition of no free lunch with vanishing risk we
introduce the sets
K = {(  S)T :  admissible},
which is a convex cone in the space L0
of finite-valued random variables (it is
not a linear space in general, since admissibility is a one-sided restriction), and
C = {f  L
: there exists a g  K such that g  f}. By C we denote in this
chapter the closure of C with respect to the norm-topology of L
.
Definition 4.2.2. We say that S satisfies the condition of no free lunch with
vanishing risk if C  L
+ = {0}.
Observe that this condition has a clear economic interpretation. If S does
not satisfy the condition, there exists for every small enough  > 0 an admissible
strategy  (depending on ) such that for the pay-off we have (S)T > - and
P((  S)T > 0) > 0 (Exercise 2). Hence, if we are willing to take an arbitrarily
4.3 Sketch of proof of the fundamental theorem 27
small, but positive loss, we have a positive probability of receiving a strictly
positive pay-off.
We can now formulate the following version of the fundamental theorem of
asset pricing. The proof is discussed in the next section.
Theorem 4.2.3 (FTAP IIIa). If S = (St)t[0,T ] is a bounded, real-valued
semimartingale, then there exists an equivalent martingale measure if and only
if S satisfies the condition of no free lunch with vanishing risk.
If S is only locally bounded, the martingale measure has to be replaced by
a local martingale measure.
Corollary 4.2.4 (FTAP IIIb). If S = (St)t[0,T ] is a locally bounded, realvalued
semimartingale, then there exists an equivalent local martingale measure
if and only if S satisfies the condition of no free lunch with vanishing risk.
Proof. The sufficiency of no free lunch with vanishing risk follows from the
preceding theorem. Indeed, suppose it holds and let n   be stopping times
such that |Sn
|  Kn, with Kn deterministic numbers. Define the new process
~S by
~S = S1[0,1] +
n2
2-n 1
Kn
1(n-1,n]  S.
Then ~S is a bounded semimartingale. Moreover, it satisfies the condition of no
free lunch with vanishing risk (Exercise 5). Hence, by the theorem, there exists
an equivalent probability measure Q such that ~S is a Q-martingale. But then
the original process S is a Q-local martingale (Exercise 6).
The converse statement is proved as in Theorem 4.2.3, see the next section.
4.3 Sketch of proof of the fundamental theorem
4.3.1 The relatively easy half
We noted above that if M is a local martingale and H is M-integrable, then
H  M is not necessarily a local martingale anymore. The proof of the fact
that the existence of a martingale measure is sufficient for no free lunch with
vanishing risk uses a characterization of the martingality of H  M. To explain
this characterization it is useful to first note that a local martingale M is in
fact locally uniformly integrable. Indeed, let n be a localizing sequence for M,
i.e. n   a.s. and every Mn
is a martingale. Then n = n  n also satisfies
28 The general theorem
n   a.s. and every Mn
is a uniformly integrable martingale. Moreover, if
we now consider the stopping time n = n  inf{t : |Mt| > n} we have that
sup
tn
|Mt|  n + |Mn
|.
Since n  n and Mn
is UI, the right-hand side of the display is integrable.
So we have proved the following useful lemma.
Lemma 4.3.1. A local martingale M is locally uniformly integrable. Moreover,
there exists a localizing sequence n such that
E sup
tn
|Mt| < 
for all n.
Now consider a local martingale M and an M-integrable predictable process
H and suppose that H  M is a local martingale. For a localizing sequence
n such that E suptn
|(H  M)t| <  we have supt |(H  M)n
t | 
2 suptn
|(H  M)t|, so (H  M)n
 Zn, where Zn = -2 suptn
|(H  M)t|.
So we see that there exists a localizing sequence n and integrable random variables
Zn such that (H  M)n
 Zn for all n. It turns out that the converse is
true as well.
Theorem 4.3.2. Let M be a local martingale and H a predictable processes
that is M-integrable. Then H  M is a local martingale if and only if there
exists a localizing sequence n and integrable random variables Zn such that
(H  M)n
 Zn for all n.
We can now prove that the existence of an equivalent martingale measure
implies there is no free lunch with vanishing risk. Suppose there exists an
equivalent martingale measure Q and let  be an a-admissible strategy. By the
observations in the preceding section the integral process   S does not depend
on the underlying measure. Under Q the process S is a local martingale. Now
consider the stopping times n = inf{t : (  S)t  n}. Then n   and by the
admissibility of  we have
(  S)n
t = (  S)n
t - (  S)n
t-  -(n + a).
Hence, by the preceding theorem,   S is a Q-local martingale. Together with
the a-admissibility this implies that S is in fact a Q-supermartingale (Exercise
7). It follows that for every f  C we have EQf  0 and then also EQf  0 for
every f  C. This implies that every f  C  L
+ vanishes Q-a.s., but then also
P-a.s..
4.3 Sketch of proof of the fundamental theorem 29
4.3.2 The much more difficult half
The essential step in the proof of the fact that no free lunch with vanishing risk
is sufficient for the existence of a martingale measure is the following theorem.
Theorem 4.3.3. If the bounded semimartingale S satisfies the condition of no
free lunch with vanishing risk, then the cone C is weak
-closed.
Indeed, if we have this result we can argue as in the proof of Theorem
3.2.2 to find a probability measure Q equivalent to P such that EQf  0 for all
f  C. It then follows that EQf = 0 for all f  K. Since S is bounded it is
integrable. Moreover, for s  t and A  Fs the predictable process  = 1A×(s,t]
is admissible, so (  S)T  K and hence
EQ1A(St - St) = EQ(  S)T = 0,
which shows that S is a Q-martingale.
The proof of Theorem 4.3.3 is long and difficult. The first difficulty that
arises is that the weak
-topology is in general not metrizable. This implies
that to show that a set is weak
-closed, it is in general not enough to consider
converging sequences (in fact, one should consider nets). However, it turns out
that in the present context the situation is not that complicated, since we can
use the following consequence of the so-called Krein-Smulian theorem.
Theorem 4.3.4. Let (E, E, ) be a finite measure space and C  L
() a
convex cone. Suppose that for each uniformly bounded sequence fn in C that
converges in measure to a function f, it holds that f  C. Then C is weak
-
closed.
So to prove Theorem 4.3.3 it suffices to consider a sequence hn in C such
that |hn|  1 for all n and hn
as
 h for some h  L
, and show that h belongs
to C. To prove that h  C we have to find an f0  K such that h  f0. To
that end it turns out to be useful to consider the set
D = {f : there exist 1-admissible n such that (n  S)T
as
 f, f  h}.
It can be shown that this set contains a maximal element f0. So the random
variable f0 dominates h and is the almost sure limit of elements (n  S)T of K,
n 1-admissible.
The remaining task is to show that f0 belongs to K itself. The first step is
the observation that convergence of n S at the terminal time T in fact implies
convergence for all time points. To see this one first shows that as n, m  ,
sup
t[0,T ]
|(n  S)t - (m  S)t|
P
 0. (4.1)
The proof of this fact uses the 1-admissibility of the n and the maximality
of f0. Next we want to apply some results on the semimartingale topology,
30 The general theorem
but the preceding display does not imply that n  S is a Cauchy sequence in
the semimartingale topology. A long and technical proof however shows that
for every n there exits a n in the convex hull of the processes n, n+1, . . .
such that n  S is a Cauchy sequence in the semimartingale topology. The
semimartingale topology can be shown to be complete, so that we now have
n  S  Z for some semimartingale Z. Moreover, Mémin's theorem shows
that the semimartingale Z must necessarily be of the form Z =   S for some
S-integrable predictable process .
Observe that since n is a convex combination of n, n+1, . . ., the process
n is 1-admissible. Since the convergence in the semimartingale topology implies
that
(n  S)t
P
 (  S)t
for all t  [0, T], it follows that  is 1-admissible as well. The fact that n is
a convex combination of n, n+1, . . . also implies that the almost sure limit of
(n  S)T equals the almost sure limit of (n  S)T , which is f0 (Exercise 8).
Combined with the previous observations we conclude that f0 = (  S)T , so
indeed f0  K.
4.4 Example: Itô processes
Suppose we have, on some filtered probability space (, F, (Ft), P) satisfying
the usual conditions, continuous adapted processes B and X satisfying
Bt = exp
t
0
rs ds ,
dXt = tXt dt + tXt dWt,
where W is a standard Brownian motion and r,  and  are locally bounded
predictable processes. All processes are indexed by [0, T] for some fixed time
horizon T > 0. We think of B as describing a bank account process with
(continuous, stochastic) interest rate rt, and X as describing the value of a
stock with local return rate t and (possibly stochastic) volatility t.
We use B as numerair, putting S = X/B. Integration by parts then gives
the stochastic differential equation
dSt = (t - rt)St dt + tSt dWt
for the discounted process S (check). Now suppose that the Sharp ratio
t =
t - rt
t
is uniformly bounded by a deterministic constant for all t  [0, T]. Then by the
classical Girsanov theorem, there exists a probability measure Q equivalent to
P under which the process
Bt = Wt +
t
0
s ds, t  [0, T],
4.4 Example: Itô processes 31
is a Brownian motion. Combining the definition of B with the SDE for S we
get
dSt = tSt dBt.
In particular, we see that S is a Q-local martingale. Hence, by Corollary 4.2.4,
this model satisfies the condition of no free lunch with vanishing risk.
Observe that the classical Black-Scholes model corresponds to the special
case that r,  and  are deterministic and independent of time. The condition
on the Sharp ratio is then trivially satisfied, so we recover the well-known fact
that the Black-Scholes model is free of arbitrage (in the sense of no free lunch
with vanishing risk).
32 The general theorem
4.5 Exercises
1. Show that Xn
P
 X if and only if E(|Xn - X|  1)  0.
2. Show that if S does not satisfy the condition of no free lunch with vanishing
risk, there exists for every  > 0 small enough an admissible strategy 
with a pay-off satisfying (
 S)T > - and P((
 S)T > 0) > 0.
3. Show that the process M defined in Example 4.1.2 is a martingale.
4. Show that the process X in Example 4.1.2 is not a local martingale.
5. Show that the process ~S in the proof of Corollary 4.2.4 satisfies the condition
of no free lunch with vanishing risk.
6. Show that the process S in the proof of Corollary 4.2.4 is a Q-local mar-
tingale.
7. Show that a local martingale that is bounded from below by a deterministic
number is a supermartingale.
8. Show that in Section 4.3.2, we have the a.s. convergence (n  S)T  f0.
A
Elements of functional
analysis
A.1 Separating hyperplane theorem
Let v  Rn
and   R be given and consider the set H = {x  Rn
: v, x = }.
For x  H we have
v, x - (/ v 2
)v = 0,
so H = v
+ (/ v 2
)v. The complement of H consists of the two sets {x :
v, x < } and {x : v, x > } on the two "sides" of the hyperplane.
The following theorem says that for two disjoint, convex sets, one compact
and one closed, there exists two "parallel" hyperplanes such that the sets lie
strictly one different sides of those hyperplanes.
The assumption that one of the sets is compact can not be dropped (see
Exercise 1)
Theorem A.1.1 (Separating hyperplane theorem). Let K and C be disjoint,
convex subsets of Rn
, K compact and C closed. There exist v  Rn
and
1, 2  R such that
v, x < 1 < 2 < v, y
for all x  K and y  C.
Proof. Consider the function f : K  R defined by f(x) = inf{ x - y : y 
C}, i.e. f(x) is the distance of x to C. The function f is continuous (check) and
since K is compact, there exists x0  K such that f attains its minimum at x0.
Let yn  C be such that x0 - yn  f(x0). By the parallelogram law we have
yn - ym
2
2
=
yn - x0
2
ym
- x0
2
2
= 1
2 yn - x0
2
+ 1
2 ym - x0
2
yn
+ ym
2
- x0
2
.
34 Elements of functional analysis
By convexity (yn + ym)/2  C, so that (yn + ym)/2 - x0  f(x0). Hence, we
have
yn - ym
2
2
 1
2 yn - x0
2
+ 1
2 ym - x0
2
- f2
(x0).
The right-hand side of this display converges to 0 as n, m  . So the yn
form a Cauchy sequence and hence they converge to some y0  Rn
. Since C
is closed, y0  C. Let v = y0 - x0. Since K and C are disjoint, v = 0. It
follows that 0 < v 2
= v, y0 - x0 = v, y0 - v, x0 . It remains to show that
v, x  v, x0 and v, y0  v, y for all x  K and y  C.
Take y  C. Since C is convex, the line segment y0 + (y - y0),   [0, 1],
belongs to C. Since y0 minimizes the distance to x0, we have
y0 - x0  y0 - x0 + (y - y0)
for every . By squaring this we find that
0  2 y0 - x0, y - y0 + 2
y - y0
2
.
Dividing by  and then letting   0 gives v, y - y0  0, as desired.
A similar argument shows that v, x  v, x0 for x  K.
The polar C0
of a set C  Rn
is defined as
C0
= {y  Rn
: x, y  1 for all x  C}.
Note that in the special case that C is closed under multiplication with positive
scalars, we have C0
= {y  Rn
: x, y  0 for all x  C} (check). For a given
x, the set C0
x = {z : x, z  0} is the set of all vectors that lie on the same
side of x
as -x. The polar is in this case the intersection of all the sets C0
x for
x  C.
To illustrate the bipolar theorem geometrically, consider a V -shaped set: C
the union of two rays emanating from the origin. Then one readily sees that
the polar of the polar of C precisely equals the convex hull of C. The general
result is as follows.
Theorem A.1.2 (Bipolar theorem). Let C  Rn
contain 0. Then the
bipolar C00
= (C0
)0
equals the closed convex hull of C.
Proof. It is clear that C00
is a closed, convex set containing C, so the closed
convex hull A of C is a subset of C00
. Suppose that the converse inclusion does
not hold. Then there exists a point x0  C00
that is not in A. By the separating
hyperplane theorem there then exists a vector v  Rn
and 1, 2  R such that
x0, v > 1 > 2 > y, v for all y  A. Since 0  C  A we have 1 > 0.
Dividing by 1 shows there exists a vector v  Rn
such that x0, v > 1 > y, v
for all y  A. The second inequality implies that v  C0
, and then the first one
implies that x0  C00
, which gives a contradiction.
A.2 Topological vector spaces 35
A.2 Topological vector spaces
A vector space X is called a topological vector space if it is endowed with a
topology which is such that every point of X is a closed set and the addition
and scalar multiplication operations are continuous.
It is easy to see that translation by a fixed vector and multiplication by a
nonzero scalar are homeomorphisms of a topological vector space. This implies
in particular that the topology is translation-invariant, meaning that a set E 
X is open if and only if each of its translates x + E is open.
Topological vector spaces have nice separation properties. Combined with
the fact that points are closed sets, the next theorem implies for instance that
they are always Hausdorff.
Theorem A.2.1. Suppose that K and C are disjoint subsets of a topological
vector space X, K compact and C closed. Then there exits a neighborhood V
of 0 such that K + V and C + V are disjoint.
Proof. The continuity of addition implies that for every neighborhood W of 0
there exist neighborhoods V1 and V2 of 0 such that V1 + V2  W (check). Now
put U = V1  V2  (-V1)  (-V2). Then U is symmetric (i.e. U = -U) and
U + U  W. Applying the same procedure to the neighborhood U we see that
for every neighborhood W of 0 there exists a symmetric neighborhood U such
that U + U + U  W (etc.).
Pick an x  K. Then X\C is an open neighborhood of x. By translation
invariance and the preceding paragraph there exists a symmetric neighborhood
Vx of 0 such that x + Vx + Vx + Vx does not intersect C. By the symmetry
of Vx this implies that x + Vx + Vx and C + Vx are disjoint (check). Since
K is compact, it is covered by finitely many sets x1 + Vx1
, . . . , xn + Vxn
. Put
V = Vx1
     Vxn
. Then
K + V  (xi + Vxi
+ V )  (xi + Vxi
+ Vxi
)
and none of the terms in the last union intersects C + V .
The following lemma implies that if V is a neighborhood of 0 in a topological
vector space X, then for every x  X it holds that x  rV if r is large
enough. A set V with this property is called absorbing.
Lemma A.2.2. Suppose V is a neighborhood of 0 in a topological vector space
X and rn is a sequence of positive numbers tending to infinity. Then
rnV = X.
Proof. Fix x  X. Then since V is open in X and   x from R to X is
continuous, { : x  V } is open in R. The set contains 0, and hence it contains
1/rn for n large enough. This completes the proof.
36 Elements of functional analysis
For an arbitrary absorbing subset A (for instance a neighborhood of 0) of a
topological vector space X we define the Minkowsky functional A : X  [0, )
by
A(x) = inf{t > 0 : x/t  A}.
Note that A is indeed finite-valued, since A is absorbing. The following lemma
collects properties that we need later.
Lemma A.2.3. Let A be a convex, absorbing subset of a topological vector
space X and let A be its Minkowsky functional.
(i) A(x + y)  A(x) + A(y) for all x, y  X.
(ii) A(tx) = tA(x) for all x  X and t  0.
Proof. For x, y  X and  > 0, consider t = A(x) + , s = A(y) + . Then
by definition of A, x/t  A and y/s  A. Hence, the convex combination
x + y
s + t
=
t
s + t
x
t
+
s
s + t
y
s
belongs to A as well. This proves (i). The proof of (ii) is easy.
For the proof of the following characterization of continuous linear functionals
we need the notion of a balanced neighborhood. A set B  X is said to
be balanced if B  B for every scalar   R with ||  1.
Lemma A.2.4. Every neighborhood of 0 contains a balanced neighborhood of
0.
Proof. Let U be a neighborhood of 0. Since scalar multiplication is continuous,
there exists a  > 0 and a neighborhood V of 0 in X such that V  U whenever
|| < . Then W = ||< V is a balanced neighborhood of 0.
A linear map  : X  R is called a linear functional on the space X. A
linear functional on X is called bounded on a subset A  X if there exists a
number K > 0 such that |x|  K for all x  A.
Theorem A.2.5. Let  be a nontrivial linear functional on a topological vector
space X. Then  is continuous if and only if  is bounded on a neighborhood
of 0.
Proof. Suppose  is continuous. Then the null space N = {x  X : x = 0}
is closed. Since  is nontrivial, there exists x  X\N. By Theorem A.2.1 there
exists a balanced neighborhood V of 0 such that x+V and N are disjoint. Then
(V ) is a balanced subset of R. Suppose it is not bounded. Then since it is
balanced, it most be all of R. In particular, there then exists a y  V such that
A.3 Hahn-Banach theorem 37
y = -x. But then x + y  N, a contradiction. Hence, (V ) is bounded, i.e.
 is bounded on V .
Conversely, suppose that |x|  M for all x  V . For r > 0, put W =
(r/M)V . Then for x  W, say x = (r/M)y for y  V , we have |x| =
(r/M)|y|  r. Hence,  is continuous at 0. By translation invariance, it is
continuous everywhere.
A.3 Hahn-Banach theorem
The proof of the following version of the Hahn-Banach theorem relies on the
axiom of choice, in the form of the Hausdorff maximality theorem:
Every nonempty partially ordered set P contains a totally ordered subset Q
which is maximal with respect to the property of being totally ordered.
A proof of this fact can for instance be found in Rudin (1987), pp. 395­396.
Theorem A.3.1 (Hahn-Banach theorem). Suppose X is a (real) vector
space and p : X  R satisfies p(x + y)  p(x) + p(y) and p(tx) = tp(x) for
x, y  X and t  0. Then if f is a linear functional on a subspace M of X such
that f(x)  p(x) for all x  M, f extends to a linear functional  on the whole
space X such that
-p(-x)  x  p(x)
for all x  X.
Proof. Suppose M is a proper subspace of X and pick x1  X\M. For x, y  M
we have
f(x) + f(y) = f(x + y)  p(x + y)  p(x - x1) + p(y + x1),
hence f(x) - p(x - x1)  p(y + x1) - f(y). So there exists an  such that
f(x) -   p(x - x1), f(y) +   p(y + x1) (A.1)
for all x, y  M. Now let M1 be the vector space spanned by M and x1. An
element of M1 is of the form x+x1 for some   R. So we can extend f to M1
by setting f1(x + x1) = f(x) + . Then f1 is a well-defined linear functional
on M1 and the inequalities in (A.1) imply that f1(x)  p(x) for all x  M1
(check).
Let C be the collection of pairs (M , f ), where M is a subspace of X
containing M and f is a linear extension of f to M such that f  p on
M . Put an ordering on C by saying that (M , f )  (M , f ) if M  M
and f |M = f . This is a partial ordering and C is not empty. Hence, by
the Hausdorff maximality theorem, we can extract a maximal totally ordered
subcollection C . Let ~M be the union of all M for which (M , f )  C . Then
38 Elements of functional analysis
~M is a subspace of X (check). If x  ~M then x  M for some M such that
(M , f )  C . We then put x = f (x). This defines a linear function  on
~M and we have that   p on ~M (check). If ~M were a proper subspace of X
the construction of the preceding paragraph would give us a further extension
of , contradicting the maximality of C . Hence, ~M = X. This completes the
proof, upon noting that   p implies that -p(-x)  -(-x) = x for all
x  X.
Before we use the Hahn-Banach theorem to prove the infinite-dimensional
version of the separating hyperplane theorem we introduce some more concepts
and notation.
A topological vector space X is called locally convex if for every neighborhood
V of 0 there exists a convex neighborhood U of 0 such that U  V . The
space of continuous linear maps from X to R is denoted by X
. It is called the
dual of X, and is treated in more detail in the next section.
Theorem A.3.2 (Separation theorem). Let A and B be disjoint, nonempty,
convex subsets of a topological vector space X.
(i) If A is open there exist   X
and   R such that
x <   y
for every x  A and y  B.
(ii) If X is locally convex, A is compact and B is closed, there exist   X
and 1, 2  R such that
x < 1 < 2 < y
for every x  A and y  B.
Proof. (i). Pick a0  A and b0  B and put x0 = b0-a0. Define C = A-B+x0
and note that C is a convex, open neighborhood of 0. Let C be the Minkowsky
functional of C.
Let M be the linear subspace generated by x0 and define a linear functional
f on M by putting f(x0) = . Since A and B are disjoint, x0  C so we have
C(x0)  1 and hence, for   0, f(x0) =   C(x0) = C(x0). For
 < 0 we have f(x0) < 0  C(x0). By Lemma A.2.3 and the Hahn-Banach
theorem, Theorem A.3.1, the functional f extends to a linear functional  on
X, and the extension satisfies x  C(x) for all x  X. In particular   1
on C, so that ||  1 on the neighborhood C -C of 0. By Theorem A.2.5 this
implies that  is continuous, i.e.   X
.
Now for a  A and b  B we have that
a - b + 1 = (a - b + x0)  C(a - b + x0) < 1,
since a - b + x0  C and C is open (Exercise 2), so a < b. It follows that
(A) and (B) are disjoint, convex subsets of R, the first one lying on the left
of the second one. Since A is open and  is nonconstant, (A) is open as well
A.4 Dual space 39
(Exercise 3). Letting  be the right end point of (A) completes the proof of
(i).
(ii). By Theorem A.2.1 and the local convexity of X there exists a convex
neighborhood V of 0 such that (A + V )  B = . By the proof of part (i) there
exists   X
such that (A+V ) and (B) are disjoint, convex subsets of R, the
first one lying on the left of the second one, the first one being open. Moreover,
(A) is a compact subset of (A + V ). The proof is now easily completed.
Corollary A.3.3. If X is a locally convex topological vector space, X
separates
the points of X.
Proof. given distinct points x, y  X, apply the separation theorem with A =
{x} and B = {y}.
For x  X and   X
we define, in analogy with the finite-dimensional
situation, x,  = x. The polar C0
of a set C  X is defined as
C0
= {  X
: x,   1 for all x  C}.
Similarly, the bipolar is defined as
C00
= (C0
)0
= {x  X : x,   1 for all   C0
}.
Theorem A.3.4 (Bipolar theorem). The bipolar C00
of a subset C of a
locally convex topological vector space X equals the closed convex hull of C.
Proof. It is clear that C00
is a convex set containing C, so the closed convex
hull A of C is a subset of C00
. Suppose that the reverse inclusion does not
hold. Then there exists a point x0  C00
that is not in A. By the separation
theorem there then exists a functional   X
such that x0 > 1 > y for all
y  A (check). The second inequality implies that   C0
, and then the first
one implies that x0  C00
, which is a contradiction.
A.4 Dual space
The dual of a topological vector space X is the space X
of continuous linear
functionals on X. By Theorem A.2.5 this is the same as the space of linear
functionals that are bounded on a neighborhood of 0.
It is easy to see that if the topology on X is induced by a norm  , a
linear functional  belongs to X
if and only if the unit ball in X is mapped
into a bounded subset of R. In that case we define the norm of  by
 = sup
x 1
|(x)|
40 Elements of functional analysis
and we have the relation |(x)|   x for every x  X.
Example A.4.1. Let (E, E, ) be a measure space with  a finite measure,
p  [1, ) and X = Lp
(E, E, ). Consider a continuous linear functional  on
X. Then the map  : E  R defined by (B) = (1B) is a signed measure (note
that the finiteness of  implies that  is well-defined). Indeed, if Bn are disjoint
elements of E and B = Bn, then 1knBk
 1B in Lp
. Since  is continuous,
this implies that  is countably additive. If (B) = 0 then 1B vanishes in Lp
and hence (B) = 0, so  . Hence, by the Radon-Nikodym theorem, there
exists a g  L1
such that
(1B) = (B) =
B
g d
for all B  E. By linearity we then have
(f) = fg d (A.2)
for all simple functions f. Every bounded measurable function f is the uniform
limit of simple functions and since  is finite, uniform convergence implies
convergence in Lp
. It follows that (A.2) holds for all f  L
.
Suppose that p > 1 and let q be the conjugate exponent. For En = {x :
|g(x)|  n} we have, since g is bounded on En and  is continuous and hence
bounded,
En
|g|q
d =
En
|g|q-1
sign(g)g d = (1En |g|q-1
sign(g))  
En
|g|q
d
1/p
.
It follows that
En
|g|q
d
1/q
  (A.3)
and letting n   shows that g  Lq
. If p = 1 then for every B  E we have
B
g d = |(1B)|   (B).
But this implies that |g|   a.e. (indeed: if not there would exist an  > 0
such that the set B = {x : |g(x)| >  + } has positive -measure, leading to
a contradiction), hence g  L
.
So in all cases the function g in (A.2) belongs to Lq
. We proved already
that (A.2) holds for all bounded functions f. Now  is continuous on Lp
by
assumption and H¨olders inequality implies that the right-hand side is continuous
for f  Lp
as well. This shows that the relation holds in fact for all f  Lp
.
Uniqueness of g is easy to prove. We conclude that we may identify the dual of
Lp
with Lq
. Moreover, using (A.3) it is easy to see that for   (Lp
)
given by
(A.2), we have  = g Lq (Exercise 4).
Let X be a topological vector space with dual X
. Every point x  X
induces a linear functional on X
, defined by   x. The weak
-topology of
X
is the weakest (i.e. smallest) topology making all these maps continuous.
A.4 Dual space 41
The following theorem states that X
with the weak
-topology is a locally
convex topological vector space. This implies for instance that we can apply the
separation theorem to it. In general, the space X
endowed with the weak
topology
is not a Banach space. (In fact, it is not even metrizable if X is an
infinite-dimensional Banach space.)
Theorem A.4.2. The dual X
of a topological vector space X, endowed with
the weak
-topology, is a locally convex topological vector space. Its dual is given
by {  x : x  X}.
Proof. Denote by fx be the linear functional   x. If  =  in X
, there
exists an x  X such that fx = fx . Hence, in R there exist disjoint neighborhoods
U of fx and U of fx . Since fx is continuous, f-1
x (U) and f-1
x (U )
are disjoint neighborhoods of  and  . This shows that X
is Hausdorff, and
in particular that points are closed.
To show that the weak
-topology is translation invariant, consider an open
base set
U = { : x1  B1, . . . , xn  Bn}
and   X
. Then  + U = { : x1  B1 +  x1 . . . , xn  Bn +  xn} is
an open base set as well. It follows that the topology is translation invariant.
Note that the open sets V of the form
V = { : |x1| < r1, . . . , |xn| < rn} (A.4)
for x1, . . . , xn  X and r1, . . . , rn > 0 form a local base at 0. Every such set V
is convex, balanced and absorbing (check). In particular, X
is a locally convex
space.
For the set V in the preceding display we have V/2 + V/2 = V and hence
addition is continuous at (0, 0). As for scalar multiplication, suppose that  
V for some scalar   R and   X
. By Exercise 2, there exists t > 0 such that
t < 1/|| and   tV . For  > 0 and   tV we have that (+)  (+)tV .
Hence, since V is balanced, (+)  V for all  such that ||t+||t  1. Since
||t < 1 there is a nonempty interval around 0 of  satisfying this condition.
Hence, scalar multiplication is continuous.
It remains to identify the dual of X
(endowed with the weak
-topology). If
x  X, the linear map   (x) is weak
-continuous by definition of the weak
topology.
Conversely, let f : X
 R be weak
-continuous. By Theorem A.2.5,
f is bounded on a neighborhood of 0, and hence also on a base set V of the form
(A.4). This implies that f vanishes on the set N = { : x1 =    = xn = 0}
(Exercise 5). Now N is the kernel of the linear map  : X
 Rn
defined by
() = (x1, . . . , xn). It follows that the linear map F : (X
)  R given by
F(()) = f() is well defined (check). We can extend F to a linear functional
on Rn
. It is then necessarily of the form F(z1, . . . , zn) = izi for certain real
numbers i. In particular,
f() = F(x1, . . . , xn) = ixi.
So indeed, f() = x, with x = ixi.
42 Elements of functional analysis
If X is a Banach space its dual X
is endowed with a norm, and the unit
ball in X
is the set {  X
: |x|  x for all x  X}. In the normtopology
this set is not compact in general (think of an infinite-dimensional
Hilbert space). In the weak
-topology however, it is always compact.
Theorem A.4.3 (Banach-Alaoglu). The unit ball of the dual of a Banach
space is weak
-compact.
Proof. Denote the Banach space by X and let B
be the unit ball in its dual. By
Tychonov's theorem, P = xX[- x , x ] is compact (relative to the product
topology). We can view P as a collection of functions on X, with f  P if and
only if |f(x)|  x for all x  X. As such, we have B
 X
 P. Hence, B
inherits two topologies: the weak
-topology from X
and the product topology
from P. These two topologies on B
coincide. To see this, take 0  B
. The
sets of the form
V1 = {  X
: |x1 - 0x1| < r1, . . . , |xn - 0x1| < rn}
and
V2 = {f  P : |f(x1) - 0x1| < r1, . . . , |f(xn) - 0x1| < rn}
form a local base for the weak
-topology and, respectively, the product topology
at 0. Since B
 X
P we have V1 B
= V2 B
and hence the two relative
topologies coincide.
Next we show that B
is closed in P. Take f0 in the closure of B
(with
respect to the product topology). For x, y  X, ,   R and  > 0 we have
that the set
U = {f  P : |f(x)-f0(x)| < , |f(y)-f0(y)| < , |f(x+y)-f0(x+y)| < }
is an open neighborhood of f0. Hence, there exist an f  U  B
. Since f is
linear we have
f0(x + y) - f0(x) - f0(y)
= (f0 - f)(x + y) - (f0 - f)(x) - (f0 - f)(y)
and hence
|f0(x + y) - f0(x) - f0(y)|  (1 + || + ||).
Since  was arbitrary, it follows that f0 is linear. By definition of P we have
that |f0(x)|  x for every x  X, so indeed f0  B
.
The proof is now completed upon noting that by the preceding paragraph,
B
is compact with respect to the product topology. But by the first part of
the proof, the latter topology coincides on B
with the weak
-topology.
A.4 Dual space 43
Example A.4.4. Although the weak
-topology has some nice properties according
to Theorem A.4.2, it is good to note that it is typically "strange".
Consider for instance a finite measure  on the line and view L
() as the dual
of L1
(). Then from the form of the local base at 0 given in the proof of the
theorem one sees that a sequence fn in L
converges in the weak
-topology to
0 if fng d  0 for every g  L1
. By dominated convergence, this holds for
instance for fn = 1(-n,n)c . This sequence does however not converge to 0 in
the ordinary, uniform topology on L
. More generally, to say that a function
f  L
belongs to the weak
-closure of a set C  L
does not necessarily
mean that f is well-approximated by elements of C in a uniform or any other
intuitively reasonable way.
44 Elements of functional analysis
A.5 Exercises
1. Give an example which shows that the separation theorem does not hold
in general if the assumption of compactness of one of the sets in dropped.
2. Suppose that C is an open neighborhood of 0 in a topological vector space
and let C be its Minkowsky functional. Show that for all x  C it holds
that C(x) < 1.
3. Show that a non-constant linear functional on a topological vector space
maps open sets to open sets.
4. In Example A.4.1, show that for the functional  on Lp
defined by (A.2)
we have  = g Lq .
5. In the last part of the proof of Theorem A.4.2, show that the functional
f vanishes on the set N.
B
Elements of martingale
theory
B.1 Basic definitions
Let (, F, P) be a probability space. A collection of Rd
-valued random variables
X = (Xt)tT indexed by a set T  R is called a (d-dimensional) stochastic
process. We call the process continuous (or cadlag), it its trajectories t  Xt()
are continuous (or cadlag). The process is called bounded if there exists a finite
number K such that a.s. Xt  K for all t.
A filtration is a collection (Ft)tT of sub--fields of F such that Fs  Ft
for all s  t. It is said to satisfy the usual conditions if it is right-continuous,
i.e. s>tFs = Ft for all t and F0 contains all the P-null sets in F. A process X
is called adapted to (Ft) is for every t, Xt is Ft-measurable. For a process X
and t  T we define FX
t to be the -field generated by the collection of random
variables {Xs : s  t}. The filtration (FX
t ) is called the natural filtration of the
process X. It is the smallest filtration to which it is adapted. A process X =
(Xt)t[0,T ] is called progressively measurable relative to the filtration (Ft)t[0,T ]
if for all t, the map (, s)  Xs() on  × [0, t] is Ft  B([0, t])-measurable.
A [0, ]-valued random variable  is called a stopping time relative to the
filtration (Ft) if {  t}  Ft for every t. If  is a stopping time and X a
process, the stopped process X
is defined by X
t = Xt. A localizing sequence
is a sequence of stopping times n increasing a.s. to infinity. A process X is said
to have a property P locally if there exists a localizing sequence n such that for
every n, the stopped process Xn
has the property P.
A process M is called a martingale relative to the filtration (Ft) if every
Mt is integrable and for all s  t it holds that E(Mt | Fs) = Ms a.s.. In
accordance with the previously introduced notation the process M is called a
local martingale if there exists a localizing sequence n such that for every n, the
stopped process Mn
is a martingale. Every martingale is a local martingale,
but not vice versa..
46 Elements of martingale theory
B.2 Theorems
For a filtration (Ft) and a stopping time  we define
F = {A  F : A  {  t}  Ft for all t}.
The set F is always a -field and should be thought of as the collection of
events describing the history before time .
Theorem B.2.1 (Optional stopping theorem). Let M be a cadlag, uniformly
integrable martingale. Then for all stopping times   ,
E(M | F) = M.
Theorem B.2.2 (Kakutani's theorem). Let X1, X2, . . . be independent nonnegative
random variables with mean 1. Define M0 = 1 and Mn = X1X2    Xn.
It holds that M is uniformly integrable if and only if (1 - E

Xn) < . If M
is not uniformly integrable, then Mn  0 a.s..
Corollary B.2.3. Let X = (X1, X2, . . .) and Y = (Y1, Y2, . . .) be two sequences
of independent random variables. Assume Xi has a positive density fi with
respect to a dominating measure , and Yi has a positive density gi with respect
to . Then the laws of the sequences X and Y are equivalent probability
measures on (R
, B(R
)) if and only if
n
i=1
( fi -

gi)2
d < .
If the laws are not equivalent, they are mutually singular.
Proof. Let (, F) = (R
, B(R
)) and Z = (Z1, Z2, . . .) the coordinate process
on (, F), so Zi() = i. Let Fn  F be the -field generated by Z1, . . . , Zn.
Since the densities fi and gi are all positive, the distributions PX and PY of the
sequences X and Y are equivalent on Fn. For A  Fn we have
PX(A) =
A
Mn dPY ,
where the Radon Nikodym derivative is defined by Mn =
n
i=1 fi(Zi)/gi(Zi).
Observe that under PY , the process M is a martingale to which the preceding
theorem applies. It is readily verified that the measures PX and PY are
equivalent on the whole -field F if and only if M is uniformly integrable with
B.2 Theorems 47
respect to PY (Exercise 1). Hence, by the preceding theorem, the measures are
equivalent if and only if
n
i=1
1 - figi d < .
The proof of the first part is completed by noting that (

fi -

gi)2
d =
2 - 2

figi d.
We noted that if PX and PY are not equivalent, then M is not uniformly
integrable relative to PY . Hence, by the preceding theorem, Mn  0, PY a.s..
We can reverse the roles of X and Y , which amounts to replacing M by
1/M. Then we find that if PX and PY are not equivalent, 1/Mn  0, PXa.s..
It follows that for the event A = {Mn  0} we have PY (A) = 1 and
PX(A) = 0.
Example B.2.4. Let X = (X1, X2, . . .) and Y = (Y1, Y2, . . .) be two sequences
of independent random variables. Suppose that P(Xi = 1) = P(Xi = -1) = 1/2
and P(Yi = 1) = 1 - P(Yi) = -1 = 1/2 + i for some i  (-1/2, 1/2). By
the corollary, applied with  the counting measure, fi(1) = fi(-1) = 1/2,
gi(1) = 1-gi(-1) = 1/2+i, the laws of the sequences X and Y are equivalent
if and only if
( 1/2 - 1/2 + i)2
+ ( 1/2 - 1/2 - i)2
< .
By Taylor's formula the function h(x) = ( 1/2 - 1/2 + x)2
+ ( 1/2 -
1/2 - x)2
behaves like a multiple of x2
near x = 0 (check!). It follows that
the sequences are equivalent if and only if 2
i < .
48 Elements of martingale theory
B.3 Exercises
1. In the proof of Corollary B.2.3, show that the measures PX and PY are
equivalent on the whole -field F if and only if M is uniformly integrable.
References
Conway, J. B. (1990). A course in functional analysis, volume 96 of Graduate
Texts in Mathematics. Springer-Verlag, New York, second edition.
Delbaen, F. and Schachermayer, W. (1994). A general version of the fundamental
theorem of asset pricing. Math. Ann. 300(3), 463­520.
Delbaen, F. and Schachermayer, W. (2006). The mathematics of arbitrage.
Springer Finance. Springer-Verlag, Berlin.
Harrison, J. M. and Pliska, S. R. (1981). Martingales and stochastic integrals in
the theory of continuous trading. Stochastic Process. Appl. 11(3), 215­260.
Kreps, D. M. (1981). Arbitrage and equilibrium in economies with infinitely
many commodities. J. Math. Econom. 8(1), 15­35.
Rudin, W. (1987). Real and complex analysis. McGraw-Hill Book Co., New
York, third edition.
Rudin, W. (1991). Functional analysis. International Series in Pure and Applied
Mathematics. McGraw-Hill Inc., New York, second edition.