MARTINGALES, DIFFUSIONS AND FINANCIAL MATHEMATICS
A.W. van der Vaart
ii
CONTENTS
1. Measure Theory ......................1
1.1. Conditional Expectation.................1
1.2. Uniform Integrability..................4
1.3. Monotone Class Theorem................6
2. Discrete Time Martingales..................8
2.1. Martingales ......................8
2.2. Stopped Martingales .................10
2.3. Martingale Transforms.................12
2.4. Doob's Upcrossing Inequality..............13
2.5. Martingale Convergence................14
2.6. Reverse Martingale Convergence............16
2.7. Doob Decomposition .................19
2.8. Optional Stopping ..................20
2.9. Maximal Inequalities .................22
3. Discrete Time Option Pricing ...............25
4. Continuous Time Martingales ...............32
4.1. Stochastic Processes..................32
4.2. Martingales .....................34
4.3. Martingale Convergence................37
4.4. Stopping.......................37
4.5. Brownian Motion...................43
4.6. Local Martingales...................45
4.7. Maximal Inequalities .................47
5. Stochastic Integrals....................49
5.1. Predictable Sets and Processes.............49
5.2. Doleans Measure...................52
5.3. Square-integrable Martingales .............54
5.4. Locally Square-integrable Martingales..........61
5.5. Brownian Motion...................65
5.6. Martingales of Bounded Variation............68
5.7. Semimartingales...................72
5.8. Quadratic Variation..................77
5.9. Predictable Quadratic Variation ............81
5.10. Ito's Formula for Continuous Processes .........87
5.11. Space of Square-integrable Martingales .........91
5.12. Ito's Formula.....................95
6. Stochastic Calculus....................96
6.1. Levy's Theorem ...................97
6.2. Brownian Martingales.................100
6.3. Exponential Processes.................103
6.4. Cameron-Martin-Girsanov Theorem...........105
7. Stochastic Differential Equations..............110
7.1. Strong Solutions...................113
iii
7.2. Martingale Problem and Weak Solutions.........119
7.3. Markov Property...................126
8. Option Pricing in Continuous Time.............128
iv
LITERATURE
There are very many books on the topics of the course. The list below is a small selection.
Discrete martingales are discussed in most advanced introductions to general probability theory. The book by David Williams is particularly close to our presentation.
For an introduction to stochastic integration we prefer the book by Chung and Williams (Ruth Williams this time). It has introductions to most of the important topics and is very well written. The two volumes by Rogers and Williams (David again) are a classic, but they are not easy and perhaps even a bit messy at times. The book by Karatzas and Shreve is more accessible. The book by Revuz and Yor I do not know, but it gets good reviews. Unlike Chung and Williams the latter two books are restricted to martingales with continuous sample paths, which obscures some interesting aspects, but also makes some things easier.
The theory of stochastic integration and much of the theory of abstract stochastic processes was originally developed by the "french school", with Meyer as the most famous proponent. Few people can appreciate the fairly abstract and detailed original books (Look for Dellacherie and Meyer, volumes 1, 2, 3, 4). The book by Elliott is in this tradition, but somewhat more readable. The first chapter of Jacod and Shiryaev is an excellent summary and reference, but is not meant for introductory reading.
The book by 0ksendal is a popular introduction. Unfortunately, at many places it is obscure and sometimes wrong, in particular in the later chapters. Unpleasant notation as well.
The book by Stroock and Varadhan is a classic on stochastic differential equations and particularly important as a source on the "martingale problem".
There are also many books on financial calculus. Some of them are written from the perspective of differential equations. Then Brownian motion is reduced to a process such that (dBt)2 = dt. The books mentioned below are of course written from a probabilistic point of view. Baxter and Rennie have written their book for a wide audience. It is interesting how they formulate "theorems" very imprecisely, but never wrong. It is good to read to get a feel for the subject. Karatzas and Shreve, and Kopp and Elliott have written rigorous mathematical books that give you less feel, but more theorems.
[1] Baxter, M. and Rennie, A., (1996). Financial calculus. Cambridge
University Press, Cambridge. [2] Chung, K.L. and Williams, R.J., (1990). Introduction to stochastic
integration, second edition. Birkhauser, London.
[3] Elliott, R.J., (1982). Stochastic calculus and applications. Springer-Verlag, New York.
V
[4] Jacod, J. and Shiryaev, A.N., (1987). Limit theorems for stochastic processes. Springer-Verlag, Berlin.
[5] Kopp, P.E. and Elliott, R.J., (1999). Mathematics and financial markets. Springer-Verlag, New York.
[6] Karatzas, I. and Shreve, S.E., (1988). Brownian motion and stochastic calculus. Springer-Verlag, Berlin.
[7] Karatzas, I. and Shreve, S.E., (1998). Methods of mathematical finance. Springer-Verlag, Berlin.
[8] 0ksendal, B., (1998). Stochastic differential equations, 5th edition. Springer, New York.
[9] Revuz, D. and Yor, M., (1994). Continuous martingales and Brown-ian motion. Springer, New York.
[10] Rogers, L.CG. and Williams, D., (2000). Diffusions, Markov Processes and Martingales, volumes 1 and 2. Cambridge University Press, Cambridge.
[11] Stroock, D.W. and Varadhan, S.R.S., (1979). Multidimensional Diffusion Processes. Springer-Verlag, Berlin.
[12] van der Vaart, A.W. and Wellner, J.A., (1996). Weak Convergence and Empirical Processes. Springer Verlag, New York.
[13] Williams, D., (1991). Probability with Martingales. Cambridge University Press, Cambridge.
vi
EXAM
The written exam will consist of problems as in these notes, questions to work out examples as in the notes or variations thereof, and will require to give precise definitions and statements of theorems plus a numbers of proofs.
The requirements for the oral exam are the same. For a very high mark it is, of course, necessary to know everything.
Very important is to be able to give a good overview of the main points of the course and their connections.
Starred sections or lemmas in the lecture notes can be skipped completely. Starred exercises may be harder than other exercises.
Proofs to learn by heart:
2.13, 2.43, 2.44 for p=2.
4.21, 4.22, 4.26, 4.28.
5.22, 5.25(i)-(iii), 5.43, 5.46 case that M is continuous, 5.52, 5.57, 5.76,
5.85.
6.1, 6.9(h),
7.7 case that Ef2 < oo and (7.5) holds for every x,y, 7.14.
1
Measure Theory
In this chapter we review or introduce a number of results from measure theory that are especially important in the following.
1.1  Conditional Expectation
Let X be an integrable random variable defined on the probability space (U,T,P). In other words X:U —> M is a measurable map (relative to T and the Borel sets on R) with E|X| < oo.
1.1 Definition. Given a sub a-Geld To C T the conditional expectation of X relative to To is a To-measurable map X'-.tt —> M such that
(1.2). EX1F = EX'1F,       for every F € To,
The random variable X' is denoted by E(X| To).
It is clear from this definition that any other .Fo-measurable map X": fi —> M such that X' = X" almost surely is also a conditional expectation. In the following theorem it is shown that conditional expectations exist and are unique, apart from this indeterminacy on null sets.
1.3 Theorem. Let X be a random variable with E|X| < oo and To C T a a-held. Then there exists an To-measurable map X'-.tt —> M such that (1.2) holds. Furthermore, any two such maps X' agree almost surely.
Proof. If X > 0, then on the u-field ^"o we can define a measure fi{F) = fF X dP. Clearly this measure is finite and absolutely continuous relative to the restriction of P to To- By the Radon-Nikodym theorem there exists
2
1: Measure Theory
an .Fo-measurable function X', unique up to null sets, such that fi{F) = fFX' dP for every F € Tq. This is the desired map X'. For a general X we apply this argument separately to X+ and X~ and take differences.
Suppose that F,(X' — X")1f = 0 for every F in a u-field for which X' — X" is measurable. Then we may choose F = {X' > X} to see that the probability of this set is zero, because the integral of a strictly positive variable over a set of positive measure must be positive. Similarly we see that the set F = {X' < X"} must be a null set. Thus X' = X" almost surely. ■
The definition of a conditional expectation is not terribly insightful, even though the name suggests an easy interpretation as an expected value. A number of examples will make the definition clearer.
A measurable map Y: ft —> (B,V) generates a u-field a(Y). We use the notation E(X\ Y) as an abbreviation of E(X\ a(Y)).
1.4 Example (Ordinary expectation). The expectation EX of a random variable X is a number, and as such can of course be viewed as a degenerate random variable. Actually, it is also the conditional expectation relative to the trivial cr-field T0 = {0, ft}. More generally, we have that E(X\F0) = EX if X and Tq are independent. In this case Tq gives "no information" about X and hence the expectation given Tq is the "unconditional" expectation.
To see this note that E(EX)1F = EXElF = EXlF for every F such that X and F are independent. □
1.5 Example. At the other extreme we have that E(X\To) = X if X itself is ^b-measurable. This is immediate from the definition. "Given Tq we then know X exactly." □
1.6 Example. Let (X, Y): ft -fMxl' be measurable and possess a density f{x,y) relative to a u-fmite product measure fi x v on ]R x IR* (for instance, the Lebesgue measure on IR*"1"1). Then it is customary to define a conditional density of X given Y = y by
This is well-defined for every y for which the denominator is positive, i.e. for all y in a set of measure one under the distribution of Y.
We now have that the conditional expection is given by the "usual formula"
where we may define the right hand zero as zero if the expression is not well-defined.
f{x\v) =
f(x,y)
f f{x,y)dn{x)'
1.1: Conditional Expectation 3
That this formula is the conditional expectation according to the abstract definition follows by a number of applications of Fubini's theorem. Note that, to begin with, it is a part of the statement of Fubini's theorem that the function on the right is a measurable function of Y. □
1.7 Example (Partitioned fi). If Tq = cr(Fi,... ,Fk) for a partition fi = U^=1Fi, then
k
E(X\T0) = ^(X\Fi)lFi,
i=l
where E{X\Fi) is defined as EXlFi/P{Fi) if P{Fi) > 0 and arbitrary otherwise. Thus the conditional expectation is constant on every of the partitioning sets i*1; (as it needs to be to be .Fo-measurable) and the constant values are equal to the average values of X over these sets.
The validity of (1.2) is easy to verify for F = Fj and every j. And then also for every F € Tq by taking sums, since every F € Tq is a union of a number of Fj's.
This example extends to u-fields generated by a countable partition of fi. In particular, F,(X\ Y) is exactly what we would think it should be if Y is a discrete random variable. □
A different perspective on an expectation is to view it as a best prediction if "best" is defined through minimizing a second moment. For instance, the ordinary expectation EX minimizes fi F,(X — fi)2 over fi € 1R. A conditional expectation is a best prediction by an .Fo-measurable variable.
1.8 Lemma (I/2-projection). If EX2 < oo, then E(X| J^q) minimizes F,(X — Y)2 over all To-measurable random variables Y.
Proof. We first show that X' = E(X| T0) satisfies EX'Z = EXZ for every ^b-measurable Z with EZ2 < oo.
By linearity of the conditional expectation we have that KX'Z = KXZ for every ^b-simple variable Z. If Z is ^b-measurable with EZ2 < oo, then there exists a sequence Zn of ^b-simple variables with E(Z„ — Z)2 —> 0. Then EX'Zn —»■ EX'Z and similarly with X instead of X' and hence EX'Z = EXZ.
Now we decompose, for arbitrary square-integrable Y,
E{X - Y)2 = E{X - X')2 + 2E{X - X'){X' — Y)+ E{X' - Y)2.
The middle term vanishes, because Z = X' — Y is ^b-measurable and square-integrable. The third term on the right is clearly minimal for X' = Y. m
4        1: Measure Theory
1.9 Lemma (Properties).
(i) EE(X\F0) = EX.
(ii) If Z is To-measurable, then F,(ZX\To) = ZF,(X\To) a.s.. (Here require that X € Lp(n,T,P) and Z € Lq(n,T,P) for 1 < p < oo and p'1 + Q-1 = I.)
(Hi) (linearity) E(al + /3Y\T0) = oE{X\T0) + /?E(Y|T0) a.s..
(iv) (positivity) IfX > 0 a.s., then E(X\T0) > 0 a.s..
(v) (towering property) If To C ^"i CT, thenE(E(X\T^\ T0) = E(X| ^"0) a.s..
(vi) (Jensen) If <j>: ffi ^ ffi is convex, then E(cj>(X)\T0) > 0(E(X| ^"0)) a.s.. (Here require that (j>{X) is integrable.)
(vii) \\E(X\To)\\P<\\X\\p (p>l).
* 1.10 Lemma (Convergence theorems).
(i) IfO<XntX a.s., then 0 < E(X„| T0) t E(X| T0) a.s..
(ii) If Xn > 0 a.s. for every n, then E(lim inf Xn\To) < lim inf E(X„| To) a.s..
(Hi) If \Xn\ < Y for every n and and integrable variable Y, and Xn J^. X, then E{Xn\T0)     E(X| T0) a.s..
The conditional expectation E(X| Y) given a random vector Y is by definition a <r(Y)-measurable function. For most Y, this means that it is a measurable function g(Y) of Y. (See the following lemma.) The value g(y) is often denoted by E(X| Y = y).
Warning. Unless P(Y = y) > 0 it is not right to give a meaning to E(X|Y = y) for a fixed, single y, even though the interpretation as an expectation given "that we know that Y = y" often makes this tempting. We may only think of a conditional expectation as a function y >->■ E(X| Y = y) and this is only determined up to null sets.
1.11 Lemma. Let {Ya:a £ A} be random variables on $7 and Jet X be a a(Ya:a € A)-measurable random variable.
(i) If A = {1,2,...,fc}, then there exists a measurable map g:Mk —> ffi such that X = g{Yi,... ,Yu).
(ii) If \A\ = oo, then there exists a countable subset {an}^=1 C A and a measurable map g: 1R00 —> ffi such that X = g(Yai, Y«2,...).
1.2  Uniform Integrability
In many courses on measure theory the dominated convergence theorem is one of the best results. Actually, domination is not the right concept, uniform integrability is.
1.2: Uniform Integrability 5
1.12 Definition. A collection {Xa:a £ A} of random variables is uniformly integrable if
lim sup E|X«|1|x„|>m = 0.
1.13 Example. A finite collection of integrable random variables is uniformly integrable.
This follows because E|X|1|x|>m —> 0 as M —> oo for any integrable variable X, by the dominated convergence theorem. □
1.14 Example. A dominated collection of random variables is uniformly integrable: if \Xa\ < Y and EY < oo, then {Xa:a € A} is uniformly integrable.
To see this note that |X*|1|x„|>m < Yly>m- □
1.15 Example. If the collection of random variables {Xa:a £ A} is bounded in L^, then it is is uniformly integrable.
This follows from the inequality E|X|1|x|>m < M_1EX2, which is valid for any random variable X.
Similarly, it suffices for uniform integrability that sup„E|X«|p < oo for some p > 1. □
1.16 EXERCISE. Show that a uniformly integrable collection of random variables is bounded in Li(U,T,P).
1.17 EXERCISE. Show that any converging sequence Xn in T, P) is uniformly integrable.
1.18 Theorem. Suppose that {Xn:n € N} C Lt(n,F,P). Then E\Xn -X\ 0 for some X € Li (fi, T, P) if and only if Xn ^. X and {Xn: n € N} is uniformly integrable.
Proof. We only give the proof of "if". (The main part of the proof in the other direction is the preceding exercise.)
If Xn J4 X, then there is a subsequence Xnj that converges almost surely to X. By Fatou's lemma E|X| < liminf E|Xnj|. If Xn is uniformly integrable, then the right side is finite and hence X € Li(U,T,P).
For any random variables X and Y and positive numbers M and N,
E|X|lm>M < E\X\1\X>N1\y\>m + NP(\Y\ > M)
(1.19) AT
<E|X|l|X|>JV+-E|Y|lm>M.
Applying this with M = N and (X, Y) equal to the four pairs that can be formed of Xn and X we find, for any M > 0,
E\Xn - X\(l\Xn\>M + 1|x|>m) < 2E|X„|l|Xn|>M + 2E|X|1|X|>M.
6        1: Measure Theory
We can make this arbitrarily small by making M sufficiently large. Next, for any e > 0,
E\Xn - X\1\x.\<m,\x\<m < e + 2MP(\Xn - X\ > e).
As n —> oo the second term on the right converges to zero for every fixed e > 0 and M. ■
1.20 EXERCISE. If {\Xn\P:n € N} is uniformly integrable (p > 1) and Xn 4 X, then E\Xn - X\?    0. Show this.
1.21 Lemma. IfXGLi (fi, T, P), then the collection of all conditional expectations E(X| To) with Tq ranging over all sub a-Eelds of T is uniformly integrable.
Proof. By Jensen's inequality |E(Jf | J^-"o)| < EdXII^o) almost surely. It therefore suffices to show that the conditional expectations E(|X| | To) are uniformly integrable. For simplicity of notation suppose that X > 0. With X' = E(X| To) and arguing as in (1.19) we see that
EX 1x>>m = EXlx'>m < EX1x>n + j^fEX .
We can make the right side arbitrarily small by first choosing N and next M sufficiently large. ■
We conclude with a lemma that is sometimes useful.
1.22 Lemma. Suppose that Xn and X are random variables such that Xn 4 X and limsupE|X„|P < E\X\? < oo for somep > 1. Then {Xn:n € N} is uniformly integrable and E\Xn - X\p -»■ 0.
1.3  Monotone Class Theorem
Many arguments in measure theory are carried out first for simple types of functions and then extended to general functions by taking limits. A monotone class theorem is meant to codify this procedure. This purpose of standardizing proofs is only partly successful, as there are many monotone class theorems in the literature, each tailored to a particular purpose. The following theorem will be of use to us.
We say that a class T-L of functions h: fi —> M. is closed under monotone limits if for each sequence {hn} C H such that 0 < hn f h for some function h, the limit h is contained in H. We say that it is closed under bounded monotone limits if this is true for every such sequence hn with a (uniformly) bounded limit. A class of sets is intersection-stable if it contains the intersection of every pair of its elements (i.e. is a 7r-system).
1.3: Monotone Class Theorem 7
1.23 Theorem. Let % be a vector space of functions h: fi —> ffi on a measurable space (fi, J7) that contains the constant functions and the indicator of every set in a collection Tq C T, and is closed under (bounded) monotone limits. If Tq is intersection-stable, then H contains all (bounded) a(J-ro)-measurable functions.
Proof. See e.g. Williams, A3.1 on p205. ■
2
Discrete Time Martingales
A stochastic process X in discrete time is a sequence Xq, X\, X%,... of random variables defined on some common probability space (fi,.?7, P). The index n of Xn is referred to as "time" and a map n Xn(oj), for a fixed oj € fi, is a sample path. (Later we replace n by a continuous parameter t € [0, oo) and use the same terminology.) Usually the discrete time set is Z+ = N U {0}. Sometimes we delete 0 or add oo to get N or Z+ = N U {0, oo}, and or delete a corresponding random variable X^ or Xq to form the stochastic process.
2.1 Martingales
A filtration {Tn} (in discrete time) on a given probability space (fi, T, P) is a nested sequence of u-fields
ToCTxC-'-CT.
The u-field Tn is interpreted as the events F of which it is known at "time" n whether F has occurred or not. A stochastic process X is said to be adapted if Xn is ^-measurable for every n > 0. The quadruple (fi, T, {Tn}, P) is called a "filtered probability space" or "stochastic basis".
A typical example of a filtration is the natural filtration generated by a stochastic process X, defined as
= &(Xo,Xi,...,Xn).
Then F € Tn if and only if F = {{X0,..., Xn) € B} for some Borel set B. Once Xq,..., Xn are realized we know whether F has occurred or not. The natural filtration is the smallest filtration to which X is adapted.
2.1: Martingales 9
2.1 Definition. An adapted, integrable stochastic process X on the filtered space (fi,T, {Tn},P) is a
(i) martingale if E(X„| T^) = Xm a.s. for all m <n.
(ii) submartingale if E(X„| Tm) > -Xm a.s. for all m < n. (ii) supermartingale if E(X„| .T7™) < Xm a.s. for all m <n.
A different way of writing the martingale property is
E(Xn-Xm\Fm) = 0, m<n.
Thus given all information at time m the expected increment Xn — Xm in the future time interval (m, n] is zero, for every initial time m. This shows that a martingale Xn can be interpreted as the total gain up to time n in a fair game: at every time m we expect to make a zero gain in the future (but may have gained in the past and we expect to keep this). In particular, the expectation EX„ of a martingale is constant in n.
Submartingales and supermartingales can be interpreted similarly as total gains in favourable and unfavourable games. If you are not able to remember which inequalities correspond to "sub" and "super", that is probably normal. It helps a bit to try and remember that a submartingale is increasing in mean: EXm < EXn if m < n.
2.2 EXERCISE. If E(Xn+1\Fn) = Xn for every n > 0, then automatically E(X„| Tjn) = Xm for every m < n and hence X is a martingale. Similarly for sub/super. Show this.
2.3 Example. Let Y\,Yi,... be a sequence of independent random variables with mean zero. Then the sequence of partial sums Xn = Y\ H-----YYn
is a martingale relative to the filtration Tn = <r{Y\,. ■ ■ ,Yn). Set X0 = 0.
This follows upon noting that for m < n the increment Xn — Xm = ^2m<i<nYi is independent of Tm and hence E(Xn - Xm\Tm) = E(Xn -Xm) =0. □
2.4 EXERCISE. In the preceding example show that <r(Yi,...,Yn) = <j(Xi,. .. ,Xn).
2.5 EXERCISE. If {N(t):t > 0} is a standard Poisson process and 0 < to < ti < ■ ■ ■ is a fixed sequence of numbers, then Xn = N(tn) — tn is a martingale relative to the filtration Tn = a(N(t):t < tn). Show this, using the fact that the Poisson process has independent increments.
2.6 Example. Let f be a fixed, integrable random variable and Tn an arbitrary filtration. Then Xn = E(f | Tn) is a martingale.
This is an immediate consequence of the towering property of conditional expectations, which gives that E(X„| Tm) = E(E(f | Tn)\Tm) = E(f | Tm) for every m <n.
10
2:  Discrete Time Martingales
By Theorem 1.18 this martingale X is uniformly integrable. Later we shall see that any uniformly integrable martingale takes this form. Moreover, we can choose f such that Xn 0%. ? as n —> oo. □
It is part of the definition of a martingale X that every of the random variables Xn is integrable. If sup„E|X„| < oo, then we call the martingale Li-bounded. If E|X„|P < oo for ally n and some p, then we call X an Lp-martingale and if supnE|X„|p < oo, then we call X Lp-bounded.
Warning. Some authors use the phrase "Lp-martingale" for a martingale that is bounded in Lp(U,!F,P). To avoid this confusion, it is perhaps better to use the more complete phrases "martingale in Lp" and "martingale that is bounded in Lp".
2.7 Lemma. If cf>: R —)• R is convex and X a martingale, then {(f>(Xn)} is a submartingale relative to the same filtration, provided that 4>(Xn) is integrable for every n.
Proof. Because a convex function is automatically measurable, the variable cj)(Xn) is adapted for every n. By Jensen's inequality E(0(X„)| Tm) > 0(E(X„| Tm)) almost surely. The right side is <j>(Xm) almost surely if m < n, by the martingale property. ■
2.8 EXERCISE. If 0: R -)• R is convex and nondecreasing and X is a submartingale, then {(j)(Xn)} is a submartingale relative to the same filtration, provided that 4>(Xn) is integrable for every n. Show this.
2.2  Stopped Martingales
If Xn is interpreted as the total gain at time n, then a natural question is if we can maximize profit by quitting the game at a suitable time. If Xn is a martingale with FjXq = 0 and we quit at a fixed time T, then our expected profit is EXr = EX0 = 0 and hence quitting the game does not help. However, this does not exclude the possibility that stopping at a random time might help. This is the gambler's dream.
If we could let our choice to stop depend on the future, then it is easy to win. For instance, if we were allowed to stop just before we incurred a big loss. This we prohibit by considering only "stopping times" as in the following definition.
2.9 Definition. A random variable T: $7 —> Z+ on ($7, T, {Tn}, P) is a stopping time if {T <n) € Tn for every n > 0.
Warning.  A stopping time is permitted to take the value oo.
2.2: Stopped Martingales 11
2.10 EXERCISE. Let X be an adapted stochastic process and let B C ffi be measurable. Show that T = mi{n:Xn € B} defines a stopping time. (Set inf 0 = oo.)
2.11 EXERCISE. Show that T is a stopping time if and only if {T = n) € Tn for all n € N U {0}.
The restriction to stopping times is natural. If we are to stop playing at time T, then for every time n = 0,1,2 ... we must know if T = n at time n. If the filtration is generated by the process X, then the event {T = n} must, for every n, depend on the history Xq, ... ,Xn of the process up to time n only, if T is a stopping time. So we are allowed to base our decision to stop on the past history of gains or losses, but not on future times.
The question now is if we can find a stopping time T such that EXr > 0. We shall see that this is usually not the case. Here the random variable Xt is defined as
(2.12) (Xt)(lo) = Xt{u)(lo).
If T can take the value oo, this requires that X^ is defined.
A first step towards answering this question is to note that the stopped process XT defined by
(XT)n(u;) = Xr(w)An(w), is a martingale whenever X is one.
2.13 Theorem. If T is a stopping time and X is a martingale, then XT is a martingale.
Proof. We can write (with an empty sum denoting zero)
71
X% = Xq + ^1;<t(X; — X;_i). i=l
Hence -Xj = ln+1<T(Xn+1-Xn). The variable ln+i<T = 1 - 1t<™ is J-'n-measurable. Taking the conditional expectation relative to Tn we find that
E(Xj+1 — X%\Fn) = 17i+i<tE(X„+i — Xn\Tn) = 0, a.s.
because X is a martingale. (To be complete, also note that \X^\ < maxi<;<„ \Xi\ is integrable for every fixed n and verify that XT is a stochastic process.) ■
12
2:  Discrete Time Martingales
2.14 EXERCISE. Show that the sub- and supermartingale properties are also retained under stopping.
If the stopped process XT is a martingale, then EXj = EXrAn is constant in n. If T is bounded and EX0 = 0, we can immediately conclude that F,Xt = 0 and hence stopping does not help. For general T we would like to take the limit as n —> oo in the relation EXrA„ = 0 and obtain the same conclusion that F,Xt = 0. Here we must be careful. If T < oo we always have that Xtau -^l Xt as n —> oo, but we need some integrability to be able to conclude that the expectations converge as well. Domination of X suffices. Later we shall see that uniform integrability is also sufficient, and then we can also allow the stopping time T to take the value oo (after defining X^ appropriately).
2.15 EXERCISE. Suppose that X is a martingale with uniformly bounded increments: \Xn+\ — Xn\ < M for every n and some constant M. Show that F,Xt = 0 for every stopping time T with ET < oo.
2.3  Martingale Transforms
Another way to try and beat the system would be to change stakes. If Xn — Xn_i is the standard pay-off at time n, we could devise a new game in which our pay-off is Cn(Xn — X„_i) at time n. Then our total capital at time n is
71
(2.16) {C-X^—Y^CiiXi-Xi-i), Y0=0.
i=l
If Cn were allowed to depend on Xn — Xn_i, then it would be easy to make a profit. We exclude this by requiring that Cn may depend on knowledge of the past only.
2.17 Definition. A stochastic process C on (U,!F,{!Fn},P) is predictable if Cn is Tn-\ measurable for every n > 1.
The process C ■ X in (2.16) is called a martingale transform of X (if X is a martingale). It is the discrete time version of the stochastic integral that we shall be concerned with later. Again we cannot beat the system: the martingale transform is a martingale.
2.4: Doob's Upcrossing Inequality
13
2.18 Theorem. Suppose that Cn € Lv(tt,T,P) and Xn € Lq(tt,T,P) for all n and some p-1 + q~x = 1.
(i) If C is predictable and X a martingale, then C ■ X is a martingale.
(ii) If C is predictable and nonnegative and X is a supermartingale, then C ■ X is a supermartingale.
Proof. If Y = C ■ X, then Yn+i — Yn = Cn(Xn+i — Xn). Because Cn is .^-measurable, E(Y„+i — Yn\ Tn) = C„E(X7l+i — Xn\ Tn) almost surely. Both (i) and (ii) are now immediate. ■
2.4 Doob's Upcrossing Inequality
Let a < b be given numbers. The number of upcrossings of the interval [a, b] by the process X in the time interval {0,1,..., n} is defined as the largest integer k for which we can find
0 < si < t\ < s-2 < t-2 < ■ ■ ■ < Sk < tk < n,
with
XSi <a,   Xti>b,      i = 1,2, ...,k.
The number of upcrossings is denoted by Un[a, b]. The definition is meant to be "w"-wise and hence Un[a, b] is a function on fi. Because the description involves only finitely many steps, Un[a, b] is a random variable.
A high number of upcrossings of [a, b] indicates that X is "variable" around the level [a,b]. The upcrossing numbers Un[a,b] are therefore an important tool to study convergence properties of processes. For super-martingales Doob's lemma gives a surprisingly simple bound on the size of the upcrossings, just in terms of the last variable.
2.19 Lemma. If X is a supermartingale, then
{b - a)EUn[a,b] <E(X„-a)".
Proof. We define a process C\,Ci,--. taking values "0" and "1" only as follows. If Xq > a, then Cn = 0 until and including the first time n that Xn < a, then Cn = 1 until and including the first time that Xn > b, next Cn = 0 until and including the first time that Xn < a, etcetera. If Xq < a, then Cn = 1 until and including the first time that Xn > b, then Cn = 0 etcetera. Thus the process is switched "on" and "off" each time the process X crosses the levels a or b. It is "on" during each crossing of the interval [a,b\.
14
2:  Discrete Time Martingales
We claim that
(2.20)
{b-a)Un[a,b] < (C-X)n + (X,
71
— a
where C ■ X is the martingale transform of the preceding section. To see this note that (C • X)n is the sum of all increments Xi — Xi-\ for which d = 1. A given realization of the process C is a sequence of n zeros and ones. Every consecutive series of ones (a "run") corresponds to a crossing of [a, b] by X, except possibly the final run (if this ends at position n). The final run (as every run) starts when X is below a and ends at Xn, which could be anywhere. Thus the final run contributes positively to (C • X)n if Xn > a and can contribute negatively only if Xn < a. In the last case it can contribute in absolute value never more than \Xn — a\. Thus if we add (Xn — a)~ to (C ■ X)n, then we obtain at least the sum of the increments over all completed crossings.
It follows from the description, that Cn depends on C\,..., C„_i and Xn-i only. Hence, by induction, the process C is predictable. By Theorem 2.18 the martingale transform C ■ X is a supermartingale and has nonincreasing mean E(C • X)n < E(C • X)0 = 0. Taking means across (2.20) concludes the proof. ■
2.5  Martingale Convergence
In this section we give conditions under which a (sub/super) martingale converges to a limit Xoo, almost surely or in pth mean. Furthermore, we investigate if we can add X^ to the end of the sequence Xq , X\,... and obtain a (sub/super) martingale Xo,Xl, ... ^X^ (with the definition extended to include the time oo in the obvious way).
2.21 Theorem. If Xn is a (sub/super) martingale with sup„E|X„| < oo, then there exists an integrable random variable X^ with Xn —»■ X^ almost surely.
Proof. If we can show that Xn converges almost surely to a limit X^ in [—00,00], then is automatically integrable, because by Fatou's lemma EIXool < liminf E\Xn\ < 00.
We can assume without loss of generality that Xn is a supermartingale. For a fixed pair of numbers a <b, let
that oj € Fa^. Because the rational numbers are dense in ]R, we can even find
If lim,
2.5: Martingale Convergence 15
such a < b among the rational numbers. The theorem is proved if we can show that P(Fa!b) = 0 for every of the countably many pairs (a, b) € <Qf •
Fix a < b and let Un[a,b] be the number of upcrossings of [a, b] on {0,..., n} by X. If oj € Faib, then Un[a, b] f oo as n —> oo and hence by monotone convergence F,Un[a,b] f oo if P(Fa!b) > 0. However, by Doob's upcrossing's inequality
(6- a)EUn[a,b] <E{Xn-a)~ < E\Xn - a\ < supE|X„| + \a\.
71
The right side is finite by assumption and hence the left side cannot increase to oo. We conclude that P(Fa}b) = 0. ■
2.22 EXERCISE. Let Xn be a nonnegative supermartingale. Show that sup„E|X„| < oo and hence Xn converges almost surely to some limit.
If we define X^ as lim Xn if this limit exists and as 0 otherwise, then, if X is adapted, X^ is measurable relative to the u-field
Too = ■ ■ ■)■
Then the stochastic process X0, X\,..., X^ is adapted to the filtration Tq,T\, ..., Too. We may ask whether the martingale property E(X„| T^) = Xm (for n>m) extends to the case n = oo. The martingale is then called closed. From Example 2.6 we know that the martingale Xm = ^X^ Tm) is uniformly integrable. This condition is also sufficient.
2.23 Theorem. If X is a uniformly integrable (sub/super) martingale, then there exists a random variable X^ such that Xn —> X^ almost surely and in L\. Moreoever,
(i) If X is a martingale, then Xn = ^X^] Tn) almost surely for every n>0.
(ii) If X is a submartingale, then Xn < ^X^] Tn) almost surely for every n>0.
Proof. The first assertion is a corollary of the preceding theorem and the fact that a uniformly integrable sequence of random variables that converges almost surely converges in L\ as well.
Statement (i) follows by taking the Li-limit as n —> oo in the equality Xm = E{Xn\?m), where we use that ||E(X„| Tm) -Epf^l Tm)\\x < \\Xn--^oo||i —> 0, so that the right side converges to F,(XTm).
Statement (ii) follows similarly (where we must note that L\-convergence retains ordering almost surely), or by the following argument. By the submartingale property, for every m < n, ~EXmlF < FjXnlp. By uniformly integrability of the process XIf we can take the limit as n —> oo in this and obtain that F,XmlF < EE(X„| J7m)lj? = EX^lp for every F € Tm. The right side equals EX^lp for X'm = ^X^T^) and hence
16
2:  Discrete Time Martingales
E(Xm - X'm)lF < 0 for every F € Tm. This implies that Xm - X'm < 0 almost surely. ■
2.24 Corollary. If £ is an integrable random variable and Xn = E(f | Tn) for a filtration {Tn}, then Xn —> E(f | Too) almost surely and in L\.
Proof. Because X is a uniformly integrable martingale, the preceding theorem gives that Xn —»■ X^ almost surely and in L\ for some integrable random variable Xoo, and Xn = FjIX^I Tn) for every n. The variable can be chosen Too measurable (a matter of null sets). It follows that E(f| Tn) = Xn = E^ool Tn) almost surely for every n and hence E£lp = EXooli? for every F € UnTn. But the set of F for which this holds is a u-field and hence Efl;? = EX^lp for every F € Too- This shows that X00 = E($|^00). -
The preceding theorem applies in particular to Lp-bounded martingales (for p > 1). But then more is true.
2.25 Theorem. If X is an Lp-bounded martingale (p > 1), then there exists a random variable Xoo such that Xn —»■ Xoo almost surely and in Lp.
Proof. By the preceding theorem Xn —> Xoo almost surely and in L\ and moreover F,(Xoo\Tn) = Xn almost surely for every n. By Jensen's inequality |Xn|» = \&{Xoo\Tn)^ < E(\Xoo\p\Fj) and hence E|Xn|» < E^p* for every n. The theorem follows from Lemma 1.22. ■
2.26 EXERCISE. Show that the theorem remains true if X is a nonnega-tive submartingale.
Warning. A stochastic process that is bounded in Lp and converges almost surely to a limit does not necessarily converge in Lp. For this \X\P must be uniformly integrable. The preceding theorem makes essential use of the martingale property of X. Also see Section 2.9.
2.6 Reverse Martingale Convergence
Thus far we have considered nitrations that are increasing. In this section, and in this section only, we consider a reverse filtration
T D Tq D T\ D ■ ■ ■ D Too = rin^n.
2.6: Reverse Martingale Convergence 17
2.27 Definition. An adapted, integrable stochastic process X on the reverse filtered space (fi, T, {Tn}, P) is a
(i) reverse martingale if E(Xm| Tn) = Xn a.s. for all m < n.
(ii) reverse submartingale if E(Xm| Tn) > Xn a.s. for all m <n. (ii) reverse supermartingale if E(Xm| Tn) < Xn a.s. for all m < n.
It is more insightful to say that a reverse (sub/super) martingale is a process X = (Xo,Xi,...) such that the sequence ... ,X2,Xi,Xo is a (sub/super) martingale as defined before, relative to the filtration • • • C Ti C T\ C Tq. In deviation from the definition of (sub/super) martingales, the time index ..., 2,1,0 then runs against the natural order and there is a "final time" 0. Thus the (sub/super) martingales obtained by reversing a reverse (sub/super) martingale are automatically closed (by the "final element" X0).
2.28 Example. If f is an integrable random variable and {Tn} an arbitrary reverse filtration, then Xn = E(f | Tn) defines a reverse martingale. We can include n = oo in this definition.
Because every reverse martingale satisfies Xn = E(X0|J-'7l), this is actually the only type of reverse martingale. □
2.29 Example. If {N(t):t > 0} is a standard Poisson process, and t\ > ti > ■ ■ ■ > 0 a decreasing sequence of numbers, then Xn = N(tn) — tn is a reverse martingale relative to the reverse filtration Tn = a(N(t): t <tn).
The verification of this is exactly the same as the for the corresponding martingale property of this process for an increasing sequence of times. □
That a reverse martingale becomes an ordinary martingale if we turn it around may be true, but it is not very helpful for the convergence results that we are interested in. The results on (sub/super) martingales do not imply those for reverse (sub/super) martingales, because the "infiniteness" is on the other end of the sequence. Fortunately, the same techniques apply.
2.30 Theorem. If X is a uniformly integrable reverse (sub/super) martingale, then there exists a random variable X^ such that Xn —> X^ almost surely and in mean as n —> oo. Moreover,
(i) If X is a reverse martingale, then E(Xm| Too) = X^ a.s. for every m.
(ii) If X is a reverse submartingale, then E(Xm| Too) > -^oo a.s. for every m.
Proof. Doob's upcrossings inequality is applicable to bound the number of upcrossings of Xq, ..., Xn, because Xn, Xn-\,..., Xq is a super martingale if X is a reverse supermartingale. Thus we can mimic the proof of Theorem 2.21 to prove the existence of an almost sure limit Xoo- By uniform integrability this is then also a limit in L\.
18
2:  Discrete Time Martingales
The submartingale property implies that EXmlj? > ~EXnlp for every F € Tn and n > m. In particular, this is true for every F € Too- Upon taking the limit as n —> oo, we see that EXmlj? > EX^lp for every F € Too- This proves the relationship in (ii). The proof of (i) is easier. ■
2.31 EXERCISE. Let {Tn} be a reverse filtration and f integrable. Show that E(f | Tn) —> E(f | Too) in L\ and in mean for Too = r\nTn. What if Xi,X-2,... are i.i.d.?
* 2.32 Example (Strong law of large numbers). A stochastic process X = {Xi,X-2, ■ ■ ■) is called exchangeable if for every n the distribution of (Xa(ij,...,Xa(nj) is the same for every permutation (u(l),...,u(n)) of (l,...,n). If E|Xl| < oo, then the sequence of averages Xn converges almost surely and in mean to a limit (which may be stochastic).
To prove this consider the reverse filtration Tn = a(Xn,Xn+i,...).
The u-field Tn "depends" onli,..., Xn only through X\ +----h Xn and
hence by symmetry and exchangeability E(Xi|Jr7l) is the same for i = 1,... ,n. Then
—        — 1 ™
Xn = E(Xn\Tn) = - VE(Ii|fJ=E(I1|fn), a.s..
The right side converges almost surely and in mean by the preceding theorem. □
2.33 EXERCISE. Identify the limit in the preceding example as E(Xl | T^) for      = ^u^u-
Because, by definition, a reverse martingale satisfies Xn = E(X0| Tn), a reverse martingale is automatically uniformly integrable. Consequently the preceding theorem applies to any reverse martingale. A reverse (sub/super) martingale is uniformly integrable as soon as it is bounded in L\. In fact, it suffices to verify that F,Xn is bounded below/above.
2.34 Lemma. A reverse supermartingale X is uniformly integrable if and only ifF,Xn is bounded above (in which case it increases to a finite limit as n —> ooj.
Proof. The expectations F,Xn of any uniformly integrable process X are bounded. The "if" part is the nontrivial part of the lemma. Suppose that X is a reverse supermartingale.
The sequence of expectations F,Xn is nondecreasing in n by the reverse supermartingale property. Because it is bounded above it converges to a finite limit. Furthermore, Xn > E(Xo| Tn) for every n and hence X~ is uniformly integrable, since E(X0| Tn) is. It suffices to show that X+ is
2.7: Doob Decomposition 19
uniformly integrable, or equivalently that EX„1x„>m —> 0 as M —> oo, uniformly in n.
By the supermartingale property and because {Xn < M} € Tn, for every M, N > 0 and every m < n,
EX„1x„>m = EX„ — EX„1x„<m < EX„ — F,Xmlxn<m = ~EXn — ~EXm + ~EXmlxn>m
N
< EXn - EXm + EX+lXm>N + ^X+.
We can make the right side arbitrarily small, uniformly in n > m, by first choosing m sufficiently large (so that EX„ — F,Xm is small), next choosing N sufficiently large and finally choosing M large. For the given m we can increase M, if necessary, to ensure that F,Xnlxn>m is also small for every 0 < n < m. m
* 2.7 Doob Decomposition
If a martingale is a model for a fair game, then non-martingale processes should correspond to unfair games. This can be made precise by the Doob decomposition of an adapted process as a sum of a martingale and a predictable process. The Doob decomposition is the discrete time version of the celebrated (and much more complicated) Doob-Meyer decomposition of a "semi-martingale" in continuous time. We need it here to extend some results on martingales to (sub/super) martingales.
2.35 Theorem. For any adapted process X there exists a martingale M and a predictable process A, unique up to null sets, both 0 at 0, such that Xn = X0 + Mn + An, for every n>0,
Proof. If we set A0 = 0 and An — An_i = E(X„ — Tn-i), then A is
predictable. In order to satisfy the equation, we must set
M0 =0,      Mn- Afn_i =Xn- Xn_i - E(Xn - Xn_i| ^n_i).
This clearly defines a martingale M.
Conversely, if the decomposition holds as stated, then E(X„ — Tn-i) = F,(An — An_i\ Tn-i), because M is a martingale. The right side is equal to An — An-i because A is predictable. ■
If Xn - Xn_i = (Mn - M„_i) + (A„. - An-i) were our gain in the nth game, then our strategy could be to play if An — An_i > 0 and not
20
2:  Discrete Time Martingales
to play if this is negative. Because A is predictable, we "know" this before time n and hence this would be a valid strategy. The martingale part M corresponds to a fair game and would give us expected gain zero. Relative to the predictable part we would avoid all losses and make all gains. Thus our expected profit would certainly be positive. We conclude that only martingales correspond to fair games.
From the fact that An — An-i = E(X„ — Tn-i) it is clear that
(sub/super) martingales X correspond precisely to the cases that the sample paths of A are increasing or decreasing.
2.8  Optional Stopping
Let T be a stopping time relative to the filtration Tn. Just as Tn are the events "known at time n", we like to introduce a a-field Tt of "events known at time T". This is to be an ordinary a-field. Plugging T into Tn would not do, as this would give something random.
2.36 Definition. The cr-Geld Tt is defined as the collection of all F C fi such that F C\ {T <n} £ Tn for all n € Z+. (This includes n = oo, where Too = o-(T0,Ti,...).)
2.37 EXERCISE. Show that TT is indeed a cr-field.
2.38 EXERCISE. Show that Tt can be equivalently described as the collection of all F C fi such that F n {T = n) € Tn for all n € Z+.
2.39 EXERCISE. Show that TT = Tn if T = n.
2.40 EXERCISE. Show that XT is Jr-measurable if {Xn: n € Z+} is adapted.
2.41 Lemma. Let S and T be stopping times. Then
(i) ifS < T, then Ts C TT.
(ii) TsnTT = Tsat-
Proof, (i). If S < T, then F n {T < n} = (F n {S < n}) n {T < n}. If F € Ts, then F n {S < n} € Tn and hence, because always {T < n} € Tn, the right side is in Tn. Thus F € Tt-
(ii). By (i) we have Tsat C TsC\Tt- Conversely, if F € TsC\Tt, then F n {S A T < n) = (F n {S < n}) U (F n {T < n}) € Tn for every n and hence F € Tsat- ■
2.8: Optional Stopping 21
If the (sub/super) martingale X is uniformly integrable, then there exists an integrable random variable X^ such that Xn —»■ X^ almost surely and in mean, by Theorem 2.23. Then we can define Xt as in (2.12), also if T assumes the value oo. The optional stopping theorem shows that in this case we may replace the fixed times m < n in the defining martingale relationship E(X„| Tm) = Xm by stopping times S <T.
2.42 Theorem. If X is a uniformly integrable supermartingale, then Xt is integrable for any stopping time T. Furthermore,
(i) IfT is a stopping time, then E(-Xoo| Tt) < Xt a.s..
(ii) If S <T are stopping times, then E(Xr| Ts) < Xs a.s..
Proof. First we note that Xt is ^r-measurable (see Exercise 2.40). For (i) we wish to prove that EX^If < EXtIf for all F € Tt- Now
EXooli? = '&^jX00\p\T=n = ^ FiX001fIt=7i,
71=0 71=0
by the dominated convergence theorem. (The "+" in the upper limit oo+ of the sums indicates that the sums also include a term n = oo.) Because F n {T = n} € Tn and E^ool Tn) < Xn for every n, the supermartingale property gives that the right side is bounded above by
^ 'EXnlF'i-T=n = EXtIf,
71=0
if Xt is integrable, by the dominated convergence theorem. This gives the desired inequality and concludes the proof of (i) for any stopping time T for which Xt is integrable.
If T is bounded, then \XT\ < \Xm,\ for n an upper bound on
T and hence Xt is integrable. Thus we can apply the preceding paragraph to see that E^ool Ttav) < ^ta» almost surely for every n. If X is a martingale, then this inequality is valid for both X and —X and hence, for every n,
Xtati = E^ool Ttav,), a-s-
for every n. If n —> oo the left side converges to Xt- The right side is a uniformly integrable martingale that converges to an integrable limit in L\ by Theorem 2.23. Because the limits must agree, Xt is integrable.
Combining the preceding we see that Xt = E(X00|Jr7') for every stopping time T if X is a uniformly integrable martingale. Then for stopping times S <T the towering property of conditional expectations gives E{X\TS) = E(E(X00|Tt)\Ts) = E(X(X)|.Fs), because Ts C TT- Applying (i) again we see that the right side is equal to Xs- This proves (ii) in the case that X is a martingale.
22
2:  Discrete Time Martingales
To extend the proof to supermartingales X, we employ the Doob decomposition Xn = X0 + Mn — An, where M is a martingale with Mq = 0 and A is a nondecreasing (predictable) process with A0 = 0. Then F,An = F,Xo — EX„ is bounded if X is uniformly integrable. Hence Ave = limA^ is integrable, and A is dominated (by A^) and hence uniformly integrable. Then M must be uniformly integrable as well, whence, by the preceding, Mt is integrable and E(Mt| Ts) = Mt- It follows that Xt = Xo + Mt — At is integrable. Furthermore, by linearity of the conditional expectation, for S <T,
because As < At implies that As < E(At\Ts) almost surely. This concludes the proof of (ii). The statement (i) (with S playing the role of T) is the special case that T = oo. ■
One consequence of the preceding theorem is that F,Xt = EX0, whenever T is a stopping time and X a uniformly integrable martingale.
Warning. The condition that X be uniformly integrable cannot be omitted.
2.9  Maximal Inequalities
A maximal inequality for a stochastic process X is a bound on some aspect of the distribution of sup„ Xn. Suprema over stochastic processes are usually hard to control, but not so for martingales. Somewhat remarkably, we can bound the norm of sup„ Xn by the supremum of the norms, up to a constant.
We start with a probability inequality.
2.43 Lemma. If X is a submartingale, then for any x > 0 and every n £ Z+,
Proof. We can write the event in the left side as the disjoint union \J™=0Fi of the events
Because Fi € Ti, the submartingale property gives EX„1 pt > EXili?; > xP(Fi), because Xi > x on Fi. Summing this over i = 0,1,..., n yields the result. ■
E(Xr| Ts) =X0+ E(MT| Ts) - E{AT\Ts) <X0 + MS-AS= Xs,
F0 = {X0 > x],   Fx = {X0 < x, Xx > x],
F2 = {X0 < x, Xx < x, X2 > x},....
2.9: Maximal Inequalities 23
2.44 Corollary. If X is a nonnegative submartingale, then for any p > 1 and p_1 + q~x = 1, and every n € Z+,
max Xi
0<i<n
< q\\X„
If X is bounded in Lp(U,!F,P), then Xn —> X^ in Lp for some random
variable X^ and
SWpXn
< qWXooWp = gsup \\Xn
Proof. Set Yn
tion),
maxo<i<„Xi. By Fubini's theorem (or partial integra-
/•oo /*oo
Ey,f = /   pxv~xP{Yn >x)dx< / pxp-2EXnl Jo Jo
Yn>:
: dX,
by the preceding lemma. After changing the order of integration and expectation, we can write the right side as
pF,(xnJ xp-2dx^
P
■EJW1.
10 < P-l
Herep/(p-l) = q and EJ^YP"1 < H^HpH^-1 ||g bY Holder's inequality. Thus EYP < H^llpliyp-1!!,. If Yn € Lp(n,T,P), then we can rearrange this inequality to obtain the result.
This rearranging is permitted only if EYP < oo. By the submartingale property 0 < Xi < E(X„| Ti), whence EXf < EXP, by Jensen's inequality. Thus EYP is finite whenever EXP is finite, and this we can assume without loss of generality.
Because X is a nonnegative submartingale, so is Xp and hence the sequence EXP is nondecreasing. If X is Lp-bounded (for p > 1), then it is uniformly integrable and hence Xn —»■ X^ almost surely for some random variable X^, by Theorem 2.23. Taking the limit as n —> oo in the first assertion, we find by the monotone convergence theorem that
EsupXP = EYP = lim EY„p < <f lim EX£ = <?psupEX£.
The supremum on the left does not increase if we extend it to n € Z+. Because \Xn — X\ is dominated by 2Yoo, we find that Xn —> X^ also in Lp
and hence EXP = lim,,
,EXP.
The results of this section apply in particular to the submartingales formed by applying a convex function to a martingale. For instance, \X\, X2 or eaX for some a > 0 and some martingale X. This yields a wealth of useful inequalities. For instance, for any martingale X,
sup|X„|    < 2sup||X„||2.
24        2:  Discrete Time Martingales
2.45 EXERCISE. Let Y\,Yi, ■ ■ ■ be an i.i.d. sequence of random variables with mean zero, set Sn = Y^i=i^i- Show that Emaxi<;<„ £? < 4ES^.
3
Discrete Time Option Pricing
In this chapter we discuss the binary tree model for the pricing of "contingent claims" such as options, due to Cox, Ross and Rubinstein. In this model the price 5*71 of a stock is evaluated and changes at the discrete time instants n = 0,1,... only and it is assumed that its increments Sn — Sn-i can assume two values only. (This is essential; the following would not work if the increments could assume e.g. three values.) We assume that S is a stochastic process on a given probability space and let Tn be its natural filtration.
Next to stock the model allows for bonds. A bond is a "risk-free investment", comparable to a deposit in a savings account, whose value increases deterministically according to the relation
the constant rn > 0 being the "interest rate" in the time interval (n — l,n). A general name for both stock and bond is "asset".
A "portfolio" is a combination of bonds and stocks. Its contents may change over time. A portfolio containing An bonds and Bn stocks at time n possesses the value
A pair of processes (A,B), giving the contents over time, is an "investment strategy" if the processes are predictable. We call a strategy "self-financing" if after investment of an initial capital at time 0, we can reshuffle the portfolio according to the strategy without further capital import. Technically this requirement means that, for every n > 1,
(3.2) AnRn-i + BnSn-i = An-iRn-i + Bn-iSn-i-
Thus the capital V„_i at time n — 1 (on the right side of the equation) is used in the time interval (n — l,n) to exchange bonds for stocks or vice
Rn = (1 + rn)Rn-i,
Ro = 1,
(3.1)
26        3:  Discrete Time Option Pricing
versa at the current prices Rn-i and Sn-\. The left side of the equation gives the value of the portfolio after the reshuffling. At time n the value changes to Vn = AnSn + BnSn, due to the changes in the values of the underlying assets.
A "derivative" is a financial contract that is based on the stock. A popular derivative is the option, of which there are several varieties. A "European call option" is a contract giving the owner of the option the right to buy the stock at some fixed time N (the "term" or "expiry time" of the option) in the future at a fixed price K (the "strike price"). At the expiry time the stock is worth Sn- If Sn > K, then the owner of the option will exercise his right and buy the stock, making a profit of Sn — K. (He could sell off the stock immediately, if he wanted to, making a profit of Sn — K.) On the other hand, if Sn < K, then the option is worthless. (It is said to be "out of the money".) If the owner of the option would want to buy the stock, he would do better to buy it on the regular market, for the price Sn, rather than use the option.
What is a good price for an option? Because the option gives a right and no obligation it must cost money to get one. The value of the option at expiry time is, as seen in the preceding discussion, (Sn — However, we want to know the price of the option at the beginning of the term. A reasonable guess would be F,(Sn — K)+, where the expectation is taken relative to the "true" law of the stock price Sn- We don't know this law, but we could presumably estimate it after observing the stock market for a while.
Wrong! Economic theory says that the actual distribution of Sn has nothing to do with the value of the option at the beginning of the term. This economic reasoning is based on the following theorem.
Recall that we assume that possible values of the stock process S form a binary tree. Given its value Sn-i at time n — 1, there are two possibilities for the value Sn. For simplicity of notation assume that
where an and bn are known numbers. We assume that given Tn-i each of the two possibilities is chosen with fixed probabilities l—pn andpn. We do not assume that we know the "true" numbers pn, but we do assume that we know the numbers (an, bn). Thus, for n > 1,
(3.3)
P(Sn — dnSn-l P(Sn = bnSn-l
) = 1-P-) = Pn-
(Pretty unrealistic, this, but good exercise for the continuous time case.) It follows that the complete distribution of the process S, given its value So at time 0, can be parametrized by a vector p = (pi,... ,pn) of probabilities.
3:  Discrete Time Option Pricing 27
3.4 Theorem. Suppose that 0 < an < 1 + rn < bn for all n and nonzero numbers an,bn. Then there exists a unique self-financing strategy (A,B) with value process V (as in (3.1)) such that
(i) V>0.
(ii) VN = (SN-K)+.
This strategy requires an initial investment of
(iii) V0_=ER]f1{SN-K)+,
where E is the expectation under the probability measure defined by (3.3) with p = (pi,... ,pn) given by
l+rn-an Pn- =    , -•
On ®n
The values p are the unique values in (0,1) that ensure that the process S defined by Sn = R^Sn is a martingale.
Proof. By assumption, given Tn-\, the variable Sn is supported on the points anSn-i and bnSn-i with probabilities 1 — pn and pn. Then
E(5„| Fn-l) = ~Pn)an +Pnbn)Sn-l-
This is equal to Sn-i = Rn-i^n-i if and only if
a\         ,        ,              Rn           i    ,                                              1 ~r~ fn Q>n ~ Pnjdn + Pnbn =   p- = 1 + rn, -B-      Pn =      ,    _-•
ftn—1 ^n Ojn
By assumption this value of pn is contained in (0,1). Thus there exists a unique martingale measure, as claimed.
The process Vn = F,(Rjf1(SN — K)+\Tn) is a j5-martingale. Given Tn-\ the variables Vn — Vn-i and Sn — Sn-i are both functions of Sn/Sn-i and hence supported on two points (dependent on Tn-i). (Note that the possible values of Sn are So times a product of the numbers an and bn and hence are nonzero by assumption.) Because these variables are martingale differences, they have conditional mean zero under pn. Together this implies that there exists a unique .T^-i-measurable variable Bn (given Tn-\ this is a "constant") such that (for n > 1)
(3.5) Vn-Vn-1=Bn(Sn-Sn-1). Given this process B, define a process A to satisfy
(3.6) AnRn_i + BnSn-i = Rn-\Vn-\.
Then both the processes A and B are predictable and hence (A, B) is a strategy. (The values (Aq,Bq) matter little, because we change the portfolio to (Ai,B{) before anything happens to the stock or bond at time 1; we can choose (A0,B0) = (A^Bi).)
28        3:  Discrete Time Option Pricing
The preceding displays imply
An + BnSn-i = Vn-i,
An + BnSn = Vn-i + Bn(Sn - Sn-i) = Vn,      by (3.5),
Rn-A-n + BnSn = RnVn.
Evaluating the last line with n — 1 instead of n and comparing the resulting equation to (3.6), we see that the strategy {A,B) is self-financing.
By the last line of the preceding display the value of the portfolio (An,Bn) at time n is
Vn = RnVn = RnE(R^{SN - K)+\Tn).
At time n this becomes Vn = (Sn — K)+. At time 0 the value is Vq = RoF,Rn1(Sn - K)+. That V > 0 is clear from the fact that V > 0, being a conditional expectation of a nonnegative random variable.
This concludes the proof that a strategy as claimed exists. To see tat it is unique, suppose that (A, B) is an arbitrary self-financing strategy satisfying (i) and (ii). Let Vn = AnRn + BnSn be its value at time n, and define Sn = R^Sn and Vn = i?"1 V„, all as before. By the first paragraph of the proof there is a unique probability measure p making S into a martingale. Multipyling the self-financing equation (3.2) by R~\, we obtain (for n > 1)
Vn-i = An + BnSn-i = An_i + Bn_iSn-i.
Replacing n — 1 by n in the second representation of Vn-i yields Vn = An+BnSn. Subtracting from this the first representation of Vn-i, we obtain that
Vn — Vn-1 = Bn(Sn ~ Sn-l)-
Because S is a p-martingale and B is predictable, V is a j5-martingale as well. In particular, Vn = E(VV| Fn) for every n < N. By (ii) this means that V is exactly as in the first part of the proof. The rest must also be the same. ■
A strategy as in the preceding theorem is called a "hedging strategy". Its special feature is that given an initial investment of Vq at time zero (to buy the portfolio (A0,B0)) it leads with certainty to a portfolio with value (Sn — K)+ at time N. This is remarkable, because S is a stochastic process. Even though we have limited its increments to two possibilities at every time, this still allows 2^ possible sample paths for the process Si,...,Sn, and each of these has a probability attached to it in the real world. The hedging strategy leads to a portfolio with value (Sn — K)+ at time N, no matter which sample path the process S will follow.
The existence of a hedging strategy and the following economic reasoning shows that the initial value Vq = F,Rn1(Sn — K)+ is the only right price for the option.
3:  Discrete Time Option Pricing 29
First, if the option were more expensive than Vo, then nobody would buy it, because it would cost less to buy the portfolio (A0,B0) and go through the hedging strategy. This is guaranteed to give the same value (Sn — K)+ at the expiry time, for less money.
On the other hand, if the option could be bought for less money than Vo) then selling a portfolio (Aq,Bq) and buying an option at time 0 would yield some immediate cash. During the term of the option we could next implement the inverse hedging strategy: starting with the portfolio (—A0,— B0) at time 0, we reshuffle the portfolio consecutively at times n = 1,2,..., N to (—An, —Bn). This can be done free of investment and at expiry time we would possess both the option and the portfolio (—Am, — Bn), i.e. our capital would be —Vjv + {Sn — K)+, which is zero. Thus after making an initial gain of Vo minus the option price, we would with certainty break even, no matter the stock price: we would be able to make money without risk. Economists would say that the market would allow for "arbitrage". But in real markets nothing comes free; real markets are "arbitrage-free".
Thus the value Vo = FiRn1(Sn — K)+ is the only "reasonable price". As noted before, this value does not depend on the "true" values of the probabilities (pi,... ,pn)'- the expectation must be computed under the "martingale measure" given by (fti,... ,pn). It depends on the steps («i, bi,..., an, bn), the interest rates rn, the strike price K and the value So of the stock at time 0. The distribution of Sn under p is supported on at most 2^ values, the corresponding probabilities being (sums of) products over the probabilities pi. We can write out the expectation as a sum, but this is not especially insightful. (Below we compute a limiting value, which is more pleasing.)
The martingale measure given by p is the unique measure (within the model (3.3)) that makes the "discounted stock process" R^Sn into a martingale. It is sometimes referred to as the "risk-free measure". If the interest rate were zero and the stock process a martingale under its true law, then the option price would be exactly the expected value F,(Sn — K)+ of the option at expiry term. In a "risk-free world we can price by expectation".
The discounting of values, the premultiplying with i?"1 = Yl7=i 0-+ri), expresses the "time value of money". A capital v at time 0 can be increased to a capital Rnv at time n in a risk-free manner, for instance by putting it in a savings account. Then a capital v that we shall receive at time n in the future is worth only R~xv today. For instance, an option is worth (SN - K)+ at expiry time N, but only RJ^^n - K)+ at time 0. The right price of the option is the expectation of this discounted value "in the risk-free world given by the martingale measure".
The theorem imposes the condition that an < 1 + rn < bn for all n. This condition is reasonable. If we had a stock at time n — 1, worth Sn-i, and kept on to it until time n, then it would change in value to either anSn-i or bnSn-i. If we sold the stock and invested the money in bonds,
30        3:  Discrete Time Option Pricing
then this capital would change in value to (1 + rn)Sn-i. The inequality 1 + rn < an < bn would mean that keeping the stock would always be more advantageous; nobody would buy bonds. On the other hand, the inequality dn < bn < 1 + rn would mean that investing in bonds would always be more advantageous. In both cases, the market would allow for arbitrage: by exchanging bonds for stock or vice versa, we would have a guaranteed positive profit, no matter the behaviour of the stock. Thus the condition is necessary for the market to be "arbitrage-free".
3.7 EXERCISE. Extend the theorem to the cases that:
(i) the numbers (an, bn) are predictable processes.
(ii) the interest rates rn form a stochastic process.
3.8 EXERCISE. Let ei,ei,... be i.i.d. random variables with the uniform distribution on {—1,1} and set Xn = ^2™=1£i- Suppose that Y is a martingale relative to Tn = a(Xi,... ,Xn). Show that there exists a predictable process C such that Y = Y0 + C ■ X.
We might view the binary stock price model of this section as arising as a time discretization of a continuous time model. Then the model should become more realistic by refining the discretization. Given a fixed time t > 0, we might consider the binary stock price model for (So, Si,..., Sn) as a discretization on the grid 0, T/N, 2T/N,..., T. Then it would be reasonable to scale the increments (an, bn) and the interest rates rn, as they will reflect changes on infinitesimal intervals as N —> oo. Given T > 0 consider the choices
These choices can be motivated from the fact that the resulting sequence of binary tree models converges to the continuous time model that we shall discuss later on.
Combining (3.3) and (3.9) we obtain that the stock price is given by
where Xn is the number of times the stock price goes up in the time span
It is thought that a realistic model for the stock market has jumps up and down with equal probabilities. Then Xn is binomially (N, |)-distributed and the "log returns" satisfy
(3.9)
1 + rn,n
erT/N.
Sn = Soe^T + aVT{2XN-N)),
1,2,...,JV.
log -JL = vT + o-VT So
XN - N/2 VN/2
N(nT,a2T),
3:  Discrete Time Option Pricing 31
by the Central limit theorem. Thus in the limit the log return at time T is log normally distributed with drift fiT and variance a2T.
As we have seen the true distribution of the stock prices is irrelevant for pricing the option. Rather we need to repeat the preceding calculation using the martingale measure p. Under this measure Xn is binomially(iV, Jjjv) distributed, for
erT/N _ ^T/N-ay/T/N
PN = -7=-7=
e^T/N+a^/T/N _ e^T/N-a^/T/N
= 1-1, [Z (»+1^-r) + 0(l.)
2    2\N\       a       J+ U\NJ' by a Taylor expansion. Then j5jv(1 — Pn) —)■ 1/4 and
-» N({r-l<r2)T,a2T).
Thus, under the martingale measure, in the limit the stock at time T is log normally distributed with drift (r — \<t2)T and variance a2T.
Evaluating the (limiting) option price is now a matter of straightforward integration. For an option with expiry time T and strike price K it is the expectation of e~rT(ST — where log^-r/Sb) possesses the log normal distribution with parameters (r — ^cr2)T and variance a2T. This can be computed to be
log(S0/JQ + (r + \a2)T^    ^_rT^ AoS(S0/K) + (r - \a2)T
This is the formula found by Black and Scholes in 1973 using a continuous time model. We shall recover it later in a continuous time set-up.
4
Continuous Time Martingales
In this chapter we extend the theory for discrete time martingales to the continuous time setting. Besides much similarity there is the important difference of dealing with uncountably many random variables, which is solved by considering martingales with cadlag sample paths.
4.1  Stochastic Processes
A stochastic process in continuous time is a collection X = {Xt: t > 0} of random variables indexed by the "time" parameter t € [0, oo) and defined on a given probability space (fi, T, P). Occasionally we work with the extended time set [0, oo] and have an additional random variable Xoa.
The finite-dimensional marginals of a process X are the random vectors (Xtl,..., Xtk), for t\,..., tk ranging over the time set and fc € N, and the marginal distributions of X are the distributions of these vectors. The maps 11—^ Xt(uj), for oj € fi, are called sample paths. Unless stated otherwise the variables Xt will be understood to be real-valued, but the definitions apply equally well to vector-valued variables.
Two processes X and Y defined on the same probability space are equivalent or each other's modification if (XtlXtk) = (Ytl,..., Ytk) almost surely. They are indistinguishable iiP(Xt = Yt, Vt) = 1. Both concepts express that X and Y are the "same", but indistinguishability is quite a bit stronger in general, because we are working with an uncountable set of random variables. However, if the sample paths of X and Y are determined by the values on a fixed countable set of time points, then the concepts agree. This is the case, for instance, if the sample paths are continuous, or more generally left- or right continuous. Most of the stochastic processes that we shall be concerned with possess this property. In particular, we
4.1: Stochastic Processes 33
often consider cadlag processes (from "continu a droite, limite a gauche"): processes with sample paths that are right-continuous and have limits from the left at every point t > 0. If X is a left- or right-continuous process, then
Xt- =   lim Xa,       and       Xt+ =   lim Xs
define left- and right-continuous processes. These are denoted by X_ and X+ and referred to as the left- or right-continuous version of X. The difference AX: = X+ — X- is the jump process of X. The variable X0- can only be defined by convention; it will usually be taken equal to 0.
A filtration in continuous time is a collection {Tt}t>o of sub u-fields of T such that Ts C Tt whenever s < t. A typical example is the natural filtration Tt = a(Xs:s < t) generated by a stochastic process X. A stochastic process X is adapted to a filtration {Tt} if Xt is .^-measurable for every t. The natural filtration is the smallest filtration to which X is adapted. We define Too = cr(Tt:1 > 0). As in the discrete time case, we call a probability space equipped with a filtration a filtered probability space or a "stochastic basis". We denote it by (fi,T, {Tt},P), where it should be clear from the notation or the context that t is a continuous parameter.
Throughout, without further mention, we assume that the probability space (fi, T, P) is complete. This means that every subset of a null set (a null set being a set F € T with P(F) = 0) is contained in T (and hence is also a null set). This is not a very restrictive assumption, because we can always extend a given u-field and probability measure to make it complete. (This will make a difference only if we would want to work with more than one probability measure at the same time.)
We also always assume that our filtration satisfies the usual conditions: for all t > 0:
(i) (completeness): Tt contains all null sets.
(ii) (right continuity): Tt = C\s>tTs.
The first condition can be ensured by completing a given filtration: replacing a given Tt by the u-field generated by Tt and all null sets. The second condition is more technical, but turns out to be important for certain arguments. Fortunately, the (completions of the) natural nitrations of the most important processes are automatically right continuous. Furthermore, if a given filtration is not right continuous, then we might replace it by the the filtration C\s>tTs, which can be seen to be right-continuous.
Warning. The natural filtration of a right-continuous process is not necessarily right continuous.
Warning-. When completing a filtration we add all null sets in (U,T,P) to every Tt. This gives a bigger filtration than completing the space (U,Tt,P) for every t > 0 separately.
4.1 EXERCISE (Completion). Given an arbitrary probability space (fi, T, P), let T be the collection of all sets F U N for F ranging over T
34        4:  Continuous Time Martingales
and n ranging over all subsets of null sets, and define P(f U n) = P(f). Show that (fi, T, P) is well-defined and a probability space.
4.2 EXERCISE. Let (fi, T, P) be a complete probability space and T0 C T a sub u-field. Show that the u-field generated by Tq and the null sets of (fi, T, P) is the collection of all f € T such that there exists f0 € Tq with P(f af0) = 0; equivalently, all f € T such that there exists f0 € Tq and null sets n, n' with f0 — n c f c f0u n'.
4.3 EXERCISE. Show that the completion of a right-continuous filtration is right continuous.
4.4 EXERCISE. Show that the natural filtration of the Poisson process is right continuous. (More generally, this is true for any counting process.)
4.2 Martingales
The definition of a martingale in continuous time is an obvious generalization of the discrete time case.
4.5 Definition. An adapted, integrable stochastic process X on the filtered space (fi, T, {Tt}, P) is a
(i) martingale if E(X4| Ts) = Xs a.s. for all s <t.
(ii) submartingale if E(X4| Ts) > Xs a.s. for all s <t. (ii) supermartingale if E(X4| Ts) < Xs a.s. for all s <t.
The (sub/super) martingales that we shall be interested in are cadlag processes. It is relatively straightforward to extend results for discrete time martingales to these, because given a (sub/super) martingale X:
(i) If 0 < ti < ti < ■ ■ ■, then Yn = Xtn defines a (sub/super) martingale relative to the filtration Qn = Ttn ■
(ii) If to > h > ■■■ > 0, then Yn = Xtn defines a reverse (sub/super) martingale relative to the reverse filtration Qn = Ttn ■
Thus we can apply results on discrete time (sub/super) martingales to the discrete time "skeletons" Xtn formed by restricting X to countable sets of times. If X is cadlag, then this should be enough to study the complete sample paths of X.
The assumption that X is cadlag is not overly strong. The following theorem shows that under the simple condition that the mean function 11—^ EXt is cadlag, a cadlag modification of a (sub/super) martingale always exists. Because we assume our nitrations to be complete, such a modification is automatically adapted. Of course, it also satisfies the (sub/super)
4.2: Martingales 35
martingale property and hence is a (sub/super) martingale relative to the original filtration. Thus rather than with the original (sub/super) martingale we can work with the modification.
We can even allow nitrations that are not necessarily right-continuous. Then we can both replace X by a modification and the filtration by its "right-continuous version" Tt+ = r\s>tTs and still keep the (sub/super) martingale property, provided that X is right continuous in probability. (This is much weaker than right continuous.) In part (ii) of the following theorem, suppose that the filtration is complete, but not necessarily right-continuous.
4.6 Theorem. Let X be a (sub/super) martingale relative to the complete filtration {Tt}.
(i) If the filtration {Tt} is right continuous and the map 1F,Xt is right continuous, then there exists a cadlag modification of X.
(ii) If X is right continuous in probability, then there exists a modification of X that is a cadlag (sub/super) martingale relative to the filtration
Proof. Assume without loss of generality that X is a super martingale. Then Xs > ¥j(Xt\Ts) almost surely for every s < t, whence X~ < Fj(Xt\Ts) almost surely and hence {^s~:0 < s < t} is uniformly integrate. Combined with the fact that t F,Xt is decreasing and hence bounded on compacts, if follows that F,\Xt\ is bounded on compacts. For fixed T and every a <b, define the event
Fab = \oj\3t € [0,T): liminf Xs(oj) <a<b< limsupXs(w),
or     lim inf Xs (oj) < a < b < lim sup Xs (oj) >
(The symbol s ft t denotes a limit as s f t with s restricted to s < t.) Let Q fl [0,T) = {h,t2,...} and let Un[a,b] be the number of upcrossings of [a, b] by the process XtlXtn put in its natural time order. If a; € Fa^, then Un[a,b] t oo. However, by Doob's upcrossings lemma E£/„[a, b] < sup0<i<TE|Xt| + \a\. We conclude that P(Fa!b) = 0 for every a < b and hence the left and right limits
Xt- =    lim   Xs,       Xt+ =    lim Xs
exist for every t € [0, T), almost surely. If we define these processes to be zero whenever one of the limits does not exist, then Xt+ is J-"t+-adapted. Moreover, from the definitions Xt+ can be seen to be right-continuous with left limits equal to Xt-. By Fatou's lemma Xt+ is integrable.
We can repeat this for a sequence Tn f oo to show that the limits Xt-and Xt+ exist for every t € [0, oo), almost surely. Setting Xt+ equal to zero
36        4:  Continuous Time Martingales
on the exceptional null set, we obtain a cadlag process that is adapted to
By the super martingale property F,Xslp > F,XtlF for every F € Ts and s < t. Given a sequence of rational numbers tn 44 t, the sequence {Xtn } is a reverse super martingale. Because EXtn is bounded above, the sequence is uniformly integrable and hence Xtn —> Xt+ both almost surely (by construction) and in mean. We conclude that F,Xslp > F,Xt+lF for every F € Ts and s < t. Applying this for every s = sn and sn a sequence of rational numbers decreasing to some fixed s, we find that ~EXs+lp > ~EXt+lF for every F € T8+ = C\nF8n and s < t. Thus {Xt+:t > 0} is a supermartingale relative to Tt+.
Applying the first half of the argument of the preceding paragraph with s = t we see that F,XtlF > EXt+l;r for every F € Tt- If Tt+ = Tt, then Xt—Xt+ is ^-measurable and we conclude that Xt—Xt+ > 0 almost surely. If, moreover, t F,Xt is right continuous, then F,Xt = lim™-^ F,Xtn = EXt+, because Xtn —> Xt+ in mean. Combined this shows that Xt = Xt+ almost surely, so that Xt+ is a modification of X. This concludes the proof of (i).
To prove (ii) we recall that Xt+ is the limit in mean of a sequence Xtn for tn 4-4- * - If X is right continuous in probability, then Xtn —> Xt in probability. Because the limits in mean and in probability must agree almost surely, it follows that Xt = Xt+ almost surely. ■
In particular, every martingale (relative to a "usual filtration") possesses a cadlag modification, because the mean function of a martingale is constant and hence certainly continuous.
4.7 Example. If for a given filtration {Tt} and integrable random variable f we "define" Xt = E(f| Tt), then in fact Xt is only determined up to a null set, for every t. The union of these null sets may have positive probability and hence we have not defined the process X yet. Any choice of the conditional expectations Xt yields a martingale X. By the preceding theorem there is a choice such that X is cadlag. □
4.8 EXERCISE. Given a standard Poisson process {Nt:t > 0}, let Tt be the completion of the natural filtration a(Ns:s < t). (This can be proved to be right continuous.) Show that:
(i) The process Nt is a submartingale.
(ii) The process Nt — t is a martingale.
(hi) The process (Nt — t)2 — t is a martingale.
4.9 EXERCISE. Show that every cadlag super martingale is right continuous in mean. (Hint: use reverse super martingale convergence, as in the proof of Theorem 4.6.)
4.3: Martingale Convergence 37
4.3  Martingale Convergence
The martingale convergence theorems for discrete time martingales extend without surprises to the continuous time situation.
4.10 Theorem. If X is a uniformly integrable, cadlag (sub/super) martingale, then there exists an integrable random variable X^ such that Xt —>       almost surely and in L\ as t —> oo.
(i) If X is a martingale, then Xt = E(Xoc| J^) a.s. for all t > 0.
(ii) If X is a submartingale, then Xt < E(Xoc| J^) a.s. for t > 0. Furthermore, if X is Lp-bounded for some p > 1, then Xt —> Xoo also in
Proof. In view of Theorems 2.23 and 2.25 every sequence Xtn for t\ < ti < ■■■—>■ oo converges almost surely, in L\ or in Lp to a limit X^. Then we must have that Xt —> X^ in L\ or in Lp as t —> oo. Assertions (i) and (ii) follow from Theorem 2.23 as well.
The almost sure convergence of Xt as t —> oo requires an additional argument, as the null set on which a sequence Xtn as in the preceding paragraph may not converge may depend on the sequence {tn}. In this part of the proof we use the fact that X is cadlag. As in the proof of Theorem 2.21 it suffices to show that for every fixed numbers a < b the event
Fab = \ u: lim inf Xt(u>) < a <b < lim sup Xt(u>) \
' I        t-)-oo t^oo J
is a null set. Assume that X is a supermartingale and for given t\,...,tn let Un[a,b] be the number of upcrossings of [a,b] by the process Xtl,...,Xtn put in its natural time order. By Doob's upcrossing's inequality, Lemma 2.19,
(6- a)EUn[a,b] < supE|Xt| + \a\.
t
If we let Q = {ii, *2, • • •}> then Un[a, b] f oo on Faji, in view of the right-continuity of X. We conclude that P(Faji) = 0. ■
4.4 Stopping
The main aim of this section is to show that a stopped martingale is a martingale, also in continuous time, and to extend the optional stopping theorem to continuous time.
38        4:  Continuous Time Martingales
4.11 Definition. A random variable T: tt —>• [0, oo] is a stopping time if {T < t) € Tt for every t > 0.
Warning. Some authors use the term optional time instead of stopping time. Some authors define an optional time by the requirement that {T < t} € Tt for every t > 0. This can make a difference if the filtration is not right-continuous.
4.12 EXERCISE. Show that T: tt ->■ [0,oo] is a stopping time if and only if {T < t} € Tt for every t > 0. (Assume that the filtration is right-continuous.)
4.13 Definition. The cr-Geld Tt is defined as the collection of all F C tt such that F n{T <t} £ Tt for all t € [0, oo]. (This includes t = oo, where Too=a(Tt:t>0).)
The collection Tt is indeed a u-field, contained in Too C T, and Tt = Tt'iiT = t. Lemma 2.41 on comparing the u-fields Ts and Tt also remains valid as stated. The proofs are identical to the proofs in discrete time. However, in the continuous time case it would not do to consider events of the type {T = t} only. We also need to be a little more careful with the definition of stopped processes, as the measurability is not automatic. The stopped process XT and the variable Xt are defined exactly as before:
(XT)t{uj) = XT{u)M{u),      Xt(lo) = XT{u){u).
In general these maps are not measurable, but if X is cadlag and adapted, then they are. More generally, it suffices that X is "progressively measurable". To define this concept think of X as the map X: [0,oo) x tt —> M given by
(t,u) i ^ Xt(u).
The process X is measurable if X is measurable relative to the product a-field Boo x i-e. if it is "jointly measurable in (t, w)" relative to the product u-field. The process X is progressively measurable if, for each t > 0, the restriction X: [0,t] x tt —> M is measurable relative to the product u-field Bt x Tt- This is somewhat stronger than being adapted.
4.14 EXERCISE. Show that a progressively measurable process is adapted.
4.15 Lemma. If the process X is progressively measurable and T is a stopping time, then:
(i) XT is progressively measurable (and hence adapted).
(ii) Xt is TT-measurable (and hence a random variable).
4.4: Stopping 39
(In (ii) it is assumed that is defined and Too-measurable ifT assumes the value oo.J
Proof. For each t the map T A t: tt —> [0, oo] is Tt measurable, because {T At > s} = {T > s} € Ta C Tt if s < t and {T At > s} is empty if s > t. Then the map
(s,oj)     (s,T(oj) At,u>) i-)- (s A T(w),w), [0, l]x!l-> [0, i] x[0,(]xfl-» [0, t] x n,
is BtxTt — BtxBtxTt — BtX ^t-measurable. The stopped process XT as a map on [0, t] x tt is obtained by composing X: [0, t] x fi —>■ R to the right side and hence is Z?t x ^-measurable, by the chain rule. That a progressively measurable process is adapted is the preceding exercise.
For assertion (ii) we must prove that {Xt € B} n {T < t} € Tt for every Borel set B and t € [0, oo]. The set on the left side can be written as {Xtm € B}C\{T < t}. For t < oo this is contained in Tt by (i) and because T is a stopping time. For t = oo we note that {Xt € B} = Ut{XrAt € B) n {T < i) U {Xoo € .B} n {T = oo} and this is contained in Tx. m
4.16 Example (Hitting time). Let X be an adapted, progressively measurable stochastic process, B a Borel set, and define
T = inf{i > 0:Xt € B}.
(The infimum of the empty set is defined to be oo.) Then T is a stopping time.
Here X = (Xi,..., Xd) may be vector-valued, where it is assumed that all the coordinate processes Xi are adapted and progressively measurable and B is a Borel set in Md.
That T is a stopping time is not easy to prove in general, and does rely on our assumption that the filtration satisfies the usual conditions. A proof can be based on the fact that the set {T < t} is the projection on tt of the set {(s,uj): s < t,Xs(uj) € B}. (The projection on tt of a subset A C T x tt of some product space is the set {uj: 3t > 0: (t,cj) €}.) By the progressive measurability of X this set is measurable in the product u-field Bt x Tt-By the projection theorem (this is the hard part), the projection of every product measurable set is measurable in the completion. See Elliott, p50.
Under special assumptions on X and B the proof is more elementary. For instance, suppose that X is continuous and that B is closed. Then, for t > 0,
{T<t} = f]   |J {d(Xs,B)<n-1}.
n s<t,seQ
The right side is clearly contained in Tt- Furthermore, by the continuity of X and the closedness of B we have {T = 0} = {X0 € B} and this is contained in ^o-
40        4:  Continuous Time Martingales
To establish the preceding display, we note first that the event {T = 0} is contained in both sides of the equation. Furthermore, it is easy to see the inclusion of right side in left side; we now prove the inclusion of left in right side. By the definition of T and continuity of X, the function t i—^ d(Xt,B) must vanish at t = T and be strictly positive on [0,T) if T > 0. By continuity this function assumes every value in the interval [0, d(X0, B)] on the interval [0, T]. In particular, for every n € N there must be some rational number s € (0,T) such that d(Xs,B) < n~x. □
4.17 EXERCISE. Give a direct proof that T = inf{i: Xt € B} is a stopping time if B is open and X is right-continuous. (Hint: consider the sets {T < t} and use the right-continuity of the filtration.)
4.18 EXERCISE. Let X be a continuous stochastic process with X0 = 0 and T = inf{£ > 0: \Xt\ > a} for some a > 0. Show that T is a stopping time and that \XT\ < a.
4.19 Lemma. If X is adapted and right continuous, then X is progressively measurable. The same is true if X is adapted and left continuous.
Proof. We give the proof for the case that X is right continuous. For fixed f > 0, let 0 = *q < f™ < ■ ■ ■ < f£ = t be a sequence of partitions of [0,t] with mesh widths tending to zero as n —> oo. Define Xn to be the discretization of X equal to Xtv on and equal to Xt at {t}. By
right continuity of X, X™(uj) —> Xs(oj) as n —> oo for every (s, oj) € [0, t] x fi. Because a pointwise limit of measurable functions is measurable, it suffices to show that every of the maps Xn: [0, t] xfi —> M is Bt x ^-measurable. Now {Xn € B} can be written as the union of the sets \t™_1, £") x {oj: Xtv (w) € B} and the set {t} x {uj: Xt(uj) € B} and each of these sets is certainly contained in B[ x J(. ■
Exactly as in the discrete time situation a stopped (sub/super) martingale is a (sub/super) martingale, and the (in) equalities defining the (sub/super) martingale property remain valid if the (sub/super) martingale is uniformly integrable and the times are replaces by stopping times. At least if we assume that the (sub/super) martingale is cadlag.
4.20 Theorem. If X is a cadlag (sub/super) martingale and T is a stopping time, then XT is a (sub/super) martingale.
Proof. We can assume without loss of generality that X is a submartingale. For n € N define Tn to be the upward discretization of T on the grid 0 < 2"™ < 22"™ < • • ■; i.e. Tn = k2~n if T € [{k - 1)2"™, k2~n) (for k € N) and Tn = oo if T = oo. Then Tn I T as n —> oo and by right continuity Xrn/\t —> Xtm for all t, pointwise on fi. For fixed t > 0 let knftrn be the biggest point k2~n on the grid smaller than or equal to t.
4.4: Stopping 41
For fixed t the sequence
X0,X2-n,X22-n, . . .,Xknt2-n,Xt
is a submartingale relative to the filtration
To C .7-2-" C ^"22-" C • • • C Tknt2~n C J-j.
Here the indexing by numbers k2~n or t differs from the standard indexing by numbers in Z+, but the interpretation of "submartingale" should be clear. Because the submartingale has finitely many elements, it is uniformly integrable. (If you wish, you may also think of it as an infinite sequence, by just repeating Xt at the end.)
Both Tn At and T„_i At can be viewed as stopping times relative to this filtration. For instance, the first follows from the fact that {Tn < k2~n} = {T < k2~n} € Tk2-n for every k, and the fact that the minimum of two stopping times is always a stopping time. For T„_i we use the same argument and also note that the grid with mesh width 2~n+1 is contained in the grid with mesh width 2~n. Because T„_i At > TnAt, the optional stopping theorem in discrete time, Theorem 2.42, gives E(Xrn_lAt| TrnAt) > ^t„a* almost surely. Furthermore, E(XrnAt| To) > X0 and hence EXrnAt > EX0.
This being true for every n it follows that Xtx a*, Xt2m, • • • is a reverse submartingale relative to the reverse filtration T^At 3 ^T2At D ■ ■ ■ with mean bounded below by ~EXo- By Lemma 2.34 {XrnAt} is uniformly integrable. Combining this with the first paragraph we see that XrnAt —>■ ^ta* in L\, as n —> oo.
For fixed s < t the sequence
Xo,X2-n,. . . ,^,„2-",^, • • • ,Xkt^2-n,Xt
is a submartingale relative to the filtration
To C T2-» C • • • C Tkan2— CTSC---C Tktn2— C Tt.
The variable TnAt is a stopping time relative to this set-up. By the extension of Theorem 2.13 to submartingales the preceding process stopped at TnAt is a submartingale relative to the given filtration. This is the process
Xo,X2-™ATn, . . . ,Xkan2-"-atn^Xsatn,- ■ ■ ,Xktn2-"-atn,Xtatn-
In particular, this gives that
E(Xt„m\Fs) > XrnAs, a.s..
As n —> oo the left and right sides of the display converge in L\ to E(XrAt| Ts) and Xtas- Because Li-convergence implies the existence of an almost surely converging subsequence, the inequality is retained in the limit in an almost sure sense. Hence E(Xta*| Ts) > Xtas almost surely. ■
A uniformly integrable, cadlag (sub/super) martingale X converges in Li to a limit X^, by Theorem 4.10. This allows to define Xt also if T takes the value oo.
42        4:  Continuous Time Martingales
4.21 Theorem. If X is a uniformly integrable, cadlag submartingale and S <T are stopping times, then Xs and Xt are integrable and E(Xr| Ts) > Xs almost surely.
Proof. Define Sn and Tn to be the discretizations of S and T upwards on the grid 0 < 2~™ < 22~™ < ■ ■ defined as in the preceding proof. By right continuity X$n —> Xs and Xrn —> Xt pointwise on fi. Both Sn and Tn are stopping times relative to the filtration To C T^-™ C ■ ■ ■, and Xq, X2-n,... is a uniformly integrable submartingale relative to this filtration. Because Sn < Tn the optional sampling theorem in discrete time, Theorem 2.42, yields that X$n and Xt„ are integrable and E(Xrn | Tsn) > X$n almost surely. In other words, for every F € Ts„,
EXTn If ^ ^Xsn 1 f•
Because S < Sn we have Ts C Tsn and hence the preceding display is true for every F € Ts- If the sequences X$n and Xt„ are uniformly integrable, then we can take the limit as n —> oo to find that EXtIf > EXgli? for every F € Ts and the proof is complete.
Both T„_i and Tn are stopping times relative to the filtration C •F2-» C • • • and Tn < T„_i. By the optional stopping theorem in discrete time F,(XTn_1\ Tt„) > Xt„, since X is uniformly integrable. Furthermore, E(Xrn | ^o) > -^o and hence EXrn > EX0. It follows that {Xt„ } is a reverse submartingale relative to the reverse filtration Ttx 3 Tt2 D ■ ■ ■ with mean bounded below. Therefore, the sequence {Xrn } is uniformly integrable by Lemma 2.34. Of course, the same proof applies to {Xsn}. ■
If X is a cadlag, uniformly integrable martingale and S <T are stopping times, then E(Xt| Ts) = Xs, by two applications of the preceding theorem. As a consequence the expectation EXr of the stopped process at oo is equal to the expectation EX0 for every stopping time T. This property actually characterizes uniformly integrable martingales.
4.22 Lemma. Let X = {Xt: t € [0, oo]} be a cadlag adapted process such that Xt is integrable with EXr = EX0 for every stopping time T. Then X is a uniformly integrable martingale.
Proof. For a given F € Tt define the random variable T to be t on F and to be oo otherwise. Then T can be seen to be a stopping time, and
~EXt = EXilj? + EXooljrc, EXq = EXqq = EXp + EX00lp': ■
We conclude that EXtlj? = EX^lp for every F € Tt and hence Xt = E^ool Tt) almost surely. ■
4.5: Brownian Motion
43
4.23 EXERCISE. Suppose that X is a cadlag process such that Xt = E(f | Tt) almost surely, for every t. Show that Xt = E(f | Tt) almost surely for every stopping time T.
4.5  Brownian Motion
Brownian motion is a special stochastic process, which was first introduced as a model for the "Brownian motion" of particles in a gas or fluid, but has a much greater importance, both for applications and theory. It could be thought of as the "standard normal distribution for processes".
4.24 Definition. A stochastic process B is a (standard) Brownian motion relative to the filtration {Tt} if:
(i) B is adapted.
(ii) all sample paths are continuous.
(Hi) Bt — Bs is independent of Ts for all 0 < s < t.
(iv) Bt - Bs is N(0,t- s)-distributed.
(v) B0 = 0.
The model for the trajectory in ]R3 of a particle in a gas is a process (Bj,B2,Bf) consisting of three independent Brownian motions defined on the same filtered probability space. Property (ii) is natural as a particle cannot jump through space. Property (iii) says that given the path history Ts the displacement Bt — Bs in the time interval (s, t] does not depend on the past. Property (iv) is the only quantative property. The normality can be motivated by the usual argument that, even in small time intervals, the displacement should be a sum of many infinitesimal movements, but has some arbitrariness to it. The zero mean indicates that there is no preferred direction. The variance t — s is, up to a constant, a consequence of the other assumptions if we also assume that it may only depend on t — s. Property (iv) is the main reason for the qualification "standard". If we replace 0 by x, then we obtain a "Brownian motion starting at x".
We automatically have the following properties:
(vi) (independent increments) Bt2 — Btl,Bt3 — Bt2,...,Btk — Btk_1 are independent for every 0 < t\ <ti < ■ ■ ■ < tk ■
(vii) (Btl,..., Btk) is multivariate-normally distributed with mean zero and covariance matrix cov(Bti, Btj) = ti Atj.
It is certainly not immediately clear that Brownian motion exists, but it does.
4.25 Theorem. There exists a complete probability space (tt, T, P) and measurable maps Bt: tt —> M. such that the process B satisfies (i)-(v) relative
44        4:  Continuous Time Martingales
to the completion of the natural filtration generated by B (which is right-continuous).
There are many different proofs of this theorem, but we omit giving a proof altogether. It is reconforting to know that Brownian motion exists, but, on the other hand, it is perfectly possible to work with it without worrying about its existence.
The theorem asserts that a Brownian motion exists relative to its (completed) natural filtration, whereas the definition allows a general filtration. In fact, there exist many Brownian motions. Not only can we use different probability spaces to carry them, but, more importantly, we may use another than the natural filtration.
Warning. Some authors always use the natural filtration, or its completion. Property (iii) is stronger if {Tt} is a bigger filtration.
Brownian motion is "the" example of a continuous martingale.
4.26 Theorem. Any Brownian motion is a martingale.
Proof. Because Bt — Bs is independent of Ts, we have E(Bt — BS\TS) = F,(Bt — Bs) almost surely, and this is 0. ■
4.27 EXERCISE. Show that the process {B% - t} is a martingale.
Brownian motion has been studied extensively and possesses many remarkable properties. For instance:
(i) Almost every sample path is nowhere differentiable.
(ii) Almost every sample path has no point of increase. (A point of increase of a function / is a point t that possesses a neighbourhood such that on this neighbourhood / is maximal at t.)
(iii) For almost every sample path the set of points of local maximum is countable and dense in [0, oo).
(iv) lim sup^^ Bt/\Z2t loglog t = 1 a.s..
These properties are of little concern in the following. A weaker form of property (i) follows from the following theorem, which is fundamental for the theory of stochastic integration.
4.28 Theorem. IfB is a Brownian motion andO < tn < t™ < ■ ■ ■ < t%n = t
is a sequence of partitions of [0, t] with mesh widths tending to zero, then
Y(BU-Bu_1fl,t.
i=l
Proof. We shall even show convergence in quadratic mean. Because Bt—Bs is N(0, t — s)-distributed, the variable (Bt—Bs)2 has mean t — s and variance
4.6: Local Martingales 45
2(t — s)2. Therefore, by the independence of the increments and because
t = Ei{u-u-1)
E[j2(Bti - Bt^)2 - t\2 =J^var(Bti-Bti_1)2 = 2j^(ti-ti.1)2.
i=l i=l i=l
The right side is bounded by 2J„^^1 \U — £;_i| = 25nt for 5n the mesh width of the partition. Hence it converges to zero. ■
A consequence of the preceding theorem is that for any sequence of partitions with mesh widths tending to 0
lim sup     \Bti — Btil | = oo, a.s..
1=1
Indeed, if the lim sup were finite on a set of positive probability, then on this set we would have that Yli=i(^U ~^t;-i)2 —> 0 almost surely, because max; \Bti — | —> 0 by the (uniform) continuity of the sample paths. This would contradict the convergence in probability to t.
We conclude that the sample paths of Brownian motion are of unbounded variation. In comparison if /: [0, t] —> M. is continuously differen-tiable, then
hm £|/(fc)-/(fc-i)|= I \f{s)\d8.
i=l 1/0
It is the roughness (or "randomness") of its sample paths that makes Brownian motion interesting and complicated at the same time.
Physicists may even find that Brownian motion is too rough as a model for "Brownian motion". Sometimes this is alleviated by modelling velocity using a Brownian motion, rather than location.
4.6  Local Martingales
In the definition of a stochastic integral I/2-martingales play a special role. A Brownian motion is I/2-bounded if restricted to a compact time interval, but not if the time set is [0, oo). Other martingales may not even be square-integrable.
Localization is a method to extend definitions or properties from processes that are well-behaved, often in the sense of integrability properties, to more general processes. The simplest form is to consider a process X in turn on the intervals [0, Ti], [0, T2], • • • for numbers Xi < T2 < ■ ■ ■ increasing to infinity. Equivalently, we consider the sequence of stopped processes XTn. More flexible is to use stopping times Tn for this purpose. The following definition of a "local martingale" is an example.
46        4:  Continuous Time Martingales
4.29 Definition. An adapted process X is a local (sub/super) martingale in Lp if there exists a sequence of stopping times 0 < Xi < T2 < • • • with Tn f oo almost surely such that XTn is a (sub/super) martingale in Lp for every n.
In the case that p = 1 we drop the "in L{" and speak simply of a local (sub/super) martingale. Rather than "martingale in Lp we also speak of "Lp-martingale". Other properties of processes can be localized in a similar way, yielding for instance, "locally bounded processes" or "locally I/2-bounded martingales". The appropriate definitions will be given when needed, but should be easy to guess. (Some of these classes actually are identical. See the exercises at the end of this section.)
The sequence of stopping times 0 < Tn f oo is called a localizing sequence. Such a sequence is certainly not unique. For instance, we can always choose Tn < n by truncating Tn at n.
Any martingale is a local martingale, for we can simply choose the localizing sequence equal to Tn = oo. Conversely, a "sufficiently integrable" local (sub/super) martingale is a (sub/super) martingale, as we now argue. If X is a local martingale with localizing sequence Tn, then Xjn —> Xt almost surely for every t. If this convergence also happens in L\, then the martingale properties of XTn carries over onto X and X itself is a martingale.
4.30 EXERCISE. Show that a dominated local martingale is a martingale.
Warning'. A local martingale that is bounded in Li need not be a martingale. A fortiori, a uniformly integrable local martingale need not be a martingale. See Chung and Williams, pp20-21, for a counterexample.
Warning. Some authors define a local (sub/super) martingale in Lp by the requirement that the process X — Xq can be localized as in the preceding definition. If Xq € Lp, this does not make a difference, but otherwise it may. Because (^T")o = our definition requires that the initial value Xq of a local (sub/super) martingale in Lp be in Lp.
We shall mostly encounter the localization procedure as a means to reduce a proof to bounded stochastic processes. If X is adapted and continuous, then
(4.31) Tn = inf{t: \Xt\ > n)
is a stopping time. On the set Tn > 0 we have \XTn \ < n. If X is a local martingale, then we can always use this sequence as the localizing sequence.
4.32 Lemma. If X is a continuous, local martingale, then Tn given by (4.31) defines a localizing sequence. Furthermore, X is automatically a local Lp-martingale for every p > 1 such that Xq € Lp.
4.7: Maximal Inequalities 47
Proof. If Tn = 0, then (XT-)t = X0 for all t > 0. On the other hand, if Tn > 0, then \Xt\ < n for t < Tn and there exists tm I Tn with \Xtm \ > n. By continuity of X it follows that \XTn \ = n in this case. Consequently, \XTn | < \X0\ V n and hence XT" is even dominated by an element of Lp if Xq € I/p. It suffices to prove that Tn is a localizing sequence.
Suppose that Sm is a sequence of stopping times with Sm —> oo as m —> oo and such that XSm is a martingale for every m. Then XSmATn = (XSm)Tn is a martingale for each m and n, by Theorem 4.20. For every fixed n we have \XSmATn\ < \X0\V n for every m, and XSmAT" XT" almost surely as m —> oo. Because X0 = (XSm)0 and XSm is a martingale by assumption, it follows that X0 is integrable. Thus X$mATnM —> ^t„a* in Li as m —> oo, for every t > 0. Upon taking limits on both sides of the martingale equality E(XsmATnAt\ Ts) = XsmATnAs of XSmATn we see that XTn is a martingale for every n.
Because X is continuous, its sample paths are bounded on compacta. This implies that Tn —> oo as n —> oo. ■
4.33 EXERCISE. Show that a local martingale X is a uniformly integrable martingale if and only if the set {Xt- T finite stopping time} is uniformly integrable. (A process with this property is said to be of class D.)
4.34 EXERCISE. Show that a local Li-martingale X is also a locally uniformly integrable martingale, meaning that there exists a sequence of stopping times 0 < Tn f oo such that XTn is a uniformly integrable martingale.
4.35 EXERCISE. Show that (for p > 1) a local I/p-martingale X is locally bounded in Lp, meaning that there exists a sequence of stopping times 0 < Tn f oo such that XTn is a martingale that is bounded in Lp, for every n.
4.7 Maximal Inequalities
The maximal inequalities for discrete time (sub/super) martingales carry over to continuous time cadlag (sub/super) martingales, without surprises. The essential observation is that a supremum sup4 Xt over t > 0 is equal to the supremum over a countable dense subset of [0, oo) if X is cadlag. Furthermore, a countable supremum is the (increasing) limit of finite maxima.
4.36 Lemma. If X is a nonnegative, cadlag submartingale, then for any x > 0 and every t > 0,
sup
0<a<t
48        4:  Continuous Time Martingales
4.37 Corollary. If X is a nonnegative, cadlag submartingale, then for any p > 1 and p_1 + q~x = 1, and every t > 0,
sup Xs
0<s<t
< q\\xt\\P.
If X is bounded in Lp(£l,T,P), then Xt —> X
oo in Lp for some random
variable X*, and
supXt
t>0
< q\\X„
q sup \\Xt\\p.
t>0
The preceding results apply in particular to the absolute value of a martingale. For instance, for any martingale X,
(4.38)
sup|Xt|
t
< 2sup||Xt||2.
2 t
5
Stochastic Integrals
In this chapter we define integrals / X dM for pairs of a " predictable" process X and a martingale M. The main challenge is that the sample paths of many martingales of interest are of infinite variation. We have seen this for Brownian motion in Section 4.5; this property is in fact shared by all martingales with continuous sample paths. For this reason the integral / X dM cannot be defined using ordinary measure theory. Rather than defining it "pathwise for every u", we define it as a random variable through an I/2-isometry.
In general the predictability of the integrand (defined in Section 5.1) is important, but in special cases, including the one of Brownian motion, the definition can be extended to more general processes.
The definition is carried out in several steps, each time including more general processes X or M. After completing the definition we close the chapter with Ito's formula, which is the stochastic version of the chain rule from calculus, and gives a method to manipulate stochastic integrals.
Throughout the chapter (tt,T,{Tt},P) is a given filtered probability space.
5.1  Predictable Sets and Processes
The product space [0, oo) x tt is naturally equipped with the product a-field Boo x T. Several sub u-fields play an important role in the definition of stochastic integrals.
A stochastic process X can be viewed as the map X: [0, oo) x tt —> M given by {t,oj) Xt(uj). We define u-fields by requiring that certain types of processes must be measurable as maps on [0, oo) x tt.
50        5:  Stochastic Integrals
5.1 Definition. The predictable a-field V is the a-Geld on [0, oo) x fi generated by the left-continuous, adapted processes X: [0, oo) x fi —> M. (It can be shown that the same a-Geld is generated by all continuous, adapted processes X: [0, oo) x fi —> M.)
5.2 Definition. The optional a-field O is the a-Geld on [0, oo) x fi generated by the cadlag, adapted processes X: [0, oo) x fi —> M.
5.3 Definition. The progressive a-field M. is the a-Geld on [0, oo) x fi generated by the progressively measurable processes X: [0, oo) x fi —> M.
We call a process X: [0, oo) x fi —> M predictable or optional if it is measurable relative to the predictable or optional u-field.
It can be shown that the three u-fields are nested in the order of the definitions:
VcOcMcBooXF.
The predictable u-field is the most important one to us, as it defines the processes X that are permitted as integrands in the stochastic integrals. Because, obviously, left-continuous, adapted processes are predictable, these are "good" integrands. In particular, continuous, adapted processes.
Warning.  Not every predictable process is left-continuous.
The term "predictable" as applied to left-continuous processes expresses the fact that the value of a left-continuous process at a time t is (approximately) "known" just before time t. In contrast, a general process may jump and hence be "unpredictable" from its values in the past. However, it is not true that a predictable process cannot have jumps. The following exercise illustrates this.
5.4 EXERCISE. Show that any measurable function /: [0, oo) ->■ ffi defines a predictable process (t,oj) f(t). "Deterministic processes are predictable" .
There are several other ways to describe the various u-fields. We give some of these as a series of lemmas. For proofs, see Chung and Williams p25-30 and p57-63.
5.5 Lemma. The predictable a-Geld is generated by the collection of all subsets of [0, oo) x fi of the form
{0}xF0,   F0GT0,       and       [s,t] x Fs, Fs£Fs,s<t.
We refer to the sets in Lemma 5.5 as predictable rectangles.
Given two functions S,T:U —> [0, oo], the subset of [0, oo) x fi given
by
[S,T] = {(t,u) € [0,oo) x n-.S(u) < t < T{lo)}
5.1: Predictable Sets and Processes 51
is a stochastic interval. In a similar way, we define the stochastic intervals {S, T], [S, T) and {S, T). The set [T] = [T, T] is the graph of T. By definition these are subsets of [0, oo) x fi, even though the right endpoint T may assume the value oo. If S and/or T is degenerate, then we use the same notation, yielding, for instance, [0,T] or (s,i\.
Warning. This causes some confusion, because notation such as (s, t] may now denote a subset of [0, oo] or of [0, oo) x fi.
We are especially interested in stochastic intervals whose boundaries are stopping times. These intervals may be used to describe the various cr-fields, where we need to single out a special type of stopping time.
5.6 Definition. A stopping time T: fi —> [0, oo] is predictable if there exists a sequence Tn of stopping times such that 0 < Tn f T and such that Tn <T for every n on the set {T > 0}.
A sequence of stopping times Tn as in the definition is called an announcing sequence. It "predicts" that we are about to stop. The phrase "predictable stopping time" is often abbreviated to "predictable time".
Warning. A hitting time of a predictable process is not necessarily a predictable time.
5.7 Lemma. Each of the following collections of sets generates the predictable a-Eeld.
(i) All stochastic intervals [T, oo), where T is a predictable stopping time.
(ii) All stochastic intervals [S, T), where S is a predictable stopping time and T is a stopping time.
(Hi) All sets {0} x F0, F0 € Tq and all stochastic intervals (S,T], where S
and T are stopping times. Furthermore, a stopping time T is predictable if and only if its graph [T] is a predictable set.
5.8 Lemma. Each of the following collections of sets generates the optional a-Eeld.
(i) All stochastic intervals [T, oo), where T is a stopping time.
(ii) All stochastic intervals [S,T], [S,T), {S,T], {S,T), where S and T are stopping times.
5.9 Example. If T is a stopping time and c > 0, then T+c is a predictable stopping time. An announcing sequence is the sequence T + cn for cn < c numbers with 0 < c„ f c. Thus there are many predictable stopping times. □
5.10 Example. Let X be an adapted process with continuous sample paths and B be a closed set. Then T = inf{£ > 0:Xt € B} is a predictable time. An announcing sequence is Tn = inf {t > 0: d(Xt, B) < n-1}. The proof of this is more or less given already in Example 4.16. □
52
5:  Stochastic Integrals
5.11 Example. It can be shown that any stopping time relative to the natural filtration of a Brownian motion is predictable. See Chung and Williams, p30-31. □
5.12 Example. The left-continuous version of an adapted cadlag process if predictable, by left continuity. Then so is the jump process AX of a predictable process X. It can be shown that this jump process is nonzero only on the union Un[Tn] of the graphs of countably many predictable times Tn. (These predictable times are said to "exhaust the jumps of X".) Thus a predictable process has "predictable jumps". □
5.13 Example. Every measurable process that is indistinguishable from a predictable process is predictable. This means that we do not need to "worry about null sets" too much.
This is true only if the filtered probability space satisfies the usual conditions (as we agreed to assume throughout).
To verify the claim it suffices to show that every measurable process X that is indistinguishable from the zero process (an evanescent process) is predictable. By the completeness of the filtration a process of the form 1(m,«]xJV is left-continuous and adapted for every null set N, and hence predictable. The product u-field Boo x T is generated by the sets of the form (u, v] x F with F € T and hence for every fixed null set N its trace on the set [0, oo) x N is generated by the collection of sets of the form (u, v] x (F n N). Because the latter sets are predictable the traces of the product u-field and the predictable u-field on the set [0, oo) x N are identical for every fixed null set N. We apply this with the null set N of all oj such that there exists t > 0 with Xt(oj) / 0. For every Borel set B in R the set {(t,uj): Xt(uj) € B} is Boo x ^-"-measurable by assumption, and is contained in [0, oo) x N if B does not contain 0. Thus it can be written as All ([0,oo) x N) for some predictable set A and hence it is predictable, because [0, oo) x N is predictable. The set B = {0} can be handled by taking completements. □
5.2  Doleans Measure
In this section we prove that for every cadlag martingale M in Li there exists a u-finite measure hm on the predictable u-field such that
(5.14)
Hm(0 x F0)=0,
((«,*] x F.) = ElFa(M42 - Ms2),
F0 G To, s <t,FsG Ts.
5.2: Doleans Measure 53
The right side of the preceding display is nonnegative, because M2 is a submartingale. We can see this explicitly by rewriting it as
E1F.(Aft - M8){Mt + M.) = ElFa(Mt - Ms)2,
which follows because Elj?a(Mt — MS)MS = 0 by the martingale property, so that we can change "+" into "—". The measure hm is called the Doleans measure of M.
5.15 Example (Brownian motion). If M = B is a Brownian motion, then by the independence of Bt — Bs and Fs,
A»b((«,*] x F.) = E1F.E(B? - B2S) = P(Fs)(t - s) = (XxP)((s,t]xFs).
Thus the Doleans measure of Brownian motion is the product measure X x P. This is not only well-defined on the predictable u-field, but also on the bigger product u-field      x T. □
5.16 EXERCISE. Find the Doleans measure of the Poisson process.
In order to prove the existence of the measure hm in general, we follow the usual steps of measure theory. First we extend hm by additivity to disjoint unions of the form
k
A={0}xF0 [j [j{8i,U] x Ft,      F0 € T0,Ft € TSi,
i=l
by setting
k
HM(A) = Y£lFi(Ml-K)-
i=l
It must be shown that this is well-defined: if A can be represented as a disjoint, finite union of predictable rectangles in two different ways, then the two numbers hm{A) obtained in this way must agree. This can be shown by the usual trick of considering the common refinement. Given two disjoint, finite unions that are equal,
k i A={0}xFo{J \J{8i,ti] x Fi = {0} x F0 [j [j {s'3,t'3] x F\,
i=l j=l
we can write A also as the disjoint union of {0} x F0 and the sets
((*,*<] xF)n {{s'3,t'3] xF'3) = x F!'tj.
Next we show that the three definitions of fiM {A) all agree. We omit further details of this verification. Once we have verified that the measure hm is
54        5:  Stochastic Integrals
well-defined in this way, it is clear that it is finitely additive on the collection of finite disjoint unions of predictable rectangles.
The set of all finite disjoint unions of predictable rectangles is a ring, and generates the predictable u-field. The first can be proved in the same way as it is proved that the cells in ]R2 form a ring. The second is the content of Lemma 5.5. We take both for facts. Next Caratheodory's theorem implies that hm is extendible to V provided that it is countably additive on the ring. This remains to proved.
5.17 Theorem. For every cadlag martingale M in L2 there exists a unique measure fiM on the predictable a-Eeld such that (5.14) holds.
Proof. See Chung and Williams, p50-53. ■
5.18 EXERCISE. Show that /j,M{[0,t] x fi) < oo for every t > 0 and conclude that hm is er-fmite.
5.3  Square-integrable Martingales
Given a square-integrable martingale M we define an integral J X dM for increasingly more general processes X. If X is of the form l(s,t]Z for some (time-independent) random variable Z, then we want to define
J\{sAZdM = Z{Mt-Ms).
Here l(s,t]-£ is short-hand notation for the map (u,lo) h-»- \^a^{u)Z{u) and hence the integral is like a Riemann-Stieltjes integral for fixed oj. The right side is the random variable oj Z(oj)(Mt(oj) — Ms(oj)). We also want the integral to be linear in the integrand, and are lead to define
/ ^Oil(siiti]><Fi dM = ^2a,ilFi{Mu - MSi).
J   i=l i=l
By convention we choose "to give measure 0 to 0" and set
Jaol{o}xF0 dM = 0.
We can only postulate these definitions if they are consistent. If X = ^2i=iai^-(si,ti]xFi has two representations as a linear combination of predictable rectangles, then the right sides of the second last display must agree. For this it is convenient to restrict the definition initially to linear combinations of disjoint predictable rectangles. The consistency can then be checked using the joint refinements of two given representations. We omit the details.
5.3: Square-integrable Martingales 55
5.19 Definition. If X = aol{o}xF0 + ^2i=iai^-(si,ti]xFi JS a linear combination of disjoint predictable rectangles, then the stochastic integral of X relative to M is defined as jXdM = YH=iailFi{Mti - MSi).
In this definition there is no need for the restriction to predictable processes. However, predictability is important for the extension of the integral. We extend by continuity, based on the following lemmas.
5.20 Lemma. Every uniformly continuous map defined on a dense subset of a metric space with values in another metric space extends in a unique way to a continuous map on the whole space. If the map is a linear isometry between two normed spaces, then so is the extension.
5.21 Lemma. The collection of simple process X as in Definition 5.19 is dense in L2 ([0, oo) x fi, V, hm) ■ Every bounded X € Li ([0, oo) x fi, V, hm) is a limit in this space of a uniformly bounded sequence of simple processes.
5.22 Lemma. For every X as in Definition 5.19 we have f X2 d[iM = E{fXdM)2.
Proofs. The first lemma is a standard result from topology.
Because any function in Li ([0, oo) x fi, V, hm) is the limit of a sequence of bounded functions, for Lemma 5.21 it suffices to show that any bounded element of Li([0,oo) x U,V,hm) can be obtained as such a limit. Because 1[o,t]X —> X in 1/2([0, oo) x U,V,hm) as t —> oo, we can further restrict ourselves to elements that vanish off [0, t] x fi.
Let H be the set of all bounded, predictable X such that ^l[o,t] is a limit in Li(jO,oo) x SI,V,hm) of a sequence of linear combinations of indicators of predictable rectangles, for every t > 0. Then H is a vector space and contains the constants. A "diagonal type" argument shows that it is also closed under bounded monotone limits. Because H contains the indicators of predictable rectangles (the sets in Lemma 5.5) and this collection of sets is intersection stable, Lemma 5.21 follows from the monotone class theorem, Theorem 1.23.
Using the common refinement of two finite disjoint unions of predictable rectangels, we can see that the minimum of two simple processes is again a simple process. This implies the second statement of Lemma 5.21.
Finally consider Lemma 5.22. Given a linear combination X of disjoint predictable rectangles as in Definition 5.19, its square is given by X2 = ^{oyxFo + T!l=ia2il{si,ti\xFi- Hence, by (5.14),
(5.23)    jX2dfiM = Y,a^M{{si,U] x Fi) = Y,<%VlFi{Mti - MSi)2.
i=l i=l
56        5:  Stochastic Integrals
On the other hand, by Definition 5.19,
v(^ailFi{Mti-MSi)f
i=l
k k
i=lj'=l
Because the rectangles are disjoint we have for i / j that either 1^ If,- =0 or (si,ti]r\(sj,tj] = 0. In the first case the corresponding term in the double sum is clearly zero. In the second case it is zero as well, because, if U < Sj, the variable lj^lj^M^ — MSi) is TSj-measurable and the martingale difference Mtj — MSj is orthogonal to TSj. Hence the off-diagonal terms vanish and the expression is seen to reduce to the right side of (5.23). ■
Lemma 5.22 shows that the map
X v+ J XdM,
L2([0,oo) x n,V,fiM) -)■ L2{9,,T,P),
is an isometry if restricted to the linear combinations of disjoint indicators of predictable rectangles. By Lemma 5.21 this class of functions is dense in L2([0,oo) x SI,V,hm)- Because an isometry is certainly uniformly continuous, this map has a unique continuous extension to L2([0,oo) x £1,V,hm), by Lemma 5.20. We define this extension to be the stochastic integral / XdM.
E
(y xdMj
5.24 Definition. For M a cadlag martingale in L2 and X a predictable process in L2 ([0, oo) x fi, V, hm), the stochastic integral X >->■ f X dM is the unique continuous extension to I/2([0,oo) x £1,V,hm) of the map defined in Definition 5.19 with range inside L,2(£l,T,P).
Thus defined a stochastic integral is an element of the Hilbert space L2{íl,T,P) and therefore an equivalence class of functions. We shall also consider every representative of the class to be "the" stochastic integral J X dM. In general, there is no preferred way of choosing a representative.
If X is a predictable process such that ~L[o,t]X € L2([0, oo) x Q,V,hm), then f l[o,t]X dM is defined through the preceding definition. A short-hand notation for this is f£ X dM. By linearity of the stochastic integral we then have
[liS;t]XdM=[ XdM- \ XdM, s<t. J Jo Jo
We abbreviate this to /s* X dM.
5.3: Square-integrable Martingales 57
If l[o,t]X € 1/2([0,oo) x U,V,hm) for every t > 0, then we can define a process X ■ M satisfying
(X-M)t =     XdM = Jl[0>t]XdM.
Because for every t > 0 the stochastic integral on the right is defined only up to a null set, this display does not completely define the process X ■ M. However, any specification yields a martingale X-M and there always exists a cadlag version of X ■ M.
5.25 Theorem. Suppose that M is a cadlag martingale in L2 and that X is a predictable process with f 1[0^X2 dfiM < oo for every t > 0.
(i) Any version of X ■ M = {/0* X dM: t > 0} is a martingale in L2.
(ii) There exists a cadlag version of X ■ M.
(Hi) If M is continuous, then there exists a continuous version of X ■ M. (iv) The processes A(X ■ M), where X ■ M is chosen cadlag, and XAM are indistinguishable.
Proof. If X is a finite linear combination of predictable rectangles, of the form as in Definition 5.19, then so is l[o,t]^ and hence / l[0}t]X dM is defined as
/ lmXdM = Y,ai(MUAt - MSiM).
As a process in t, this is a martingale in L?, because each of the stopped processes M4i or MSi is a martingale, and a linear combination of martingales is a martingale. The stochastic integral X ■ M of a general integrand X is defined as an I/2-limit of stochastic integrals of simple predictable processes. Because the martingale property is retained under convergence in Li, the process X ■ M is a martingale.
Statement (ii) is an immediate consequence of (i) and Theorem 4.6, which implies that any martingale possesses a cadlag version.
To prove statement (iii) it suffices to show that the cadlag version of X ■ M found in (ii) is continuous if M is continuous. If X is elementary, then this is clear from the explicit formula for the stochastic integral used in (i). In general, the stochastic integral (X ■ M)t is defined as the I/2-limit of a sequence of elementary stochastic integrals (Xn ■ M)t. Given a fixed T > 0 we can use the same sequence of linear combinations of predictable rectangles for every 0 < t < T. Each process X-M — Xn ■ M is a cadlag martingale in L2 and hence, by Corollary 4.37, for every T > 0,
sup \(X ■ M)t - (Xn ■ M)t
0<t<T
The right side converges to zero as n —> oo and hence the variables in the left side converge to zero in probability. There must be a subsequence {n;}
<2\\(X -M)T-(Xn-M)T
58        5:  Stochastic Integrals
along which the convergence is almost surely, i.e. (Xni ■ M)t —> (X ■ M)t uniformly in t € [0, T], almost surely. Because continuity is retained under uniform limits, the process X-M is continuous almost surely. This concludes the proof of (hi).
Let H be the set of all bounded predictable processes X for which (iv) is true. Then H is a vector space that contains the constants, and it is readily verified that it contains the indicators of predictable rectangles. If 0 < Xn f X for a uniformly bounded X, then l[0^Xn —> l[o,t]^ in £2 ([0,00) x S7,7^, hm) ■ As in the preceding paragraph we can select a subsequence such that, for the cadlag versions, Xni ■ M —»■ X ■ M uniformly on compacta, almost surely. Because |AY| < 2||y||00 for any cadlag process Y, the latter implies that A(Xni-M) —> A (X-M) uniformly on compacta, almost surely. On the other hand, by pointwise convergence of Xn to X, XniAM —> XAM pointwise on [0,oo) x fi. Thus {Xn} C Ti implies that X € Ti. By the monotone class theorem, Theorem 1.23, H contains all bounded predictable X. A general X can be truncated to the interval [—n, n], yielding a sequence Xn with Xn —»■ X pointwise on [0,00) x fi and l[0i]X„ —»■ l[o,t]^ in I/2 ([0,00) x U,V,hm)- The latter implies, as before, that there exists a subsequence such that, for the cadlag versions, Xni ■ M —»■ X ■ M uniformly on compacta, almost surely. It is now seen that (iv) extends to X. m
The following two lemmas gives further properties of stochastic integrals. Here we use notation as in the following exercise.
5.26 EXERCISE. Let S < T be stopping times and let X be an Ts-measurable random variable. Show that the process 1(S}T]X defined as (t,uj)     l(S(w)iT(w)](i)A"(w) is predictable.
5.27 Lemma. Let M be a cadlag martingale in L? and let S < T be bounded stopping times.
(i) f 1(S}T]X dM = X(Mt — Ms) almost surely, for every bounded Ts-measurable random variable X.
(ii) f 1(S}t]XY dM = X f 1(S}t]Y dM almost surely, for every bounded Ts-measurable random variable X and bounded predictable process Y.
(Hi) f l(s,t]X dM = Nt — Ns almost surely, for every bounded predictable
process X, and N a cadlag version of X ■ M. (iv) f 1{0}xQ^ dM = 0 almost surely for every predictable process X.
Proof. Let Sn and Tn be the upward discretizations of S and T on the grid 0 < 2~n < 22"71 < • • • < kn2~n, as in the proof of Theorem 4.20, for kn sufficiently large that kn2~n > SVT. Then Sn I S and Tn I T, so that 1(s„,t„] ~~>■ l(s,t] pointwise on $7. Furthermore,
(5-28) !(S„,T„] = ^2 1(fc2-",(fc+l)2-"]x{S<fc2-"<T}-
5.3: Square-integrable Martingales 59
If we can prove the lemma for (Sn, Tn] taking the place of (S, T] and every n, then it follows for (S, T] upon taking limits. (Note here that fiM is a finite measure on sets of the form [0, if] x fi and all (Sn,Tn] are contained in a set of this form.)
For the proof of (i) we first consider the case that X = lp for some F € TS- In view of (5.28) and because {S < k2~n <T}nF = ({S < k2~n} n F)n{k2~n < T} is contained in Tk2-n, the process 1(s„,t„]X = l^SntTn]lp is a linear combination of predictable rectangles. Hence, by Definition 5.19,
/ 1(Sn,Tn]^-FdM = ^ l{s<jt2-"<T}nF(-^(jt+l)2-" ~ Mk2-») ■* k=0
= lF(MTn - MSn).
This proves (i) in the case that X = If- By linearity (i) is then also true for X that are simple over Ts- A general, bounded J-g-measurable X can be approximated by a uniformly bounded sequence of simple X. Both sides of the equality in (i) then converge in Li and hence the equality is valid for such X.
For the proof of (ii) first assume that X = If for some F £ Ts and that Y = 1(Mi„]xf„ for some Fu € Tu. In view of (5.28),
kn
1(S„,T„]1f1(m,«]xF„ = ^2 l(it2-"V™,(fc+l)2-"Au]x{S<ife2-"<T}nFnF„
k=0
k2~n Vii<(fc+1)2~" Av
is a linear combination of predictable rectangles, whence, by Definition 5.19, with the summation index k ranging over the same set as in the preceding display,
J ~i-(S„,T„]~LF~L(u,v]xF„dM
= ^2 l{S<k2-"<T}nFnF„(Af(iH-l)2-"Au — Mfc2-»Vn) k
= If ^2 l{s<fc2-"<T}nF„(Af(ife+i)2-"Au — Mki-nVu)
k
= If J 1(s„,t„]xf„ dM.
This proves (ii) for X and Y of the given forms. The general case follows again by linear extension and approximation.
For (hi) it suffices to show that Nt„ = f l(pttn]X dM almost surely.
60        5:  Stochastic Integrals
Since N0 = 0,
Nt„ = 1{*2-"<T}(-W(A;+l)2-" --Nfc2-»)
A:
= l{fc2-»<T}l(fc2-»,(fc+l)2-»]-X"d-^ = y !(0,T„]^d-M",
where the second equality follows from (ii), and the last equality by (5.28) after changing the order of summation and integration.
Because hm does not charge {0} x fi, l{0}xn^ = 0 in ^([0,oo) x ^,V,hm) for any X and hence 0 = J 1{o}xqX dM in L?, by the isometry. This proves (iv). ■
The preceding lemma remains valid for unbounded processes X, Y or unbounded stopping times S,T, provided the processes involved in the statements are appropriately square-integrable. In each case this is true under several combinations of conditions on X, Y, S, T and M.
5.29 Lemma (Substitution). Let M be a cadlagmartingale in Li and let N = Y ■ M be a cadlag version of the stochastic integral of a predictable process Y with ~L[o,t]Y € 1/2([0, oo) x U,V,hm) for every t>0. Then
(i) [in is absolutely continuous relative to nu and dfiN = Y2 dfiM-
(ii) JXdN = JXYdM almost surely for every X  € L2([0,oo) x H,,V,hn)-
Proof. By Lemma 5.27(h), for every bounded predictable process Y and every s < t and Fs € Ts,
(5.30) 1F, j\{sAY dM = J l{sAxFY dM.
This can be extended to predictable Y as in the statement of the lemma by approximation. Specifically, if Yn is Y truncated to the interval [—n, n], then l(s,t]^n ->■ 1(s,t]y in L2([0,oo) x n,V,hm) and hence also lFal(S}t]Yn ->■ l-fa~L(s,i\Y in this space. By the isometry property of the stochastic integral it follows that f l(„tt]YndM and f l(s}t]xfaYndM converge in L2 to the corresponding expressions with Y instead of Yn, as n —> oo. Therefore, if (5.30) is valid for Yn instead of Y for every n, then it is valid for Y.
We can rewrite the left side of (5.30) as lpa(Nt — Ns). Therefore, for every predictable rectangle (s,i\ x Fs,
liN((s,t] xFs) =ElFa{Nt-Nsf = E(Jl[sAxFYdM)2
= J 1(s,i]xf/2 dflM,
5.4: Locally Square-integrable Martingales
61
by the isometry property of the stochastic integral. The predictable rectangles are an intersection stable generator of the predictable u-field and [0, oo) x fi is a countable union of predictable rectangles of finite measures under fiN and Y2 • fiM- Thus these measures must agree on all predictable sets, as asserted in (i).
For the proof of (ii) first assume that X = l(S}t]xFa f°r € Ts. Then the equality in (ii) reads
lFa(Nt - Ns) = Jl{s!t]xFYdM, a.s..
The left side of this display is exactly the left side of (5.30) and hence (ii) is correct for this choice of X. By linearity this extends to all X that are simple over the predictable rectangles.
A general X € Li ([0, oo) x fl,V, [In) can be approximated in this space by a sequence of simple Xn. Then by (i)
J \XnY - XY\2 dfiM = J \Xn-X\2dfiN -+0.
Thus, by the isometry property of the stochastic integral, we can take limits as n —> oo in the identities / XnY dM = f Xn dN to obtain the desired identity for general X and Y. m
5.4 Locally Square-integrable Martingales
In this section we extend the stochastic integral by localization to more general processes X and M.
Given a cadlag local I/2-martingale M we allow integrands X that are predictable processes and are such that there exists a sequence of stopping times 0 < Tn f oo such that, for every n,
(i) MT" is a martingale in L?,
(ii) l[0itATn]X € L2([0,oo) x i},V,nmtn) for every t > 0.
A sequence of stopping times Tn of this type is called a localizing sequence for the pair (X, M). If such a sequence exists, then
j l[0}tATn]X dMT"
is a well-defined element of 1/2(0, F, P), for every n, by Definition 5.24. We define /0* X dM as the almost sure limit as n —> oo of these random variables.
62        5:  Stochastic Integrals
5.31 Definition. Given a cadlag local L2-martingale M and a predictable process X for which there exists a localizing sequence Tn for the pair (X, M), the stochastic integral fQ X dM is defined as the almost sure limit of the sequence of random variables f l[q^atn]X dMTn, as n —>■ oo. The stochastic process 1f*X dM is denoted by X ■ M.
It is not immediately clear that this definition is well posed. Not only do we need to show that the almost sure limit exists, but we must also show that the limit does not depend on the localizing sequence. This issue requires scrutiny of the definitions, but turns out to be easily resolvable. An integral of the type f ~L[o,s]X dMT ought to depend only on 5 A T and the values of the processes X and M on the set [0,5 A T], because the integrand ~L[o,s]X vanishes outside [0,5] and the integrator MT is constant outside [0,T]. In analogy with the ordinary integral, a nonzero integral should require both a nonzero integrand and a nonzero measure.
This reasoning suggests that, for every n > m, on the event {t < Tm}, where t A Tm = t A Tn, the variable / l[0}tatm]X dMTm is the same as the variable / I^^at^X dMTn. Then the limit as n —> oo trivially exists on the event {t < Tm}. Because Um{t <Tm} = tt the limit exists everywhere.
The following lemma makes these arguments precise.
5.32 Lemma. Let M be a cadlag process and X a predictable process, and let 5, T, U, V be stopping times such that 5 and U are bounded, MT and Mv are martingales in L2 and such that I^q^X and l^^X are contained in L2([0,oo) x U,V,Hmt) and ^([0,oo) x U,V,Hmv), respectively. Then / 1[0 s]X dMT = f 1[0 u]X dMv almost surely on the event {5 A T = U A V}.'
Proof. First assume that X is a predictable rectangle of the form X = l(s,t]xF„- By Lemma 5.27(ii) and next (i),
J l[o,s]l(.,t]xF. dMT = lFa J l[0,s]l(.,t] dMT = lFa (MgM - MjAs) = If, {Ms At at — MSasat)-
The right side depends on on (5,T) only through SAT. Clearly the same calculation with the stopping times U and V gives the same result on the event {5 AT = U A V}.
Next let X be a bounded predictable process. Then, for every given t > 0, the process ~L[o,t]X is automatically contained in L2([0,oo) x ^■>'P->Vmt +Vmv) and by (a minor extension of) Lemma 5.21 there exists a bounded sequence of simple processes Xn with Xn —> ~L[o,t]X in L2 ([0, oo) x ^)'P)/umt + Vmv)- H t > S, then this implies that l[0]sjXn —> l[0is]X in L2([0,oo) x SI,V,Hmt) and hence / l[0,s]Xn dMT -»■ f l[o,s]XdMT in L2, by the isometry. We can argue in the same way with 5 and T replaced by U
5.4: Locally Square-integrable Martingales
63
and V. Thus the equality of f l[0}S]XndMT and f l[0}U]XndMv for every n on the event {S A T = U A V} carries over onto X.
A general X as in the lemma can be truncated to [—n, n] and next we take limits. ■
Thus the reasoning given previously is justified and shows that the almost sure limit of / l[o^atn]X dMTn exists. To see that the limit is also independent of the localizing sequence, suppose that Sn and Tn are two localizing sequences for the pair of processes (X, M). Then the lemma implies that on the event An = {t A Sn = t A Tn}, which contains {t < Sn A Tn},
J l[0!tASjIdMs» = J l[0MTn]XdMT\ a.s..
It follows that the almost sure limits of left and right sides of the display, as n —> oo, are the same almost surely on the event An for every n, and hence on the event UnAn = fi. Thus the two localizing sequences yield the same definition of /„ X dM.
In a similar way we can prove that we get the same stochastic integral if we use separate localizing sequences for X and M. (See Exercise 5.33.) In particular, if M is a martingale in L2, X is a predictable process, and 0 < Tn f oo is a sequence of stopping times such that l[0}tATn]X € L2 ([0, oo) x ^,V,hm) for every t and every n, then
J l[0jtATn]XdM,
which is well-defined by Definition 5.24, converges almost surely to f* X dM as defined in Definition 5.31. So "if it is not necessary to localize M, then not doing so yields the same result".
5.33 EXERCISE. Suppose that M is a local I/2-martingale with localizing sequence Tn, X a predictable process, and 0 < Sn t oo are stopping times such that l[0}tASri]X € L2([0,oo) x tt,V,HMT»-) for every t > 0 and n. Show that linin-j.oo f l[o,tAS„]^^T" exists almost surely and is equal to /J X dM. (Note that Sn A Tn is a localizing sequence for the pair (X, M), so that /q X dM is well defined in view of Exercise 5.34.)
5.34 EXERCISE. Let M be a cadlag process and S and T stopping times such that Ms and MT are I/2-martingales. Show that
(i) nMs (A n [0, S A T]) = hmt (A n [0, S A T]) for every A € V.
(ii) if M is an I/2-martingale, then fiMs(A) = hm{A fl [0,5]) for every A£V.
The present extension of the stochastic integral possesses similar properties as in the preceding section.
64        5:  Stochastic Integrals
5.35 Theorem. Suppose that M is a cadlag local L^-martingale and X a predictable process for which there exists a localizing sequence Tn for the pair (X, M).
(i) There exists a cadlag version of X ■ M.
(ii) Any cadlag version of X ■ M is a local L^-martingale relative to the localizing sequence Tn.
(Hi) If M is continuous, then there exists a continuous version of X ■ M. (iv) The processes A(X ■ M), where X ■ M is chosen cadlag, and XAM are indistinguishable.
Proof. For every n let Yn be a cadlag version of the process t / 1[o,*at„]^dMTn. By Theorem 5.25 such a version exists; it is an L2-martingale; and we can and do choose it continuous if M is continuous. For fixed t > 0 the variable Tm A t is a stopping time and hence by Lemma 5.27(iii)
By Lemma 5.32 the right side of this display changes at most on a null set if we replace MTn by MTm. For m < n we have Tm A T„ = Tm and hence the integrand is identical to l\o,tATm]X. If we make both changes, then the right side becomes Ym,t- We conclude that YnjTm/\t = Ymjt almost surely, for every fixed t and m <n. This shows that the stopped martingale Yjm is a version of the stopped martingale Y^m, for m <n. Because both martingales possess cadlag sample paths, the two stopped processes are indistinguishable. This implies that Yn and Ym agree on the set [0,Tm] except possibly for points (t,uS) with oj ranging over a null set. The union of all null sets attached to some pair (m, n) is still a null set. Apart from points (t,io) with oj contained in this null set, the limit Y as n —> oo of Yn,t{oj) exists and agrees with Ymit(w) on [0,Tm]. The latter implies that it is cadlag, and YTm is indistinguishable of Ym. Furthermore, the jump process of Y is indistinguishable of the jump process of Ym on the set [0,Tm] and hence is equal to l[0tTm\XAMTm = XAM on the set [0,Tm], by Theorem 5.25(iv).
By definition this limit Y is a version of X ■ M. m
The properties as in Lemmas 5.27 and 5.29 also extend to the present more general integral. For instance, in a condensed notation we have, for T a stopping time and for processes X, Y and M for which the expressions are defined,
a.s.
(5.36)
(X ■ M)T = X ■ MT = {l[0,T]X) ■ M, X-(Y ■ M) = (XY) ■ M, A{X ■ M) = XAM.
5.5: Brownian Motion
65
We shall formalize this later, after introducing the final extension of the stochastic integral.
5.37 Example (Continuous processes). The stochastic integral X ■ M is defined for every pair of a continuous process X and a continuous local martingale M with Mq = 0.
Such a pair can be localized by the stopping times
Tn = mf{t > 0: \Xt\ > n, \Mt\ > n}.
If 0 < t < Tn, then \Xt\ < n and \Mt\ < n, by the continuity of the sample paths of the processes. It follows that MTn is an L2-bounded martingale and
|1(o,t„]-X'| < n, »MTn (0, oo) = E(M^ - M0T")2 < n\
Therefore /Umt« is a finite measure and 1(o,t„]^ is bounded and hence in L2([0,oo) x il,V,nmtn). Trivially l{o}^ € L2([0,oo) x il,V,nmtn), because /«mt« ({0}xfi) = 0, and hence l[0}TnM]X € -^([O, oo) xU,V,fimtn) for every t > 0. □
5.38 EXERCISE. Extend the preceding example to processes that may have jumps, but of jump sizes that are uniformly bounded.
5.39 Example (Locally bounded integrators). The stochastic integral X ■ M is defined for every pair of a local I/2-martingale M and a locally bounded predictable process X.
Here "locally bounded" means that there exists a sequence of stopping times 0 < Tn f oo such that XTn is uniformly bounded, for every n. We can choose this sequence of stopping times to be the same as the localizing sequence for M. (Otherwise, we use the minimum of the two localizing sequences.) Then l^-j^^X is uniformly bounded and hence is contained in 1/2([0,oo) x tt,V,HMT»-) for every t and n. Thus Definition 5.24 applies. □
5.5  Brownian Motion
The Doleans measure of Brownian motion is the product measure A x P and hence exists as a measure on the product u-field x T, which is bigger than the predictable u-field. This can be used to define the stochastic integral / X dB relative to a Brownian motion B also for non-predictable integrands. The main aim of this section is to define the stochastic integral
66        5:  Stochastic Integrals
/J X dB for all measurable, adapted processes X such that f* X2 ds is finite almost surely.
Going from predictable to adapted measurable processes may appear an important extension. However, it turns out that any measurable, adapted process X is almost everywhere equal to a predictable process X, relative to A x P. Because we want to keep the isometry relationship of a stochastic integral, then the only possibility is to define f* X dM as f* X dM. From this perspective we obtain little new.
The key in the construction is the following lemma.
5.40 Lemma. For every measurable, adapted process X: [0, oo) x fi —> M there exists a predictable process X such that X = X almost everywhere under A x P.
Proof. The proof is based on two facts:
(i) For every bounded, measurable process X there exists a bounded optional process X such that E,(Xt\Tt) = Xt almost surely, for every t>0.
(ii) For every bounded, optional process X there exists a predictable process X such that the set {X / X} is contained in the union U„[Tn] of the graphs of countably many stopping times.
If we accept (i)-(ii), then the lemma can be proved as follows. For every bounded measurable process X, facts (i) and (ii) yield processes X and X. If X is adapted, then Xt = E(Xt| Tt) = Xt almost surely for every t > 0, by (i). Consequently, by Fubini's theorem
A x P(X ± X) = JP{uj:Xt{uj) ± Xt{uj)) d\(t) = 0.
Because the sections {t: (uj,t) € G} of the set G = U„[Tn] contain at most countably many points, they have Lebesgue measure zero and hence A x P(X / X) = 0, by another application of Fubini's theorem. Combining (i) and (ii), we see that A x P(X / X) = 0. This proves the lemma for bounded, measurable, adapted processes X. We can treat general processes X by truncating and taking limits. Specifically, if Xn is X truncated to [—n,n], then Xn —> X on [0, oo) x fi. If Xn is predictable with Xn = Xn except on a null set Bn, then Xn converges to a limit at least on the complement of UnBn. We can define X to be \imXn if this exists and 0 otherwise.
We prove (i) by the monotone class theorem, Theorem 1.23. Let H be the set of all bounded, measurable processes X for which there exists an optional process X as in (i). Then H is a vector space and contains the constants. If Xn € H with 0 < Xn f X for some bounded measurable X and Xn are the corresponding optional processes as in (i), then the process X defined as liminf Xn if this liminf is finite, and as 0 if not, is optional. By the monotone convergence theorem for conditional expectations (Xn)t =
5.5: Brownian Motion
67
E((Xn)t\Ft) t E{Xt\Ft) almost surely, for every t > 0. Hence for each t > 0, we have that Xt = E(Xt| Tt) almost surely.
In view of Theorem 1.23 it now suffices to show that the indicators of the sets [0, s) x F, for s > 0 and F € T, which form an intersection stable generator of B^ x T, are in H. By Example 2.6 there exists a cadlag process Y such that Yt = E(l;r| Tt) almost surely, for every t > 0. Then X = l[o}S)Y is right continuous and hence optional. It also satisfies Xt = E(1[0iS)xj?| Tt) almost surely. The proof of (i) is complete.
To prove (ii) we apply the monotone class theorem another time, this time with H equal to the set of bounded, optional processes X for which there exists a predictable process X as in (ii). Then H is a vector space that contains the constants. It is closed under taking bounded monotone limits, because if Xn = Xn on Gn and Xn —»■ X, then \\mXn must exist at least on n„G„ and be equal to X there. We can define X to be lim Xn if this exists and 0 otherwise. Because the stochastic integral (S, T] for two given stopping times S, T is predictable, H clearly contains all indicators of stochastic intervals [S,T), [S,T], (S,T] and (S,T). These intervals form an intersection stable generator of the optional u-field by Lemma 5.8. ■
Let X be a measurable, adapted process for which there exists a sequence of stopping times 0 < Tn f oo such that, for every t > 0 and n,
(5.41) l[o,tAT„]* € L2([0,oo) xd.Bo.xf.AxF).
By the preceding lemma there exists a predictable process X such that X = X almost everywhere under A x P. Relation (5.41) remains valid if we replace X by X. Then we can define a stochastic integral f l[o^ATn]X dB as in Definition 5.24 and the discussion following it. We define /0* X dB as the almost sure limit of these variables as n —> oo.
5.42 Definition. Given a measurable, adapted process X for which there exists a localizing sequence Tn satisfying (5.41) the stochastic integral Jq X dB is defined as the almost sure limit of the sequence of cadlag processes ii->-/ l[o}tATn]X dB.
The verification that this definition is well posed is identical to the similar verification for stochastic integrals relative to local martingales.
Condition (5.41) is exactly what is needed, but it is of interest to have a more readily verifiable condition for a process X to be a good integrand.
5.43 Lemma. Let X be a measurable and adapted process.
(i) If fQ X^ds < oo almost surely for every t > 0, then there exists a sequence of stopping times 0 < Tn f oo such that (5.41) is satisfied and hence J"0* X dB can be defined as a continuous local martingale.
68        5:  Stochastic Integrals
(ii) If f* EX2 ds < oo, then f* X dB can be defined as a continuous martingale in 1/2 •
Proof. There exists a predictable process X with X = X almost everywhere under A x P. By Fubini's theorem the sections {t:Xt(uj) / Xt(uj)} of the set {X / X} are Lebesgue null sets for P-almost every oj. Therefore, the conditions (i) or (ii) are also satisfied with X replacing X. Because X is predictable, it is progressive. This means that X: [0, t] x fi —> M is an Bt xTt-measurable map and so is X2. Consequently, by the measurability part of Fubini's theorem, the map oj Yt(oj): = fQ X2(oj) ds is .^-measurable for every t > 0, which means that the process Y is adapted. The variables Tn = inf{t >0:Yt> n} are stopping times, with 0 < Tn f oo on the event where Yt is finite for every t, by the continuity of the sample paths of Y. This is a set of probability one by assumption (i), and hence we can redefine Tn such that 0 < Tn f oo everywhere. Furthermore,
J l[0mtn]X2 d(\ xP)= EYTnAt < n.
Thus the process X satisfies (5.41), concluding the proof of (i).
For (ii) it suffices to prove that l[o,t]^ € L2 ([0, oo) x fi, B^ xF,\xP) for every t > 0. Then the same is true for X, and the result follows from Theorem 5.25(iii). (The localization applied in Definition 5.42 is unnecessary in this situation. Equivalently, we can put Tn = oo.) But / l[o,t]^2 dXx P = f* EX2 ds, by Fubini's theorem. ■
5.6  Martingales of Bounded Variation
We recall that the variation of a cadlag function A:W—tW over the interval (a, b] is defined as
/  \dAs\:=        sup        Y^\AU ~ Au-i\,
where the supremum is taken over all partitions a = to < t\ < ■ ■ ■ < tk = b of the interval. The function is called of "locally bounded variation" if its variation over every compact interval is finite. It can be shown that this is equivalent to the existence of two nondecreasing cadlag functions A\ and Ai such that A = A\ — A^. Thus every function of locally bounded variation defines a signed measure Bh> fBdA defined as the difference of the measures defined by the functions A\ and Ai- It can be shown that there is a unique decomposition, written as A = A+ — A~, such that the measures
5.6: Martingales of Bounded Variation
69
defined by A+ and A~ are orthogonal. The sum of the corresponding measures is denoted \A\ = A+ +A~ and is called the total variation of A. It can be shown that f^a b^ d\A\ is equal to the variation over (a, b] as defined in
the preceding display. In particular, the expressions \dAs \ and Jjo ^ d\A\ denote the same.
If the sample paths of the martingale M are of bounded variation, then we can also define an integral f X dM based on the usual Lebesgue-Stieltjes integral. Specifically, if for a given oj € fi the variation f\dMt\(uj) of the function 11—^ Mt(oj) is finite, then Bi->/B dMt(oj) defines a signed measure on the Borel sets (a difference of two ordinary measures) and hence we can define an integral
for every process X and oj such that the function t Xt(uj) is Borel measurable and integrable relative to the measure B >->■ fBd\Mt\(uj). (All integrals are relative to t, for fixed oj.)
If this is true for every oj, then we have two candidates for the integral / X dM, the "pathwise" Lebesgue-Stieltjes integral and the stochastic integral. These better be the same. They are under some conditions. For clarity of the following theorem we denote the two integrals by f Xs dMs and fXdM.
A process X is said to be locally bounded if there exists a sequence of stopping times 0 < Tn f oo such that XTn is uniformly bounded on [0, oo) x fi, for every n. A process X is said to be of locally bounded variation if there exists a sequence of stopping times 0 < Tn f oo such that every of the sample paths of XTn is of bounded variation on [0, oo), for every n. This can be seen to be identical to the variation of every sample path of X on every compact interval [0, t] being finite, which property is well described as locally of bounded variation.
Warning. "Locally bounded" is defined to mean "locally uniformly bounded". This appears to be stronger than existence of a localizing sequence such that each of the sample paths of every of the stopped processes is bounded. On the other hand, "locally of bounded variation" is to be understood in a nonuniform way; it is weaker than existence of a sequence of stopping times such that all sample paths of XTn are of variation bounded by a fixed constant, depending only on n.
5.44 Theorem. Let M be a cadlag local Li-martingale of locally bounded variation, and let X a locally bounded predictable process. Then for every t > 0 the stochastic integral fQ X dM and the Lebesgue-Stieltjes integral 4j Xs dMs are both well-defined and agree almost surely.
Proof. If X is a measurable process, then the Lebesgue-Stieltjes integral f Xs dMs is well-defined (up to integrability), because the map 1Xt(oj)
70        5:  Stochastic Integrals
is measurable for every oj. The integral f Xs dMs is then also measurable as a map on fi. This is clear if X is the indicator function of a product set in [0, oo) x fi. Next we can see it for a general X by an application of the monotone class theorem, Theorem 1.23.
By assumption there exist sequences of stopping times 0 < Tn f oo such that MTn is an I/2-martingale and such that XTn is uniformly bounded, for every n. It is not a loss of generality to choose these two sequences the same; otherwise we use the minimum of the two sequences. We may also assume that MTn is I/2-bounded. If not, then we replace Tn by Tn A n; the martingale MT"A™ is bounded in L2, because ~EM%nAtAn < E(MT")2 < oo for all t > 0, by the submartingale property of (M7")2.
The process l\o,Tn]X is uniformly bounded and hence is contained in the Hilbert space Li ([0, oo) x fi, V, /«mt« ) • Therefore, the stochastic integral /J X dM is well-defined according to Definition 5.24 as the almost sure limit of the sequence / 1[0!t„a*]^ dMTn.
Because f^Q ^ \dMs\ is finite for every t, and the process l[o,t]^ is uniformly bounded on the event An = {t < Tn}, the Lebesgue-Stieltjes integral J|0 tj \XS \ \dMs\ if finite on this event, and hence almost surely on fi = UnAn, for every given t. We conclude that f^Q ^ Xs dMs is well-defined and finite, almost surely. By dominated convergence it is the limit as n —> oo of the sequence f l(0}TnAt](s)Xs dMs, almost surely.
We conclude that it suffices to show that / 1[0!t„a*]^ dMTn and / ^(o,TnAi\{s)Xs dMs agree almost surely, for every n. For simplicity of notation, we drop the localization and prove that for any I/2-bounded martingale M with f\dMs\ < oo almost surely, and every bounded, predictable process X the stochastic integral / X dM and Lebesgue-Stieltjes integral f Xs dMs are the same almost surely, where we interprete the mass that s '—y Ms puts at 0 to be zero.
We apply the monotone class theorem, with H the set of all bounded predictable X for which the integrals agree almost surely. Then H contains all indicators of predictable rectangles, because both integrals agree with the Riemann-Stieltjes integral for such integrands. Because both integrals are linear, H is a vector space. Because fiM ([0, oo) x fi) = E(Moc — M0)2 < oo, the Doleans measure of M is finite, and hence the constant functions are integrable. If 0 < Xn f X for a bounded X and {Xn} C ~H, then Xn —> X in 1/2 ([0, oo) x Hj'Pj/xm) by the dominated convergence theorem, and hence /XndM -»■ /XdM in L2. Furthermore, /XU}SdMs -»■ /XsdMs pointwise on $7, by the dominated convergence theorem, because / \dMs\ < oo. Because I/2-limits and pointwise limits must agree, it follows that the two integrals agree almost surely. The unit function is a limit of a sequence of indicators of predictable rectangles and hence we can first infer that the constant functions are in H. Next an application of Theorem 1.23 shows that H contains all bounded predictable processes. ■
5.6: Martingales of Bounded Variation
71
As a corollary of the preceding theorem we see that the Lebesgue-Stieltjes integral of a locally bounded predictable process relative to a cad-lag local I/2-martingale of locally bounded variation is a local martingale. Indeed, under these conditions the two types of integrals coincide and the stochastic integral is a local martingale. In the next section we want to drop the "L2" from the conditions and for this reason we now give a direct proof of this martingale property for integrators that are only local martingales.
5.45 Lemma. If M is a cadlag local martingale of locally bounded variation and X is a locally bounded predictable process, then the Lebesgue-Stieltjes integrals (X ■ M)t: = f^Q ^ Xs dMs define a cadlag local martingale X-M.
Proof. Write /„ for J(0i]. Let 0 < Tn f 00 be a sequence of stopping times
such that MTn is a martingale and such that XTn is uniformly bounded, for every n. Because (X ■ M)Jn = fQ Xjn dMjn, the lemma will follow if
t 1—^ /q Xs dMs is a cadlag martingale for every given pair of a bounded predictable process X and martingale of locally bounded variation m. This is clear if X is the indicator of a predictable rectangle. In that case the Lebesgue-Stieltjes integral is a Riemann-Stieltjes integral, and coincides with the elementary stochastic integral, which is a martingale. The set H of all bounded predictable X for which X ■ M is a martingale is a vector space and contains the constants. If 0 < Xn f X for a uniformly bounded process X, then fQ XnjS dMs —> fQ Xs dMs pointwise on fi and in L\, for every t > 0, by two applications of the dominated convergence theorem. We conclude that the set H is closed under bounded monotone limits and hence it contains all bounded predictable processes, by the monotone class theorem, Theorem 1.23. ■
Warning. The predictability of the integrand is important. For instance, if N is a standard Poisson process and T is the time of its first jump, then the process m defined by Mt = Nt — t and the process MT are martingales. The Lebesgue-Stieltjes integral fQ Ns dMj = lt>rNT = 1*>t is certainly not a martingale (as can be seen from the fact that El{t > T} = 1 — e~* is not constant) and hence this Lebesgue-Stieltjes integral lacks the most striking property of the stochastic integral. In comparison iV_ is a predictable process and /0 iVs_ dMj = 0 is certainly a martingale.
The most important example of a continuous martingale is Brownian motion and this has sample paths of unbounded variation. The latter property is not special to Brownian motion, but is shared by all continuous martingales, or more generally all predictable local martingales. We can prove this important and interesting result by a comparison of stochastic and Lebesgue-Stieltjes integrals.
72        5:  Stochastic Integrals
5.46 Theorem. Let M be a cadlagpredictable process that is both a local martingale and a process of locally bounded variation, and 0 at 0. Then M = 0 up to indistinguishability.
Proof. First assume that M is continuous. By assumption there exists a sequence 0 < Tn f oo of stopping times such that MTn is both a martingale and of bounded variation. If necessary we can replace Tn by the minimum of Tn and inf{t > 0: \ Mt\ > n} to ensure also that MTn is bounded, and hence in L2- Because MTn is of bounded variation, the integration by parts formula for Lebesgue-Stieltjes integrals yields (with fQ denoting f^ot^)
(MT")2= f MT" dMTn + f MTndMTn. Jo Jo
Under the present assumption that M is continuous, the integrands in these integrals are continuous and hence predictable. (The two integrals are also identical, but we write them differently because the identity is valid even for discontinuous M, and we need it in the second part of the proof.) Therefore, the integrals on the right can be viewed equivalently as Lebesgue-Stieltjes or stochastic integrals, by Theorem 5.44. The interpretation as stochastic integrals shows that the right side is a martingale. This implies that EM|,nAi = 0 and hence Mt = 0 almost surely, for every t.
The proof if M is not continuous is similar but requires additional steps, and should be skipped at first reading. A stopped predictable process is automatically predictable. (This is easy to verify for indicators of predictable rectangles and next can be extended to general predictable processes by a monotone class argument.) Therefore, the integrands in the preceding display are predictable also if M is not continuous. On the other hand, if M is not continuous, then MTn as constructed previously is not necessarily bounded and we cannot apply Theorem 5.44 to conclude that the Lebesgue-Stieltjes integral fQ MTn dMTn is a martingale. We can solve this by "stopping earlier", if necessary. The stopping time Sn = ini{t > 0: \Mt\ > n} is predictable, as [Sn] = [0,Sn] n M~1([—n,n]c) is predictable. (See the last assertion of Lemma 5.7.) Thus Sn is the monotone limit of a sequence of stopping times {Sm,n}m=i strictly smaller than Sn on {Sn > 0} = fi. Then Rn = maxi^xn Sij defines a sequence of stopping times with 0 < Rn t oo and |Mfl"| < n for every n by the definition of Sn and the fact that Rn < Sn. Now we may replace the original sequence of stopping times Tn by the minimum of Tn and Rn, and conclude the argument as before. ■
5.7: Semimartingales 73
5.7 Semimartingales
The ultimate generalization of the stochastic integral uses "semimartingales" as integrators. Because these are defined as sums of local martingales and bounded variation processes, this does not add much to what we have already in place. However, the concept of a semimartingale does allow some unification, for instance in the statement of Ito's formula.
5.47 Definition. A cadlag adapted stochastic process X is a semimartingale if it has a representation of the form X = X0 + M + A for a cadlag local martingale M and a cadlag adapted process of locally bounded variation A.
The representation X = X0 + M + A oi & semimartingale is non-unique. It helps to require that M0 = A0 = 0, but this does not resolve the nonuniqueness. This is because there exist martingales that are locally of bounded variation. The compensated Poisson process is a simple example.
We would like to define a stochastic integral Y ■ X as Y ■ M + Y ■ A, where the first integral Y ■ M is a stochastic integral and the second integral Y ■ A can be interpreted as a Lebesgue-Stieltjes integral. If we restrict the integrand Y to locally bounded, predictable processes, then Y-M is defined as soon as M is a local I/2-martingale, by Definition 5.31. In the given decomposition X = Xq + M + A, the martingale is not required to be locally in L-2, but one can always achieve this by proper choice of M and A, in view of the following lemma. The proof of this lemma is long and difficult and should be skipped at first reading. It suffices to remember that "local martingale" in the preceding definition may be read as "local I/2-martingale", without any consequence; and that a continuous semimartingale can be decomposed into continuous processes M and A. The latter means that a continuous semimartingale can equivalently be defined as a process that is the sum of a continuous local martingale and a continuous adapted process of locally bounded variation.
5.48 Lemma. For any cadlag semimartingale X there exists a decomposition X = Xq + M + A such that M is a cadlag local L^-martingale and A is a cadlag adapted process of locally bounded variation. Furthermore, if X is continuous, then M and A can be chosen continuous.
* Proof. We may without loss of generality assume that X is a local martingale. Define a process Z by Zt = ^2s<t ^XBl^AXa\>i- This is well-defined, because a cadlag function can have at most finitely many jumps of absolute size bigger than some fixed constant on any given compact interval. We show below that there exists a cadlag predictable process B of locally bounded variation such that Z — B is a local martingale. Next we set A = Z — B and M = X - X0 - A and show that \AM\ < 2. Then M is a locally bounded
74        5:  Stochastic Integrals
martingale and hence certainly a local I/2-martingale, and hence the first assertion of the lemma is proved.
In order to show the existence of the process B define a process Zu by Z™ = Y^s<t AXsl&xa>i- This is clearly nondecreasing. We claim that it is locally in L\ and hence a local submartingale. To see this, let 0 < Sn t oo be a sequence of stopping times such that XSn is a uniformly integrable martingale, for every n, and define Tn = ini{t > 0: Z™ > n, \Xt\ > n} A Sn. Then     V \XTn \ < n on [0, Tn) and
0 < ZTnAt <ZTn<n + \AXTn\<2n+ \XTn\.
The right side is integrable by the optional stopping theorem, because Tn < Sn and XSn is uniformly integrable.
Being a local submartingale, the process Zu possesses a compensator Bu by the Doob-Meyer decomposition, Lemma 5.69. We can apply a similar argument to the process of cumulative jumps of X less than —1, and take differences to construct a process B with the required properties.
The proof that \AM\ < 2 is based on the following facts:
(i) For every cadlag predictable process X there exists a sequence of predictable times Tn such that {{t,uj): AXt(uj) ± 0} C U„[T„]. (The sequence Tn is said to exhaust the jumps of X. See e.g. Jacod and Shiryaev, 12.24.)
(ii) If X is a predictable process and T a stopping time, then Xt is Tt--measurable, where we define X^ to be 0. (See e.g. Jacod and Shiryaev, 12.4; and 11.11 for the definition of Tt--)
(iii) For any cadlag martingale X and predictable stopping time T we have E(Xr| Tt-) = Xt- almost surely on {T < oo}. (See e.g. Jacod and Shiryaev, 12.27.)
(iv) For any cadlag martingale X and predictable stopping time T we have ~E(AXT\TT-) = 0 almost surely on the set {T < oo}. This follows by applying (ii) to the predictable process X- to see that Xt- is Tt--measurable and combining this with (iii) to compute the conditional expectation of AXt = Xt — Xt- ■
The processes X, A = Z — B and M = X — Xq — A are local martingales. If we can show that | AM7" | < 2 for every Tn in a localizing sequence 0 < Tn t oo, then it follows that \ AM\ < 2 and the proof is complete. For simplicity of notation assume that X, M and A are martingales. The process Z has been constructed so that | A(X - Z)\ < 1 and hence \E(A(X - Z)T\ ?t-) \ < 1 almost surely, for every stopping time T. By (iv) E(AMt| Tt-) = 0 almost surely on {T < oo}, for every predictable time T. Because AM = A(X — Z) + AB, it follows that |E(A.Bt| Tt-) | < 1 for every predictable time T. Since B and B- are predictable, ABt is Tt--measurable by (ii) and hence |ABt| < 1 almost surely. Consequently |AB| < 1 by (i), and hence \AM\ < \A{X -Z)\ + \AB\ < 2.
This concludes the proof of the first assertion of the theorem. Next,
5.7: Semimartingales 75
we prove that a continuous semimartingale X permits a decomposition X = Xq + M + A such that M and A are continuous.
Suppose that X is continuous and let X = X0 + M + A be a given decomposition in a local I/2-martingale M and a process of locally bounded variation A. Let 0 < Sn f oo be a sequence of stopping times such that MSn is a martingale for every n and define
Then the process MTn is a uniformly integrable martingale, is bounded in absolute value by n on [0, Tn), and
The right side is integrable by the optional stopping theorem, Theorem 4.21, whence the process A is locally integrable. We conclude that the positive and negative variation processes corresponding to A are both locally integrable. Because they are nondecreasing, they are submartingales, and permit Doob-Meyer decompositions as in Lemma 5.69. We conclude that there exists a cadlag predictable process A such that A — A is a local martingale. Now X = X0 + {M+A — A)+A is a decomposition of X into a local martingale M = M + A — A and a predictable process of locally bounded variation A. We shall show that these processes are necessarily continuous.
By predictability the variable AAt is .^--measurable for every stopping T, by (ii). If M is integrable, then E(AMt| Tt-) = 0 for every predictable stopping time, by (iv) because M is a martingale. Since AX = 0, it then follows that AAt = F,(AAt\Tt-) = 0 for every predictable time T, whence the process A and hence M are continuous, by (i). If AMt is not integrable, we can first localize the processes and apply the argument to stopped processes. ■
5.49 Definition. The integral Y ■ X of a locally bounded, predictable process Y relative to a cadlag semimartingale X with decomposition X = Xq + M + A as in Lemma 5.48 is defined as Y ■ M + Y ■ A, where the first integral Y -M is a stochastic integral defined according to Definition 5.31 and the second integral is the Lebesgue-Stieltjes integral (Y -A)t = f^Q ^ Ys dAs.
The notations (Y ■ X)t and f* Y dX are used interchangeably.
Because the decomposition of Lemma 5.48 is not unique, we must verify that the preceding definition is well posed. This follows from the fact that for any other decomposition X = Xq + M + A as in Lemma 5.48 the process M — M = A — A is a cadlag local I/2-martingale that is locally of bounded variation. Therefore, the Lebesgue-Stieltjes integral and the stochastic integral of a locally bounded predictable process Y relative to
Tn = mi\t> 0:\Mt\ > n,
Jo J[0,t„)
[ " |a4.| = /      \dAs\ + \AA
■tJ <n + \AXTn | + \AMTn \<2n+ \MTn
76        5:  Stochastic Integrals
this process coincide by Theorem 5.44 and hence Y-M+Y-A = Y-M+Y-A, if the integrals Y ■ M, Y ■ A,Y ■ M, and Y ■ A are interpreted as stochastic or Lebesgue-Stieltjes integrals, as in the definition.
5.50 EXERCISE. Suppose that M is a cadlag local martingale that is locally of bounded variation, and Y is a locally bounded process. Show that the integral Y ■ M as defined by the preceding definition coincides with the Lebesgue-Stieltjes integral /0 Y„ dM„. (Hint: this is a trivial consequence of the fact that the definition is well posed. Don't be confused by the fact that M is a martingale.)
5.51 Theorem. If X is a cadlag semimartingale and Y is a predictable locally bounded process, then:
(i) There exists a cadlag version ofY-X.
(ii) This version is a semimartingale.
(Hi) If X is a local martingale, then this version is a local martingale.
(iv) If X is continuous, then there exists a continuous version ofY-X.
(v) The processes A(Y ■ X), where Y ■ X is a cadlag version, and YAX are indistinguishable.
Proof. Let X = X0 + M + A be an arbitrary decomposition in a cadlag local I/2-martingale M and a cadlag adapted process of locally bounded variation A. By definition Y-X = Y-M+Y-A, where the first is a stochastic integral and the second a Lebesgue-Stieltjes integral. By Theorem 5.35 the stochastic integral Y ■ M permits a cadlag version and this is a local Li-martingale; it permits a continuous version if M is continuous; and its jump process is YAM. The Lebesgue-Stieltjes integral Y ■ A is of locally bounded variation and cadlag; it is continuous if A is continuous; it is a local martingale if A is a local martingale, by Lemma 5.45; and its jump process is YAA. Finally, if X is continuous, then the processes M and A can be chosen continuous. ■
Now that we have completely dressed up the definition of the stochastic integral, it is useful to summarize some properties.
5.52 Lemma. If X is a semimartingale, and Yn is a sequence of predictable processes such that Yn —»■ Y pointwise on [0, oo) x fi and \Yn\ < K for a locally bounded predictable process K and every n, then the cadlag versions ofYn ■ X and Y ■ X satisfy sups<t | (Y„ • X)s - (Y ■ X)8 | 4 0, for every t > 0.
Proof. We can decompose X = X0 + M + A for a cadlag local Li-martingale M and a cadlag process of locally bounded variation A, 0 at 0. That A is of locally bounded variation implies that /0 |d74s|(w) < oo and that K is locally bounded implies that sups<iKs(oj) < oo, both for
every fixed oj and t. It follows that     Ks(oj) \dAs\(oj) < oo (the integral
5.8: Quadratic Variation 77
is relative to s). Because, for fixed oj, the map s YnjS(oj) is dominated by the map s K„ (oj) , the dominated convergence theorem implies that /o |5^i,s(w) — ys(w) | |<L4s|(w) —> 0. This being true for every u, we conclude that sups<i|(Y„ • A)s — (Y ■ A)B \ converges to zero almost surely and hence in probability.
There exists a sequence of stopping times 0 < Tm f oo such that MTm is an I/2-bounded martingale and KTm is a uniformly bounded process, for every m. Then, because l[o,TmAt]^n is bounded by KTm, the dominated convergence theorem yields
n.At]Yn ~ l[0,TmAt]^)2 dflmtm -»■ 0.
On the set [0,Tm] the stochastic integral Yn ■ M can be defined as s 4 / l[o,TmAs]^n dMTm, and similarly with Y instead of Yn. (See Lemma 5.32 or the proof of Theorem 5.35.) For the cadlag versions of these processes, the maximal inequality (4.38) yields, for every fixed m, as n —> oo,
Esup|(Y„-M)s-(Y-M)s|2W,
s<4
< Esup
s<t
< 4E
J l[0}TmAs]YndMT™ - j l[0}TmAs]Y dMT' J !%TmM]YndMT™ - J l[0,TmAt]YdMT'
0.
This being true for every m implies that sups<i|(Y„ • M)s — (Y ■ M)s\ converges to zero in probability. ■
5.53 EXERCISE. Show that every left-continuous adapted process that is 0 at 0 is locally bounded.
5.54 Lemma. For every locally bounded predictable processes X and Y, semimartingale Z and stopping T, up to indistinguishability:
(i) (Y-Z)T = Y-ZT= (i[0i71y) • z.
(ii) X-(Y -Z) = (XY) ■ Z.
(Hi) A(Y • Z) = YAZ, ifY Z is chosen cadlag.
(iv) (V1(s,t\) " Z = yi(s,T] " Z f°T every Ts-measurable random variable V.
Proof. The statements follow from the similar statements on stochastic integrals, properties of Lebesgue-Stieltjes integrals, and localization arguments. We omit the details. ■
78
5:  Stochastic Integrals
5.8  Quadratic Variation
To every semimartingale or local I/2-martingale X correspond processes [X] and (X), which play an important role in stochastic calculus. They are known as the "quadratic variation process" and "predictable quadratic variation process", and are also referred to as the square bracket process and the angle bracket process. In this section we discuss the first of the two. In the next section we shall see that the two processes are the same for continuous I/2-local martingales.
5.55 Definition. The quadratic covariation of two semimartingales X and Y is a cadlag version of the process
The process [X, X], abbreviated to [X], is called the quadratic variation of
As usual we need to check that the definition is well posed. In this case this concerns the semimartingale integrals X- ■ Y and Y_ • X; these are well-defined by Definition 5.49, because a left-continuous adapted process that is 0 at 0 (such as X_ and Y_) is predictable and locally bounded.
We refer to the formula (5.56) as the integration-by-parts formula. The ordinary integration-by-parts formula for processes X and Y of locally bounded variation, from Lebesgue-Stieltjes theory, asserts that
Comparing this to ((5.56) we see that in this case the quadratic variation [X, Y] is the last term on the right. (Cf. Example 5.65 for more details.) One way of looking at the quadratic variation process for general semimartingales is to view it as the process that "makes the partial integral formula true". Many semimartingales are not locally of bounded variation, and then the quadratic covariation does not reduce to a function of the jump processes, as in the preceding display. In particular, the quadratic covariation of a continuous semimartingales is typically nonzero.
The name "quadratic covariation" is better explained by the following theorem, which may also be viewed as an alternative definition of this process.
5.57 Theorem. For any pair of cadlag semimartingales X and Y, any sequence of partitions 0 = tn < tn < ■ ■ ■ < f£ = t of mesh widths tending to zero, and any t > 0, as n —> oo,
(5.56)
[X, Y]=XY - X0Y0 - X_ ■ Y - y_ • X.
X.
k,
(5.58)
i=l
5.8: Quadratic Variation 79
Proof. Because ixy = (x + y)2 — (x — y)2 for any numbers x,y, the case of two semimartingales X and Y can be reduced to the case that X = Y. For simplicity of notation we only consider the latter case. By the identity (x — y)2 = x2 — y2 — 2y(x — y) we can write
kn kn kn
£(Xt? - Xt7_t)2 = J2(X27 - X27J -2j2xt7_i(Xt7 - Xt7_t)
i=l i=l i=l
(5.59) = X2 - X2 - 2{Xn ■ X)t,
for Xn the simple predictable process defined by
Xn = J2xt^_11(t?_1,t7y
i=l
The sequence of processes Xn converges pointwise on [0, t] x U to the process X- (where X0- = 0). The process K defined by Kt = sups<iXs_ is adapted and left continuous and hence predictable and locally bounded, and it dominates Xn. Lemma 5.52 implies that the sequence (Xn ■ X)t converges in probability to (X- ■ X)t. m
5.60 Example (Brownian motion). The quadratic variation process of Brownian motion is computed in Theorem 4.28 and is given by [B]t = t. This is special, because it is a deterministic process. We shall see later that Brownian motion is the only continuous local martingale with quadratic variation process the identity function.
In view of the representation in (5.56) and the continuity of Brownian motion,
B2 = 2 J BdB + t.
Compare this to the formula f2(t) = 2 f(s) df(s) (where df(s) = f'(s) ds) for a continuously differentiable function /, and be at least a little bit surprised. Ito's formula in Section 5.10 is the generalization of this result and has a similar "correction term" relative to ordinary calculus. □
5.61 EXERCISE. Show that the quadratic variation process of both the Poisson process n and the compensated Poisson process {nt — t: t > 0} is n itself. (Hint: subtraction of the smooth function t does not change the limit of the sum of squares; n is a jump process of jump sizes 1 = l2.)
5.62 EXERCISE. Show that 4[X, Y] = [X + Y] - [X - Y].
5.63 Example (Multivariate Brownian motion). The quadratic covariation between the coordinates of a multivariate Brownian motion
80        5:  Stochastic Integrals
(B1,... ,Bd) is given by [B\B^]t = Sijt, for Sij = 0 or 1 if i = j or i / j the Kronecker delta.
This can be seen in a variety of ways. For instance, the covariation between two independent martingales is zero in general. A simple proof, which makes use of the special properties of Brownian motion, is to note that {Bi -Bi)/V2 and (B' + Bi)/^ are both Brownian motions in their own right and hence [Bl - Bj] = [Bi + Bj], whence [B\Bj] = 0 by Exercise 5.62, for i / j. □
It is clear from the defining relation (5.58) that the quadratic variation process [X] can be chosen nondecreasing almost surely. By the "polarization identity" of Exercise 5.62, the quadratic covariation process [X, Y] is the difference of two nondecreasing processes and hence is of locally bounded variation. The following lemma lists some further properties.
5.64 Lemma. Let X and Y be cadlag semimartingales.
(i) [XT, Y] = [X, Y]T = [XT, YT] for every stopping time T.
(ii) If X and Y are local martingales, then XY — [X, Y] is a local martingale.
(Hi) If X and Y are L^-martingales, then XY — [X,Y] is a martingale.
(iv) If X and Y are L2-bounded martingales, then [X,Y] is Li-bounded.
(v) If X and Y are continuous, then [X, Y] is continuous.
(vi) The processes A[X,Y] and AX AY are indistinguishable.
Proof. Assertion (i) can be proved using (5.56), or from (5.58) after verifying that this relation remains true for partitions with a random endpoint.
For statements (ii)-(vi) it suffices to consider the case that X = Y.
Assertion (ii) is a consequence of the representation (5.56) of X2 — [X] in terms of the stochastic integral X_ • X and Theorem 5.51(iii).
If X is a square-integrable martingale, then the term (Xn-X)t in (5.59) has mean zero by the orthogonality of the martingale increment Xtv- —Xtv-to .Ft»  . Then, by Fatou's lemma and (5.59),
E[X]t < liminf EV(X,. - Xt- J2 = E(X2 - X2).
i=l
This proves (iv) and also that the process [X] is in L\ if X is in L2- To see that in the latter case X2 — [X] is a martingale, as claimed in (iii), it suffices to show that X_ • X is a martingale. By (ii) it is a local martingale. If Tn is a localizing sequence, then, by (5.56) and (i),
2 (X- ■ X)Jn = \X^nAt - X2 - [X]t„m\ < Xj*n/\t + X2 + [X]t,
because [X] is nondecreasing. Because XrnAt = E(Xt| Fr„At) by the optional stopping theorem, Jensen's inequality yields that X%,nAt <
5.9: Predictable Quadratic Variation 81
E(X|| TTnAt) and hence the sequence {XTnM}n^=1 is uniformly integrable, for every fixed t > 0. We conclude that the right side and hence the left side of the preceding display is uniformly integrable, and the sequence of processes (X- ■ X)Tn converges in L\ to the process X- ■ X, as n —> oo. Then the martingale property of the processes (X_ ■X)Tn) carries over onto the process X_ • X. This concludes the proof of (hi).
Assertion (v) is clear from the fact that the stochastic integral X_ • X is continuous if X is continuous, by Theorem 5.51(iv). For assertion (vi) we note first that X2 = (X_)2 + 2X_AX + (AX)2, so that its jump process is given by A(X2) = 2X_AX + (AX)2. Next we use (5.56) to see that A[X] = A(X2)-2A(X_-X), and conclude by applying Lemma 5.54(iii). ■
5.65 Example (Bounded variation processes). The quadratic variation process of a cadlag semimartingale X that is locally of bounded variation
This can be proved directly from the definition of [X] as the sum of infinitesimal square increments in equation (5.58) of Theorem 5.57, but an indirect proof is easier. An intuitive explanation of the result is that for a process of locally bounded variation the sums of infinitesimal absolute increments converges to a finite limit. Therefore, for a continuous process of locally bounded variation the sums of infinitesimal square increments, as in (5.58), converges to zero. On the other hand, the squares of the pure jumps in the discrete part of a process of locally bounded variation remain.
A proof can be based on the integration-by-parts formula for cadlag functions of bounded variation. This shows that
Here the integral on the right is to be understood as a pathwise Lebesgue-Stieltjes integral, and is equal to the Lebesgue-Stieltjes integral 4j Xs_ d(Xs — Xq). Because the decomposition X = X0 + m + a of the semimartingale X can be chosen with M = 0 and a = X — X0, the latter Lebesgue-Stieltjes integral is by definition the semimartingale integral (X_ • X)t, as defined in Definition 5.49. Making this identification and comparing the preceding display to (5.56) we conclude that
is given by [X]t =
0<s<4
(AXS)2.
0<s<4
(AXS)2. □
5.66 EXERCISE. Let X be a continuous semimartingale and Y a cadlag semimartingale that is locally of bounded variation. Show that [X, Y] = 0. [Hint: one possibility is to use (5.58).]
82
5:  Stochastic Integrals
5.9 Predictable Quadratic Variation
The "angle bracket process" (M) is defined for the smaller class of local I/2-martingales M, unlike the square bracket process, which is defined for general semimartingales. If M is continuous, we can define (M) simply to be identical to [M]. For general local I/2-martingales, we define the angle bracket process by reference to the Doob-Meyer decomposition. This decomposition, given in Lemma 5.69, implies that for any local LVmartingale M there exists a unique predictable process A such that M2 — A is a local martingale. We define this process as the predictable quadratic variation of M.
5.67 Definition. The predictable quadratic variation of a cadlag local L?-martingale M is the unique cadlag nondecreasing predictable process (M), 0 at 0, such that M2 — (M) is a local martingale. The predictable quadratic covariation of a pair of cadlag local L^-martingales M and N is the process {M,N) defined by 4(M,N) = {M + N) - {M - N).
5.68 EXERCISE. Show that MN - {M,N) is a local martingale.
If M is a local martingale, then the process M2 — [M] is a local martingale, by Lemma 5.64(h). If [M] is predictable, in particular if M is continuous, then (M) = [M]. However, the process [M] is not necessarily predictable, and hence is not necessarily equal to the process (M).
To see that Definition 5.67 is well posed, we use the Doob-Meyer decomposition. The square of a local martingale is a local submartingale, by Jensen's inequality, and hence existence and uniqueness of (M) follows from (ii) of the following lemma.
A process Z is said to be of class D, if the collection of all random variable Zt with T ranging over all finite stopping times, is uniformly in-tegrable.
5.69 Lemma (Doob-Meyer).
(i) Any cadlag submartingale Z of class D can be written uniquely in the form Z = Zq + M + A for a cadlag uniformly integrable martingale M and a cadlag predictable nondecreasing process A with EA^ < oo, both 0 at 0. The process A is continuous if and only if EZt- = EZt for every finite predictable time T.
(ii) Any cadlag local submartingale Z can be written uniquely in the form Z = Zq + M + A for a cadlag local martingale M and a cadlag predictable nondecreasing process A, both 0 at 0.
Proof. For a proof of (i) see e.g. Rogers and Williams, VI-29.7 and VI-31.1. The uniqueness of the decomposition follows also from Theorem 5.46, because given two decompositions Z = Z0+M + A = Z0+M + Aoi the
5.9: Predictable Quadratic Variation 83
given form, the process M — M = A — A is & cadlag predictable process of bounded variation, 0 at 0, and hence is 0.
Given (i) we can prove (ii) by localization as follows. Suppose 0 < Tn f oo is a sequence of stopping times such that ZTn is a submartingale of class D for every n. Then by (i) it can be written as ZTn = Z0 + Mn + An for a uniformly integrable martingale Mn and a cadlag, nondecreasing integrable predictable process An. For m < n we have ZTm = (ZTn)Tm = Z0 + M^m + A^™ ■ By uniqueness of the decomposition it follows that Mjm = Mm and Af™ = Am. This allows us to define processes M and A in a consistent manner by specifying their values on the set [0, Tm] to be Mm and Am, for every m. Then MTn = Mn and hence M is a local martingale. Also Z = Zq + M + A on [0, Tm] for every m and hence on [0, oo) x fi.
We still need to show the existence of the stopping times Tn. By assumption there are stopping times 0 < Sn f oo such that ZSn is a submartingale. Define
Tn = Sn A n A mf{t > 0: \Zf" | > n}.
Then \Zf" \ < \Z^\ Vn for t € [0,Tn] and hence \Z^n \ < \Z$r \ Vn for every stopping time T. The right side is integrable because Tn is bounded and ZSn is a submartingale (and hence is in L{) by Theorem 4.20. ■
The nondecreasing, predictable process A in the Doob-Meyer decomposition given by Lemma 5.69(i) (ii) is called the compensator or "dual predictable projection" of the submartingale Z.
5.70 Example (Poisson process). The standard Poisson process is non-decreasing and integrable and hence trivially a local submartingale. The process M defined by Mt = Nt — t is a martingale, and the identity function t i-)- t, being a deterministic process, is certainly predictable. We conclude that the compensator of N is the identity function.
The process t Mf — t is also a martingale. By the same reasoning we find that the predictable quadratic variation of M is given by (M)t = t. In contrast, the quadratic variation is [M] = N. (See Exercise 5.61.) □
5.71 EXERCISE. Show that the compensator of [M] is given by (M).
5.72 EXERCISE. Show that (MT) = (M)T for every stopping time T. (Hint: a stopped predictable process is predictable.)
* 5.73 EXERCISE. Show that M2 - (M) is a martingale if M is an L2-martingale. (Hint: if M is I/2-bounded, then M2 is of class D and we can apply (i) of the Doob-Meyer lemma; a general M can be stopped.)
Both quadratic variation processes are closely related to the Doleans measure. The following lemma shows that the Doleans measure can be
84        5:  Stochastic Integrals
disintegrated as,
dfiM{s,uj) = d[M]s{oj) dP{oj) =d(M)s{oj)dP{uj).
Here d[M]s(oj) denotes the measure on [0,oo) corresponding to the non-decreasing, cadlag function t [M]s(oj), for given oj, and similarly for d{M)s(oj). The three measures in the display agree on the predictable a-field, where the Doleans measure was first defined. (See (5.14)). Off the predictable u-field the two disintegrations offer possible extensions, which may be different.
5.74 Lemma. If M is an L^-martingale, then, for all A £ V,
p poo p poo
HM{A)= J        lA{s,Lu)d[M}s{Lu)dP{L0)= J lA{s,Lu)d(M)s{Lu)dP{L0).
Proof. Because the predictable rectangles form an intersection stable generator of the predictable u-field, it suffices to verify the identity for every set of the form A = (s, t] x Fs with Fs € Ts. Now
/•oo
E /    l{s!t]xFa(u,oj)d[M]u =ElFa([M]t - [M]s). Jo
Because M2 — [M] is a martingale, by Lemma 5.64(iii), the variable (M2 — [M]t) — (M2 — [M]s) is orthogonal to Ts. This implies that we may replace [M]t — [M]s in the display by M2 — . The resulting expression is exactly fiM{{s,t] x Fs).
The argument for (M) is identical, if we note that M2 — (M) is a martingale if M is in L2- (Cf. Exercise 5.73.) ■
5.75 Example (Integration with Continuous Integrators). We have seen in Example 5.37 that a continuous local martingale M, 0 at 0, is a local I/2-martingale, and hence can act as an integrator. It can now be seen that any predictable process X with, for every t > 0,
t
Xl d[M]a < oo, a.s.
is a good integrand relative to M. This is to say that under this condition there exists a localizing sequence 0 < Tn f oo for the pair (X, M) and hence Definition 5.31 of the stochastic integral applies. An appropriate localizing sequence is
Tn = inf |« > 0: \Mt\ > n,     X2S d[M]s > nj.
For this sequence we have that MTn is bounded and l[ojATn]X is contained in I/2([0,oo) x fi,?,^^) in view of Lemma 5.74, because / l[o,T„](«)^2 d[M]s < n and hence has expectation f X2 dfimtn bounded by n. □
5.9: Predictable Quadratic Variation 85
5.76 Lemma. The quadratic variation process of a cadlag local martingale is the unique adapted process A of locally bounded variation, 0 at 0, such that M2 - A is a local martingale and AA = (AM)2.
Proof. The quadratic variation process [M] possesses the listed properties, by Lemma 5.64(h) and (vi). Given an other process A with these properties the process [M] — A is the difference of two local martingales and hence a local martingale. It is also of locally bounded variation and 0 at 0. Moreover, it is continuous, because A[M] = (AM)2 = AA. Theorem 5.46 shows that [M]-A = 0. m
Because the quadratic covariation process [X, Y] is of locally bounded variation, integrals of the type Zsd[X, Y]s can be defined as Lebesgue-Stieltjes integrals, for every measurable (integrable) process Z. (The s in the notation is to indicate that the integral is a Lebesgue-Stieltjes integral relative to s, for every fixed pair or sample paths of Z and [X, Y].) The integrals in the following lemmas can be understood in this way.
5.77 Lemma. Let M and N be local L^-martingales and let X and Y be locally bounded predictable processes.
(i) [X-M,Y- N]t = f*XSYS d[M,N]s.
(ii) (X-M,Y-N)t = f* XSYS d(M, N)s.
Proof. For simplicity of notation we give the proof in the case that X = Y and M = N. Furthermore, we abbreviate the process t f X2 d[M]s to X2 ■ [M], and define X2 ■ (M) similarly.
Because a compensator of a local submartingale is unique, for (ii) it suffices to show that the process X2 ■ (M) is predictable and that the process (X ■ M)2 - X2 ■ (M) is a local martingale.
Similarly, for (i) it suffices to show that the process (X ■ M)2 - X2 ■ [M] is a local martingale and that A(X2 ■ [M]) = (A(X ■ M))2.
Now any integral relative to a predictable process of locally bounded variation is predictable, as can be seen by approximation by integrals of simple integrands. Furthermore, by properties of the Lebesgue-Stieltjes integral A{X2 ■ [M]) = X2A[M] = X2{AM)2, by Lemma 5.64(vi), while (A(X ■ M))2 = (XAM)2, by Lemma 5.54. We are left with showing that the processes {X ■ M)2 - X2 ■ (M) and {X ■ M)2 - X2 ■ [M] are local martingales.
Suppose first that M is I/2-bounded and that X is a predictable process with f X2 dfiM < oo. Then X ■ M is an L2-bounded martingale, and for every stopping time T, by Lemma 5.54(i),
e{X ■M)%. = e(JXl[0}T]dMy = JX2l[0}T]dnM
= e [ X2sd[M]8 = e{X2-[M])T, Jo
86        5:  Stochastic Integrals
where we use Lemma 5.74 for the first equality on the second line of the display. We can conclude that the process (X-M)2 — X2- [M] is a martingale by Lemma 4.22.
For a general local I/2-martingale we can find a sequence of stopping times 0 < Tn f oo such that MTn is I/2-bounded and such that l[0}Tri^X € 1/2 ([0, oo) x fi, V, /Umt« ) f°r every n. By the preceding argument the process (l[0}Tn]X ■ MTn)2 — l[0}Tn]X2 ■ [MTn] is a martingale for every n. But this is the process (X ■ M)2 — X2 ■ [M] stopped at Tn and hence this process is a local martingale.
The proof for the process (X ■ M)2 - X2 ■ (M) is similar. ■
The following lemma is of interest, but will not be used in the remainder.
* 5.78 Lemma (Kunita-Watanabe). If M and N are cadlag local martingales and X and Y are predictable processes, then
a.s.,
(j \d[M,N]u\)  < J d[M]uJ^ d[N]u, (E J \XuYu\\d[M,N]\uy < JX2diiu J Y2diiN.
Proof. For s < t abbreviate [M,N]t - [M,N]S to [M,iV]*. Let s = tn < t™ < ■ ■ ■ < t%n = t be a sequence of partitions of [s, t] of mesh widths tending to zero as n —> oo. Then, by Theorem 5.57 and the Cauchy-Schwarz inequality,
|[M,JV]*|2 = lim yVAft- - Aft- ){Nt>
< hm E(M*? " Mn J2 £Wr - Nt7 J2 = [M]*[iV]*
Here the limits may be interpreted as limits in probability, or, by choosing an appropriate subsequence of {n}, as almost sure limits. By applying this inequality to every partitioning interval (U-i,ti) in a given partition s = to < t\ < ■ ■ ■ < tk = t of [s, t], we obtain
k k
E| [M, N]%_ 11 < ^y/mlLMu-! <
by the Cauchy-Schwarz inequality. The right side is exactly the square root of f d[M]u f d[N]u. The supremum of the left side over all partitions of the
interval [s, t] is /* \d[M, N]u\. This concludes the proof of the first inequality in Lemma 5.78.
5.10: Ito's Formula for Continuous Processes
87
To prove the second assertion we first note that by the first, for any measurable processes X and Y,
Next we take expectations, use the Cauchy-Schwarz inequality on the right side, and finally rewrite the resulting expression in terms of the Doleans measures, as in Lemma 5.74. ■
5.10  Ito's Formula for Continuous Processes
Ito's formula is the cornerstone of stochastic calculus. In this section we present it for the case of continuous processes, which allows some simplification. In the first statement we also keep the martingale and the bounded variation process separated, which helps to understand the essence of the formula. The formulas for general semimartingales are more symmetric, but also more complicated at first.
For a given function /: Md —> M write for its ith partial derivative and Di^f for its (i,j)th second degree partial derivative.
5.79 Theorem (Ito's formula). Let M be a continuous local martingale and A a continuous process that is locally of bounded variation. Then, for every twice continuously differentiable function /: ]R2 —> M,
The special feature of Ito's formula is that the martingale M gives two contributions on the right hand side (the first and third terms). These result from the linear and quadratic approximations to the function on the left. An informal explanation of the formula is as follows. For a given partition 0 = to < ti < ■ ■ ■ < tk = t, we can write the left side of the theorem as
Y{f(Mti+1, Ati+1) - f(Mti+1 ,Ati))+Y{f(Mti+1 ,Ati)- f(Mti,AU))
+ YDif(Mti,Ati)(Mti+1 -Mti) + i^i?11/(Mti,Ati)(Mti+1 - Mtif.
a.s..
(5.80) «^D2/(Mti+1,Ati)(Ati+1
88
5:  Stochastic Integrals
We have dropped the quadratic approximation involving the terms {Ati+1 — Ati)2 and all higher order terms, because these should be negligible in the limit if the mesh width of the partition converges to zero. On the other hand, the quadratic approximation coming from the martingale part, the term on the far right, does give a contribution. This term is of comparable magnitude as the quadratic variation process on the left side of (5.58).
5.81 EXERCISE. Apply Theorem 5.79 to the function f(m, a) = m2. Compare the result to Theorem 5.57.
If we apply Theorem 5.79 with the function f(m,a) = g(m + a), then we find the formula
Here X = m + A is a semimartingale. If we define its quadratic variation [X] as [M], then we can also write this as
This pleasantly symmetric formula does not permit the study of transformations of pairs of processes (M, A), but this can be remedied by studying functions g{Xitt, ■ ■ ■, Xd,t) of several semimartingales Xi = {Xitt:t > 0}. In the present section we restrict ourselves to continuous semimartingales. It was shown in Lemma 5.48 that the processes M and A in the decomposition X = Xq + m + A of a continuous semimartingale can always be chosen continuous. The following definition is therefore consistent with the earlier definition of a semimartingale.
5.83 Definition. A continuous semimartingale X is a process that can be written as the sum X = X0 + m + Aofa continuous local martingale m and a continuous process A of locally bounded variation, both 0 at 0.
The decomposition X = X0 + m + Aoi& continuous semimartingale in its continuous martingale and bounded variation parts m and A is unique, because a continuous local martingale that is of locally bounded variation is necessarily constant, by Theorem 5.46.
It can also be proved that for a semimartingale X = m + A with A a continuous process of locally bounded variation, the quadratic variation [X] of X is indeed given by [M]. We leave this as an exercise.
s
(5.82)
5.10: Ito's Formula for Continuous Processes
89
5.84 EXERCISE. Show that the quadratic variation of a continuous semi-martingale X = Xq + M + A, as defined in (5.58), is given by [M], i.e. the contributions of the bounded variation part is negligible. Furthermore, show that [M,A] = 0 = [A]. (Hint: the continuity of the processes is essential.)
5.85 Theorem (Ito's formula). Let X = (Xi,..., Xj) be a vector of continuous semimartingales. Then, for every twice continuously differentiable function f: Md ->■ ffi,
f(xt)-f(x0) =J2fDif{x)dXi+\Y,Y,fDijfWd{x»x& a-s-
i=iJo i=ij=iJo
Proofs. For a proof of Theorem 5.79 based directly on the Taylor approximation (5.80), see Chung and Williams, pp94-97. Here we give a proof of the more general Theorem 5.85, but following the "convention" stated by Rogers and Williams, p61: "Convention dictates that Ito's formula should only be proved for d = 1, the general case being left as an exercise, amid bland assurances that only the notation is any more difficult."
The proof proceeds by first establishing the formula for all polynomials / and next generalization to general smooth functions by approximation. The formula is trivially true for the polynomials f(x) = 1 and f(x) = x. Next we show that the formula is correct for the function fg if it is correct for the functions / and g. Because the set of functions for which it is correct is also a vector space, we then can conclude that the formula is correct for all polynomials.
An essential step in this argument is the defining equation (5.56) for the quadratic variation process, which can be viewed as the Ito formula for polynomials of degree 2 and can be written in the form
(5.86) XtYt - X0Y0 = {X ■ Y)t + (Y ■ X)t + [X, Y]t.
Then suppose that Ito's formula is correct for the functions / and g. This means that (5.82) is valid for g (as it stands) and for / in the place of g. The formula implies that the processes f{X) and g(X) are semimartingales. For instance, if X = X0 + M + A then the process g(X) has decomposition g(X) = g(X0) + M + A given by
Mt= f g'(X)dM,      At= f g'(Xs)dAs + \ f g"(Xs)d[X]s. Jo Jo Jo
In view of Exercise 5.84, the quadratic covariation [f(X),g(X)] is the quadratic covariation between the martingale parts of f{X) and g{X), and is equal to f'(X)g'(X) ■ [X], by Lemma 5.77. Applying (5.86) with X and
90        5:  Stochastic Integrals
Y there replaced by f(X) and g(X), we find
f(Xt)g(Xt) - f(X0)g(X0)
= {f(X)-g(X))t+{g(X)-f(X))t + [f(X),g(X)]t
= {f{X)g'{X).X)t + \f{X)g"{X).[X]t
+ (g(X)f(X) ■ X)t + \g{X)f'{X) • [X]t + f'(X)g'(X) ■ [X]t,
where we have used (5.82) for / and g, and the substitution formula of Lemma 5.54(ii). By regrouping the terms this can be seen to be the Ito formula for the function fg.
Finally, we extend Ito's formula to general functions / by approximation. Because /" is continuous, there exists a sequence of polynomials /„ with /" —> /", f'n —> /' and /«—>■/ pointwise on m and uniformly on com-pacta, by an extension of the Weierstrass approximation theorem. Then fn(X), fn{X) and ft(X) converge pointwise on ft x [0,oo) to f{X), f{X) and f"{X). The proof of the theorem is complete, if we can show that all terms of Ito's formula applied with /„ converge to the corresponding terms with / instead of /„, as n —> oo. This convergence is clear for the left side of the formula. For the proof of the convergence of the integral terms, we can assume without loss of generality that the process X in the integrand satisfies Xq = 0; otherwise we replace the integrand by Xl(0tOOy
The process K = sup„ 1/4(^)1 is predictable and is bounded on sets where \X\ is bounded. If Tm = ini{t > 0: \Xt\ > m}, then, as we have assumed that Xq = 0, \X\ < m on the set [0, Tm] and hence KTm is bounded. We conclude that K is locally bounded, and hence, by Lemma 5.52, /;(X)-X4/'(X)-X,afln->oo.
Finally, for a fixed m on the event {t < Tm}, the processes s fn{X) are uniformly bounded on [0,*]. On this event /0* f"{Xs) [X]s -»■
/0 f"(Xs) d[X]s, as n —> oo, by the dominated convergence theorem, for fixed m. Because the union over m of these events is ft, the second terms on the right in the Ito formula converge in probability. ■
Ito's formula is easiest to remember in terms of differentials. For instance, the one-dimensional formula can be written as
df(Xt) = f'(Xt) dXt + y"(Xt) d[X]t.
The definition of the quadratic variation process suggests to think of [X]t as /(dXt)2. For this reason Ito's rule is sometimes informally stated as
df(Xt) = f'(Xt) dXt + \f"{Xt) {dXtf.
Since the quadratic variation of a Brownian motion B is given by [B]t = t, a Brownian motion then satisfies (dBt)2 = dt. A further rule is that (dBt)(dAt) = 0 for a process of bounded variation A, expressing that [B,A]t = 0. In particular dBtdt = 0.
5.11: Space of Square-integrable Martingales
91
5.87 Lemma. For every twice continuously differentiable function f: R —► R there exist polynomials pn: R —>■ R such that sup|a.|<n|pll^(a;) —/W (a;)| —>. 0 as n —> oo, for i = 0,1,2.
Proof. For every n € N the function [0,1] —> ffi defined by 5„(a;) = f"(x/n) is continuous and hence by Weierstrass' theorem there exists a polynomial rn such that the uniform distance on [—1,1] between gn and rn is smaller than n~3. This uniform distance is identical to the uniform distance on [—n, n] between /" and the polynomial qn defined by qn(x) = rn(xn). We now define pn to be the polynomial with p„(0) = /(O), p'n{0) = /'(O) and p" = qn. By integration of /" — p" it follows that the uniform distance between /' and p'n on [—n, n] is smaller than n~2, and by a second integration it follows that the uniform distance between / and pn on [—n, n] is bounded above by n_1. ■
* 5.11  Space of Square-integrable Martingales
Recall that we call a martingale M square-integrable if EM2 < oo for every t > 0 and Li-bounded if supi>0 EM2 < oo. We denote the set of all cadlag I/2-bounded martingales by ~H2, and the subset of all continuous I/2-bounded martingales by ~H2.
By Theorem 4.10 every L2-bounded martingale M = {Mt:t > 0} converges almost surely and in Li to a "terminal variable" and Mt = E(Moc| Tt) almost surely for all t > 0. If we require the martingale to be cadlag, then it is completely determined by the terminal variable (and the filtration, up to indistinguishability). This permits us to identify a martingale M with its terminal variable M^, and to make T-L2 into a Hilbert space, with inner product and norm
(M, JV) = EMooiVoo,       ||M|| = VEM^.
The set of continuous martingales H2 is closed in H2 relative to this norm. This follows by the maximal inequality (4.38), which shows that —> Moo in L>2 implies the convergence of sup4 |M™ — Mt \ in L2, so that continuity is retained when taking limits in T-L2. We denote the orthocom-plement of U2C in U2 by %2d, so that
n/2 _      nil  i nil nil   i nil
n — nc + nd,      nc _i_ nd.
The elements of H2, are referred to as the purely discontinuous martingales bounded in Li.
Warning. The sample paths of a purely discontinuous martingale are not "purely discontinuous", as is clear from the fact that they are cadlag by
92        5:  Stochastic Integrals
definition. Nor is it true that they change by jumps only. The compensated Poisson process (stopped at a finite time to make it I/2-bounded) is an example of a purely discontinuous martingale. (See Example 5.89.)
The quadratic covariation processes [M, N] and (M, N) offer another method of defining two martingales to be "orthogonal": by requiring that their covariation process is zero. For the decomposition of a martingale in its continuous and purely discontinuous part this type of orthogonality is equivalent to orthogonality in the inner product (■, •).
5.88 Lemma. For every M £~H2 the following statements are equivalent.
(i) Me Hi
(ii) Mq = 0 almost surely and MN is a uniformly integrable martingale for every N € V2.
(Hi) Mq = 0 almost surely and MN is a local martingale for every continuous local martingale N.
(iv) Mq = 0 almost surely and [M, JV] = 0 for every continuous local martingale N.
(v) M0 = 0 almost surely and (M, N) = 0 for every N € V2. Furthermore, statements (in) and (iv) are equivalent for every local martingale M.
Proof. If M and N are both in H2, then \MtNt\ < M2 + N2 < supt(M2 + N2), which is integrable by (4.38). Consequently, the process MN is dominated and hence uniformly integrable. If it is a local martingale, then it is automatically martingale. Thus (iii) implies (ii). Also, that (ii) is equivalent to (v) is now immediate from the the definition of the predictable covariation. That (iv) implies (v) is a consequence of Lemma 5.64(h) and the fact that the zero process is predictable. That (iv) implies (iii) is immediate from Lemma 5.64(ii).
(ii) =>■ (i). If MN is a uniformly integrable martingale, then (M,N) = EMooiVoo = ~EM0N0 and this is zero if M0 = 0.
(i) (ii). Fix M € %2d, so that EM^N^ = 0 for every N € %2. The choice N = lp for a set F € Tq yields, by the martingale property of M that EM01f = EMooIf = EMooJVoo = 0. We conclude that M0 = 0 almost surely.
For an arbitrary N € H2 and an arbitrary stopping time T, the process NT is also contained in H2 and hence, again by the martingale property of M combined with the optional stopping theorem, EMtNt = EM^Nt = EM00(iVT)00 = 0. Thus MN is a uniformly integrable martingale by Lemma 4.22.
(i)+(ii) (iii). A continuous local martingale N is automatically locally I/2-bounded and hence there exists a sequence of stopping times 0 < Tn f oo such that NTn is an I/2-bounded continuous martingale, for every n. If M is purely discontinuous, then 0 = [NTn,M] = [NTn,MTn]. Hence (MN)Tn = MTnNTn is a martingale by Lemma 5.64(h), so that
5.11: Space of Square-integrable Martingales
93
MN is a local martingale.
(iii) (iv) By Lemma 5.64(ii) the process MN — [M,N] is always a local martingale. If MN is a local martingale, then [M, N] is also a local martingale. The process [M, N] is always locally of bounded variation. If N is continuous this process is also continuous in view of Lemma 5.64(vi). Therefore [M, N] = 0 by Theorem 5.46. ■
The quadratic covariation process [M, N] is defined for processes that are not necessarily I/2-bounded, or even square-integrable. It offers a way of extending the decomposition of a martingale into a continuous and a purely discontinuous part to general local martingales. A local martingale M is said to be purely discontinuous if Mq = 0 and [M, N] = 0 for every continuous local martingale N. By the preceding lemma it is equivalent to say that M is purely discontinuous if and only if MN is a local martingale for every continuous local martingale N, and hence the definition agrees with the defintion given earlier in the case of I/2-bounded martingales.
5.89 Example (Bounded variation martingales). Every local martingale that is of locally bounded variation is purely discontinuous.
To see this, note that if N is a continuous process, 0 at 0, then max; |JVt» — JVt~_ | —>• 0 almost surely, for every sequence of partitions as in Theorem 5.57. If M is a process whose sample paths are of bounded variation on compacta, it follows that the left side in the definition (5.58) of the quadratic covariation process converges to zero, almost surely. Thus [M, N] = 0 and MN is a local martingale by Lemma 5.64(h). □
The definition of H2d as the orthocomplement of H2 and the projection theorem in Hilbert spaces, shows that any L2-bounded martingale M can be written uniquely as M = Mc + Md for Mc € U2C and Md € %2d. This decomposition can be extended to local martingales, using the extended definition of orthogonality.
5.90 Lemma. Any cadlag local martingale M possesses a unique decomposition M = Mo + Mc + Md into a continuous local martingale Mc and a purely discontinuous local martingale Md, both 0 at 0. (The uniqueness is up to indistinguishability.)
Proof. In view of Lemma 5.48 we can decompose M as M = Mo + N + A for a cadlag local I/2-martingale N and a cadlag local martingale A of locally bounded variation, both 0 at 0. By Example 5.89 A is purely discontinuous. Thus to prove existence of the decomposition it suffices to decompose N. If 0 < Tn f oo is a sequence of stopping times such that NTn is an I/2-martingale for every n, then we can decompose NTn = N£ + Nd in H2 for every n. Because this decomposition is unique and both H2 and T-L\ are closed under stopping (because [MT,N] = [M, N]T), and NTm =
94        5:  Stochastic Integrals
(JVT«)T™ = (A^)Tm + {Nd)T™ for m < n, it follows that (iV£)Tm = and (Nd)Tm = N^. This implies that we can define Nc and Nd consistently as
and on [0,Tm]. The resulting processes satisfy (JVc)Tm = and (Nd)Tm = N^. The first relation shows immediately that Nc is continuous, while the second shows that Nd is purely discontinuous, in view of the fact [Nd,K]T™ = [(Nd)T™,K] = 0 for every continuous K € H2.
Given two decompositions M = M0 +Mc + Md = M0+Nc + Nd, the process X = Mc — Nc = Nd — Md is a continuous local martingale that is purely discontinuous, 0 at 0. By the definition of "purely discontinuous" it follows that X2 is a local martingale as well. Therefore there exist sequences of stopping times 0 < Tn f oo such that Y = XTn and Y2 = (X2)Tn are uniformly integrable martingales, for every n. It follows that t FY2 is constant on [0, oo] and at the same time Yt = E(Y00|Ft) almost surely, for every t. Because a projection decreases norm, this is possible only if Yt = Yoo almost surely for every t. Thus X is constant. ■
The decomposition of a local martingale in its continuous and purely discontinuous parts makes it possible to describe the relationship between the two quadratic variation processes.
5.91 Lemma. If M and N are JocaJ L^-maitingales with decompositions M = M0 + Mc + Md and N = N0 + Nc + Nd as in Lemma 5.90, then
[M,N]t = {Mc,Nc)t + J2AM*ANs-
s<t
Proof. For simplicity we give the proof in the case that M = N. Because the process (Mc) is continuous, by Lemma 5.69, the process [M] as in the lemma satisfies A[M] = (AM)2. As in the proof of Lemma 5.77 it suffices to prove that M2 — [M] is a local martingale. Assume M0 = 0 for simplicity. The decomposition implies that M2 = (Mc)2 + 2McMd + (Md)2. The middle term 2McMd is a local martingale, because Md is purely discontinuous and Mc is continuous. The first term on the right has compensator (Mc). Therefore, it suffices to show that the purely discontinuous submartingale Md has quadratic variation [Md]t = Ss<t(^^)2• F°r this see Rogers and Williams, pp384-385, in particular the proof of Theorem 36.5. ■
Warning. For a martingale M of locally bounded variation the decomposition M = Mq + Mc + Md is not the same as the decomposition of M in its continuous and jump parts. For instance, the compensated Poisson process is purely discontinuous and hence has continuous part zero.
The local martingale M in the decomposition X = Xq + M + A of a semimartingale can be split in its continuous and purely discontinuous parts Mc and Md. Even though the decomposition of X is not unique, the continuous martingale part Mc is the same for every decomposition. This
5.12: Ito's Formula 95
is true because M — M = A — A implies that the local martingale M — M is of locally bounded variation, whence it is purely discontinuous by Example 5.89. It is called the continuous martingale part of the semimartingale X, and denoted by Xc.
* 5.12  Ito's Formula
5.92 Theorem (Ito's formula). Let X = (Xi,... ,Xd) be a vector of cadlag semimartingales. Then, for every twice continuously differentiable function f: Md ->■ ffi,
d
d
+     [/(*.) - /(*.-) - E Dif(Xs_)AXi!S],
a.s..
s<t i=l
6
Stochastic Calculus
In this chapter we discuss some examples of "stochastic calculus", the manipulation of stochastic integrals, mainly by the use of the Ito formula. The more substantial application to stochastic differential equations is discussed in Chapter 7.
We recall the differential notationior stochastic integrals. For processes X, Y, Z we write
dX = Y dZ,       iff      X = X0 + Y ■ Z.
In particular d(Y ■ Z) =Y dZ. By the substitution rule, Lemma 5.54(h), it follows that dZ = Y~x dX if dX = Y dZ for a strictly positive process Y, provided the stochastic integrals are well-defined.
For notational convenience we use complex-valued processes in some of the proofs. A complex-valued random variable Z on a probability space {n,F,P) is a function Z: ft -> C of the form Z = U + iV for ordinary, real-valued random variables U and V. Its expectation is defined as EZ = E£7 + iEV', if U and V are integrable. Conditional expectations E(Z| ^o) are defined similarly from the conditional expectations of the real and imaginary parts of Z. A complex-valued stochastic process is a collection Z = {Zt: t > 0} of complex-valued random variables. A complex-valued martingale Z is a complex-valued process whose real and imaginary parts are martingales. Given the preceding definitions of (conditional) expectations, this is equivalent to the process satisfying the martingale property E(Zt\Ts) = Zs for s<t.
With these definitions it can be verified that Ito's formula extends to twice continuously differentiable complex-valued functions /: Md —> C. We simply apply the formula to the real and imaginary parts of / and next combine.
6.1: Levy's Theorem 97
6.1  Levy's Theorem
The quadratic variation process of a Brownian motion is the identity function. Levy's theorem asserts that Brownian motion is the only continuous local martingale with this quadratic variation process. It is a useful tool to show that a given process is a Brownian motion. The continuity is essential, because the compensated Poisson process is another example of a martingale with quadratic variation process equal to the identity.
6.1 Theorem (Levy). Let M be a continuous local martingale, 0 at 0, such that [M] is the identity function. Then M is a Brownian motion process.
Proof. For a fixed real number 8 consider the complex-valued stochastic process
By application of Ito's formula to Xt = f(Mt,t) with the complex-valued function f(m,t) = exp(i8m + \02t), we find
dXt = Xti6dMt + \Xt{i6f d[M]t + Xt±82dt = Xti6dMt,
since [M]t = t by assumption. It follows that X = X0 + iOX ■ M and hence X is a (complex-valued) local martingale. Because \Xt \ is actually bounded for every fixed t, X is a martingale. The martingale relation E(Xt| Ts) = Xs can be rewritten in the form
E(e«(^-".)|.Fs) =e-h°'(*-'),      a.s., s<t.
This implies that Mt — Ms is independent of Ts and possesses the normal distribution with mean zero and variance t — s. (Cf. Exercise 6.2.) ■
6.2 EXERCISE. Let X be a random variable on the probability space (U,T,P) and Tq C T a sub u-field such that F,(el6X\To) is equal to a constant c(8) for every 8 € ffi. Show that X is independent of To-
Levy's theorem may be interpreted in the sense that among the continuous local martingales Brownian motion is determined by its quadratic variation process. Actually, every continuous local martingale is "determined" by its quadratic variation process, in a certain sense. The following theorem shows that we can generate an arbitrary continuous local martingale from a Brownian motion by transforming the time scale using the inverse process of the quadratic variation. In the words of Rogers and Williams, p64, any such continuous local martingale "has delusions of grandeur: it thinks it is a Brownian motion" running on a different clock.
98        6:  Stochastic Calculus
6.3 Theorem. Let M be a continuous local martingale relative to a filtration {Tt} such that [M]t t °o almost surely, as t f oo. Let Tt = inf{s > 0: [M]s > t}. Then the process Bt = Mrt is a Browman motion relative to the filtration {Trt} and Mt = -B[m]t-
Proof. For every fixed t the variable Tt is a stopping time relative to the filtration {Tt}, and the maps t Tt are right continuous. It follows from this that {Trt} is a right continuous filtration. Indeed, if A € Trq for every rational number q > t, then A n {Tq < u} € Tu for every u > 0, by the definition of Trq - Hence A n {Tt < u} = Uq>tA n {Tq < u} € Tu for every u > 0, whence A € Trt - The filtration {Trt} is complete, because Trt D To for every t.
For simplicity assume first that the sample paths s [M]s of [M] are strictly increasing. Then the maps t Tt are their true inverses and, for every s, t > 0,
(6.4) TtA[M]a = Tt A «.
In the case that t < [M]s, which is equivalent to Tt < s, this is true because both sides reduce to Tt. In the other case, that t > [M]s, the identity reduces to T[M]a = s, which is correct because T is the inverse of [M].
The continuous local martingale M can be localized by the stopping times Sn = inf{s > 0: \MS\ > n}. The stopped process MSn is a bounded martingale, for every n. By the definition Bt = Mrt and (6.4),
Bt/\[M]Sn = mttasn,
fl2a[m]s„ " * A [M]Sn = M2tASn - [M]TtASn,
where we also use the identity t = [M]^. The variable Rn = [M]sn is an Trt-stopping time, because, for every t > 0,
{[M}Sn>t} = {Sn>Tt}€TTt.
The last inclusion follows from the fact that for any pair of stopping times S, T the event {T < S} is contained in Tt, because its intersection with {T < t} can be written in the form Uq<t{T < q < t < S} € Tt, where the union is restricted to rational numbers q > 0.
By the optional stopping theorem the processes t MTt/\sn and 1 MTtASn — [M]tt/\sn are martingales relative to the filtration {Trt}- Because they are identical to the processes t Bt and tv^B\—t stopped at Rn, we conclude that the latter two processes are local martingales. From the local martingale property of the process t B2 — t it follows that (B) is the identity process. Because M and T are continuous, so is B. By Levy's theorem, Theorem 6.1, we conclude that B is a Brownian motion. This concludes the proof if [M] is strictly increasing.
For the proof in the general case we may still assume that the sample paths of [M] are continuous and nondecreasing, but we must allow them to
6.1: Levy's Theorem 99
possess intervals of constant value, which we shall refer to as "flats". The maps t i-)- Tt are "generalized inverses" of the maps s [M]s and map a value t to the largest time s with [M]s = t, i.e. the right end point of the flat at height t. The function s [M]s is constant on each interval of the form [s, X[M] J, the time 2]M]a being the right end point of the flat at height [M]s. The inverse maps t *-+Tt are cadlag with jumps at the values t that are heights of flats of nonzero length. For every s,t >0,
Tt<sifft< [M]s, [M]Tt = t, T[m]a > s,
with, in the last line, equality unless s is in the interior or on the left side of a flat of nonzero length.
These facts show that (6.4) is still valid for every s that is not in the interior or on the left side of a flat. Then the proof can be completed as before provided that the stopping time Sn is never in the interior or on the left of a flat and the sample paths of B are continuous.
Both properties follow if M is constant on every flat. (Then Sn cannot be in the interior or on the left of a flat, because by its definition M increases immediately after Sn.) It is sufficient to show that the stopped process MSn has this property, for every n. By the martingale relation, for every stopping time T > s,
E((M|" - Mf")2| ?,) = E(M2„AT - MlAs\?,)
= E([M]s„AT-[M]s„As|^s).
For T equal to the stopping time mi{t > s: [M]gnAt > [M]gnAs}, the right side vanishes. We conclude that for every s > 0, the process M takes the same value at s as at the right end point of the flat containing s, almost surely. For oj not contained in the union of the null sets attached to some rational s, the corresponding sample path of M is constant on the flats of [M]. .
The filtration {Trt} may be bigger than the completed natural filtration generated by B and the variables [M]t may not be stopping times for the filtration generated by B. This hampers the interpretation of M as a time-changed Brownian motion, and the Brownian motion may need to have special properties. The theorem is still a wonderful tool to derive properties of general continuous local martingales from properties of Brownian motion.
The condition that [M]t f 00 cannot be dispensed of in the preceding theorem, because if [M]t remains bounded, then the process B is not defined on the full time scale [0, oo). However, the theorem may be adapted to cover more general local martingales, by piecing B as defined together with an
100        6:  Stochastic Calculus
additional independent Brownian motion that starts at time [M]oo- For this, see Chung and Williams, p??, or Rogers and Williams, p64-67.
Both theorems allow extension to multidimensional processes. The multivariate version of Levy's theorem can be proved in exactly the same way. We leave this as an exercise. Extension of the time-change theorem is harder.
6.5 EXERCISE. For i = 1,... ,d let M; be a continuous local martingale, 0 at 0, such that [M;,Mj]t = Sijt almost surely for every t > 0. Show that M = (Mi,..., Md) is a vector-valued Brownian motion, i.e. for every s < t the random vector Mt — Ms is independent of Ts and normally distributed with mean zero and covariance matrix (t — s) times the identity matrix.
6.2  Brownian Martingales
Let B be a Brownian motion on a given probability space (tt, T, P), and denote the completion of the natural filtration generated by B by {Tt}. Stochastic processes on the filtered space (tt, T, {Tt}, P) that are martingales are referred to as Brownian martingales. Brownian motion itself is an example, and so are all stochastic integrals X ■ B for predictable processes X that are appropriately integrable to make the stochastic integral well-defined.
The following theorem shows that these are the only Brownian martingales.
One interesting corollary is that every Brownian martingale can be chosen continuous, because all stochastic integrals relative to Brownian motion have a continuous version.
6.6 Theorem. Let {Tt} be the completion of the natural filtration of a Brownian motion process B. If M is a cadlag local martingale relative to {Tt}, then there exists a predictable process X with fQ X2 ds < oo almost surely for every t>0 such that M = M0 + X ■ B, up to indistinguishability.
Proof. We can assume without loss of generality that M0 = 0.
First suppose that M is an L2-bounded martingale, so that Mt = E(Moc| Tt) almost surely, for every t > 0, for some square-integrable variable Moo. For a given process X € Li ([0, oo) x tt, V, hm) the stochastic integral X-B is an L2-bounded martingale with Li-\imit (X-B)^ = f XdB, because f (X1[0A—X)2 dfiM —> 0 as t —> oo. The map /: X —> (X-B)^ is an isometry from Li([0, oo) xtt,V,fiM) into Li(tt, T, P). If Moo is contained in the range range(7) of this map, then Mt = E(Moo| Ft) = ^{{X ■ B) x\Tt) =
6.2: Brownian Martingales 101
(X-B)t, almost surely, because X-B is a martingale. Therefore, it suffices to show that range(-T) contains all square-integrable variables with mean zero.
Because the map / is an isometry on a Hilbert space, its range is a closed linear subspace of L^Sl, T, P). It suffices to show that 0 is the only element of mean zero that is orthogonal to range(7).
Given some process X € 1/2 ([0, oo) x U,V,hm) and a stopping time T, the process Xl[0tT] is also an element of 1*2([0, oo) x U,V,hm) and (X1[0it] ■ -B)oo = (X ■ B)T, by Lemma 5.27(iii). If _L range(7), then it is orthogonal to (Xl^^-B)^ and hence 0 = EMoc(X-B)T = ~EMT(X-B)T, because M is a martingale and (X ■ B)t is ^r-measurable. By Lemma 4.22 we conclude that the process M(X-B) is a uniformly integrable martingale.
The process Xt = exp(i8Bt + \02t) satisfies dXt = i8XtdBt, by Ito's formula (cf. the proof of Theorem 6.1), and hence X = 1 + iOX ■ B. The process X is not uniformly bounded and hence is not an eligible choice in the preceding paragraph. However, the process Xl[0tt] is uniformly bounded for every fixed constant T > 0 and hence the preceding shows that the process MXT = M + i0M(Xl[ott] ■ B) is a uniformly integrable martingale. This being true for every T > 0 implies that MX is a martingale. The martingale relation for the process MX can be written in the form
E(Mte^s'-s»)|^) = Mse-^2(*-s),      a.s.,  s < t.
Multiplying this equation by exp(i6'(Bs — Bu)) for u < s and taking conditional expectation relative to Tu, we find, for u < s < t,
E(Mteie^-B^+ie^B'-B^\Tu) =M„e-592(t-)-59'2(»-), a.s..
Repeating this operation finitely many times, we find that for an arbitrary partition 0 = to < t\ < ■ ■ ■ < tk = t and arbitrary numbers 6\,...,6k,
EE(Mie'2/'(B'.-fl'i-i)|70) = EMoe-5£i'?(i'-i'-1' = 0.
We claim that this shows that M = 0, concluding the proof in the case that M is I/2-bounded.
The claim follows essentially by the uniqueness theorem for characteristic functions. In view of the preceding display the measures ^ t and fit1,...,tk on     defined by
l4u...,th{A)=EM±lA{Btl-t0,...,Btk-tk_1),
possess identical characteristic functions and hence are identical. This shows that the measures n+ and n~ on (ft, J7) defined by /«±(-F1) = EM^lj? agree on the cr-field generated by Btl-t0, ■ ■ ■ ,Btk-tk_1- This being true for every partition of [0, t] shows that fi+ and fi~ also agree on the algebra generated
102        6:  Stochastic Calculus
by {Bs:0 < s < t} and hence, by Caratheodory's theorem, also on the u-field generated by these variables. Thus EMtl;r = 0 for every F in this (j-field, whence Mt = 0 almost surely, because Mt is measurable in this u-field.
Next we show that any local martingale M as in the statement of the theorem possesses a continuous version. Because we can localize M, it suffices to prove this in the case that M is a uniformly integrable martingale. Then Mt = E(Moo| Tt) for an integrable variable M^. If we let M£> be Moo truncated to the interval [—n,n], then M™:= E(M^|Jrt) defines a bounded and hence LVbounded martingale, for every n. By the preceding paragraph this can be represented as a stochastic integral with respect to Brownian motion and hence it possesses a continuous version. The process \Mn — M| is a cadlag submartingale, whence by the maximal inequality given by Lemma 4.36,
p(sup |M™ - Mt| > e) < ^E|M™ - M^.
The right side converges to zero as n —> oo, by construction, whence the sequence of suprema in the left side converges to zero in probability. There exists a subsequence which converges to zero almost surely, and hence the continuity of the processes M™ carries over onto the continuity of M.
Every continuous local martingale M is locally I/2-bounded. Let 0 < Tn f oo be a sequence of stopping times such that MTn is an L2-bounded martingale, for every n. By the preceding we can represent M7™ as M7™ = Xn ■ B for a predictable process Xn € I/2([0, oo) x U,V,hm), for every n. For m < n,
Xm ■ B = MTm = (MT")T™ = {Xn ■ B)Tm = Xnl[0,rm] • B,
by Lemma 5.27(iii) or Lemma 5.54. By the isometry this implies that, for every t > 0,
0 = E(Xm • B — Xnl[0tTm] ■ B)2 = E / (Xm — X„l[o!Tm])2 d\.
Jo
We conclude that Xm = Xn on the set [0,Tm] almost everywhere under A x P. This enables to define a process X on [0, oo) x fi in a consistent way, up to a A x P-null set, by setting X = Xm on the set [0,Tm]. Then (X-B)T™ = Xl[0}TmyB = Xm-B = MT™ for every m and hence M = X-B.
The finiteness of E f X^ d\ for every m implies that f* X2 d\ < oo almost surely, for every t > 0. ■
The preceding theorem concerns processes that are local martingales relative to a filtration generated by a Brownian motion. This is restrictive in terms of the local martingales it can be applied to, but at the same time
6.3: Exponential Processes
103
determines the strength of the theorem, which gives a representation as a stochastic integral relative to the given Brownian motion.
If we are just interested in representing a local martingale as a stochastic integral relative to some Brownian motion, then we need not restrict the filtration to a special form. Then we can define a Brownian motion in terms of the martingale, and actually the proof of the representation can be much simpler. We leave one result of this type as an exercise. See e.g. Karatzas and Shreve, pl70-173 for slightly more general results.
6.7 EXERCISE. Let M be a continuous local martingale with quadratic variation process [M] of the form [M]t = f0 Xsds for a continuous, strictly positive stochastic process A. Show that B = A-1/2 • M is a Brownian motion, and M = y/\ ■ B. [Hint: don't use the preceding theorem!]
6.3 Exponential Processes
The exponential process corresponding to a continuous semimartingale X is the process £(X) defined by
The name "exponential process" would perhaps suggest the process ex rather than the process £(X) as defined here. The additional term \[X] in the exponent of £(X) is motivated by the extra term in the Ito formula. An application of this formula to the right side of the preceding display yields
(Cf. the proof of the following theorem.) If we consider the differential equation df(x) = f(x) dx as the true definition of the exponential function f(x) = ex, then £(X) is the "true" exponential process of X, not ex.
Besides that, the exponentiation as defined here has the nice property of turning local martingales into local martingales.
6.9 Theorem. The exponential process £{X) of a continuous local martingale X with Xo = 0 is a local martingale. Furthermore,
(i) If EeaM' < oo for every t>0, then £(X) is a martingale.
(ii) If X is an Li-martingale and E J* £(X)1 d[X]„ < oo for every t > 0, then £(X) is an L^-martingale.
Proof. By Ito's formula applied to the function f(Xt, [X]t) = £{X)t, we find that
£{X)
(6.8)
d£{X)t = £{X)t dXt.
d£(X)t = £(X)t dXt + \£{X)t d[X]t + £(X)t (-1) d[X]t.
104        6:  Stochastic Calculus
This simplifies to (6.8) and hence £(X) = 1+£(X)-X is a stochastic integral relative to X. If X is a local martingale, then so is S(X). Furthermore, if X is an I/2-martingale and J l[o,t]£{X)2 dfix < oo for every t > 0, then £(X) is an I/2-martingale, by Theorem 5.25. This condition reduces to the condition in (ii), in view of Lemma 5.74.
The proof of (i) should be skipped at first reading. If 0 < Tn f oo is a localizing sequence for £(X), then Fatou's lemma gives
E(£{X)t\Fs) <liminfE(£(X)tATJ^s) = hminf £(X)sAT„ =£{X)8.
n—too n—}oo
Therefore, the process £(X) is a supermartingale. It is a martingale if and only if its mean is constant, where the constant must be E£(X)0 = 1.
In view of Theorem 6.3 we may assume that the local martingale X takes the form Xt = B[X]t for a process B that is a Brownian motion relative to a certain filtration. For every fixed t the random variable [X]t is a stopping time relative to this filtration. We conclude that it suffices to prove: if B is a Brownian motion and T a stopping time with Eexp(|T) < oo, then Eexp(Br - \T) = 1.
Because 2BS is normally distributed with mean zero and variance 4s,
e/ £{Bfsds= f Ee2B-e-sds= f e8 ds < oo Jo Jo Jo
By (ii) it follows that £(B) is an I/2-martingale. For given a < 0 define Sa = ini{t > 0: Bt — t = a}. Then Sa is a stopping time, so that £(B)Sa is a martingale, whence E£(B)saAt = 1 for every t. It can be shown that Sa is finite almost surely and
E£{B)S" = EeBs"-5Sa = 1.
(The distribution of Sa is known in closed form. See e.g. Rogers and Williams 1.9, pl8-19; because Bsa = Sa + a, the right side is the expectation of exp(a + TjSa).) With the help of Lemma 1.22 we conclude that £(B)saAt —> £{B)sa in L\ as t —> oo, and hence £(B)Sa is uniformly inte-grable. By the optional stopping theorem, for any stopping time T,
1 = E£(B)%? = ElT<saeBT~5T + ElT>SaeBsa~^Sa •
Because the sample paths of the process t —>■ Bt — t are bounded on compact time intervals, Sa t °o if a 4- —oo. Therefore, the first term on the right converges to Eexp(.E?T — \T), by the monotone convergence theorem. The second term is equal to
ElT>saeSa+a~^Sa < eaEe5T. If Eexp(iT) < oo, then this converges to zero as a —> —oo. ■
6.4: Cameron-Martin-Girsanov Theorem
105
In applications it is important to determine whether the process £(X) is a martingale, rather than just a local martingale. No simple necessary and sufficient condition appears to be known, although the condition in (i), which is known as Novikov's condition, is optimal in the sense that the factor \ in the exponent cannot be replaced by a smaller constant, in general.
6.10 EXERCISE. Let X be a continuous semimartingale with X0 = 0. Show that Y = £{X) is the unique solution to the pair of equations dY = YdX and Y0 = 1. (Hint: using Ito's formula show that d{Y£(X)~1) = 0 for every solution Y, so that Y£(X)~1 = Y0£{X)q1 = 1.)
6.11 EXERCISE. Show that £(X)T = £{XT) for every stopping time T.
6.4  Cameron-Martin-Girsanov Theorem
Let X be a continuous local martingale on the filtered probability space (fi,F, {Tt},P), 0 at 0. If the exponential process £(X) corresponding to X is a uniformly integrable martingale, then we can define a probability measure P on T by
Thus P possesses Radon-Nikodym derivative £(X)oc relative to P. Because P(F) = E1F£(X)00 = ElF£(X)t for every F € Tt and £{X)t is .^-measurable, the restriction Pt of P to Tt possesses a Radon-Nikodym density £{X)t relative to the restriction Pt of P to Tt, i.e.
The condition that £ (X) be a uniformly integrable martingale is somewhat restrictive. It is satisfied, for instance, if X is a process that satisfies Novikov's condition (as in Theorem 6.9 (i)) stopped at a finite time. We illustrate this situation in Example 6.18.
If M is a local martingale relative to P, then it typically looses the local martingale property if we use the measure P instead. The Cameron-Martin-Girsanov theorem shows that M is still a semimartingale if we use P, and gives an explicit decomposition of M in its martingale and bounded variation parts.
We start with a general lemma on the martingale property under a "change of measure". We refer to a process that is a local martingale under P as a P-local martingale.
106        6:  Stochastic Calculus
6.12 Lemma. Let P and P be equivalent probability measures on (fi, T) and let Lt = dPt/dPt be a Radon-Nikodym density of the restrictions of P and P to Tf Then a stochastic process M is a P-local martingale if and only if the process LM is a P-local martingale.
Proof. We first prove the theorem without "local". If M is an adapted P-integrable process, then, for every s < t and F € Fs,
EMtlF = ELjMjIf, EMS1F = ELsMs1f,
The two left sides are identical for every F € Ts and s < t if and only if M is a P-martingale. Similarly, the two right sides are identical if and only if LM is a P-martingale. We conclude that M is a P-martingale if and only if LM is a P-martingale.
If M is a P-local martingale and 0 < Tn f oo is a localizing sequence, then the preceding shows that the process LMTn is a P-martingale, for every n. Then so is (LMTn)Tn = (LM)Tn, and we can conclude that LM is a P-local martingale.
Because P and P are equivalent, we can select a version of L that is strictly positive. Then dPt/dPt = L^1 and we can use the argument of the preceding paragraph in the other direction to see that M = L~X[LM) is a P-local martingale if LM is a P-local martingale. ■
6.13 EXERCISE. In the situation of the preceding lemma, show that Lt = F,(dP/dP\Ft) almost surely and conclude that there exists a cadlag version of L.
If M itself is a P-local martingale, then generally the process LM will not be a P-local martingale, and hence the process M will not be a P-local martingale. We can correct for this by subtracting an appropriate process. We assume that the likelihood ratio process L is continuous. The processes
LM - [L, M]
L(L~X ■ [L, M]) - [L, M]
are both P-local martingales. For the first this is an immediate consequence of Lemma 5.64(h). For the second it follows from the integration-by-parts or Ito's formula. (See the proof of the following theorem.) It follows that the difference of the two processes is also a P-local martingale and hence the process
(6.14) M - L~x ■ [L, M].
is a P-local martingale. In the case that L = £(X) this takes the nice form given in the following theorem.
6.4: Cameron-Martin-Girsanov Theorem
107
6.15 Theorem. Let X be a continuous local martingale, 0 at 0, such that £{X) is a uniformly integrable martingale, and let dP = £(X)00 dP'. If'M is a continuous P-local martingale, then M — [X, M] is a P-local martingale.
Proof. The exponential process L = £{X) satisfies dL = LdX, or equiv-alently, L = 1 + L ■ X. Hence L~x ■ [L,M] = L~X[L ■ X,M] = [X,M], by Lemma 5.77(i). The theorem follows if we can show that the process in (6.14) is a P-local martingale. By Lemma 6.12 it suffices to show that L times the process is a P-local martingale.
By the integration-by-parts (or Ito's) formula it follows that
d(L(L~x ■ [L,M])) = (L~x ■ [L,M])dL + Ld{L-x ■ [L,M]).
No "correction term" appears at the end of the display, because the quadratic covariation between the continuous process L and the process of locally bounded variation L~x ■ [L, M] is zero. The integral of the first term on the right is a stochastic integral (of L~x ■ [L,M]) relative to the P-martingale L and hence is a P-local martingale. The integral of the second term is [L, M], by Lemma 5.77(i). It follows that the process (L(L~X ■ [L, M]) — [L, M] is a local martingale. The difference of this with the local martingale LM — [L, M] is L times the process in (6.14). ■
The quadratic covariation process [X, M] in the preceding theorem was meant to be the quadratic covariation process under the orginal measure P. Because P and P are equivalent and a quadratic covariation process can be defined as a limit of inner products of increments, as in (5.58), it is actually also the quadratic variation under P.
Under the continuity assumptions on M and X, the process M—[X, M] possesses the same quadratic variation process [M] as M, where again it does not matter if we use P or P as the reference measure.
The latter remark is particularly interesting if M is a P-Brownian motion process. Then both M and M — [X, M] possess quadratic variation process the identity. Because M — [X, M]_ is a continuous local martingale under P, it is a Brownian motion under P by Levy's theorem. This proves the following corollary.
6.16 Corollary. Let X be a continuous local martingale, 0 at 0, such that £{X) is a uniformly integrable martingale, and let dP = £(X)00dP. If B is a P-Brownian motion, then B — [X, B] is a P-Brownian motion.
A further specialization is to choose X equal to a stochastic integral X = Y ■ B of a process Y relative to Brownian motion. Then
(6.17) —— = eJo 2 Jo " a.s.,
dPt
108        6:  Stochastic Calculus
and, by the preceding theorem, the process
t<-+ Bt- [ Ysds Jo
is a Brownian motion under P. Here the process Y must be chosen such that the stochastic integral Y ■ B is well-defined and £ (Y ■ B) is a uniformly integrable martingale. For the first it suffices that Y is adapted and measurable with f Yj2 ds finite almost surely. The second condition is more restrictive.
6.18 Example. For a given measurable, adapted process Y and constant T > 0 assume that
Ee2 Jo   "    < oo.
Then the process Y1[o,t] - B = (Y ■ B)T satisfies Novikov's condition, as its quadratic variation is given by
rTAt
\Yl[0>Ti-B]t= Y?ds. Jo
By Theorem 6.9 the process £{{Y ■ B)T) is a martingale. Because it is constant on [T, oo), it is uniformly integrable. We conclude that the process {Bt — f0™ Ys ds: t > 0} is a Brownian motion under the measure P given by dP = £{{Y •JB)T)00dP. □
It is a fair question why we would be interested in "changes of measure" of the form (6.17). We shall see some reasons when discussing stochastic differential equations or option pricing in later chapters. For now we can note that in the situation that the filtration is the completion of the filtration generated by a Brownian motion any change to an equivalent measure is of the form (6.17).
6.19 Lemma. Let {Tt} be the completion of the natural filtration of a Brownian motion process B defined on P). If P is a probability measure on (fi, T) that is equivalent to P, then there exists a predictable process Y with f* Yj2 ds < oo almost surely for every t > 0 such that the restrictions Pt and Pt of P and P to Tt satisfy (6.17).
Proof. Let Lt = dPt/dPt be a version of the density of Pt relative to Pt. Then for every F € Tt,
dP f dP
E^-1F = P(F) = Pt(F) =     ±J.dPt = VLtlp. af jF aft
This shows that Lt = F,(dP/dP\Tt) almost surely, and hence L is a martingale relative to the filtration {Tt}. Because this is a Brownian filtration,
6.4: Cameron-Martin-Girsanov Theorem
109
Theorem 6.6 implies that L permits a continuous version. By the equivalence of P and P the variable Lt is strictly positive almost surely, for every t > 0, and hence we can choose the process L strictly positive without loss of generality. Then L~x is a well-defined continuous process and hence is locally bounded. The stochastic integral Z = L~x ■ L is well-defined and a local martingale, relative to the Brownian filtration {Tt}- By Theorem 6.6 it can be represented as Z = Y ■ B for a predictable process Y as in the statement of the lemma. The definition Z = L~x ■ L implies dL = LdZ. This is solved uniquely by L = £(Z). (Cf. Exercise 6.10.) ■
It could be of interest to drop the condition that £(X) is uniformly integrable, which we have made throughout this section. As long as £(X) is a martingale, then we can define probability measures Pt on Tt by
Pt(F) = ElF£(X)t.
By the martingale property this collection of measures will be consistent in the sense that Ps is the restriction of Pt of Ts, for s < t. If we could find a measure P on with restriction Pt to Tt, much of the preceding goes through.
Such a measure P does not necessarily exist under just the condition that £(X) is a martingale. A sufficient condition is that the filtration be generated by an appropriate process Z. If Tt = a(Zs:s < t) for every t, then we can invoke the Kolmogorov extension theorem to see the existence of a measure P on T^. It should be noted that this condition does not permit that we complete the filtration. In fact, completion (under P) may cause problems, because, in general, the measure P will not be absolutely continuous relative to P. See Chung and Williams, p?? for further discussion.
7
Stochastic
Differential Equations
In this chapter we consider stochastic differential equations of the form
Here [i and a are given functions and B is a Brownian motion process. The equation may be thought of as a randomly perturbed version of the first order differential equation dXt = [i(t, Xt) dt. Brownian motion is often viewed as an appropriate "driving force" for such a noisy perturbation.
The stochastic differential equation is to be understood in the sense that we look for a continuous stochastic process X such that, for every t > 0,
Usually, we add an initial condition Xq = f, for a given random variable f, or require that X0 possesses a given law.
It is useful to discern two ways of posing the problem, the strong and the weak one, differing mostly in the specification of what is being given a-priori and of which further properties the solution X must satisfy. The functions [i and a are fixed throughout, and are referred to as the "drift" and "diffusion coefficients" of the equation.
In the "strong setting" we are given a particular filtered probability space (fi, T, {Tt}, P), a Brownian motion B and an initial random variable f, both defined on the given filtered space, and we search for a continuous adapted process X, also defined on the given filtered space, which satisfies the stochastic differential equation with X0 = f. It is usually assumed here that the filtration {Tt} is the smallest one to which B is adapted and for which f is ^b-measurable, and which satisfies the usual conditions. The requirement that the solution X be adapted then implies that it can be expressed as X = F(^, B) for a suitably measurable map F, and the precise
dXt = n(t, Xt) dt + a(t, Xt) dBt.
(7.1)
Xt=X0+     n(s,Xs)ds+ <7{s,Xs)dB;
a.s..
7:  Stochastic Differential Equations 111
definition of a strong solution could include certain properties of F, such as appropriate measurability, or the requirement that F(x,B') is a solution of the stochastic differential equation with initial variable x € R, for every x and every Brownian motion B' defined on some filtered probability space. Different authors make this precise in different ways; we shall not add to this confusion here.
For a weak solution of the stochastic differential equation we search for a filtered probability space, as well as a Brownian motion and an initial random variable f, and a continuous adapted process X satisfying the stochastic differential equation, all defined on the given filtered space. The initial variable X0 is usually required to possess a given law. The filtration is required to satisfy the usual conditions only, so that a weak solution X is not necessarily a function of the pair (£, B).
Clearly a strong solution in a given setting provides a weak solution, but the converse is false. The existence of a weak solution does not even imply the existence of a strong solution (depending on the measurability assumptions we impose). In particular, there exist examples of weak solutions, for which it can be shown that the filtration must necessarily be bigger than the filtration generated by the driving Brownian motion, so that the solution X cannot be a function of (£, B) alone. (For instance, Tanaka's example, see Chung and Williams, pages 248-250.)
For X to solve the stochastic differential equation, the integrals in (7.1) must be well defined. This is certainly the case if fi and a are measurable functions and, for every t > 0,
Throughout we shall silently understand that it is included in the requirements for "X to solve the stochastic differential equation" that these conditions are satisfied.
7.2 EXERCISE. Show that t ^ a{t, Xt) is a predictable process if a: M2 -»■ R is measurable and X is predictable. (Hint: consider the map {t,uj) (t,Xt(uj)) on [0, oo) x fi equipped with the predictable u-field.)
The case that fi and a depend on X only is of special interest. The stochastic differential equation
a.s.,
a.s..
(7.3)
dXt = n(Xt) dt + o-(Xt) dBt
is known as a diffusion equation. Under some conditions the solution X of a diffusion equation is a time-homogeneous Markov process. Some authors use the term diffusion process to denote any time-homogeneous (strong)
112
7:  Stochastic Differential Equations
Markov process, while other authors reserve the term for solutions of diffusion equations only, sometimes imposing additional conditions of a somewhat technical nature, or relaxing the differential equation to a statement concerning first and second infinitesimal moments of the type
These infinitesimal conditions give an important interpretation to the functions fi and a, and can be extended to the more general equation (7.1). Apparently, stochastic differential equations were invented, by Ito in the 1940s, to construct processes that are "diffusions" in this vaguer sense.
Rather than simplifying the stochastic differential equation, we can also make it more general, by allowing the functions [i and a to depend not only on (t,Xt), but on t and the sample path of X until t. The resulting stochastic differential equations can be treated by similar methods. (See e.g. pages 122-124 of Rogers and Williams.)
Another generalization is to multi-dimensional equations, driven by a multivariate Brownian motion B = (B±,... ,Bi) and involving a vector-valued function fj,: [0, oo) x ffi* ->■ ffi* and a function a: [0, oo) x ffi* ->■ IR*' with values in the k x /-matrices. Then we search for a continuous vector-valued process X = (Xi,..., Xu) satisfying, for i = 1,..., k,
Multivariate stochastic differential equations of this type are not essentially more difficult to handle than the one-dimensional equation (7.1). For simplicity we consider the one-dimensional equation (7.1), or at least shall view the equation (7.1) as an abbreviation for the multivariate equation in the preceding display.
We close this section by showing that Girsanov's theorem may be used to construct a weak solution of a special type of stochastic differential equation, under a mild condition. This illustrates that special approaches to special equations can be more powerful than the general results obtained in this chapter.
7.4 Example. Let f be an .Fn-measurable random variable and let X — f be a Brownian motion on a filtered probability space (fi, T, {Tt},P). For a given measurable function [i define a process Y by Yt = [i(t, Xt), and assume that the exponential process £(Y ■ X) is a uniformly integrable martingale. Then dP = £(Y ■X)00 defines a probability measure and, by Corollary 6.16 the process B defined by Bt = Xt — f — /0* Ys ds is a P-Brownian motion process. (Note that Y ■ X = Y ■ (X - £).) It follows that X together with
E(Xt+h - Xt\ Tt) = /i{Xt)h + o(h), w3x{Xt+h - Xt\ Tt) = o-2{Xt)h + o{h),
a.s.
a.s.,
hiO,.
7.1: Strong Solutions 113
the filtered probability space (fi, T, {Tt}, P) provides a weak solution of the stochastic differential equation Xt = £ + /0* n(s,Xs)ds + Bt.
The main condition to make this work is that the exponential process of Y ■ X is a uniformly integrable martingale. This is easy to achieve on compact time intervals by Novikov's condition. □
7.1  Strong Solutions
Following Ito's original approach we construct in this section strong solutions under Lipschitz and growth conditions on the functions [i and a. We assume that for every t > 0 there exists a constant Ct such that, for all s € [0,t] and for all x, y € [—t, t],
^75j \v(s,x) -fJ,(s,y)\ < Ct\x-y\,
\<r(s,x) - (r{s,y)\ < Ct\x - y\.
Furthermore, we assume that for every t > 0 there exists a constant Ct such that, for all s € [0, t] and x € ffi,
(76) \n{8,x)\<Ct{l + \x\),
\a{s,x)\ <Ct{l + \x\).
Then the stochastic differential equation (7.1) possesses a strong solution in every possible setting. The proof of this is based on an iterative construction of processes that converge to a solution, much like the Picard iteration scheme for solving a deterministic differential equation.
Let (fi,F, {Tt},P) be an arbitrary filtered probability space, and let B be a Brownian motion and an Fo-measurable random variable f defined on it.
7.7 Theorem. Let fi and a be measurable functions that satisfy (7.5)-(7.6). Then there exists a continuous, adapted process X on (fi, T, {Tt}, P) with Xq = f that satisfies (7.1). This process is unique up to indistinguisha-bility, and its distribution is uniquely determined by the distribution of f.
Proof. For a given process X let LX denote the process on the right of (7.1), i.e.
{LX)t = S+ [ fi{s,Xs)ds+ [ cr{s,Xs)dBs. Jo Jo We wish to prove that the equation LX = X possesses a unique continuous adapted solution X. By assumption (7.6) the absolute values of the integrands are bounded above by Ct(l + \XS\) and hence the integrals in the definition of LX are well-defined for every continuous adapted process X.
114        7:  Stochastic Differential Equations
First assume that f is square-integrable and the Lipschitz condition (7.5) is valid for every x,y € ffi (and not just for x,y € [—*,*]). We may assume without of loss of generality that the constants Ct are nondecreasing in t.
By the triangle inequality, the maximal inequality (4.38), the Cauchy-Schwarz inequality, and the defining isometry of stochastic integrals,
Esup|(LX)s - {LY)S
s<t
< E
Jo
+ E
/V(«.
Jo
Xs)-a(s,Ys))dBs
<tE     \n{8,X.)-n{8,Y.)fd8 + E / {o-{s,Xs)-o-{s,Ys))2ds Jo Jo
< (* + l)C2E / \XS-Ys\2ds. Jo
The use of the maximal inequality (in the first <) is justified as soon as the process t f*(a(s,XB) — a(s,Ys)) dBs is an I/2-martingale, which is certainly the case if the final upper bound is finite.
Define processes by X^ = £ and, recursively, = I/X^™-1), for n > 1. In particular,
+ [\{8,t)d8+ f' 0-(s,t;)dBs. Jo Jo
By similar arguments as previously,
Esup \XU _       |2 < * E /V(s,0 d« + E /* a3(«,0 ds
s<4 JO JO
<(* + l)2C2E(l + f2)-Furthermore, for n > 1, since - XW = LXW - U^™"1),
Esup - XSW|2 < (i + 1)C2 E /" - X^"1))2 ds.
s<4 JO
Iterating this last inequality and using the initial bound for n = 0 of the preceding display, we find that, with M = E(l + f2),
EbupIXW-X^ < (* + 1)2"Cf2"M
We conclude that, for m < n, by the triangle inequality,
sup|XsW -Xs(m)|
s<4
(£+1) 2   . Vii
< E
■C!VM
7.1: Strong Solutions 115
For fixed t, we have that em^n —)■ 0 as m, n —)■ oo. We conclude that the variables in the left side of the last display converge to zero in quadratic mean and hence in probability as m, n —> oo. In other words, the sequence of processes l'"' forms a Cauchy sequence in probability in the space C[0, t] of continuous functions, equipped with the uniform norm. Since this space is complete there exists a process X such that, as n —> oo,
sup |X
s<t
(»)
■X.
0.
Being a uniform limit of continuous processes, the process X must be continuous. By Fatou's lemma
sup|Xs -Xim)|
< lim e„
Because LXW = X(n+1\ the triangle inequality gives that
sup\{LX)s-X; <
sup|(LX).-(LXW)
s<t
+
sup |xi™+1) - Xs
s<t
X<;n)\2ds + £n+1
The right side converges to zero as n —> oo, for fixed t, and hence the left side must be identically zero. This shows that LX = X, so that X solves the stochastic differential equation, at least on the interval [0,t\.
If Y is another solution, then, since in that case X — Y = LX — LY,
Esup \X„ - Ys\2 < {t + l)C2 f Esup \XU - Yu\2 ds.
s<t ,70 uKs
By Gronwall's lemma, Lemma 7.10, applied to the function on the left side and with A = 0, it follows that the left side must vanish and hence X = Y.
By going through the preceding for every t € N we can consistently construct a solution on [0, oo), and conclude that this is unique.
By the measurability of fi and a the processes t fi(t,Xt) and t i—y a(t,Xt) are predictable, and hence progressively measurable, for every predictable process X. (Cf. Exercise 7.2.) By Fubini's theorem the process t [i(s,Xs) ds is adapted, while the stochastic integral t
/J a(s,Xs) dBs is a local martingale and hence certainly adapted. Because the processes are also continuous, they are predictable. The process X^ is certainly predictable and hence by induction the process       is predictable
116        7:  Stochastic Differential Equations
for every n. The solution to the stochastic differential equation is indistinguishable from lim infn-j.oo X^ and hence is predictable and adapted.
The remainder of the proof should be skipped at first reading. It consists of proving the theorem without the additional conditions on the functions fi and o~ and the variable f, and is based on the identification lemma given as Lemma 7.11 below. First assume that fi and a only satisfy (7.5) and (7.6), but f is still square-integrable.
For n € N let Xn'-^- —>■ R be continuously differentiate with compact support and be equal to the unit function on [—n,n\. Then the functions fj,n and <7„ defined by nn{t,x) = n{t,x)xn{x) and an(t,x) = a{t,x)xn{x) satisfy the conditions of the first part of the proof. Hence there exists, for every n, a continuous adapted process Xn such that
(7.8) Xn!t = £+     fj,n(s,XniB)ds + an(s,XniB)dBB.
Jo Jo
For fixed m < n the functions fim and fin, and am and an agree on the interval [—m, m], whence by Lemma 7.11 the process Xm and Xn are indistinguishable on the set [0,Tm] for Tm = inf{£ > 0: |^m,t| > m or \X„,it\ > m}.
In particular, the first times that Xm, or Xn leave the interval [—m, m] are identical and hence the possibility "|X„it| > m" in the definition of Tm is superfluous. If 0 < Tn f oo, then we can consistently define a process X by setting it equal to Xn on [0, Tn], for every n. Then XTn = X7" and, by the preceding display and Lemma 5.54(i),
(7.9) xTn=£+     l(0tTn](s)[i,n(s,XniB)ds+ l(o,Tn]{s)o-7i{s,Xn!S)dBs.
Jo Jo
By the definitions of Tn, fin, an and X the integrands do not change if we delete the subscript n from fin, an and Xn. We conclude that
fTnAt rT„M
Xfn=£,+ fJ,{s,Xs)ds+ a{s,Xs)dBs.
Jo Jo
This being true for every n implies that X is a solution of the stochastic differential equation (7.1).
We must still show that 0 < Tn f oo. By the integration-by-parts formula and (7.8)
■y-2 v2
0 = 2     Xn}Sfj,n(s, Xn}S) ds + 2 / XntBan(s,XntB)dBB Jo Jo
+ / al(s,Xn,s)ds. Jo
The process 1(o,t„]XniBan(s, XnjS) is bounded on [0, t] and hence the process
11—y f0TnAt Xn}San(s, Xn}S) dBs is a martingale. Replacing t by Tn A t in the
7.1: Strong Solutions 117
preceding display and next taking expectations we obtain
t-TnAt
1 + EX2 TnAt = 1 + Ef2 + 2E /      Xn,.pn(s, *»,.) ds
Jo
rTnAt
+ E / o-l{s,Xn}S)ds Jo
<l+Ef2 + (Ct + C2)E /      (l + X2!S)dS Jo
<l+Ef2 + (Ct + C2) / (l + EX2^As)dS. Jo
We can apply Gronwall's lemma, Lemma 7.10, to the function on the far left of the display to conclude that this is bounded on [0,t], uniformly in n, for every fixed t. By the definition of Tn
P(0 <Tn< t)n2 < VXlTnM.
Hence P(0 < Tn < t) = 0(n~2) —> 0 as n —> oo, for every fixed t. Combined with the fact that P{Tn = 0) = P(|£| > n) ->■ 0, this proves that 0 < Tn f oo.
Finally, we drop the condition that f is square-integrable. By the preceding there exists, for every n € N, a solution Xn to the stochastic differential equation (7.1) with initial value fl|£|<n. By Lemma 7.11 the processes Xm and Xn are indistibguishable on the event {|f | < m) for every n >m. Thus Hindoo Xn exists almost surely and solves the stochastic differential equation with initial value f.
The last assertion of the theorem is a consequence of Lemma 7.12 below, or can be argued along the following lines. The distribution of the triple (£, B, ^W) on RxC[0, oo) xC[0, oo) is determined by the distribution of (£,B,X(n-1'>) and hence ultimately by the distribution of (£,B,XW), which is determined by the distribution of f, the distribution of B being fixed as that of a Brownian motion. Therefore the distribution of X is determined by the distribution of f as well. (Even though believable this argument needs to be given in more detail to be really convincing.) ■
7.10 Lemma (Gronwall). Let f:[0,T] -> 1 be a measurable function such that f(t) < A + B J* f(s) ds for every t € [0, T] and constants A and B > 0. Then f{t) < AeBt on [0,T].
Proof. We can write the inequality in the form F'(t) — BF(t) < A, for F the primitive function of / with F(0) = 0. This implies that (F(t)e~Bt)' < Ae~Bt. By integrating and rearranging we find that F(t) < (A/B)(eBt — 1). The lemma follows upon reinserting this in the given inequality. ■
118        7:  Stochastic Differential Equations
* 7.1.1  Auxiliary Results
The remainder of this section should be skipped at first reading.
The following lemma is used in the proof of Theorem 7.7, but is also of independent interest. It shows that given two pairs of functions (fa,ai) that agree on [0, oo) x [—n, n], the solutions Xi of the corresponding stochastic differential equations (of the type (7.1)) agree as long as they remain within [—n, n]. Furthermore, given two initial variables ft the corresponding solutions Xi are indistinguishable on the event {ft = ft}.
7.11 Lemma. For i = 1,2 let fa, <7;: [0, oo) x R -> R be measurable functions that satisfy (7.5)-(7.6), let ft be To-measurable random variables, and let Xi be continuous, adapted processes that satisfy (7.1) with (ft,/U;,<7;) replacing (ft/x, <r). If fa = fa and 0i = &2 on [0, oo) x [-n,n] and T = ini{t > 0: \Xi t\ > n, or \X2 t\ > n}, then X^ = X2 on the event {ft=ft}-
Proof. By subtracting the stochastic differential equations (7.1) with (ft,Hi,(Ji,Xi) replacing (ft/i,u,X), and evaluating at T A t instead of t, we obtain
rTAt
Xlt ~ X?t = a - ft + /     {fa{s,X1}S) - fa{s,X2}S)) ds Jo
pTAt
+        (<7i(s,Xi,.)-o-2{s,X2}S))dBs. Jo
On the event F = {ft = ft} € To the first term on the right vanishes. On the set (0, T] the processes X\ and Xi are bounded in absolute value by n. Hence the functions fa and fa, and ui and U2, agree on the domain involved in the integrands and hence can be replaced by their common values fa = fa and ui = a?. Then we can use the Lipschitz properties of fa and ui, and obtain, by similar arguments as in the proof of Theorem 7.7, that
Eavp\Xl.-XZflF<{t + l)C?E / \X1}S-X2}S\2dslF.
s<t Jo
(Note that given an event F € To the process Y1F is a martingale whenever the process Y is a martingale.) By Gronwall's lemma the left side of the last display must vanish and hence Xj" = X2 on F. m
The next lemma gives a strengthening of the last assertion of Theorem 7.7. The lemma shows that, under the conditions of the theorem, solutions to the stochastic differential equation (7.1) can be constructed in a canonical way as X = F(ft B) for a fixed map F in any strong setting consisting of an initial variable f and a Brownian motion B defined on some
7.2: Martingale Problem and Weak Solutions
119
filtered probability space. Because the map F is measurable, it follows in particular that the law of X is uniquely determined by the law of f.
The sense of the measurability of F is slightly involved. The map F is defined as a map F: Rx C[0, oo) —> C[0, oo). Here C[0, oo) is the collection of all continuous functions x: [0, oo) —> R. The projection a-field on this space is the smallest a-field making all evaluation maps ("projections") itt'-x i—y x(t) measurable. The projection filtration {Ht} is defined by n4 = o-(tts:s < t). (The projection u-field can be shown to be the Borel a-field for the topology of uniform convergence on compacta.) A Brownian motion process induces a law on the measurable space (C[0,00)^00). This is called the Wiener measure. We denote the completion of the projection filtration under the Wiener measure by {nt}.
For a proof of the following lemma, see e.g. Rogers and Williams, pages 125-127 and 136-138.
7.12 Lemma. Under the conditions of Theorem 7.7 there exists a map F:M x C[0,oo) —> C[0,oo) such that, given any filtered probability space (fi, T, {Tt),P) with a Brownian motion B and an To-measurable random variable f defined on it X = F(^, B) is a solution to the stochastic differential equation (7.1). This map can be chosen such that the map f F(^,x) is continuous for every x € C[0,00) and such that the map x F(£,x) is Ut — Ut-measurable for every t > 0 and every f € M. In particular, it can be chosen B x floo — -measurable.
7.2  Martingale Problem and Weak Solutions
If X is a continuous solution to the diffusion equation (7.3), defined on some filtered probability space, and /: ffi R is a twice continuously differentiate function, then Ito's formula yields that
df(Xt) = f'(Xt)a(Xt) dBt + f'{Xt)ii{Xt) dt + y"(Xt)a2(Xt) dt.
Defining the differentiable operator A by
Af = nf' + ho3f",
we conclude that the process
(7.13) t H- f{Xt) - f(X0) - f Af(Xs) ds
Jo
is identical to the stochastic integral (f'a)(X) ■ B, and hence is a local martingale. If / has compact support, in addition to being twice continuously differentiable, and a is bounded on compacta, then the function fa
120
7:  Stochastic Differential Equations
is bounded and the process in (7.13) is also a martingale. It is said that X is a solution to the (local) martingale problem. This martingale problem can be used to characterize, study and construct solutions of the diffusion equation: instead of constructing a solution directly, we search for a solution to the martingale problem. The following theorem shows the feasibility of this approach.
7.14 Theorem. Let X be a continuous adapted process on a given filtered space such that the process in (7.13) is a local martingale for every twice continuously differentiable function with compact support. Then there exists a weak solution to the diffusion equation (7.3) with the law of X0 as the initial law.
Proof. For given n € N let Tn = inf{i > 0: \Xt\ > n}, so that \XTn \ < n on (0,T„]. Furthermore, let / and g be twice continuously differentiable functions with compact supports that coincide with the functions x x and x i—^ x2 on the set [—n, ri\. By assumption the processes (7.13) obtained by setting the function / in this equation equal to the present / and to g are local martingales. On the set (0,Tn] they coincide with the processes M and N defined by
At time 0 the processes M and N vanish and so do the processes of the type (7.13). We conclude that the correspondence extends to [0,Tn] and hence the processes M and N are local martingales. By simple algebra
for the process A defined by
v      Jo '    vo ' Jo
By Ito's formula
7.2: Martingale Problem and Weak Solutions
121
We conclude that the process A is a local martingale and hence so is the process t ^ M? - /„* a2(Xs) ds. This implies that [M]t = /„* a2{Xs) ds.
Define a function a: ffi —>■ ffi by setting <7 equal to 1/a if u / 0 and equal to 0 otherwise, so that aa = la^o- Furthermore, given a Brownian motion process B define
B = a(X) ■ M + 1CT(X)=0 • B.
Being the sum of two stochastic integrals relative to continuous martingales, the process B possesses a continuous version that is a local martingale. Its quadratic variation process is given by
[B]t = a2(X) ■ [M]t + 2(a(X)la(x)=0) ■ [M,B]t + la(x)=0 ■ [B]t.
Here we have linearly expanded [B] = [B,B] and used Lemma 5.77. The middle term vanishes by the definition of a, while the sum of the first and third terms on the right is equal to /0*(a2a2(Xs) + la(xa)=o) ds = t. By Levy's theorem, Theorem 6.1, the process B is a Brownian motion process. By our definitions a(X) ■ B = la-(x)^o • M = M, because [i.a(x)=o ■ M] = 0 whence lo-(x)=o • M = 0. We conclude that
Xt = X0 + Mt+ [ n{Xs)ds = X0+ f a{Xs)dBs+ f n{Xs)ds. Jo Jo Jo
Thus we have found a solution to the diffusion equation (7.3).
In the preceding we have implicitly assumed that the process X and the Brownian motion B are defined on the same filtered probability space, but this may not be possible on the filtered space (fi,.F, {Tt},P) on which X is given originally. However, we can always construct a Brownian motion B on some filtered space (fi,F, {Tt},P) and next consider the product space
(n xfi,FxF,{Ft xft},PxP),
with the maps
{oj,oj) i—y X(oj), {oj,oj) i—y B(oj).
The latter processes are exactly as the original processes X and B and hence the first process solves the martingale problem and the second is a Brownian motion. The enlarged filtered probability space may not be complete and satisfy the usual conditions, but this may be remedied by completion and replacing the product filtration Tt x Tt by its completed right-continuous version. ■
It follows from the proof of the preceding theorem, that a solution X of the martingale problem together with the filtered probability space on which it is defined yields a weak solution of the diffusion equation if
122        7:  Stochastic Differential Equations
<7 is never zero. If a can assume the value zero, then the proof proceeds by extending the given probability space, and X, suitably defined on the extension, again yields a weak solution. The extension may be necessary, because the given filtered probability space may not be rich enough to carry a suitable Brownian motion process.
It is interesting that the proof of Theorem 7.14 proceeds in the opposite direction of the proof of Theorem 7.7. In the latter theorem the solution X is constructed from the given Brownian motion, whereas in Theorem 7.14 the Brownian motion is constructed out of the given X.
Now that it is established that solving the martingale problem and solving the stochastic differential equation in the weak sense are equivalent, we can prove existence of weak solutions for the diffusion equation from consideration of the martingale problem. The advantage of this approach is the availability of additional technical tools to handle martingales.
7.15 Theorem. If fi,a:M. —»■ R are bounded and continuous and v is a probability measure on M, then there exists a filtered probability space (fi, T, {Tt}, P) with a Brownian motion and a continuous adapted process X satisfying the diffusion equation (7.3) and such that X0 has law v.
Proof. Let (B,£) be a pair of a Brownian motion and an .Fo-measurable random variable with law v, defined on some filtered probability space. For every n € N define a process X^ by
k2~n <t<(k + 1)2"™, k = 0,1,2,....
Then, for every n, the process is a continuous solution of the stochastic differential equation
rt ft
(7.16) x[n)=t;+     nn(s)ds+ o-n(s)dBs,
Jo Jo
for the processes fin and an defined by
A*»(*)=M*m-»), = *2"" <*<(* +1)2"".
By Lemma 5.77 the quadratic variation of the process M defined by Mt = an ■ Bs+t - an ■ Bs is given by [M]t = fQs+t al(u) du. For s < t we obtain, by the triangle inequality and the Burkholder-Davis-Gundy inequality, Lemma 7.18,
E|XW-XtW|4<E [ »n(u)du\E f o-l(u)dBv
J s J s
<IMI4Js-*|4 + IMI4Js-*|2.
2
7.2: Martingale Problem and Weak Solutions 123
By Kolmogorov's criterion (e.g. Van der Vaart and Wellner, page 104) it follows that the sequence of processes X^ is uniformly tight in the metric space C[0, oo), equipped with the topology of uniform convergence on compacta. By Prohorov's theorem it contains a weakly converging subsequence. For simplicity of notation we assume that the whole sequence converges in distribution in C[0, oo) to a process X. We shall show that X solves the martingale problem, and then can complete the proof by applying Theorem 7.14.
The variable X0 is the limit in law of the sequence Xq1^ and hence is equal in law to f.
For a twice continuously differentiate function /: R —> M with compact support, an application of Ito's formula and (7.16) shows that the process
(7.17)    /(X<">) - /(X<">) - f\nn(s)f(X^) + \al{s)f'{X^)) ds
Jo
is a martingale. (Cf. the discussion before the statement of Theorem 7.14.) By assumption the functions [i and a are uniformly continuous on compacta. Hence for every fixed M the moduli of continuity
m(8)=     sup    \n(x) - n(y)\,      s(5) =    sup    \a(x) - a(y)\
\x-y\<5 \x-y\<5 \x\v\y\<M \x\v\y\<M
converge to zero as J J, 0. The weak convergence of the sequence implies the weak convergence of the sequence sups<i ixi™^, for every fixed t > 0. Therefore, we can choose M such that the events Fn = {sups<i |Xs"^ | < M} possess probability arbitrarily close to one, uniformly in n. The weak convergence also implies that, for every fixed t > 0,
An:= sup        |XW-XW|4 0.
\u-v\<2~n ,u<v<t
On the event Fn
[\nM-»{xin)))f{xin))ds <MAn)||/'|U 4o.
Jo
Combining this with a similar argument for an we conclude that the sequence of processes in (7.17) is asymptotically equivalent to the sequence of processes
M- = f(X^) - /(X<">) - f Af(X^) ds.
Jo
These processes are also uniformly bounded on compacta. The martingale property of the processes in (7.17) now yields that F,M^g(X^:u < s) —)■ 0 for every bounded, continuous function g: C[0, s] —> ffi. Because the map
124
7:  Stochastic Differential Equations
x '—y f(xt) — f(xo) — /J Af(xs) ds is also continuous and bounded as a map from C[0, oo) to ffi, this implies that
We conclude that X is a martingale relative to its natural filtration. It is automatically also a martingale relative to the completion of its natural filtration. Because X is right continuous, it is again a martingale relative to the right-continuous version of its completed natural filtration, by Theorem 4.6.
Thus X solves the martingale problem, and there exists a weak solution to the diffusion equation with initial law the law of X0, by Theorem 7.14. ■
7.18 Lemma (Burkholder-Davis-Gundy). For every p > 2 there exists a constant Cp such that E|Mt|p < CpE[M]f ^ for every continuous martingale M, 0 at 0, and every t > 0.
Proof. Define m = p/2 and Yt = cM2 + [M]t for a constant c > 0 to be determined later. By Ito's formula applied with the functions x x2m and {x,y) —> (ex2 + y)m we have that
Assume first that the process Y is bounded. Then the integrals of the two first terms on the right are martingales. Taking the integrals and next expectations we conclude that
The middle term in the second equation is nonnegative, so that the sum of the first and third terms is bounded above by EY/71. Because M2 < Yt/c, we can bound the right side of the first equation by a multiple of this sum. Thus we can bound the left side EM2™1 of the first equation by a multiple of the left side EY/71 of the second equation. Using the inequality |a;-r-2/|m < 2m-1(xm + ym) we can bound EYtm by a multipe of cmEMfm + E[M]^. Putting this together, we obtain the desired inequality after rearranging and choosing c > 0 sufficiently close to 0.
dM2m = 2mM2m-1 dMt + \2m(2m - ljM2"1"2 d[M]t, dY™ = mYtrrl~12cMt dMt + mY™'1 d[M]t
7.2: Martingale Problem and Weak Solutions
125
If Y is not uniformly bounded, then we stop M at the time Tn = ini{t > 0: \Yt\ > n}. Then YTn relates to MTn in the same way as Y to M and is uniformly bounded. We can apply the preceding to find that the desired inequality is valid for the stopped process M. Next we let n —> oo and use Fatou's lemma on the left side and the monotone convergence theorem on the right side of the inequality to see that it is valid for M as well. ■
Within the context of weak solutions to stochastic differential equations "uniqueness" of a solution should not refer to the underlying filtered probability space, but it does make sense to speak of "uniqueness in law". Any solution X in a given setting induces a probability distribution on the metric space C[0, oo). A solution X is called unique-in-law if any other solution X, possibly defined in a different setting, induces the same distribution on C[0,oo). Here X and X possess the same distribution if the vectors (Xtl,..., Xtk) and (Xtl,..., Xtk) are equal in distribution for every 0 < ti < ■ ■ ■ < tk- (This corresponds to using on C[0, oo) the u-field of all Borel sets of the topology of uniform convergence on compacta.)
The last assertion of Theorem 7.7 is exactly that, under the conditions imposed there, that the solution of the stochastic differential equation is unique-in-law. Alternatively, t here is an interesting sufficient condition for uniqueness in law in terms of the Cauchy problem accompanying the differential operator A. The Cauchy problem is to find, for a given initial function /, a solution u: [0, oo) x ]R —> M. to the partial differential equation
du
— = Au, u{0, •) = /.
Here du/dt is the partial derivative relative to the first argument of u, whereas the operator A on the right works on the function x u(t, x) for fixed t. We make it part of the requirements for solving the Cauchy problem that the partial derivatives du/dt and d2u/dx2 exist on (0, oo) x R and possess continuous extensions to [0, oo) x R.
A sufficient condition for solvability of the Cauchy problem, where the solution also satisfies the condition in the next theorem, is that the functions fi and o~2 are Holder continuous and that a2 is bounded away from zero. See Stroock and Varadhan, Theorem 3.2.1.
For a proof of the following theorem, see Karatzas and Shreve, pages 325-427 or Stroock and Varadhan.
7.19 Theorem. Suppose that the accompanying Cauchy problem admits for every twice continuous differentiable function f with compact support a solution u which is bounded and continuous on the strips [0, t] x M, for every t > 0. Then for any x € M the solution X to the diffusion equation with initial law Xq = x is unique.
126        7:  Stochastic Differential Equations
7.3  Markov Property
In this section we consider the diffusion equation
o Jo
Evaluating this equation at the time points t+s and s, taking the difference, and making the change of variables u = v + s in the integrals, we obtain
Because the stochastic integral depends only on the increments of the integrator, the process Bs+V can be replaced by the process Bv = Bs+V — Bs, which is a Brownian motion itself and is independent of Ts. The resulting equation suggests that conditionally on Ts (and hence given Xs) the process {Xs+t: t > 0} relates to the initial value Xs and the Brownian motion B in the same way as the process X relates to the pair (XS,B) (with Xs fixed). In particular, the conditional law of the process {Xs+t:t > 0} given Ts should be the same as the law of X given the initial value Xs (considered fixed).
This expresses that a solution of the diffusion equation is a time-homogeneous Markov process: at any time the process will given its past evolve from its present according to the same probability law that determines its evolvement from time zero. This is indeed true, even though a proper mathematical formulation is slightly involved.
A Markov kernel from ]R into 1R is a map {x,B) >->■ Q(x,B) such that
(i) the map x     Q(x,B) is measurable, for every Borel set B;
(ii) the map B >->■ Q(x,B) is a Borel measure, for every x € 1R.
A general process X is called a time-homogeneous Markov process if for every t > 0 there exists a Markov kernel Qt such that, for every Borel set B and every s > 0,
By the towering property of a conditional expectation the common value in the display is then automatically also a version of P(Xs+t € B\XS). The property expresses that the distribution of X at the future time s + t given the "past" up till time s is dependent on its value at the "present" time s only. The Markov kernels Qt are called the transition kernels of the process.
Suppose that the functions fi and a satisfy the conditions of Theorem 7.7. In the present situation these can be simplified to the existence, for every t > 0 of a constant Ct such that, for all x,y € [—t, t],
Xs+t = Xs+     n(Xs+v) dv + cr(Xs+,
-v
P(Xs+t GB\Xu:u<s)=Qt(Xs,B),
a.s..
(7.20)
\n(x) -n(y) \ < Ct\x-y \a(x) - a(y) \ < Ct\x - y
7.3: Markov Property
127
and the existence of a constant C such that, for all x € ffi,
(7.21)
\p(x)\ <C(l + \x\), \a{x) \ < C(l + |a;|).
Under these conditions Theorem 7.7 guarantees the existence of a solution Xx to the diffusion equation with initial value Xq = x, for every x € 1R, and this solution is unique in law. The following theorem asserts that the distribution Qt{x, •) of Xf defines a Markov kernel, and any solution to the diffusion equation is a Markov process with Qt as its transition kernels.
Informally, given Ts and Xs = x the distribution of Xs+t is the same as the distribution of Xf.
7.22 Theorem. Assume that the functions fi,a:M —> M satisfy (7.20)-(7.21). Then any solution X to the diffusion equation (7.3) is a Markov process with transition kernels Qt defined by Qt(x,B) = P(X* € B).
Proof. See Chung and Williams, pages 235-243. These authors (and most authors) work within the canonical set-up where the process is (re)defined as the identity map on the space C[0, oo) equipped with the distribution induced by Xx. This is immaterial, as the Markov property is a distributional property; it can be written as
for every measurable set B and bounded measurable function g: C[0, s] —> ffi. This identity depends on the law of X only, as does the definition of Qt.
The map x >-*■ f f(y)Qt{x,dy) is shown to be continuous for every bounded continuous function f:M —> R in Lemma 10.9 of Chung and Williams. In particular, it is measurable. By a monotone class argument this can be seen to imply that the map x Qt(x,B) is measurable for every Borel set B. m
ElXa+teBg(Xu:u < s) = EQt(Xs,B)g(Xu:Xu:u < s),
8
Option Pricing
in Continuous Time
In this chapter we discuss the Black-Scholes model for the pricing of derivatives. Given the tools developed in the preceding chapters it is relatively straightforward to obtain analogues in continuous time of the discrete time results for the Cox-Ross-Rubinstein model of Chapter 3. The model can be set up for portfolios consisting of several risky assets, but for simplicity we restrict to one such asset.
We suppose that the price St of a stock at time t > 0 satisfies a stochastic differential equation of the form
(8.1) dSt=HtStdt + o-tStdWt.
Here W is a Brownian motion process on a given filtered probability space (Q,T,{Tt},P), and {fit'-t > 0} and {at:t > 0} are predictable processes. The filtration {Tt} is the completed natural filtration generated by W, and it is assumed that S is continuous and adapted to this filtration. The choices fit = A* and ct = for constants [i and a, give the original Black-Scholes model. These choices yield a stochastic differential equation of the type considered in Chapter 7, and Theorem 7.7 guarantees the existence of a solution S in this case. For many other choices the existence of a solution is guaranteed as well. For our present purpose it is enough to assume that there exist a continuous adapted solution S.
The process a is called the volatility of the stock. It determines how variable or "volatile" the movements of the stock are. We assume that this process is never zero. The process fi gives the drift of the stock. It is responsible for the exponential growth of a typical stock price.
Next to stocks our model allows for bonds, which in the simplest case are riskless assets with a predetermined yield, much as money in a savings account. More generally, we assume that the price Rt of a bond at time t satisfies the differential equation
dRt = rtRt dt,
Ro = 1.
8:  Option Pricing in Continuous Time 129
Here rt is some continuous adapted process called the interest rate process. (Warning: r is not the derivative of R, as might be suggested by the notation.) The differential equation can be solved to give
Rt = efor'ds.
This is the "continuously compounded interest" over the interval [0,t\. In the special case of a constant interest rate rt = r this reduces to Rt = ert.
A portfolio (A, B) is defined to be a pair of predictable processes A and B. The pair (At,Bt) gives the numbers of bonds and stocks owned at time t, giving the portfolio value
(8.2) Vt = AtRt + BtSt.
The predictable processes A and B can depend on the past until "just before t" and we may think of changes in the content of the portfolio as a reallocation of bonds and stock that takes place just before time t. A portfolio is "self-financing" if such reshuffling can be carried out without import or export of money, whence changes in the value of the portfolio are due only to changes in the values of the underlying assets. More precisely, we call the portfolio (A, B) self-financing if
(8.3) . dVt=AtdRt + BtdSt.
This is to be interpreted in the sense that V must be a semimartingale satisfying V = Vq + A ■ R + B ■ S. It is implicitly required that A and B are suitable integrands relative to R and S.
A contingent claim with expiry time T > 0 is defined to be an Tt-measurable random variable. It is interpreted as the value at the expiry time of a "derivative", a contract based on the stock. The European call option, considered in Chapter 3, is an important example, but there are many other contracts. Some examples of contingent claims are:
(i) European call option: (St — K)+■
(ii) European put option: (K — St)+-(hi) Asian call option: (fQT St dt — K) +.
(iv) lookback call option: St — nrino<t<T St,
(v) down and out barrier option: (St — -£0+l{mino<t<T St > H}.
The constants K and H and the expiry time T are fixed in the contract. There are many more possibilities; the more complicated contracts are referred to as exotic options. Note that in (iii)-(v) the claim depends on the history of the stock price throughout the period [0, T]. All contingent claims can be priced following the same no-arbitrage approach that we outline below.
A popular option that is not covered in the following is the American put option. This is a contract giving the right to sell a stock at any time in [0, T] for a fixed price K. The value of this contract cannot be expressed in a
130        8:  Option Pricing in Continuous Time
contingent claim, because its value depends on an optimization of the time to exercise the contract (i.e. sell the stock). Pricing an American put option involves optimal stopping theory, in addition to the risk-neutral pricing we discuss below. A bit surprising is that a similar complication does not arise with the American call option, which gives the right to buy a stock at any time until expiry time. It can be shown that it is never advantageous to exercise a call option before the expiry time and hence the American call option is equivalent to the European call option.
Because the claims we wish to evaluate always have a finite term T, all the processes in our model matter only on the interval [0, T]. We may or must understand the assumptions and assertions accordingly.
In the discrete time setting of Chapter 3 claims are priced by reference to a "martingale measure", defined as the unique measure that turns the "discounted stock process" into a martingale. In the present setting the discounted stock price is the process S defined by St = Rt~1St- By Ito's formula and (8.1),
(8.4)        dSt = —§ dRt + ^-dSt = ^—^^St dt + dWt.
Kt Kt o~t    Ht Ht
Here and in the following we apply Ito's formula with the function r >->• 1/r, which does not satisfy the conditions of Ito's theorem as we stated it. However, the derivations are correct, as can be seen by substituting the explicit form for Rt as an exponential and next applying Ito's formula.
Under the true measure P governing the Black-Scholes stochastic differential equation (8.1) the process W is a Brownian motion and hence S is a local martingale if its drift component vanishes, i.e. if fit = n- This will rarely be the case in the real world. Girsanov's theorem allows us to eliminate the drift part by a change of measure and hence provides the martingale measure that we are looking for. The process
crt
is called the market price of risk. If it is zero, then the real world is already "risk-neutral"; if not, then the process 8 measures the deviation from a risk-neutral market relative to the volatility process.
Let Z = £(—8 ■ W) be the exponential process of —8 ■ Z, i.e.
„        - f* 9adWa-\ V Bids
Zt = e Jo 2 Jo
We assume that the process 8 is such that the process Z is a martingale (on [0,T]). For instance, this is true under Novikov's condition. We can next define a measure P on (fi, T, P) by its density dP = Zt dP relative to P. Then the process W defined by
Wt = Wt + f 8sds Jo
8:  Option Pricing in Continuous Time 131
is a Brownian motion under P, by Corollary 6.16, and, by the preceding calculations,
(8.5) dSt = ^-StdWt.
-ft*
It follows that S is a P-local martingale. As in the discrete time setting the "reasonable price" at time 0 for a contingent claim with pay-off X is the expectation under the martingale measure of the discounted value of the claim at time T, i.e.
Vo = ERrp^X,
where E denotes the expectation under P. This is a consequence of economic, no-arbitrage reasoning, as in Chapter 3, and the following theorem.
8.6 Theorem. Let X be a nonnegative contingent claim with EJ?^11 | < oo. Then there exists a self-financing strategy with value process V such that
(i) V > 0 up to indistiguishability.
(ii) Vt = X almost surely. (Hi) Vo = ER^X.
Proof. The process S = R~XS is a continuous semimartingale under P and a continuous local martingale under P, in view of (8.5). Let V be a cadlag version of the martingale
Vt = E(Rr1X\Tt).
Suppose that there exists a predictable process B such that
dVt = BtdSf
Then V is continuous, because S is continuous, and hence predictable. Define
A = V -BS.
Then A is predictable, because V, B and S are predictable. The value of the portfolio (A, B) is given by V = AR + BS = (V - BS)R + BS = RV and hence, by Ito's formula and (8.4),
dVt = Vt dRt + Rt dVt = {At + BtSt) dRt + RtBt dSt
= (At + BtR^St) dRt + RtBt{-StRt2 dRt + R*1 dSt) = AtdRt + Bt dSt.
Thus the portfolio (A, B) is self-financing. Statements (i)-(iii) of the theorem are clear from the definition of V and the relation V = RV.
132        8:  Option Pricing in Continuous Time
We must still prove the existence of the process B. In view of (8.5) we need to determine this process B such that
dVt = Bt^dWt. -ft*
The process W is a P-Brownian motion and V is a P-martingale. If the underlying filtration would be the completion of the natural filtration generated by W, then the representation theorem for Brownian local martingales, Theorem 6.6, and the fact that atSt is strictly positive would immediate imply the result. By assumption the underlying filtration is the completion of the natural filtration generated by W. Because W and W differ by the process fQ 6S ds, it appears that the two nitrations are not identical and hence this argument fails in general. (In the special case in which [it, & and rt and hence 6t are deterministic functions the two filtration are clearly the same and hence the proof is complete at this point.) We can still prove the desired representation by a detour. We first write the P-local martingale V in terms of P-local martingales through
- _ E(R^XZT\Ft) _ Ut
Vt~     E(ZT\Ft)     ~ Zt' a'S"
Here U, defined as the numerator in the preceding display, is a P-martingale relative to {Ft}. By the representation theorem for Brownian martingales the process U possesses a continuous version and there exists a predictable process C such that U = Uq + C-W. The exponential process Z = £(—8-W) satisfies dZ = Z d{-6 ■ Z) = -Z6 dW and hence d[Z]t = Z\d\ dt. Careful application of Ito's formula gives that
dVt = ~^dZt + ^ + T^d[Z\t ~ ^2 d[U, Z]t
= -^(-Zt6t) dWt + + § m dt + ^CtZt6t dt
Ut6t + ct JT?, =-=-dWt.
This gives the desired representation of V in terms of W. m
We interpret the preceding theorem economically as saying that Vq = ER^X is the just price for the contingent claim X. In general it is not easy to evaluate this explicitly, but for Black-Scholes option pricing it is.
First the stock price can be solved explicitly from (8.1) to give
C        C     r^s-haDds+r aadWa
St = SoeJo ^   2 °> Jo
8:  Option Pricing in Continuous Time 133
Because we are interested in this process under the martingale measure P, it is useful to write it in terms of W as
St = SoeJoK    2 >> Jo
Note that the drift process [i does not make part of this equation: it plays no role in the pricing formula. Apparently the systematic part of the stock price diffusion can be completely hedged away. If the volatility u and the interest rate r are constant in time, then this can be further evaluated, and we find that, under P,
This is exactly as in the limiting case for the discrete time situation in Chapter 3. The price of a European call option can be written as, with Z a standard normal variable,
It is straightforward calculus to evaluate this explicitly, and the result is given already in Chapter 3.
The exact values of most of the other option contracts mentioned previously can also be evaluated explicitly in the Black-Scholes model. This is more difficult, because the corresponding contingent claims involve the full history of the process S, not just the martingale distribution at some fixed time point. However, if the processes u and r are not constant, then the explicit evaluation may be impossible. In some cases the problem can be reduced to a partial differential equation, which can next be solved numerically.
Assume that the value process V of the replicating portfolio as in Theorem 8.6 can be written as Vt = f{t,St) for some twice differentiate function      Then, by Ito's formula and (8.1),
dVt = Dif{t, St) dt + D2f(t, St) dSt + \-D22f(t, StWtS? dt.
By the self-financing equation and the definition of V = AR + BS, we have that
log
fi~jv(c
r-±a2)t,a2t).
dVt = At dRt + Bt dSt = (Vt - BtSt)rt dt + Bt dSt. The right sides of these two equations are identical if
£>i/(i, 5t) + ±D22f(t,StytS2 = {Vt - BtSt)n, D2f(t,St)=Bt.
I do not know in what situations this is a reasonable assumption.
134        8:  Option Pricing in Continuous Time
We can substitute Vt = f(t,St) in the right side of the first equation, and replace Bt by the expression given in the second. If we assume that at = a(t,St) and rt = r(t,St), then the resulting equation can be written in the form
ft + \fsso-'1s1 = fr- fssr,
where we have omitted the arguments (t,s) from the functions ft, fss, f, fs and r, and the indices t and s denote partial derivatives relative to t or s of the function (t, s) f(t,s). We can now try and solve this partial differential equation, under a boundary condition that results from the pay-off equation. For instance, for a European call option the equation f(T, St) = Vt = {St — K)+ yields the boundary condition
f(T,s) = (s-K)+.
8.7 EXERCISE. Show that the value of a call option at time t is always at least (St — e~r(T~^K)+, where r is the (fixed) interest rate. (Hint: if not, show that any owner of a stock would gain riskless profit by: selling the stock, buying the option and putting e~rtK in a savings account, sit still until expiry and hence owning on option and money K at time T, which is worth at least St-)
8.8 EXERCISE. Show, by an economic argument, that the early exercise of an American call option never pays. (Hint: if exercised at time t, then the value at time t is (St - K)+. This is less than (St - e-r{-T-^K)+.)
8.9 EXERCISE. The put-call parity for European options asserts that the values Pt of a put and Ct of a call option at t with strike price K and expiry time T based on the stock S are related as St + Pt = Ct + Ke~r(T~t\ where r is the (fixed) interest rate. Derive this by an economic argument, e.g. comparing portfolios consisting of one stock and one put option, or one call option and an amount Ke~rT in a savings account. Which one of the two portfolios would you prefer?