1 Natural units
Almost all units in nature are derived. For instance, in the cgs system of
units one chooses the unit of length to be cm, the unit of mass to be g and
the unit of time to be s. All other units can be expressed in terms of these
three. For instance, the unit of force can be found from Newton’s law to
be ML
T2 where M, L and T represent the particular units chosen for mass,
length and time respectively. Similarly, the unit of charge can be found from
Coulomb’s law to be M
1
2 L
3
2
T
. We see that sometimes odd-looking fractional
powers appear which is probably one reason why people have invented new
names for these units. Anyway, the choice of units which are regarded as
fundamental is by no means unique. For instance, instead of choosing M, L
and T, we can choose as fundamental, the unit of energy E, the unit of
velocity V and the unit of “action” A. The last unit is somewhat unusual.
It is the unit that the action functional in mechanics or quantum mechanics
carries. Since the action is dt (K − V ) we see that the unit is energy·time,
the same as the constant ¯h carries. (It would also be possible to view A as
the unit of angular momentum.) If we choose the units for velocity such that
the speed of light, c = 1 and the unit of action such that ¯h = 1, then we have
something that is called “natural units”. It only remains to choose a unit for
energy, which we usually will choose to be eV . (1eV = 1.602 · 10−19
J.)
To be able to convert between these two systems of units, we express the
new units in terms of the old ones. It is not diﬃcult to ﬁnd that



V = L
T
E = ML2
T2
A = ML2
T
or



M = E
V 2
T = A
E
L = AV
E
(1)
Using these relations we can make a small table
Quantity CGS Natural
Force ML
T2
E2
AV
Charge M
1
2 L
3
2
T
(AV )
1
2
Magnetic moment M
1
2 L
5
2
T
(AV )
3
2
E
From the table we see that for instance charge is dimensionless in natural
units (it is given in eV 0
). Force has dimensions eV 2
while length, time and
magnetic moment all have the same dimension eV −1
. If you are given a
1
number in natural units and you want to change it into CGS units you just
have to insert the proper powers of ¯h and c to restore the dimensions. Here
is an example: the charge of the electron in natural units is
en = 8.543 · 10−2
eV 0
(2)
Converting this to CGS we have to consult the table to see that we need to
multiply this with
√
¯hc to restore the dimensions. Note here that the units
you choose for ¯h and c will also give you the units of the ﬁnal result. Since
we are interested in CGS units we have to give ¯h and c in cm, g and s instead
of the more usual m, kg and s. For reference I give them here
¯h = 1.05459 · 10−27 g cm2
s
c = 2.9979 · 1010 cm
s
(3)
Multiplying together gives
ecgs = en
√
¯hc = 4.803 · 10−10
esu = 3.336 · 10−10
C. (4)
Another example is the expression for the Bohr radius in natural units
(a0)n =
1
mne2
n
eV −1
(5)
To transform this to an expression of dimension length we multiply with ¯hc
to get
(a0)cgs =
¯hc
mne2
n
(6)
The formula still depends on the values of e and m in natural units though.
We can convert these also by using the relation between the charges derived
above and the relation between the masses as mn = mcgsc2
. This gives
(a0)cgs =
¯h2
c2
mcgsc2e2
cgs
=
¯h2
mcgse2
cgs
(7)
Comment 1: The deﬁnition and even the unit of charge diﬀers between
diﬀerent systems of units. This comes about because there are diﬀerent
conventions about how to write Coulomb’s law. We have
ecgs =
ehl
√
4π
=
eSI
√
4πε0
(8)
2
ehl represents the charge in the Heaviside-Lorentz system of units. It clearly
has the same dimension as the CGS charge. However, in the SI system
there is an additional constant ε0 which has the dimension of inverse velocity
squared. Charge in the SI system therefore has a diﬀerent dimension than in
CGS.
Comment 2: There exist other systems of units with only one basic unit.
For instance, in relativistic gravitational physics it is often advantageous
to choose a system of units where the speed of light c = 1 and Newton’s
gravitational constant G = 1. The remaining unit is the unit of length which
can be chosen arbitrarily (light-year, m, cm etc.). This system of units is
called geometrical units. In this system, for example, the mass of the earth
is approximately 0.44 cm.
2 The Dirac equation
There is a curious way to “derive” the Schr¨odinger equation. Namely, take
the relation for the energy in classical physics
E =
p2
2m
+ V. (9)
One gets the Schr¨odinger equation by making the replacement
E → i¯h
∂
∂t
,
pi → −i¯h
∂
∂xi
, (10)
and then letting the relation (9) “act” on a wavefunction one gets
i¯h
∂
∂t
ψ = −
¯h2
2m
∂2
x + V ψ. (11)
This derivation inspired many people to try to derive a relativistic analog
of the Schr¨odinger equation by starting with the relativistic energy relation
E2
= p2
c2
+m2
c4
(or E2
= p2
+m2
in natural units) instead of starting with
(9). Making the same substitution (10) as before we get a relativistic wave
equation
−∂2
t φ = − ∂2
x + ∂2
y + ∂2
z φ + m2
φ. (12)
3
This can be written in a more relativistic fashion by introducing a metric
gµν = diag(1, −1, −1, −1) as
gµν
∂µ∂νφ + m2
φ = 0, (13)
an equation which is known as the Klein-Gordon equation.
To ﬁnd out more about its properties, we now go on to ﬁnd solutions to
the Klein-Gordon equation. For instance, there is a complete set of planewave
solutions as we will now show. First we make the ansatz φ = e−ikµxµ
.
Acting on this with a four-derivative ∂µ gives us
∂µe−ikν xν
= −ikµe−ikν xν
. (14)
Using this result twice we may insert the ansatz into the Klein-Gordon equation
to get
∂µ∂µ
φ + m2
φ = (−kµkµ
+ m2
)φ. (15)
We see that for φ to be a solution to the Klein-Gordon equation we need the
four momentum kµ to satisfy the relation
kµkµ
= m2
, (16)
and rewriting the four momentum kµ in terms of its components kµ
= (E, k)
where k is the ordinary three momentum, we recover the relativistic energy
relation E2
= k2
+m2
. Let us recapitulate; the Klein-Gordon equation has a
complete set of plane wave solutions φ(x) = eik·x
where the four momentum
has to satisfy the relativistic energy condition k · k = m2
. Any solution
can then be written as a linear combination of these plane waves. There is
however a funny new feature of these solutions. If the four vector kµ
= (E, k)
gives a solution, then the four vector kµ
= (−E, k) with negative energy is
also a solution! Thus, for every solution with positive energy, there is a
solution with negative energy which seems physically unacceptable since it
would lead to an unstable theory (there would be no state with lowest energy
= vacuum state).
Dirac identiﬁed the root of this problem in the fact that the Klein-Gordon
equation is quadratic in the time derivative whereas the Schr¨odinger equation
is linear. He tried to get around this by introducing an equation which
would be linear in time derivatives. To achieve this he used some interesting
properties of the Pauli matrices
σ1
=
0 1
1 0
, σ2
=
0 −i
i 0
, σ3
=
1 0
0 −1
, (17)
4
which fulﬁll the relation σi
σk
= i ikl
σl
+δik
1. This made it possible for Dirac
to write
kµkµ
= E2
− ki
ki
= E1 − ki
σi
E1 + kl
σl
. (18)
That is, by writing the equation in terms of two by two matrices, he was able
to split it into factors linear in energy. The price he had to pay was that
the wave functions now become two dimension column vectors (or spinors as
they are more commonly known). Thus our second attempt for a relativistic
wave equation looks like this
1i∂t − σi
i∂i 1i∂t + σl
i∂l φA = m2
φA, (19)
where φA =
φ1
φ2
is a two dimensional column vector. By introducing a
second two dimensional column vector
mφB = 1i∂t + σl
i∂l φA, (20)
we can write an equation (well, really a system of equations) which is linear
in time derivatives
mφB = 1i∂t + σl
i∂l φA,
mφA = 1i∂t − σi
i∂i φB. (21)
For purely conventional reasons one often redeﬁnes the column vectors as
φ± = φA ± φB which makes it possible to write the above equation as
mφ+ = 1i∂tφ+ + σl
i∂lφ−
mφ− = −1i∂tφ− − σl
i∂lφ+, (22)
or, deﬁning a four component column vector ψ =
φ+
φ−
and four by four
matrices
γ0
=
1 0
0 −1
; γi
=
0 σi
−σi
0
, (23)
we may write the resulting equations in a very compact form as
γµ
i∂µψ = mψ. (24)
5
Notice that this is a matrix equation (it is really four equations written in
a very nice and compact form using matrices) and that it is linear in time
derivatives which is exactly what Dirac wanted to achieve. This equation
is known as the Dirac equation. To make the comparison to the ordinary
Schr¨odinger equation more prominent, we can rewrite it as
γ0
i∂tψ = −γl
i∂lψ + mψ, (25)
and using that γ0
γ0
= 1 we ﬁnd
i∂tψ = −γ0
γl
i∂lψ + mγ0
ψ. (26)
We thus see that the Hamiltonian operator that we get from the Dirac equation
is H = −γ0
γl
i∂l + mγ0
.
Again, to get a feeling for the physics we can try to solve the equation.
Since the wavefunction is a four component column vector we make the ansatz
for a plane wave
ψ = u(p)e−ip·x
, (27)
where u(p) is a four component column vector possibly dependent on p.
Inserting this into the Dirac equation we get
(iγµ
∂µ − m)ψ = (γµ
pµ − m)u(p)e−ip·x
, (28)
so we see that for this to be a solution of the Dirac equation we need the
four column vector u to satisfy the matrix equation
(γµ
pµ − m)u(p) = 0. (29)
Using the expressions for the gamma matrices found earlier we can rewrite
this in an even more explicit form





E − m 0 −p3 −p−
0 E − m −p+ p3
p3 p− −E − m 0
p+ −p3 0 −E − m










u1
u2
u3
u4





= 0, (30)
where we have deﬁned the complex combinations p± = p1±ip2. This equation
has four independent solutions. We will ﬁnd one of them, but I recommend
that you similarly try to ﬁnd the other three. Actually, for this equation to
6
be solvable we need the determinant of the matrix to be zero. We can easily
evaluate it to be (E2
− p2
− m2
)2
so we see that a necessary condition for
this equation to have solutions is that the “old” relativistic energy condition
is satisﬁed. Unfortunately this means that we did not get rid of the solutions
with negative energy. Therefore we ﬁrst need to assume that the condition
holds, then we can go on and try to ﬁnd a solution. To make it a little bit
simpler, let us ﬁrst try it in the case where p = 0. Then the equation looks
like





0 0 0 0
0 0 0 0
0 0 −2m 0
0 0 0 −2m










u1
u2
u3
u4





= 0, (31)
for the case of positive energy, i.e. when E = +m and in the case where the
energy is negative, i.e. when E = −m, it looks like





−2m 0 0 0
0 −2m 0 0
0 0 0 0
0 0 0 0










u1
u2
u3
u4





= 0. (32)
In the positive energy case we have the two independent solutions





1
0
0
0





,





0
1
0
0





, (33)
and in the negative energy case the solutions look like





0
0
1
0





,





0
0
0
1





. (34)
Turning on the three momentum p we have to solve the full equations (30)
but we can expect that the solutions should not diﬀer too much from the
zero p solutions, at least when p is small. Then we should be able to ﬁnd a
7
solution of the form





1
0
a
b





, (35)
where a and b are small of order p (or possibly smaller). Inserting this
ansatz into the equation immediately gives us that a = p3
E+m
and b = p+
E+m
.
For reasons to be explained later we choose the normalization to be u†
u = 2E
which leads to the ﬁnal answer
ψ =
√
E + m






1
0
p3
E+m
p+
E+m






e−ip·x
. (36)
3 The non-relativistic limit of the Dirac equa-
tion
One check that one should always do is to see how the new physics one is
investigating reduces in known situations. In the case at hand this means
that we should try to see how the physics of the Dirac equation looks in a
non-relativistic situation. To do this, let us have a look at it in the form
given in (22) but in momentum space. The equation looks like
(E − m)φ+ = σl
plφ−,
(E + m)φ− = σl
plφ+. (37)
The non-relativistic limit means the limit where p m. This in turn implies
that E =
√
p2 + m2 = m + p2
2m
+ . . . = m + E(NR)
where E(NR)
is the nonrelativistic
energy. This immediately tells us that the quantity E(NR)
= E−m
is small (of order m p
m
2
or v2
c2 mc2
in ordinary units) while the quantity E+m
is large (of order m). A look at the equations now tells us that φ+ is of order
one while φ− is of order p
m
so it goes to zero in the non-relativistic limit. We
can now solve for the “small” component φ− to get an equation for φ+ only
since φ+ is what is left in the non-relativistic limit. Solving for φ− gives us
φ− =
1
2m + E(NR)
p · σφ+, (38)
8
which, when inserted back into the equation gives us
E(NR)
φ+ = p · σ
1
2m + E(NR)
p · σφ+. (39)
In the non-relativistic limit m E(NR)
so we can expand the denominator
to get
E(NR)
φ+ = p · σ
1
2m
(1 −
E(NR)
2m
+ . . .)p · σφ+, (40)
and to lowest order we get back the non-relativistic Schr¨odinger equation
E(NR)
φ+ =
p2
2m
φ+. (41)
This is maybe not a very exciting result but it is gratifying to see that we
get the correct non-relativistic limit of our equation.
A slightly more interesting result we get if we include a potential from
an external electromagnetic ﬁeld. This is done in a relativistically covariant
fashion in the Dirac equation, introducing the relativistic electromagnetic
vector potential Aµ
= (ϕ, A), by replacing i∂µ → i∂µ − eAµ. This changes
the Dirac equation to
γµ
(i∂µ − eAµ) ψ = mψ, (42)
or, if we Fourier transform as
γµ
(pµ − eAµ) u(p) = mu(p). (43)
When rewriting this in terms of the large and small components we get
(E − eϕ − m)u+ = σ · (p − eA)u−,
(E − eϕ + m)u− = σ · (p − eA)u+, (44)
and solving for the small component we get
u− =
1
2m + E(NR) − eϕ
σ · (p − eA)u+,
E(NR)
u+|! = eϕ + σ · (p − eA)
1
2m + E(NR) − eϕ
σ · (p − eA) u+.(45)
9
Notice that we have to be careful in which order we write things since ϕ
and A depend on x and thus do not commute with p. Using the same
approximations as before we get an equation for u+
E(NR)
u+ = eϕ +
1
2m
σ · (p − eA)σ · (p − eA) u+. (46)
To evaluate this we again need to use the properties of the Pauli matrices to
be able to write
σ · (p − eA)σ · (p − eA) = σi
σk
(p − eA)i(p − eA)k =
(δik
+ i ikl
σl
)(p − eA)i(p − eA)k = (47)
(p − eA) · (p − eA) + iσ · (p − eA) × (p − eA).
The cross product can be evaluated as
lik
(p − eA)i(p − eA)k =
1
2
lik
[(p − eA)i, (p − eA)k] =
1
2
lik
(−e [pi, Ak] − e [Ai, pk]) = − lik
e [pi, Ak] = (48)
ie lik
∂iAk = ieBl
, (49)
which gives us the non-relativistic equation (Pauli equation)
E(NR)
u+ =
(p − eA)2
2m
+ eϕ −
e
2m
σ · B u+. (50)
This is exactly the Schr¨odinger equation for a non-relativistic spin half particle
with an intrinsic magnetic moment µ = e
m
s where s = σ
2
is the spin
operator. This is a very interesting result. We see that only from the requirement
that the theory should be relativistically invariant, we ﬁnd that
particles carry an intrinsic magnetic moment. This is not something that
we can turn oﬀ or change in any way. It is fundamentally built into the
theory and comes from the relativistic invariance. Furthermore, it cannot
be understood in any classical sense as “something charged going around
in circles”. In fact, you can easily verify by yourself that if we have some
charged particle moving in a circle of radius R it produces a magnetic moment
which is µ = e
2m
L and what we get out of our equation is twice this
value. We say that the electron has a gyromagnetic ratio of 2. In fact this is
not completely true and this value receives quantum corrections which can
10
be computed with great accuracy (moreover, you will in principle be able to
do it yourself using what you learn in this course).
One can go on and keep higher order corrections to this result. This will
result in extra terms in the Hamiltonian. The calculation is slightly more
involved since now it will not be justiﬁed to neglect φ− any more. Anyway, it
is still possible to write a non-relativistic Hamiltonian for a two component
spinor. If one puts A = 0 (no magnetic ﬁeld) the Schr¨odinger equation
becomes
p2
2m
+ eϕ −
p4
8m3
−
eσ · (E × p)
4m2
−
e
8m2
· E ψ = E(NR)
ψ. (51)
The ﬁrst two terms are the lowest order terms which we have already derived
(remember that we put A = 0). The next three terms are higher order corrections.
If we for instance apply this Hamiltonian to the hydrogen atom they
will give small corrections to the spectrum (known as ﬁne structure). The
third term is simply the ﬁrst non-trivial correction to the non-relativistic energy
(from expanding
√
p2 + m2 −m). The fourth term is called the Thomas
term and it has the interpretation as an interaction between the spin of the
electron and the eﬀective magnetic ﬁeld it sees when moving through the
electric ﬁeld. It can be rewritten as a spin-orbit interaction (proportional to
S · L). The last term is known as the Darwin term. It represents an interaction
with the charge density that produces the electric ﬁeld. In the hydrogen
atom it gives a shift in energy of the s-states. There is also something called
hyperﬁne structure of the hydrogen spectrum. It comes from the interaction
of the magnetic moments of the proton and the electron but is a much smaller
eﬀect than is the ﬁne structure.
I would like to point out once again that all these terms one gets automatically
from the Dirac equation when going to the non-relativistic limit.
There are no additional assumptions involved. Quite a nice little equation!
4 Transformation properties of the Dirac equa-
tion
You are familiar with how covariant and contravariant vectors transform
when we change coordinate systems (we also say “when we do Lorentz rotations”
or “boost” the coordinate system) in special relativity. The typical
11
contravariant vector is the coordinate vector xµ
itself. When we do a Lorentz
boost it transforms into x µ
= Λµ
νxν
where, if we for instance boost to a coordinate
system which is moving with speed v in the x direction we have
Λµ
ν =






1√
1−v2 − v√
1−v2 0 0
− v√
1−v2
1√
1−v2 0 0
0 0 1 0
0 0 0 1






(52)
We may deﬁne the matrix Λ ν
µ = gµρΛρ
σgσν
and we can check that Λρ
µΛ ν
ρ =
δν
µ. All covariant quantities (for example the momentum vector pµ or a
vector ﬁeld Aµ or the ordinary derivative operator ∂µ transform as Aµ =
Λ ν
µ Aν. Therefore the scalar product is invariant x µ
pµ = xµ
pµ. Using this
information it is easy to see that for a scalar ﬁeld φ (a scalar ﬁeld is deﬁned by
the property that it does not transform at all under Lorentz transformations)
the Klein-Gordon equation is invariant under Lorentz transformations
∂µ∂µ
φ + m2
φ = 0 (53)
A spinor is not invariant under Lorentz transformations but transforms
as ψa = Sabψb for some matrix S which we will not need the exact form of.
The Dirac equation itself transforms as
i/∂ψ − mψ = 0 → iγµ
Λ σ
µ ∂σ (Sψ) − mSψ = 0 (54)
or
iS−1
γµ
SΛ σ
µ ∂σψ − mψ = 0 (55)
We see that for the Dirac equation to be invariant we need that
S−1
γµ
S = Λµ
σγσ
(56)
Taking the hermitian conjugate of this equation and using that we know from
the explicit representation of the gamma matrices that (γµ
)†
= γ0
γµ
γ0
we
get
γ0
S†
γ0
= S−1
(57)
Having this formula we may investigate how for instance ψ†
transforms under
Lorentz transformations. We get
ψ †
= ψ†
γ0
S−1
γ0
(58)
12
So the hermitian conjugate does not transform as the inverse of the original
object. However, if we check how ¯ψ ≡ ψ†
γ0
transforms we ﬁnd
¯ψ = ¯ψS−1
(59)
which is indeed “nicer” since we can form objects with simple Lorentz transformation
properties from it, for instance
¯ψ ψ = ¯ψψ (scalar)
¯ψ γµ
ψ = ¯ψS−1
γµ
Sψ = Λµ
ν
¯ψγν
ψ (vector) (60)
5 Field quantization (“second quantization”)
To be able to describe quantum systems where the number of particles is able
to change (for instance, an electron and a positron annihilates into two photons)
we use a formalism called “second quantization”. Notice that the name
second quantization is rather badly chosen since it is not a question about
“quantizing again”. It is simply yet an alternative formalism for describing
the states we have in the quantum world. It is not only used in relativistic
quantum mechanics, but also in for instance solid state physics or anywhere
were our quantum system consists of many types of particles which can also
change into each other.
As a technical detail to simplify computations, let us imagine that our
universe is a box with side length L. Then the universe has ﬁnite volume V =
L3
and if we impose periodical boundary conditions, the allowed momenta
form a countable set. In this universe the allowed momenta can be written
as
k =
2π
L
(n1, n2, n3) , (61)
for any integers n1, n2, n3. In the end of each calculation we may let L → ∞
(if we have done the calculation correctly, nothing should depend on L).
Imagining that we have ordered the allowed momenta in some particular
way, we may write them as k1, k2, . . . , ki, . . .. This gives us the possibility to
write an arbitrary state of the system as
|nk1 , nk2 , . . . , nki
, . . . , (62)
13
which we interpret as meaning: there are nk1 particles with momenta k1
(that means plane waves), there are nk2 particles with momenta k2 and so
on. These states in fact form a complete basis, so any state can be written
as a linear combination of these basis states. To be able to write down how
operators act on these states we consider the “basic” operators ak and a†
k
satisfying the commutation relations ap, a†
k = δp,k. That is, if p and k
are diﬀerent then ap and a†
k commute but if they are the same they satisfy
the usual harmonic oscillator algebra. Then, remembering the harmonic
oscillator, we have that
aki
|nk1 , nk2 , . . . , nki
, . . . , =
√
nki
|nk1 , nk2 , . . . , nki
− 1, . . . ,
a†
ki
|nk1 , nk2 , . . . , nki
, . . . , = nki
+ 1|nk1 , nk2 , . . . , nki
+ 1, . . . . (63)
Thus, the a, a†
operators describe the basic operations which take us between
diﬀerent states. For instance, a process where a particle with momentum k1
is scattered into a particle with momentum k2 would be accomplished by
the operator a†
k2
ak1 acting on the state |1k1 , 0, . . . . Explicitly, using (63), we
would have
a†
k2
ak1 |1k1 , 0, . . . = a†
k2
|0, 0, . . . = |0, 1k2 , . . . . (64)
As in the case of the harmonic oscillator, any state in the basis can be
constructed by acting with the a†
operators on the vacuum
|nk1 , nk2 , . . . , nki
, . . . =
a†
k1
nk1
nk1 !
a†
k2
nk2
nk2 !
. . .
a†
ki
nki
nki
!
. . . |0 . (65)
Let us now look at the coordinate representation of these states. Since
we know the total number of particles in each state we know how many
coordinates we need, i.e. one particle has three coordinates x, two particles
have six coordinates x1, x2 etc. Therefore we have
x|1k = φk(x)
x1, x2|1k1 , 1k2 = φk1 (x1)φk2 (x2), (66)
where we have denoted the coordinate representation of the state with momentum
k as φk = 1√
V
eik·x
. The factor 1√
V
is a normalization factor.
14
Now consider the operator
φ(x) =
k
akφk(x). (67)
When its hermitian conjugate acts on the vacuum, it creates a state
φ†
(x0)|0 =
k
1
√
V
e−ik·x0
a†
k|0 =
k
|1k
1
√
V
e−ik·x0
. (68)
In the coordinate representation, the resulting state looks like
x|φ†
(x0)|0 =
k
x|1k
1
√
V
e−ik·x0
=
1
V k
eik·(x−x0)
. (69)
The ﬁnal expression may seem a little bit strange but it is really a delta
function. If x = x0 the exponential oscillates for each k and on average
it is zero. For x = x0 however, all exponentials are 1, independently of
k, so the sum diverges. The integral over x of this function gives zero for
all terms with k = 0 and V for the k = 0 term. The factor 1
V
ensures
that the ﬁnal result of the integration is 1. Thus we see that the operator
φ†
(x0) creates a wavefunction which in coordinate representation is a delta
function located at x0 or in other words, the operator φ†
(x0) creates a particle
completely localized at x0 complementary to a†
k which creates a particle with
ﬁxed momentum k. In the same way, the operator φ(x0) annihilates a particle
located at x0.
Now consider any operator on the full system that can be thought of as
being composed of operations on single particles. In formulas we would write
O =
a
Oa, (70)
where the sum is over all particles in the system. This is a very general
expression. Many operators of physical interest are of this type. For instance,
the total energy is the sum of the energy of the single particles. The total
momentum is the sum of the momenta of the single particles etc.
The operator Oa is a “normal” one particle operator. Its action on a
one particle state can be expanded into a linear combination of one particle
states
Oxφa(x) =
b
φb(x)fba, (71)
15
where, as usual
fba = d3
x φ∗
b(x)Oxφa(x). (72)
Thus the action of each of the one particle operators in (70) can be seen as
a reshuﬄing of the particles. The total number of particles is not changed,
but they are moved between diﬀerent states.
If we want to write how the operator (70) acts on the basis (62) we
know that since it does not change the total number of particles but rather
shuﬄes them around, it has to be written as a linear combination of the
operators a†
aab since this operator ﬁrst annihilates a particle in state b but
then immediately creates a particle in state a. Explicitly we write
O =
ab
faba†
aab, (73)
and you should check that the coeﬃcients fab are really the same as in (72)
by for instance check how O acts on one particle states. Thus we may write
O =
ab
d3
x φ∗
b(x)Oxφa(x)a†
aab = (74)
d3
x
a
aaφa(x)
†
Ox
b
abφb(x) = d3
x φ†
(x)Oxφ(x),
where we have used the operator φ(x) deﬁned in (67). From this we read oﬀ
the procedure for writing operators (this works only for operators that can
be thought of as being composed of operations on the single particles) in the
second quantized formalism. Take the one particle operator (here written
Ox) and compute what looks like an expectation value but instead of a wave
function we use the operator φ(x). Since we know what the operator φ(x)
does we know how to interpret this expression intuitively. First the operator
φ annihilates a particle located at x (if there is a particle there, otherwise the
result is zero), then the operator Ox computes whatever it should compute
(the energy, momentum or something else) and ﬁnally the particle is created
again by φ†
. The integral means that this process is repeated for each point
in space and then summed.
Notice also that here is the origin of the awkward term “second quantization”.
It comes from the fact that the operator φ looks like an arbitrary
16
wavefunction but with the coeﬃcients in the expansion replaced by the annihilation
operators aa. Thus it looks like the wave function is “quantized
again” which of course is not true and is the source of much confusion. Second
quantization is just a formalism, within the framework of ordinary quantum
theory, to describe systems with many particles and in particular where the
types of particles may change.
There is a pleasant surprise incorporated in the second quantized formalism.
Since the creation operators a†
a commute with themselves, any wavefunction
is automatically symmetric with respect to interchange of these particles
as should be the case for bosons. This naturally leads to the question what
one should do if one would like to instead describe fermions since in that case
the wavefunctions should be anti-symmetric with respect to interchange of
particles. The natural thing to try is to use operators which do not commute,
but anti-commute. That is, operators ba, b†
a which satisfy
ba, b†
b = bab†
b + b†
bba = δab,
{ba, bb} = b†
a, b†
b = 0. (75)
In this case, since the creation operators anti-commute, we get an extra
minus sign when we interchange particles b†
ab†
b = −b†
bb†
a giving us the required
behavior under interchange of particles. Furthermore we see that if we try
to put more than one particle in each state
|2 = b†
b†
|0 =
1
2
b†
, b†
|0 = 0! (76)
This means that the Pauli principle is automatically incorporated when we
use anti-commuting creation/annihilation operators.
6 Dyson’s method - the interaction picture
Since we have found the time dependent plane wave solutions of the Dirac
equation, we completely know the time evolution of any state, if there are no
interactions (the theory without interactions we call the free theory). Just
Fourier expand the full wavefunction at any given time and then let the
individual plane waves evolve in time. The problem comes when we consider
an interacting theory. Then the Hamiltonian can be written as a sum of
two operators H = H0 + HI where H0 is the free Hamiltonian which is
17
responsible for the (almost trivial) time evolution of the free theory (i.e. the
plane waves are eigenstates of H0) and HI is the interaction Hamiltonian
which does not necessarily commute with H0 or even with itself at diﬀerent
times. This makes the time evolution problem quite involved. However,
Dyson has invented a nice little trick which “hides” the (almost trivial) time
evolution of the free theory so that we may concentrate on the (slightly more
complicated) time evolution given by the interaction Hamiltonian HI. We
want to solve the “Schr¨odinger” equation
(H0 + HI) |ψ = i
∂
∂t
|ψ . (77)
Now deﬁne a new state |ψ = e−iH0t
|χ . Inserting this we get
(H0 + HI) e−iH0t
|χ = i
∂
∂t
e−iH0t
|χ = e−iH0t
i
∂
∂t
+ H0 |χ . (78)
Multiplying from the left with eiH0t
and using that H0 commutes with itself
we get
eiH0t
HIe−iH0t
|χ = i
∂
∂t
|χ . (79)
If we deﬁne a time dependent interaction Hamiltonian HI(t) = eiH0t
HIe−iH0t
this equation takes a very simple form
HI(t)|χ = i
∂
∂t
|χ , (80)
i.e. it looks just like the Schr¨odinger equation, but with the Hamiltonian
HI(t). The “trivial” time dependence generated by H0 is taken care of by
making the operators in HI evolve in time like in the free theory. Notice that
since |χ is a solution to the time dependent Schr¨odinger equation it depends
on time as in the so called Schr¨odinger representation while the operator
HI(t) = eiH0t
HIe−iH0t
depends on time as an operator in the Heisenberg representation!
This is a funny mix of representations known as the interaction
representation. Anyway, from our studies of Quantum Mechanics we know
how to solve the time dependence of |χ . The solution is given as
|χ(t) = U(t, t0)|χ(t0) , (81)
18
where we have introduced the time evolution operator
U(t, t0) = T e
−i
t
t0
HI (t)dt
, (82)
where T represents the time ordering operator.
One may worry that it is really the wave function |ψ which is the “correct”
wave function which one should use to calculate probability amplitudes
but it is not diﬃcult to show using what you know about the time evolution
operator (exercise) that the expressions
ψf|T e−i (H0+HI )dt
|ψi , (83)
and
χf|T e−i HI (t)dt
|χi , (84)
are equal.
7 The quantized Dirac ﬁeld
We would now like to write the equivalent of the ﬁeld operators (67) for
the Dirac ﬁeld. Since we know that electrons are fermions, we know that
we should use anti-commuting creation/annihilation operators rather than
commuting ones. With this in mind we may immediately write down a
candidate for the ﬁeld operators
ψ(x) =
1
√
V k
4
r=1
1
2 |E|
bk,ru(r)
(k)eik·x
. (85)
Here r ∈ {1, 2, 3, 4} is an index which runs over the four independent spinor
solutions. Since we are using the interaction representation, we now need to
make this operator transform in time according to the free theory. That is,
ψ(t) = eiH0t
ψe−iH0t
. If one then uses the formula eA
Be−A
= B + [A, B] +
1
2!
[A, [A, B]] + . . . and the fact that H0, b†
k,r = E(k)b†
k,r
1
, one ﬁnds that
1
This follows from the fact that H0|1k = E(k)|1k and that H0|0 = 0. Alternatively
one may explicitly evaluate the (second quantized) Hamiltonian which turns out to be
H0 = k,r Eb†
k,rbk,r
19
bk,r(t) = bk,re−iEt
which tells us that
ψ(x, t) =
1
√
V k
2
r=1
1
√
2E
bk,ru(r)
(k)e−iE(k)t+ik·x
+
4
r=3
1
2 |E|
bk,ru(r)
(k)e+i|E(k)|t+ik·x

 . (86)
Notice that we had to separate the positive energy solutions (r = 1, 2) from
the negative energy solutions (r = 3, 4) since they will have a time dependence
of diﬀerent type! In fact, the time dependence of the negative energy
annihilation operators looks more like the time dependence of a creation
operator. Loosely one can reason as follows; if the operator O(t) creates
something the ﬁnal state should have bigger energy than the initial state.
Thus we have
Ef|O(t)|Ei = Ef|eiH0t
Oe−iH0t
|Ei = Ef|Oei(Ef −Ei)t
|Ei , (87)
so we see that a creation operator should have the time dependence e+iEt
. If,
on the other hand, the operator O would annihilate something, then Ef < Ei
and the time dependence going together with annihilation should thus be of
the type e−iEt
. This gives us a hint on how to treat the problematic negative
energy solutions which seem to be inherent in any relativistic theory. Namely
we deﬁne
d†
k,1 = −b−k,4,
d†
k,2 = b−k,3. (88)
The annihilation of a negative energy electron is thus reinterpreted as the
creation of a new positive energy particle, a positron. The new particle has
exactly the same properties as the electron (mass etc) except that the charge
is opposite. It is known as the anti-particle of the electron. To go together
with this we also redeﬁne the spinors
v(1)
(k) = −u(4)
(−k),
v(2)
(k) = u(3)
(−k). (89)
With these new deﬁnitions we may rewrite the ﬁeld operator in the Heisenberg
representation as
ψ(x, t) =
1
√
V k
2
r=1
1
√
2E
bk,ru(r)
(k)e−iE(k)t+ik·x
20
+d†
k,rv(r)
(k)e+iE(k)t−ik·x
. (90)
According to what we know about second quantization, this operator will
annihilate an electron at point x and create a positron at x.
8 Scattering of electrons in the ﬁeld of a nucleus
(Rutherford/Mott scattering)
We will consider the scattering of an electron in the ﬁeld of a positively
charged heavy particle, typically in the ﬁeld of a nucleus, so we will take the
potential to be:
φ = A0 =
Ze
r
(91)
The incoming state will of course be a state describing the incoming electron
with momentum p and spin r. That is, it will be described by the state
|i = b†
p,r|0 (92)
since the b†
operator creates a state of one electron with the speciﬁed momentum
and spin. The outgoing, ﬁnal state is also a state with only one
electron, but since the electron has scattered it has a diﬀerent momentum k
and a (possibly) diﬀerent spin s. This is given by the state
|f = b†
k,s|0 (93)
The probability amplitude for this process (scattering of an electron with
momentum p and spin r to an electron with the momentum k and spin s
we get by taking the initial state |i and evolve it with the time evolution
operator and ﬁnally taking the overlap with the state f|. The probability
amplitude is therefore given by the expression
M = f|T e−i HI
|i (94)
and the probability is of course the absolute square of the probability ampli-
tude.
We know that HI is small (since e is a small number) so we can evaluate
the probability amplitude in perturbation theory
f|T e−i HI
|i = f|i − ie f| d4
xˆ¯ψ/A ˆψ|i + . . . (95)
21
Notice here that the time ordering operator T is trivial since all operators
are at the same time. Only in the higher order terms is the T operator
important. If k = p which means that scattering has taken place, the ﬁrst
term is zero.
Inserting the information we have we can compute
M = −
i
V q1,t1 q2,t2
d4
x
Ze2
r
0|bk,s


1
2Eq1
dq1,t1 ¯vq1,t1 e−iq1·x
+
1
2Eq1
b†
q1,t1
¯uq1,t1 eiq1·x


γ0


1
2Eq2
d†
q2,t2
vq2,t2 eiq2·x
+
1
2Eq2
bq2,t2 uq2,t2 e−iq2·x

 b†
p,r|0
(96)
There are in principle four diﬀerent terms but clearly only terms with the
same number of creation and annihilation operators will survive. There are
two such terms, the ﬁrst being the one where we select the positron creation/annihilation
operators from the parenthesis. The operators squeezed
between the vacuum states in that case are
0|bk,sdq1,t1 d†
q2,t2
b†
p,r|0 (97)
and using the anti-commutation relations we can transform this into
δp,kδr,sδq1,q2 δt1,t2 (98)
We see that this term does not give anything unless p = k which means
that no scattering is taking place. It would represent a process where the
electron is just passing by when a positron is created and annihilated out of
the vacuum. This clearly have no eﬀect on the scattering process and we will
therefore drop this term.
The second term is the one with only electron creation/annihilation operators
which, writing only the operators, gives us
0|bk,sb†
q1,t1
bq2,t2 b†
p,r|0 = δk,q1 δs,t1 δq2,pδt2,r (99)
This is the type of term we expect. The interpretation is that the incoming
electron gets annihilated and there is a new electron (with new momentum
22
and spin) created, i.e. the electron gets “scattered”. We thus get the formula
M = −
i
V q1,t1 q2,t2
d4
x
Ze2
r
ei(q1−q2)·x
4Eq1 Eq2
¯uq1,t1 γ0
uq2,t2 δk,q1 δs,t1 δq2,pδt2,r (100)
We can use the Kronecker deltas to get rid of the summations
M = −
i
V
d4
x
Ze2
r
ei(k−p)·x
4EpEk
¯uk,sγ0
up,r =
−i
U(k − p)
V 4EpEk
(¯uk,sγ0
up,r)
T/2
−T/2
dt ei(Ek−Ep)t
(101)
where we have introduced the 3-dimensional Fourier transform of the coulomb
potential
U(k) ≡ d3
x
Ze2
|x|
e−ik·x
=
4πZe2
|k|2 (102)
When we let the interaction time T go to inﬁnity, the last integral in (101)
is just (2π times) a delta function of the energy telling us that the energy is
conserved in the scattering process.
Now the actual probability, let us call it P, is given by the absolute square
of the probability amplitude
P = |M|2
=
|U|2
4V 2EpEk
¯uk,sγ0
up,r
2
|2πδ(Ek − Ep)|2
(103)
Here the last term, the delta function square, may seem a little bit odd, but
we can treat it using a trick, namely, we may write it as
|2πδ(Ek − Ep)|2
= lim
T→∞
2πδ(Ek − Ep)
T/2
−T/2
dt ei(Ek−Ep)t
(104)
Due to the presence of the delta function, the Ek − Ep in the integrand can
be replaced by 0. This means that the integrand can be replaced by 1 and
thus the integral is equal to T. We therefore get the result
P = lim
T→∞
|U|2
4V 2EpEk
¯uk,sγ0
up,r
2
T2πδ(Ek − Ep) (105)
23
which gives us an expression for the probability per unit time
w ≡
P
T
=
|U|2
4V 2EpEk
¯uk,sγ0
up,r
2
2πδ(Ek − Ep) (106)
Since we cannot experimentally separate scattering into ﬁnal energy and
momenta which are close to each other, we need to sum over all these probabilities
to get a total probability for scattering into a state with ﬁnal energy
Ek or into a state with energy close to it. This we do by multiplying the
probability with the density of states ρ(Ek) and then integrate over energy
wtot = dEk
|U|2
4V 2EpEk
¯uk,sγ0
up,r
2
2πδ(Ek − Ep)ρ(Ek) =
|U|2
4V 2E2
p
¯uk,sγ0
up,r
2
2πρ(Ep) (107)
This expression now depends only on the energy of the incoming electron
Ep which we will hereafter denote by just E. Note however that the spinors
u still depend on the 3-dimensional ﬁnal momentum k. Because of energy
conservation |k| = |p|, but the direction can still be diﬀerent.
The density of states function ρ(E) we ﬁnd in the following way. We have
assumed that the universe is a (large but ﬁnite) box with length, width and
hight L. In such a box the allowed 3-momenta are not arbitrary but rather
discrete points momentum space. Only momenta with values ki = 2π
L
ni i =
1, 2, 3, where ni are integers are allowed. This gives a density of states in
momentum space as ρ(k) = L3
(2π)3 = V
(2π)3 . Since the energy is a function of
the 3-momenta we can write
ρ(E)dE = ρ(k)d3
k =
V
(2π)3
k2
d |k| dΩk (108)
where we have introduced spherical coordinates in momentum space. From
this it follows that
ρ(E) =
V
(2π)3
k2 d |k|
dE
dΩk (109)
and since dE
d|k|
= |k|
E
we have
ρ(E) =
V
(2π)3
E |k| dΩk (110)
24
We thus have the number
w =
|U|2
4V 2E2
¯uk,sγ0
up,r
2 V
(2π)2
E |k| dΩk (111)
representing the probability per unit time that a particle gets scattered into
the space angle dΩk. More precisely, since the incoming wave-function is
extended in all space and is normalized to one, which means that there is
only one particle in the whole universe, we have calculated the probability
for scattering if we have an incoming ﬂux (= the number of particles per
unit time and unit area) of v
V
(where v = |k|
E
is the speed of the incoming
particle). Since we would like to get a number which is not dependent on the
particular incoming ﬂux that we have chosen, we divide the probability w by
the ﬂux and get a number called the (diﬀerential) cross section. This number
characterizes the physical process and is not dependent on any particular
choice of ﬂux used in the experiment. It is given by
dσ =
|U(k − p)|2
4(2π)2
¯uk,sγ0
up,r
2
dΩk (112)
To get the actual number of scattered particles per unit time that we will
measure in our detector, we have to multiply this number with the incoming
ﬂux we are using in the experiment. If we are interested here in the cross
section when the incoming particle has some particular spin and the outgoing
particle also has some ﬁxed spin we just insert their corresponding spinors u
and ¯u in the expression for the cross-section above and we are done.
However, if we assume that the initial state is unpolarized which means
that half of the particles have spin up and the other half have spin down,
but that the relative phases of the particles are totally random, then the
resulting probability (cross section) is given by averaging over the spin of the
initial wave-function. In this case that means summing the ﬁnal result over
r and multiplying by 1
2
. If we also do not measure the spin of the outgoing
particle we have to sum the ﬁnal probability (cross section) over the separate
probabilities to measure an outgoing particle with spin up and an outgoing
particle with spin down. This gives us
dσ =
|U(k − p)|2
4(2π)2
1
2 r s
¯uk,sγ0
up,r
2
dΩk (113)
The sums over the diﬀerent spins can be written
r s
¯uk,sγ0
up,r ¯up,rγ0
uk,s (114)
25
or, writing out the matrix indexes explicitly
s
(uk,s)a (¯uk,s)b γ0
bc
r
(up,r)c (¯up,r)d γ0
da
(115)
which, deﬁning the matrix Mab(k) ≡ s (uk,s)a (¯uk,s)b, can be written
Tr M(k)γ0
M(p)γ0
(116)
Using the explicit representation of the spinors one can ﬁnd that
M(k) = /k + m (117)
so we have
dσ =
|U(k − p)|2
8(2π)2
Tr (/k + m)γ0
(/p + m)γ0
dΩk (118)
Using the gamma-matrix anti-commutation relations we can compute
Tr (/k + m)γ0
(/p + m)γ0
= 4 m2
+ EkEp + k · p (119)
Finally using that Ek = Ep and |k| = |p| we can choose a coordinate system
so that p is along the z-axis and k is pointing in the (θ, φ) direction. Inserting
this we get the ﬁnal formula for the relativistically corrected Rutherford
formula also called the Mott cross section.
dσ =
Z2
e4
4 sin4 θ
2
E2
|k|4 1 − v2
sin2 θ
2
dΩ (120)
9 Pair creation
What is the probability that an electron/positron pair is created in the po-
tential
Aµ = 0, 0, 0,
√
4πa cos (ωt) (121)
This 4-potential represents an electric ﬁeld directed in the 3 direction and
oscillating with frequency ω. In this case the initial and ﬁnal states are of
course given by
|i = |0 (122)
|f = b†
k1,r1
d†
k2,r2
|0
26
representing the fact that initially we do not have any particles at all but we
will end up with both an electron with momentum k1 and a positron with
momentum k2. As usual the probability amplitude is given by
M = f|T e−i HI
|i (123)
which, to lowest order in the expansion parameter e can be written as
M = −ie f| d4
xˆ¯ψ/A ˆψ|0 = −ie d4
xA3 0|bk1,r1 dk2,r2
ˆ¯ψγ3 ˆψ|0 (124)
It is quite clear that the only piece that will survive is the piece containing
the operator b†
from ˆ¯ψ and the operator d†
from ˆψ. What remains is
−
ie
V q1,s1 q2,s2
d4
xA3
ei(q1+q2)·x
4Eq1 Eq2
¯uq1,s1 γ3
vq2,s2 0|bk1,r1 dk2,r2 b†
q1,s1
d†
q2,s2
|0
(125)
which, using the anti-commutation relations simpliﬁes to
ie
V
d4
xA3
ei(k1+k2)·x
4Ek1 Ek2
¯uk1,r1 γ3
vk2,r2 (126)
Inserting the expression for the potential we write
iea
√
4π
V 4Ek1 Ek2
¯uk1,r1 γ3
vk2,r2 d4
x cos(ωt)ei(k1+k2)·x
(127)
The integral can be performed by rewriting cos(ωt) in terms of exponentials
as
d4
x cos(ωt)ei(k1+k2)·x
= d3
xe−i(k1+k2)·x
dtei(Ek1
+Ek2
)t eiωt
+ e−iωt
2
=
(2π)4
2
δ3
(k1 + k2) (δ (Ek1 + Ek2 + ω) + δ (Ek1 + Ek2 − ω)) (128)
The term containing δ (Ek1 + Ek2 + ω) will clearly not give any contribution
since Ek1 , Ek2 and ω are all positive. Therefore we have for the probability
amplitude
M =
i(2π)4
ea
√
4π
2V 4Ek1 Ek2
¯uk1,r1 γ3
vk2,r2 δ3
(k1 + k2) δ (Ek1 + Ek2 − ω) (129)
27
Again we see that the delta functions express energy conservation ω = Ek1 +
Ek2 and momentum conservation k1 + k2 = 0. Namely, the frequency of the
electric ﬁeld has to represent an energy which precisely matches the energy of
the created electron/positron pair. Also, the electron/positron has to come
out back-to-back so that momentum is conserved. This also means that
Ek1 = Ek2 ≡ E.
The probability is the absolute square of the probability amplitude
P =
πe2
a2
4V 2E2
¯uk1,r1 γ3
vk2,r2
2
(2π)3
δ3
(k1 + k2) (2π)δ (2E − ω)
2
(130)
By a similar trick as in the last section we evaluate the square of the delta
functions to be
(2π)3
δ3
(k1 + k2) (2π)δ (2E − ω)
2
= V T(2π)4
δ3
(k1 + k2) δ (2E − ω)(131)
giving us
P = T
πe2
a2
(2π)4
4V E2
¯uk1,r1 γ3
vk2,r2
2
δ3
(k1 + k2) δ (2E − ω) (132)
As in the previous example, we cannot separate ﬁnal states which are too
close in phase space. Therefore we have to sum over these probabilities to
get a total “eﬀective” probability. As in the previous example, this means
including a factor V
(2π)3 d3
k for each ﬁnal particle giving us
P = T
πe2
a2
(2π)4
4V E2
¯uk1,r1 γ3
vk2,r2
2
δ3
(k1 + k2) δ (2E − ω)
V
(2π)3
d3
k1
V
(2π)3
d3
k2
(133)
One of the integrals is easily performed using the ﬁrst delta function giving us
that k1 = −k2 ≡ k and the second integral we can perform, as in the previous
case after rewriting d3
k = |k|2
d |k| dΩk = |k|2 d|k|
dE
dEdΩk = |k| EdEdΩk. The
result is (noticing that δ(2E − ω) = 1
2
δ(E − ω
2
))
P = V T
e2
a2
|k|
16πω
¯uk1,r1 γ3
vk2,r2
2
dΩk (134)
Notice that the probability is proportional to the volume (of the region with
the electric ﬁeld) and the time we let the ﬁeld act, in accordance to physical
28
expectations. This makes it more useful to speak about the probability per
unit volume and unit time, P
V T
.
If we are not observing the spins of the ﬁnal particles we have to sum over
the diﬀerent probabilities of observing the diﬀerent possible spins. Then we
will get the formula
P
V T
=
e2
a2
|k|
16πω
dΩk
r1,r2
¯uk1,r1 γ3
vk2,r2
2
(135)
Using the trick of the last section, this can be rewritten as
P
V T
=
e2
a2
|k|
16πω
dΩkTr (/k1 + m)γ3
(/k2 − m)γ3
(136)
or, using the anti commutation relations of the gamma matrices and remembering
that k1 = (E, k), k2 = (E, −k)
P
V T
=
e2
a2
|k|
16πω
dΩk2ω2
1 − v2
cos2
(θ) =
e2
a2
ω2
16π
1 −
4m2
ω2
1 − v2
cos2
(θ) dΩk (137)
which gives the probability of an electron/positron pair with a momentum
with angle θ towards the electric ﬁeld. If we are interested in the total
probability, irrespective of the angle, we have to integrate over dΩ to get
P
V T
=
e2
a2
ω2
6
1 −
4m2
ω2
1 +
2m2
ω2
(138)
Notice that there is a “threshold” in the energy. The probability is zero for
ω ≤ 2m ≤ 2E, i.e. the energy of the photons in the ﬁeld must be larger than
the mass of the electron/positron pair to be able to create it. In contrast the
probability is non-zero for arbitrary small amplitude a of the ﬁeld.
10 The quantized electro-magnetic ﬁeld
In the two previous examples the electro-magnetic ﬁeld was treated classically
as an external ﬁeld. In order to incorporate photons into the theory we need
to quantize also the electro-magnetic ﬁeld. We will do this in a relativistically
29
covariant fashion, so let us start by recapitulating some notation. Remember
that we may use a scalar potential φ and a vector potential A to describe
the electric and magnetic ﬁelds
E = − φ − ∂tA,
B = × A. (139)
Introducing the four vector Aµ
= (φ, A) we may write the electric and the
magnetic ﬁeld in a compact form
Fµν
= ∂µ
Aν
− ∂ν
Aµ
=





0 −Ex −Ey −Ez
Ex 0 −Bz By
Ey Bz 0 −Bx
Ez −By Bx 0





. (140)
Using Fµν
, Maxwell’s equations can also be written covariantly as
∂µFµν
= 0,
∂µ
Fνσ
+ ∂σ
Fµν
+ ∂ν
Fσµ
= 0. (141)
Here we have used Einstein’s summation convention (sum over repeated indexes).
It is interesting to observe that Aµ
is not uniquely speciﬁed by the
electric and magnetic ﬁeld. Namely, if we deﬁne a new vector potential by
Aµ
new = Aµ
old + ∂µ
χ, (142)
for any function χ(x), the ﬁeld strength Fµν
, and thus the electric and
magnetic ﬁelds, remain unchanged. This we can use to simplify the form
of Maxwell’s equations. If we choose χ so that ∂µAµ
new = 0, i.e. so that
∂µ∂µ
χ + ∂µAµ
old = 0, then we have
∂µFµν
new = ∂µ (∂µ
Aν
new − ∂ν
Aµ
new) = ∂µ∂µ
Aν
new = 0, (143)
that is, each component of the vector potential has to satisfy the (massless)
Klein-Gordon equation (which we have already solved!). Thus, the vector
potential we will use will have to satisfy two equations
∂µAµ
= 0,
∂µ∂µ
Aν
= 0. (144)
30
However, this still does not completely specify Aµ
. We may still shift it as
Aµ
ﬁn = Aµ
new + ∂µ
Λ with a Λ satisfying ∂µ∂µ
Λ = 0 since this leaves the two
equations (144) invariant. This additional invariance can be used to choose
A0
ﬁn = 0. We have thus seen that we can always choose a vector potential
which satisﬁes the following three equations
A0
= 0,
∂µAµ
= 0, (145)
∂µ∂µ
Aν
= 0.
This choice of the form of the vector potential (or choice of gauge as the
jargon goes) is known as the Coulomb gauge.
Using this information we may now immediately write down the quantized
electromagnetic ﬁeld
Aµ =
1
√
V k,α
1
√
2ω
ak,α
(α)
µ e−ikx
+ a†
k,α
(α)
µ eikx
. (146)
From the third equation in (145) we ﬁnd that kµkµ
= ω2
− k2
= 0. The
second equation tells us that k · (α)
= 0 while the ﬁrst equation tells us that
(α)
0 = 0. We thus ﬁnd that (α)
µ is a four vector with zero time component
and orthogonal to the four momentum. Thus, out of the four orthonormal
four vectors
(0)
µ = (1, 0),
(1)
µ = (0, ¯(1)
),
(2)
µ = (0, ¯(2)
), (147)
(3)
µ = (0,
k
|k|
),
only (1)
µ and (2)
µ are admissible. Thus we see that the sum over α in (146)
is restricted to α = 1, 2 in order for Aµ to satisfy all three equations in
(145). The (α)
µ vectors are known as the polarization vectors of the photon.
As usual, when we second quantize, the ak,α, a†
k,α become annihilation and
creation operators which annihilate/create a photon with momentum k and
polarization (α)
µ .
31
11 The electron propagator
If we are interested in the next to lowest order corrections to the scattering
in an external potential, we have to study the term
−
e2
2
f| d4
x1 d4
x2T ( ¯ψ/Aψ)(x1)( ¯ψ/Aψ)(x2) |i (148)
Notice here that the T operator is non-trivial and important since the two
¯ψ/Aψ factors change place if t2 > t1.
Let us ﬁrst assume that t1 > t2. Then the ordering given above is the
correct one and we can use the term as it stands. For each ψ factor there
are two diﬀerent choices for the operator, one associated to the electron and
one associated with the positron. Since there are four ψ operators, we have
in principle 16 diﬀerent terms. However, out of these 16 terms, only 2 are
non-trivial. All the others are either zero or they represent “non-connected”
terms in the sense discussed before. For instance, there is one term which
represents an electron getting scattered at x1 while at x2 a positron is created
and annihilated but there is no contact between these two points. The ﬁrst
non-trivial term is schematically
0|bf b†
1b1b†
2b2b†
i |0 = δf1δ12δ2i (149)
representing the incoming electron being scattered ﬁrst at x2 and then, later
at x1. The second non-trivial term is
0|bf d1b1b†
2d†
2b†
i |0 = −δ12δf2δ1i + δ12δ12δfi (150)
the second term here is again “non-connected”, representing a process with no
scattering because of the δfi but the ﬁrst term is interesting and slightly hard
to interpret. Since t1 > t2 we have to interpret it as the incoming electron
ﬂies past the point x2 where there is an electron/positron pair created and
later annihilates with the positron just created at x2. The electron created
at x1 is in fact the ﬁnal outgoing electron. Graphically we have
32
Similarly, when t2 > t1 we have graphically
Using this knowledge we write diagram (1) and (3) (where the initial electron
33
is annihilated in x2) as
0|bf
¯ψa(x1)|0 0|T ψb(x1) ¯ψc(x2) |0 0|ψd(x2)b†
i |0 (151)
and the diagrams (2) and (4) (where the initial electron is annihilated in x1)
as
0|bf
¯ψc(x2)|0 0|T ψd(x2) ¯ψa(x1) |0 0|ψb(x1)b†
i |0 (152)
and the total amplitude is of course the sum of these two terms. We see that
in this expression the object 0|T ψa(x1) ¯ψb(x2) |0 plays an important role.
It represents the particle going between the points x1 and x2 and if t1 > t2
it is an electron but if t2 > t1 it is a positron. This object is therefore called
the electron/positron propagator and we will now proceed to calculate it.
Because of the time ordering operator T we have to consider two cases.
Assume to begin with that t1 > t2. Then we pick out the electron creation/annihilation
operators and the propagator can be written as
k1,s1 k2,s2
e−ik1·x1+ik2·x2
2V
√
E1E2
(u1)a(¯u2)b 0|b1b†
2|0 (153)
Since 0|b1b†
2|0 = δk1,k2 δs1,s2 we can evaluate one of the sums “for free”
k1,s1
eik1·(x2−x1)
2V E1
(u1)a(¯u1)b (154)
Using the results of the previous section ( s u(s)
a ¯u
(s)
b = (/k + m)ab) we can
calculate the sum over the spin in the expression for the propagator. The
result is
k1
eik1·(x2−x1)
2V E1
(/k1 + m)ab (155)
For convenience we will here change the summation over momenta k into
an integral. This we can do since the volume of the universe is large (so that
the distribution of states in momentum space is almost continuous) which
means that k = V
(2π)3 d3
k. This gives us
d3
k
(2π)3
eiE(t2−t1)
e−ik·(x2−x1) (/k + m)ab
2E
(156)
34
(where we have dropped the index 1 for convenience). Notice here that E is
a function of k.
The same analysis in the case where t2 > t1 gives
−
d3
k
(2π)3
eiE(t1−t2)
e−ik·(x1−x2) (/k − m)ab
2E
(157)
where the minus sign comes from the fact that the T operator has reordered
two fermionic operators.
We can rewrite the result in a more covariant form by using the integral
−
e−iE|t|
2E
= lim
ε→0
1
2πi
∞
−∞
dk0
e−ik0t
k2
0 − E2 + iε
(158)
The ﬁrst part of the propagator then becomes
d3
k
(2π)3
eiE(t2−t1)
e−ik·(x2−x1) (/k + m)ab
2E
=
(−i/∂2 + m)ab
d3
k
(2π)3
eiE(t2−t1)
2E
e−ik·(x2−x1)
=
(−i/∂2 + m)ab
d3
k
(2π)3
i
dk0
2π
eik0(t2−t1)
k2
0 − E2 + iε
e−ik·(x2−x1)
(159)
Since E2
= k2
+ m2
the denominator of the integrand can be written k2
0 −
E2
+ iε = k2
0 − k2
− m2
+ iε = k2
− m2
+ iε giving us
i (−i/∂2 + m)ab
d4
k
(2π)4
eik·(x2−x1)
k2 − m2 + iε
(160)
and pushing the derivative operator back in, we get
i
d4
k
(2π)4
(/k + m)ab
k2 − m2 + iε
eik·(x2−x1)
(161)
The second part of the propagator (the one with t2 > t1) can similarly be
rewritten
i
d4
k
(2π)4
(−/k + m)ab
k2 − m2 + iε
eik·(x1−x2)
(162)
35
and, changing the integration variable from k to −k, we get
i
d4
k
(2π)4
(/k + m)ab
k2 − m2 + iε
eik·(x2−x1)
(163)
which is exactly the same expression as for the part of the propagator with
t1 > t2. We thus have a unique expression for the propagator
G(x1 − x2) ≡ i
d4
k
(2π)4
(/k + m)ab
k2 − m2 + iε
e−ik·(x1−x2)
(164)
independent of whether t1 or t2 comes ﬁrst.
Notice that if we act with the Dirac operator i/∂ − m on the propagator
we get
(i/∂ − m)G(x) = i
d4
k
(2π)4
(/k − m)(/k + m)
k2 − m2 + iε
e−ik·x
=
i
d4
k
(2π)4
e−ik·x
= iδ4
(x) (165)
so that G(x) is the Green function of the Dirac operator in accordance with
the usual interpretation of the propagator.
12 Compton scattering
Compton scattering is scattering of a photon on an electron. In the initial
state we therefore have a photon and an electron
|i = b†
p,sa†
k,α|0 (166)
Since the photon is physical the index α takes only the values 1, 2 corresponding
to the two physical polarizations of the photon. The ﬁnal state
also contains a photon and an electron but with diﬀerent spins and momenta
|f = b†
p ,s a†
k ,α |0 (167)
As usual, the probability amplitude is given by
M = f|T e−i HI
|i = f|i − i f| HI|i −
1
2
f|T HI HI|i + . . . (168)
36
The ﬁrst term is non-zero only for the case when no scattering is taking place
(p = p , k = k ). The second term is zero because it always involves exactly
three photon creation/annihilation operators so the lowest non-trivial term
is the third one. Separating the piece that has to do with electrons/positrons
and the piece that has to do with photons we can write it as
−
e2
2
d4
x1d4
x2 0|bp ,s T ( ¯ψaψb)(x1)( ¯ψcψd)(x2) b†
p,s|0
(γµ
)ab (γν
)cd 0|ak ,α T [Aµ(x1)Aν(x2)] a†
k,α|0 (169)
The piece associated with the electrons/positrons we have already calculated
in the previous section. It is given by two terms corresponding to the cases
where the electron ﬁrst goes to the point x2, interacts, then goes to the point
x1 where it is scattered to the ﬁnal electron state and, oppositely when it
goes ﬁrst to x1 and then continues to x2. Graphically this can be represented
as
For the second process the expression is
(¯up ,s )a
√
2V E
eip ·x1
i
d4
q
(2π)4
(/q + m)bc
q2 − m2 + iε
e−iq·(x1−x2) (up,s)d
√
2V E
e−ip·x2
(170)
37
Here we recognize the middle part as the electron/positron propagator corresponding
to the piece where the electron propagates from the point x2 to the
point x1. The expression for the ﬁrst term is similar. Evaluating the photon
piece we get a sum of two terms. The ﬁrst comes from choosing the creation
operator in Aµ(x1) and the annihilation operator in Aν(x2), representing the
case where the incoming photon is annihilated in x2 and the outgoing photon
is created in x1, and the second term (where we also have to ignore
a “non-connected” piece) comes from choosing the annihilation operator in
Aµ(x1) and the creation operator in Aν(x2), representing the case where the
incoming photon is annihilated in x1 and the outgoing photon is created in
x2. Explicitly we have
4π
(εk ,α )µ
√
2V ω
eik ·x1
(εk,α)ν
√
2V ω
e−ik·x2
+ 4π
(εk ,α )ν
√
2V ω
eik ·x2
(εk,α)µ
√
2V ω
e−ik·x1
(171)
Graphically we can write this as
Putting these two terms together we graphically get
38
Since we are integrating over x1 and x2 in the ﬁnal expression these variables
are really “dummy” variables, meaning that we can anywhere rename them
as we wish. In particular we can interchange them x1 ↔ x2. From the
pictures we see that two of the pictures change into the other two under this
relabeling so we have really to calculate only two terms, graphically they
look like this
39
The second term we can write as
π
V 2
√
E Eω ω
d4
x1 d4
x2i
d4
q
(2π)4
¯up ,s /εk ,α (/q + m)/εk,αup,s
q2 − m2 + iε
eip ·x2
eik ·x2
e−ip·x1
e−ik·x1
e−iq·(x2−x1)
(172)
and doing the integrals over x1 and x2 we get
π
V 2
√
E Eω ω
i
d4
q
(2π)4
¯up ,s /εk ,α (/q + m)/εk,αup,s
q2 − m2 + iε
(2π)4
δ4
(p + k − q)(2π)4
δ4
(q − p − k) (173)
Notice that the delta functions express momentum conservation at each of
the vertexes. The expression for the ﬁrst diagram is the same except that one
has to switch positions for the polarization vectors ε and switch the place of
k and k in the delta functions. We can get rid of one of the delta functions
by performing the q integral which gives us
−
iπe2
(2π)4
δ4
(p + k − p − k)
V 2
√
EE ωω
¯up ,s /εk ,α
/p + /k + m
(p + k)2 − m2 + iε
/εk,αup,s+
¯up ,s /εk,α
/p − /k + m
(p − k )2 − m2 + iε
/εk ,α up,s (174)
40
Here we notice that the delta function which is left just expresses the momentum
conservation of the whole process. To simplify this expression further
we can use the Dirac equation on the spinors
/pup = mup (175)
which, using the anti-commutation relations of the gamma matrices, leads to
/p/εup = −/εmup + 2p · εup (176)
and choosing a coordinate system where the initial electron is at rest p =
(m, 0, 0, 0) so that p · ε = p · ε = 0 we get
−
iπe2
(2π)4
δ4
(p + k − p − k)
V 2
√
EE ωω
¯u /ε /k/εu
2p · k
+
¯u /ε/k /ε u
2p · k
(177)
Now we calculate the probability density by taking the absolute square
of the amplitude. At the same time we say that we are not interested in the
polarization of the ﬁnal electron, and the initial electron comes in a mixed
state so that we have to include a sum 1
2 s,s . We then have the probability
P =
πe2
(2π)4
δ4
(p + k − p − k)
2V 2
√
EE ωω
2
1
2 s,s
¯u /ε /k/εu
p · k
+
¯u /ε/k /ε u
p · k
¯u/ε/k/ε u
p · k
+
¯u/ε /k /εu
p · k
(178)
where we have used that (γµ
)†
= γ0
γµ
γ0
. Performing the spin sums and
noticing that the ﬁrst term in each parenthesis is equal to the second if we
make the exchange ε ↔ ε and k ↔ −k we have
P =
πe2
(2π)4
δ4
(p + k − p − k)
2V 2
√
2EE ωω
2
Tr
/ε /k/ε(/p + m)/ε/k/ε (/p + m)
(p · k)2
+
/ε /k/ε(/p + m)/ε /k /ε(/p + m)
(p · k)(p · k )
+
ε ↔ ε
k ↔ −k
(179)
The reason for the minus sign in the exchange of the photon momenta is
that we do not want p or p to change, but since p = p + k − k we have to
interchange k and k with an extra minus sign.
41
Let us perform the trace over the ﬁrst term explicitly. The fact that the
trace over any odd number of gamma matrices is zero allows us to write it
as
Tr (/ε /k/ε/p/ε/k/ε /p ) + m2
Tr (/ε /k/ε/ε/k/ε ) (180)
Using the gamma matrix algebra we know that /a/b = −/b/a + 2a · b which also
implies that /a/a = a · a, and using the cyclicity of the trace, we can show that
the second term is
m2
Tr (/k/ε/ε/k/ε /ε ) = m2
(ε · ε)(ε · ε )Tr (/k/k) = 0 (181)
Thus it only remains to calculate the ﬁrst term. Anti-commuting the /p/k in
the middle and using that /k/k = k · k = 0 we have
Tr (/ε /k/ε/p/ε/k/ε /p ) = 2(p · k)Tr (/ε /k/ε /p ) (182)
Then we can anti-commute /ε /k and use that ε · ε = −1 to get
2(k · p)Tr ((−/k/ε + 2(ε · k)) /ε /p ) = 8(k · p) (2(k · ε )(ε · p ) + (k · p )) (183)
where we in the last step used that Tr (γµ
γν
) = 4gµν
.
Similarly the second term in the full trace can be calculated to be
−8(k · p)(k · p) + 16(ε · ε)2
(k · p)(k · p) − 8(k · ε )2
(k · p) + 8(k · ε)2
(k · p)(184)
and the two last terms are given by the ﬁrst two by the interchange above.
Summing all the traces together there are a lot of cancellations and the ﬁnal
result is
8
k · p
k · p
+
k · p
k · p
+ 4(ε · ε)2
− 2 (185)
or, using that k · p = mω and k · p = mω
8
ω
ω
+
ω
ω
+ 4(ε · ε)2
− 2 (186)
Now let us return to the calculation of the probability, it can be written as
P =
πe2
(2π)4
δ4
(p + k − p − k)
2V 2
√
2EE ωω
2
8
ω
ω
+
ω
ω
+ 4(ε · ε)2
− 2 =
V T
π2
e4
V 4EE ωω
(2π)4
δ4
(p + k − p − k)
ω
ω
+
ω
ω
+ 4(ε · ε)2
− 2 (187)
42
Calculating the probability per unit time and summing over inseparable ﬁnal
states we have
P
T
=
π2
e4
V 3EE ωω
ω
ω
+
ω
ω
+ 4(ε · ε)2
− 2
(2π)4
δ4
(p + k − p − k)
V
(2π)3
d3
p
V
(2π)3
d3
k (188)
We get rid of the space-like part of the delta function if we perform one of
the integrals, say the one over p , which leaves us with
P
T
=
π2
e4
(2π)2V EE ωω
ω
ω
+
ω
ω
+ 4(ε · ε)2
− 2
δ(E + ω − E − ω)d3
k (189)
To get rid of the last delta function we can use that ω = |k | to rewrite
d3
k = (ω )2
dω dΩk. What complicates things slightly is that since we already
used the space-like delta function, E = |p |2
+ m2 = |k − k |2
+ m2 =
ω2 + ω 2 − 2ωω cos(θ) + m2 is also a function of ω .
To perform the integration we have to write
δ(E + ω − E − ω)dω = δ(E + ω − E − ω)
dω
d(E + ω )
d(E + ω ) =
dω
d(E + ω )
(190)
Now
d(E + ω )
dω
=
ω − ω cos(θ)
E
+ 1 (191)
which, using that E = ω − ω + m, we can write as
d(E + ω )
dω
=
ω(1 − cos(θ)) + m
E
=
ωω (1 − cos(θ)) + mω
E ω
=
k · k + p · k
E ω
=
(k + p ) · k
E ω
=
p · k
E ω
=
p · k
E ω
=
Eω
E ω
(192)
where we have used that p + k = p + k which in turn implies (by squaring)
that p · k = p · k .
43
Finally, to get a number which is independent of how often we throw in
photons (i.e. the cross section) we have to divide by the ﬂux. Since the
photon wave-function is normalized to one photon in the whole universe and
since the speed of the photon is 1 (in natural units), the incoming ﬂux is 1
V
.
This gives us an expression for the cross section as
dσ =
π2
e4
(2π)2EE ωω
ω
ω
+
ω
ω
+ 4(ε · ε)2
− 2
E ω 3
Eω
dΩ =
e4
4m2
ω
ω
2
ω
ω
+
ω
ω
+ 4(ε · ε)2
− 2 dΩ (193)
which is the famous Klein-Nishina formula for the cross section of Compton
scattering.
13 The photon propagator
Remember that in a previous section we derived the electron/positron propagator
which could be written as
Gab(x1 − x2) = 0|T ψa(x1) ¯ψb(x2) |0 (194)
As we will see in the next section, there is also a photon propagator corresponding
to a photon propagating between two points x1 and x2. As one can
guess, it can be written as
Dµν(x1 − x2) = 0|T [Aµ(x1)Aν(x2)] |0 (195)
Let us calculate it here so that we can use it when it appears in the next
section. Inserting the expressions for the photon ﬁelds we have
k1,α1 k2,α2
4π
2V
√
ω1ω2
0|T (a†
1ε1µeik1·x1
+ a1ε1µe−ik1·x1
)
(a†
2ε2νeik2·x2
+ a2ε2νe−ik2·x2
) |0 =
k1,α1 k2,α2
4π
2V
√
ω1ω2
θ (t1 − t2) ε1µε2νe−ik1·x1
eik2·x2
0|a1a†
2|0
+θ (t2 − t1) ε2νε1µe−ik2·x2
eik1·x1
0|a2a†
1|0 (196)
44
Remembering that
ak,α, a†
k ,α = δk,k δα,α α = 1, 2 (197)
(198)
we ﬁnd
Dµν(x1 − x2) =
k
4π
2V ω
θ (t1 − t2) ε(1)
µ ε(1)
ν + ε(2)
µ ε(2)
ν e−ik·(x1−x2)
+θ (t2 − t1) ε(1)
ν ε(1)
µ + ε(2)
ν ε(2)
µ eik·(x1−x2)
. (199)
This expression does not look too satisfying since the rather messy expression
ε(1)
ν ε(1)
µ + ε(2)
ν ε(2)
µ appears. In fact, using the explicit expressions for
the polarization vectors
ε
(0)
k = (1, 0, 0, 0)
ε
(1)
k = (0, ¯ε1)
ε
(2)
k = (0, ¯ε2) (200)
ε
(3)
k = (0,
k
|k|
)
where ¯ε1 and ¯ε2 are two three dimensional unit vectors which are orthogonal
both to k and to each other, one may show that
− ε(0)
ν ε(0)
µ + ε(1)
ν ε(1)
µ + ε(2)
ν ε(2)
µ + ε(3)
ν ε(3)
µ = −gµν. (201)
One way to prove it would be to say that it is a symmetric matrix with
eigenvalues (-1,1,1,1), therefore it is diagonalizable to −gµν using an orthogonal
matrix M so that MT
(−ε0εT
0 + ε1εT
1 + ε2εT
2 + ε3εT
3 )M = −g. But
by multiplying on the left with M and on the right with MT
we ﬁnd that
(−ε0εT
0 + ε1εT
1 + ε2εT
2 + ε3εT
3 ) = −MgMT
= −g.
Furthermore we may rewrite the − ε(0)
ν ε(0)
µ + ε(3)
ν ε(3)
µ piece using the
four vectors kµ = (ω, −k) and aµ = (ω, k) as
− ε(0)
ν ε(0)
µ + ε(3)
ν ε(3)
µ = −
1
2ω2
(aµkν + kµaν) . (202)
We thus ﬁnd that we may write the propagator
Dµν(x1 − x2) =
k
−gµν +
1
2ω2
(aµkν + kµaν) ×
4π
2V ω
θ (t1 − t2) e−ik·(x1−x2)
+ θ (t2 − t1) e−ik·(x2−x1)
(203)
45
Again using the fact that for a high density of states we can switch the sum
over k for an integral over the density of states
k
= V
d3
k
(2π)3
(204)
and again using the rewriting of the energy dependence as
−
e−iω|t|
2ω
=
1
2πi
∞
−∞
dk0
e−ik0t
k2
0 − |k|2
+ iε
(205)
we can write the propagator as
Dµν(x1 − x2) =
d3
k
(2π)3
4π gµν −
1
2ω2
(aµkν + kµaν) ×
θ(t1 − t2)
1
2πi
dk0
k2 + iε
eik0(t1−t2)
eik·(x1−x2)
+
θ(t2 − t1)
1
2πi
dk0
k2 + iε
eik0(t1−t2)
e−ik·(x1−x2)
. (206)
and changing the sign of the integration variable k in the second term, we
see that both the term for t1 > t2 and the term for t2 > t1 have exactly the
same form, just as for the electron/positron propagator, so we can write it
in a compact form
Dµν(x1 − x2) = −i4π
d4
k
(2π)4
gµν −
1
2ω2
(aµkν + kµaν)
e−ik·(x1−x2)
k2 + iε
.(207)
This is very nice except for the “ak + ka” piece which ruins the covariant
form of the propagator. However, the fact that this piece contains explicit
factors of kµ which is the same four momentum as the momentum which
ﬂows in the propagator saves us. To see this, let us consider a part of a
Feynman diagram looking like
46
The electron line corresponds to an expression (up to normalization con-
stants)
¯u(p + k)γµ
u(p), (208)
To the γµ
is connected the momentum space propagator Dµν(k). The part of
Dµν proportional to kµ together with (208) gives a contribution proportional
to
¯u(p + k)/ku(p) = ¯u(p + k) (/k + /p − /p) u(p) =
{Dirac equation} = ¯u(p + k) (m − m) u(p) = 0. (209)
Therefore the piece proportional to kµaν does not give any contribution to
physical processes and can consequently be dropped. Similarly, the piece
proportional to aµkν cancels in the other end of the propagator. The full
proof of this fact is a little bit involved. One needs to check that it is true
also when the photon propagator ends on an internal electron propagator
and not on an external line as in the simple example above. If you believe
me for now that this is true we can write the photon propagator simply as
Dµν(x1 − x2) = −i4πgµν
d4
k
(2π)4
e−ik·(x1−x2)
k2 + iε
. (210)
47
14 Electron-electron scattering
Let us now consider scattering of two electrons. Let us assume that initially
they have momenta and spins p1, s1 and p2, s2. They are scattered into
electrons with momenta and spins p3, s3 and p4.s4. We therefore take as the
initial state
|i = b†
1b†
2|0 (211)
and as the ﬁnal state we take
|f = b†
4b†
3|0 (212)
The probability amplitude is given by the usual expression
M = f|T e−i HI
|i (213)
and the ﬁrst non-trivial term is the term which is second order in HI. Since
there are 4 b operators in the initial and ﬁnal states the only combination
which will give anything is when we choose the b operators from the 4 ψ
ﬁelds. The expression is therefore
0|b3b4b†
b b †
b b†
1b†
2|0 =
−δ1,p δ2,p δ3,p δ4,p + δ1,p δ2,p δ3,pδ4,p
+δ1,p δ2,p δ3,p δ4,p − δ1,p δ2,p δ3,pδ4,p
+ non−connected pieces (214)
where we have used a shorthand notation in that each delta function also
comes with a corresponding delta for the spin dependence so that δ1,p really
means δ1,pδα1,α. These terms can be graphically represented as
48
Again using the trick of changing the integration variables x1 and x2 we see
that the two last diagrams are equal to the two ﬁrst diagrams which leaves
us with only two expressions to be calculated. The one corresponding to the
ﬁrst diagram we can write as
e2
4V 2
√
E1E2E3E4
(¯u4γµ
u2)(¯u3γν
u1) d4
x1 d4
x2
ei(p4−p2)·x1
ei(p3−p1)·x2
0|T [Aµ(x1)Aν(x2)] |0 (215)
where in the last expression we recognize the photon propagator Dµν(x1 −x2)
which we calculated in the previous section. Inserting the expression we
obtained we get
e2
4V 2
√
E1E2E3E4
(¯u4γµ
u2)(¯u3γν
u1) d4
x1 d4
x2
d4
k
(2π)4
49
ei(p4−p2−k)·x1
ei(p3−p1+k)·x2
4πgµν
i(k2 + iε)
(216)
and performing the x1 and x2 integrals we get
e2
4V 2
√
E1E2E3E4
(¯u4γµ
u2)(¯u3γν
u1)
d4
k
(2π)4
(2π)4
δ(p4 − p2 − k)(2π)4
δ(p3 − p1 + k)
4πgµν
i(k2 + iε)
(217)
We can get rid of one of the delta functions by doing the k integral
e2
4V 2
√
E1E2E3E4
(¯u4γµ
u2)(¯u3γν
u1)
(2π)4
δ(p1 + p2 − p3 − p4)
4πgµν
i((p1 − p3)2 + iε)
(218)
Notice that the remaining delta function expresses the total conservation of
momentum. The second diagram can be easily calculated when we realize
that the only thing that diﬀers between the second and the ﬁrst diagram is
that we have to switch 3 and 4 and also the overall sign. Totally we therefore
have the probability amplitude
M =
e2
(2π)4
δ(p1 + p2 − p3 − p4)
4V 2
√
E1E2E3E4
(¯u4γµ
u2)
4πgµν
i((p1 − p3)2 + iε)
(¯u3γν
u1)−
(¯u3γµ
u2)
4πgµν
i((p1 − p4)2 + iε)
(¯u4γν
u1)(219)
If we assume that the incoming electrons are unpolarized so that we have to
average over the incoming spins and that we do not observe the spin of the
outgoing electrons so that we have to sum over the probabilities of observing
diﬀerent outgoing spin, we have to include a sum over
1
2 s1
1
2 s2 s3 s4
(220)
Introducing the notation
t = (p1 − p3)2
= (p4 − p2)2
u = (p1 − p4)2
= (p3 − p2)2
(221)
50
we can write the total probability as
P = T
e4
(2π)6
δ(p1 + p2 − p3 − p4)
4V 3E1E2E3E4
1
4 s1,s2,s3,s4
(¯u4γµu2)(¯u3γµ
u1)
t
−
(¯u3γµu2)(¯u4γµ
u1)
u
2
(222)
which, using the expressions for the spin sums, can be written as
P = T
e4
(2π)6
δ(p1 + p2 − p3 − p4)
4V 3E1E2E3E4
1
4
Tr((/p4 + m)γµ(/p2 + m)γν)Tr((/p3 + m)γµ
(/p1 + m)γν
)
t2
+
Tr((/p3 + m)γµ(/p2 + m)γν)Tr((/p4 + m)γµ
(/p1 + m)γν
)
u2
−
Tr((/p4 + m)γµ(/p2 + m)γν(/p3 + m)γµ
(/p1 + m)γν
)
tu
−
Tr((/p3 + m)γµ(/p2 + m)γν(/p4 + m)γµ
(/p1 + m)γν
)
tu
(223)
Let us explicitly calculate the ﬁrst trace and leave the other ones as an
exercise. To do this we observe that
Tr((/p + m)γµ(/q + m)γν) = Tr(/pγµ/qγν) + m2
Tr(γµγν) =
4 pµqν − (p · q)gµν + pνqµ + m2
gµν (224)
giving the result for the ﬁrst term
16 2(p1 · p2)(p3 · p4) + 2(p1 · p4)(p2 · p3) + 2(m2
− p2 · p4)(p1 · p3)+
2(m2
− p1 · p3)(p2 · p4) + 4(m2
− p1 · p3)(m2
− p2 · p4) (225)
and using that the relation p1 + p2 = p3 + p4 implies
p1 · p2 = p3 · p4
p1 · p3 = p2 · p4 (226)
p1 · p4 = p2 · p3 (227)
we can write it as
32 (p1 · p2)2
+ (p1 · p4)2
+ 2m2
(m2
− p1 · p3) (228)
51
Computing also the remaining traces we get the full answer
P = T
e4
(2π)6
δ(p1 + p2 − p3 − p4)
4V 3E1E2E3E4
8
(p1 · p2)2
+ (p1 · p4)2
+ 2m2
(m2
− p1 · p3)
t2
+
(p1 · p2)2
+ (p1 · p3)2
+ 2m2
(m2
− p1 · p4)
u2
−
2(p1 · p2)(2m2
− p1 · p2)
tu
(229)
Let us choose center of mass coordinates such that
p1 = (E, p)
p2 = (E, −p)
p3 = (E, p ) (230)
p4 = (E, −p )
and that p · p = |p|2
cos(θ). Then we see that we can write
p1 · p2 = m2
+ 2 |p|2
p1 · p3 = m2
+ 2 |p|2
sin2 θ
2
(231)
p1 · p4 = m2
+ 2 |p|2
cos2 θ
2
To calculate the cross section we need to sum over the probabilities of observing
ﬁnal states which are close to each other in momentum space. This
we do by including the factors
V
(2π)3
d3
p3
V
(2π)3
d3
p4 (232)
and dividing by the incoming ﬂux v1+v2
V
where v1 and v2 is the speed of the
1 and 2 particle respectively. In the center of mass system the speed of the
two particles are equal and can be expressed as |p|
E
so that the ﬂux is 2|p|
EV
.
The integrals over the momenta can be taken care of in the usual way
δ4
(p1 + p2 − p3 − p4)d3
p4d3
p3 = δ(E1 + E2 − E3 − E4) |p3|2
d |p3| dΩ (233)
52
and since in the center of mass system we have that E4 = E3 = |p3|2
+ m2
we can write
δ(E1 + E2 − 2E3) |p3|2 d |p3|
d(2E3)
d(2E3)dΩ =
|p3| E3
2
dΩ (234)
Putting everything together we have
dσ =
e4
(2π)6
4V 3E4
V 2
(2π)6
EV
2 |p|
|p| E
2
8
(p1 · p2)2
+ (p1 · p4)2
+ 2m2
(m2
− p1 · p3)
t2
+
(p1 · p2)2
+ (p1 · p3)2
+ 2m2
(m2
− p1 · p4)
u2
−
2(p1 · p2)(2m2
− p1 · p2)
tu
dΩ
=
e4
2E2
(p1 · p2)2
+ (p1 · p4)2
+ 2m2
(m2
− p1 · p3)
t2
+
(p1 · p2)2
+ (p1 · p3)2
+ 2m2
(m2
− p1 · p4)
u2
−
2(p1 · p2)(2m2
− p1 · p2)
tu
dΩ (235)
This cross section simpliﬁes in the non-relativistic and the ultra-relativistic
cases. In the non-relativistic case we have that |p| m so that to lowest
order
p1 · p2 ≈ m2
p1 · p3 ≈ m2
(236)
p1 · p4 ≈ m2
t = (p1 − p3)2
= 2m2
− 2 m2
+ 2 |p|2
sin2 θ
2
≈ −4 |p|2
sin2 θ
2
u = 2m2
− 2 m2
+ 2 |p|2
cos2 θ
2
≈ −4 |p|2
cos2 θ
2
(237)
53
which gives the expression
dσ =
e4
m2v4


1
sin4 θ
2
+
1
cos4 θ
2
−
1
sin2 θ
2
cos2 θ
2

 dΩ (238)
where v is the velocity v = |p|
m
. Notice that the ﬁrst term gives exactly the
Rutherford cross-section. The two additional terms are of quantum mechanical
origin and come about because the particles that scatter are quantum
mechanically identical. This means that the cross section has to be invariant
under θ → π − θ. The second term alone would be enough to achieve
that. The third term is there however because scattering of identical Fermi
particles is very much suppressed at θ = π
2
. This is essentially an eﬀect
of the Pauli principle which tells us that the total wave function has to be
anti-symmetric.
In the ultra-relativistic limit, |p| m, the cross section can similarly be
written in a simple form (Møller)
dσ =
e4
8E2


1 + sin4 θ
2
cos4 θ
2
+
1 + cos4 θ
2
sin4 θ
2
+
2
sin2 θ
2
cos2 θ
2

 dΩ (239)
15 Feynman rules, higher order processes
Feynman graphs are very helpful to get an overview over the various contributions
to probability amplitudes in higher orders. Basically in n-th order
one has n vertices, and from the nature of the electromagnetic interaction
Hamltonian follows that at each vertex two fermion lines and one photon line
meet. There are a few general rules how to compose transition amplitudes
at a certain order.
1. Draw all connected graphs with a given number of vertices.
2. Add a factor eγµ
to every vertex and integrate over all of space.
3. Take a propagator G(x−y), or Dµν(x−y), respectively, for each fermion
or photon line between the vertices at x and y.
4. Take free wave functions for external lines.
54
5. The exchange of any two fermions (in the construction of a graph from
another one) gives a minus sign, as well as every closed fermion loop.
6. In the case of n ingoing positrons there is a relative factor (−1)n
in
comparison to n ingoing electrons.
As an example we take fourth-order diagrams for the electron-positron
scattering. Altogether there are 17 connected graphs, here we display ﬁve of
them;
55
56
Their meaning is obviously the following:
(a) Exchange of two photons,
(b) annihilation, followed by creation of a new pair,
(c) annihilation and creation, scattering of the two outgoing particles,
(d) annihilation and creation, emission and absorption of a photon by one
of the outgoing electrons,
(e) repeated annihilation and creation.
The amplitudes for processes (a) and (b) can be calculated along the above
rules, the calculation is lengthy but straightforward. In the cases (c), (d)
and (e) however, new problems of a serious nature arise: One encounters
divergent integrals, which require new technical tools and an interpretation the
renormalization paradigm. In the following we study the renormalization
procedure according to Pauli-Villars.
16 The vacuum polarization
Let’s begin with diagram (e). Here in relation to the corresponding secondorder
diagram a “fermion loop” is inserted into the photon propagator. Denote
the ingoing electron’s momentum by p, the ingoing positron’s momentum
by p , respectively. The photon four-momentum is p + p , which is equal
to the sum of the outgoing four-momenta. If the fermion four-momentum in
one arch of the loop is k, then in the other arch it is p + p − k, where k is
arbitrary. For a given momentum p + p =: q the photon propagator gµν
q2+iε
is
replaced by
1
q2 + iε
Iµν(q, m)
1
q2 + iε
, (240)
where Iµν contains a double fermion propagator
Iµν(q, m) = −e2 d4
k
(2π)4
Tr γµ
1
/k − m + iε
γν
1
/k − /q − m + iε
. (241)
(The trace appears because of the closed loop.) For large values of k the integrand
goes asymptotically like k−2
, so the momentum space integral Iµν(q)
57
diverges quadratically, as arbitrarily large momenta may circulate in the loop.
The resulting divergence is called an “ultraviolet catastrophe”. Technically
a “cut-oﬀ” of high frequencies brings a remedy, but this means a “change
of the rules in the middle of the game” and needs a physical justiﬁcation.
Before modifying the theory in such a way it is convenient to carry out some
formal transformations that reformulate the divergent integral.
First we write the propagator in form of an integral
1
/k − m + iε
=
/k + m
k2 − m2 + iε
= −i(/k + m)
∞
0
dz eiz(k2−m2+iε)
. (242)
Insertion gives
Iµν(q, m) = −4ie2
∞
0
dz1
∞
0
dz2
d4
k
(2π)4
× (243)
kµ(k − q)ν + kν(k − q)µ − gµν(k2
− k · q − m2
) ×
exp iz1 k2
− m2
+ iε + iz2 (k − q)2
− m2
+ iε
To carry out the k-integral, the part of the exponent containing k and q is
rearranged in form of a complete square,
exp i(z1 + z2) k −
z2 q
z1 + z2
2
− i
(z2 q)2
z1 + z2
+ iz2 q2
.
With the deﬁnition
l := k −
z2 q
z1 + z2
= k − q +
z1 q
z1 + z2
this becomes
eil2(z1+z2)
e
i
z1 z2 q2
z1+z2 ,
so that the k integral turns into three types of Gaußian integrals, namely
dl
(2π)4



1
lµ
lµlν


 eil2(z1+z2)
=
1
16π2i(z1 + z2)2



1
0
igµν
2(z1+z2)


 .
Now
Iµν(q, m) =
α
π
∞
0
dz1
∞
0
dz2
(z1 + z2)2
e
i q2 z1 z2
z1+z2
−(m2−iε)(z1+z2)
× (244)
2(gµνq2
− qµqν)
z1 z2
(z1 + z2)2
+ gµν
−i
z1 + z2
−
z1 z2 q2
(z1 + z2)2
+ m2
58
(α = e2
4π
is the ﬁne structure constant.))
The part of the integrand in square brackets can be shown to contribute
nothing. For this purpose we rescale the z’s,
zi → λ zi.
Then this part becomes
∞
0
∞
0
dz1 dz2
(z1 + z2)2
m2
−
i
λ(z1 + z2)
−
z1 z2 q2
(z1 + z2)2
e
iλ
z1 z2 q2
z1+z2
−(m2−iε)(z1+z2)
= iλ
∂
∂λ
∞
0
∞
0
dz1 dz2
λ(z1 + z2)3
e
iλ
z1 z2 q2
z1+z2
−(m2−iε)(z1+z2)
.
Undoing the scaling transformation,
λ zi → zi,
makes the integral λ-independent, so the derivative is zero.
One further transformation is done by inserting the identity
1 =
∞
0
dλ
λ
δ 1 −
z1 + z2
λ
, (245)
leading to the form
Iµν(q, m) =
2iα
π
(qµqν − gµνq2
)
∞
0
∞
0
∞
0
dλ dz1 dz2 z1 z2
λ (z1 + z2)4
×
δ 1 −
z1 + z2
λ
e
i
z1 z2 q2
z1+z2
−(m2−iε)(z1+z2)
= (246)
2iα
π
(qµqν − ηµνq2
)
∞
0
∞
0
dz2 dz2 z1 z2 δ(1 − z1 − z2)
∞
0
dλ
λ
eiλ(z1 z2 q2−m2+iε)
.
In the last line zi were again multiplied by λ. Evaluating the δ-function we
ﬁnally ﬁnd
Iµν(q, m) =
2iα
π
(qµqν − gµνq2
)
1
0
dz z(1 − z)
∞
0
dλ
λ
eiλ(z1 z2 q2−m2+iε)
, (247)
where the original asymptotic divergence of the k-integral was replaced by a
logarithmic divergence of the λ-integral.
59
This is the divergence we want to cut oﬀ in the regularization procedure.
It is removed by subtracting an analogous expression Iµν(q, M) with a ﬁctive
large mass M and the same behavior close to λ = 0, i. e. we consider
∞
0
dλ
λ
eiλ(z1 z2 q2−m2+iε)
− eiλ(z1 z2 q2−M2+iε)
.
Integrals of this type can be calculated by introducing a further integration,
∞
0
dλ
λ
e−aλ
− e−bλ
=
∞
0
dλ
b
a
dx e−λx
=
b
a
dx
x
= ln
b
a
. (248)
For large M the regularized expression for the fermion loop is approximately
¯Iµν(q) = Iµν(q, m) − Iµν(q, M) ≈ (249)
2iα
π
(qµqν − gµνq2
)
1
0
dz z(1 − z) ln
M2
m2 − q2z(1 − z)
=
iα
3π
(qµqν − gµνq2
) ln
M2
m2
− 6
1
0
dz z(1 − z) ln 1 −
q2
m2
z(1 − z) .
When ¯Iµν is inserted into the photon propagator, the part with qµqν does
not contribute to the amplitude for the same reasons as in the unrenormalized
case. Taking the sum of the second-order and the fourth-order terms and
neglecting iε in the denominator, we get, in ﬁrst order in α, for the photon
propagator
Dµν →
igµν
q2
− i
1
q2
¯Iµν
1
q2
, (250)
explicitly
igµν
q2
1 −
α
3π
ln
M2
m2
+
2α
π
1
0
dz z(z − 1) ln 1 −
q2
z(1 − z)
m2 − iε
. (251)
In the limit q2
→ 0 the renormalization amounts merely to a multiplication
of the propagator by the factor
Z3 = 1 −
α
3π
ln
M2
m2
. (252)
For a physical interpretation consider Coulomb scattering with small momentum
transfer: The lowest-order expression e2
¯uγ0 u/q2
is in ﬁrst order in
α replaced by
e2 ¯uγ0 u
q2
1 −
α
3π
ln
M2
m2
=: e2
R
¯uγ0 u
q2
. (253)
60
eR =
√
Z3 e is called the renormalized charge of the electron.
eR is the observed charge, which is measured as e2
R = 4π
137
in natural units.
The parameter e in the Dirac equation would be the unobservable charge,
if the electromagnetic interaction could be switched oﬀ. Accordingly e is
called the “bare charge”, and eR is called the dressed charge. The interpretation
of the diﬀerence between eR and e is that a charge is surrounded by a
cloud of virtual photons, which, in turn, create short-lived electron-positron
pairs, visualized by fermion loops in Feynman diagrams. As eﬀective dipoles,
these electron-positron pairs partially screen oﬀ the bare charge from distant
observers, so that it appears smaller.
The renormalized photon propagator (251) splits into two parts: the limit
q2
→ 0, which contains the cutoﬀ parameter M and describes the static vacuum
polarization, and a q-dependent part, which is physically meaningful as
ﬁrst-order correction in α. Fermion loops of the considered type are sometimes
called “vacuum bubbles”.
As an example of a ﬁrst-order correction consider once more the Coulomb
scattering amplitude:
ie2 ¯uγ0u
q2
1 −
α
3π
ln
M2
m2
−
α
15π
q2
m2
≈ ie2
R
¯uγ0u
q2
1 −
αR
15π
q2
m2
+ O(α2
R) .
In position space the momentum space quantity q2
corresponds to the Laplacian
operator, so that we get the following action on the electrostatic poten-
tial
1 −
αR
15πm2
e2
R
4πr
=
e2
R
4πr
+
αRe2
R
15πm2
δ(3)
(x). (254)
In ﬁrst order in α there is a point-like potential that leads to a lowering of
the energy levels of s-wave functions in atoms, which have their maximum
at x = 0,
∆Enl = −
Ze2
RαR
15πm2
|ψnl(0)|2
= −
1
2
Z2
α2
m
8Z2
α2
15πn3
δl0, (255)
the Lamb shift. For hydrogen for example, a frequency shift by ν = ∆E/¯h =
27 MHz corresponds to the energy diﬀerence between the 2S1
2
and the 2P1
2
levels.
For large momentum transfer, |q |2
≈ −q2
m2
, on the other hand, the
logarithm in the physical part of the propagator can be approximated in the
61
following way
ln 1 −
q2
z(1 − z)
m2 − iε
≈ ln
|q|2
z(1 − z)
m2
= ln
|q |2
m2
+ ln(z(1 − z)),
where the contribution of the last logarithm, when integrated with z(1 − z),
is small, so that the unrenormalized propagator becomes
−
igµν
q2
1 +
α
3π
ln
|q |2
m2
−
α
3π
ln
M2
m2
+ O(α2
). (256)
Growing momentum transfer q partially compensates the renormalization.
As large |q | means small impact parameter, particles coming very close to
each other in a scattering process “dive” into their clouds of virtual dipoles
and begin to feel the bare charge.
What remains to do is to take care of free photon lines. Like propagators,
they pick up a vacuum bubble in fourth order and a factor Z3 after renormalization.
As for the propagator,
√
Z3 is associated to the vertex, where the
line begins or ends, for open ends, where there is no charge to renormalize,
we have to divide the amplitude by
√
Z3, when the renormalization is done.
17 Electron mass renormalization
17.1 Fourth-order correction to the fermion propaga-
tor
Diagram (d), where the electron emits and then absorbs a photon, is a contribution
to the self-energy of an electric charge, which raises also a problem in
classical electrostatics. The amplitude for the loop consisting of one electron
and one photon propagator is
iΣ(p) := (−ie)2 d4
k
(2π)2
−i
k2 − λ2 + i
γµ
i
/p − /k − m − i
γµ
. (257)
λ is a small photon mass introduced in order to avoid infrared divergences.
(A physical argument could be a ﬁnite extension of the universe as a cutoﬀ
of inﬁnite wavelengths.) The integral is linearly divergent.
Like in eq. (242) we introduce variables zi to rewrite the integrals in Σ,
Σ(p) =
α
2π
∞
0
∞
0
dz1 dz2
(z1 − z2)2
2m −
/p z1
z1 + z2
e
i
p2z1z2
z1+z2
−m2z2−λ2z1
. (258)
62
Like in the case of vacuum polarization, we rescale zi by γzi and apply
1 = ∞
0
dγ
γ
δ 1 − z1+z2
γ
to obtain
Σ(p) =
α
2π
1
0
dz [2m − /p(1 − z)]
∞
0
dγ
γ
eiγ[p2z(1−z)−m2z−λ2(1−z)+i ]
. (259)
The integral
J(p, m, λ) =
∞
0
dγ
γ
eiγ[p2z(1−z)−m2z−λ2(1−z)+i ]
(260)
diverges logarithmically. Σ(p) is regularized by subtraction of an analogous
integral with a large photon mass Λ, followed by application of (248),
J(p, m, λ) − J(p, m, Λ) ≈ ln
Λ2
(1 − z)
m2z + λ2(1 − z) − p2z(1 − z) − i
≈ ln
Λ2
(1 − z)
m2z2
+ ln
m2
z2
m2z + λ2(1 − z) − p2z(1 − z)
.
Now λ can be dropped, then the last term is zero for p2
= m2
, i. e. for a free
electron on the mass shell. This leads to the regularized expression
¯Σ(p) =
α
2π
1
0
dz [2m − (1 − z)/p] ln
Λ2
(1 − z)
m2z2
+
α
2π
1
0
dz [2m − (1 − z)/p] ln
m2
z
m2 − p2(1 − z)
.
The integrals in the ﬁrst, cutoﬀ-dependent term can be easily carried out, so
that
¯Σ(p) =
3αm
4π
ln
Λ2
m2
−
α
4π
(/p − m) ln
Λ2
m2
+ (261)
α
2π
1
0
dz [2m − (1 − z)/p] ln
m2
z
m2 − p2(1 − z)
.
The integral containing the physical corrections is approximately
α
2π
1
0
dz[2m − (1 − z)/p] ln
m2
z
m2 − p2(1 − z)
=
αm
π
m2
− p2
p2
ln
m2
− p2
m2
− (262)
α
4π
/p
m2
− p2
p2
1 +
m2
+ p2
p2
ln
m2
− p2
m2
.
63
Close to the mass shell p2
≈ m2
(p2
− m2
≈ 2m(/p − m)), we ﬁnd
¯Σ(p)
3α
4π
m ln
Λ2
m2
−
α
4π
(/p − m) ln
Λ2
m2
+ 4 ln
m2
− p2
m2
. (263)
17.2 Propagator renormalization
In the step from second to fourth order perturbation theory i
/p−m
is replaced
by
i
/p − m
+
i
/p − m
(−iΣ(p))
i
/p − m
=
i
/p − m − Σ(p)
+ O(α2
). (264)
Here ¯Σ(p) is brieﬂy written as Σ. This relation is shown in the following way
1
/p − m
+
1
/p − m
(−/p + m + Σ + /p − m)
1
/p − m
=
1
/p − m
+
1
/p − m
(−/p + m + Σ) 1 −
1
/p − m − Σ
(/p − m)
1
/p − m
=
1
/p − m
− 1 −
1
/p − m
Σ
1
/p − m
−
1
/p − m − Σ
=
1
/p − m − Σ
+
1
/p − m
Σ
1
/p − m
−
1
/p − m − Σ
.
The diﬀerence in the last parentheses is of order one in α, like Σ, this proves
relation (264).
Now we write ¯Σ in the form
¯Σ(p) = δm − [Z−1
2 − 1 + C(p)](/p − m) (265)
with
δm =
3αm
4π
ln
Λ2
m2
. (266)
The function C(p) is chosen such that C(p) = 0 when p = m, thus
Z−1
2 − 1 =
α
4π
ln
Λ2
m2
− 2 ln
m2
λ2
.
64
(265) is inserted into (264),
i
/p − m − ¯Σ
=
i
/p − m − δm + [Z−1
2 − 1 + C(p)](/p − m)
=
i
−δm + [Z−1
2 + C(p)](/p − m)
=
i Z2
(/p − m)[1 + Z2C(p)] − Z2δm
.
As ¯Σ(p) is of order α, from (265) follows that 1 − Z2 + Z2 C(p) is of order α,
thus
Z2 = 1 + Z2 C(p) + O(α).
δm being of order α, too, it follows that
Z2 δm = (1 + Z2 C(p)) δm + O(α2
),
so that, up to order α2
,
i
/p − m − ¯Σ
=
i Z2
(/p − m − δm)[1 + C(p)]
+ O(α2
). (267)
With the deﬁnition of the renormalized physical mass,
mph = m + δm = m 1 +
3α
4π
ln
Λ2
m2
(268)
the cutoﬀ constant Λ has disappeared. δm is interpreted as the electron’s
mass increase coming from its electrostatic ﬁeld. The unrenormalized mass m
is unobservable. In the limit p = mph, when C(p) = 0, the propagator simply
picks up a multiplicative factor Z2 in addition to the mass renormalization,
i
/p − m
→
i Z2
/p − mph
. (269)
Analogously to the case of charge renormalization, we could multiply the
charges at the ends of the propagator by a factor
√
Z2, but this factors will
cancel in the end. For each free fermion line, however, we have to divide the
amplitude by
√
Z2.
65
18 Vertex correction
18.1 Vertices in fourth order
The quantity corresponding to the loop in (c) is
Λµ(p , p) = (270)
(−ie)2 d4
k
(2π)4
−i
k2 − λ2 + i
γν
i
/p − /k − m + i
γµ
i
/p − /k − m + i
γν
.
Again λ is a small photon mass, the integral is logarithmically divergent.
Consider the case of small energy-momentum transfer between an external
source and a free fermion p ≈ p ≈ m (see ﬁgure in chapter 13). In this case
Λµ is expressed in a simple way by the renormalization constant Z1, which
is deﬁned by
¯u(p) Λµ(p, p) u(p) = (Z−1
1 − 1) ¯u(p) γµ u(p). (271)
Λµ(p, p) can be calculated from Σ(p), observing that
Λµ(p, p) = −
∂Σ(p)
∂pµ
, (272)
this relation coming from
∂
∂pµ
1
/p − m
= −
1
/p − m
γµ
1
/p − m
. (273)
(Compare (270) with (257).) With the aid of (272) Λµ(p, p ) can be split into
a part with Z1, hiding the divergence, and a unique ﬁnite part Λc
µ(p, p ),
Λµ(p, p ) = (Z−1
1 − 1)γµ + Λc
µ(p, p ). (274)
Application of (272) to (265) yields
∂Σ(p)
∂pµ
= −(Z−1
1 − 1) γµ,
(note that C(p) = 0 for p = m), and in the sequel
¯u(p) Λµ(p, p) u(p) = (Z−1
2 − 1) ¯u(p) γµ u(p),
which means
Z1 = Z2, (275)
so far up to order α. Like Z2, also Z1 could be plugged into a further charge
renormalization.
66
18.2 Synopsis of renormalisations
At a vertex of a Feynman diagram all the considered renormalisations meet.
In the neighborhood there are the following diagrams, up to order e2
:
67
(a) is the lowest (2nd
order) graph, (b) - (d) show 4th
order contributions.
We consider the limit of small energy-momentum transfer, i. e. the limit of
68
the photon four momentum q going to zero. The contributions corresponding
to the graphs are the following:
(a) −ieγµ
(b) −ieγµ(Z−1
1 − 1)
(c) 2ieγµ(Z−1
2 − 1)
(d) −ieγµ(Z3 − 1)
For the free photon line we divide by
√
Z3, for each of the two free fermion
lines by
√
Z2. Altogether we obtain thus for the diagrams in the above ﬁgure
the expression
−
ieγµ
Z2
√
Z3
1 + (Z−1
1 − 1) − 2(Z−1
2 − 1) + Z3 − 1 . (276)
The three expressions Z−1
1 − 1, Z−1
2 − 1, and Z3 − 1 are of order α.
Now this is transformed, keeping only terms up to order α in every step.
First we take out the factor 1 + (Z−1
1 − 1) from the square bracket:
[1 + (Z−1
1 − 1)] 1 −
2(Z−1
2 − 1) − (Z3 − 1)
1 + (Z−1
1 − 1)
.
The counter of the fraction being already of order α, the denominator can
be approximated by 1, leading to
≈ Z−1
1 [Z3 − 2(Z−1
2 − 1)].
Then we take out the factor Z3, and because Z3 ≈ 1 + O(α), in ﬁrst order
(Z−1
2 − 1)/Z3 ≈ Z−1
2 − 1 :
Z−1
1 Z3[1 − 2(Z−1
2 − 1)].
Up to order α the square bracket can be replaced by
1
[1 + (Z−1
2 − 1)]2
= Z2
2 .
Collecting these terms we ﬁnally get
−ie
Z2
Z1
Z3 γµ (277)
69
as the corrected vertex contribution. When Z1 = Z2, as we know it is in
order α, then the renormalizations due to Z1 and Z2 cancel and all we get is
−ieRγµ
with eR =
√
Z3 e, as it was obtained after handling the vacuum polarization.
In the next chapter we will see that this is indeed the case in all orders.
19 The Ward-Takahashi identity
In the last chapters we have studied the renormalization procedure in the
lowest order, where divergences occur, that is in fourth order in e. In this
order it was possible to hide all inﬁnities in the electron’s/positron’s charge
and mass. Particularly, two of three renormalization constants turned out to
be equal.
Generally, a theory is called renormalizable, if an approach of this kind
works in all orders, more speciﬁcally, if a ﬁnite number of renormalization
constants is suﬃcient. The appearance of new kinds of divergences that
would require new renormalization constants in every order would spoil the
predictive power of a theory.
Concerning the relation Z1 = Z2 there is a general identity that extends
its validity to all orders, the Ward-Takahashi identity. Furthermore it conﬁrms
that the term kµaν +kνaµ in the photon propagator does not contribute
to amplitudes in the general case, when the photon does not necessarily couple
immediately to free fermions.
To prove the Ward-Takahashi identity we consider arbitrary diagrams
with at least one external photon with momentum k and denote the probability
amplitude of the process by M(k). If we remove this photon, we get
a simpler diagram, which contributes to a simpler amplitude M0. Inserting
the photon somewhere else into the diagram gives a contribution to M(k),
and summing over all diagrams that contribute to M0 and over all possible
insertions in each of these diagrams gives the full amplitude M(k). The
Ward-Takahashi identity applies for each diagram contributing to M0, once
we sum over all insertion points. The external photon must attach either to
a fermion line that runs out of the diagram to two external points, or to a
closed fermion loop.
1) Fermion line with n vertices going to inﬁnity.
70
The ingoing fermion momentum is p, the photon momenta are counted
as ingoing, such that p1 = p + q1, . . . , pn = p = p + i qi. Now insert the
photon with momentum k after the i-th vertex:
71
At the vertex, where this photon couples to the fermion, we replace µ(k)
72
by kµ, so that we obtain
−iekµγµ
= −ie[(/pi + /k − m) − (/pi − m)].
With this relation the expression for this vertex and the two adjacent fermion
lines, represented by propagators, becomes
i
/pi + /k − m
(−ie/k)
i
/pi − m
= e
i
/pi − m
−
i
/pi + /k − m
.
Thus the diagram has a segment described by
. . .
i
/pi+1 + /k − m
γλi+1
i
/pi − m
−
i
/pi + /k − m
γλi
i
/pi−1 − m
γλi−1
. . .
When we insert the photon at the position i−1, the corresponding expression
is
. . .
i
/pi+1 + /k − m
γλi+1
i
/pi + /k − m
γλi
i
/pi−1 − m
−
i
/pi−1 + /k − m
γλi−1
. . .
The ﬁrst term in this expression cancels the second term of the previous one,
and so on. In the sum over all possible insertions the unpaired terms at the
ends survive.
73
The sum on the left-hand side is meant to be taken over all insertion
points i. p + k has been relabeled as q. Obviously the right-hand side does
74
not contribute to the transition amplitude M(k) for p → q.
2) Closed fermion loop.
The left diagram of Fig. shows a closed fermion loop with n photons attached.
In the right diagram a photon with momentum k is inserted between
the positions i and i + 1. The momentum k exits at vertex 1 by convention.
75
76
When we insert the photon between the vertices 1 and 2, we obtain the
contribution
−e
dp1
(2π)4
Tr
i
/pn + /k − m
γλn
. . .
i
/p2 + /k − m
γλ2
i
/p1 − m
−
i
/p1 + /k − m
γλ1
.
The ﬁrst term will be canceled by one of the amplitudes coming from the
insertion between 2 and 3, and so on. In the end two terms survive,
−e
dp1
(2π)4
Tr
i
/pn − m
γλn
i
/pn−1 − m
γλn−1
. . .
i
/p1 − m
γλ1
−
i
/pn + /k − m
γλn
i
/pn−1 + /k − m
γλn−1
. . .
i
/p1 + /k − m
γλ1
.
After shifting the integration variable from p1 to p1 + k in the second term
the two terms cancel. The diagrams with the photon inserted along a closed
loop add up to zero.
In the most general case there may be n ingoing and n outgoing fermions
and an arbitrary number of further external photons. Graphically the WardTakahashi
identity is shown in the following ﬁgure,
77
78
in terms of a formula this is
kµMµ
(k, p1, . . . , pn, q1, . . . , qn) =
e
i
[M0(p1, . . . , pn, q1, . . . , qi − k, . . . , qn)
− M0(pi, . . . , pi + k, . . . , pn, q1, . . . , qn)] .
The right-hand side does not contribute to the S-matrix.
In the simplest case there is just one external fermion line, so that the
left-hand side can be seen in fact as a full, renormalized vertex.
79
For full propagators S(p) = i
/p−m−Σ(p)
and a full vertex Γµ
the diagram can
80
be translated into
S(p + k) [−iekµΓµ
(p + k, p)] S(p) = e[S(p) − S(p + k)]. (278)
If we multiply by S−1
(inverse propagators = Dirac matrices) from the left
and from the right, we obtain
−ikµΓµ
(p + k, p) = S−1
(p + k) − S−1
(p). (279)
Sometimes this more special relation is called the Ward-Takahashi identity.
From this we can ﬁnd a relation between Z1 and Z2: In the limit k → 0
Γµ
(p + k, p) → Z−1
1 γµ
and S(p) →
i Z2
/p − m
.
Expansion of (279) around k = 0 gives (recall the diﬀerential relation (272))
−ikµZ−1
1 γµ
= −iZ−1
2 (/p + /k − m − /p + m),
and from this follows Z1 = Z2 in all orders.
In quantum electrodynamics no new types of divergences than the considered
ones occur in higher orders; such theories are called renormalizable.
81