Panel Data: Lecture 1
April 18, 2024
Recap
▶ Linear Regression is the best linear approximation of the CEF.
Recap
▶ Linear Regression is the best linear approximation of the CEF.
▶ CEF is the best predictor of Y. Thus, linear regression is the best linear
predictor of Y
Recap
▶ Linear Regression is the best linear approximation of the CEF.
▶ CEF is the best predictor of Y. Thus, linear regression is the best linear
predictor of Y
▶ Linear regression can capture non-linearities by controling for higher order
polynomials of the explanatory variables. The important thing is that it is
linear in parameters.
Recap
▶ Linear Regression is the best linear approximation of the CEF.
▶ CEF is the best predictor of Y. Thus, linear regression is the best linear
predictor of Y
▶ Linear regression can capture non-linearities by controling for higher order
polynomials of the explanatory variables. The important thing is that it is
linear in parameters.
▶ The goal of econometrics is to find casual relationships. -¿ We want to
compare actual outcomes to potential outcomes. E.g. of what would have
happened if someone without college degree had a college degree?
Recap
▶ Linear Regression is the best linear approximation of the CEF.
▶ CEF is the best predictor of Y. Thus, linear regression is the best linear
predictor of Y
▶ Linear regression can capture non-linearities by controling for higher order
polynomials of the explanatory variables. The important thing is that it is
linear in parameters.
▶ The goal of econometrics is to find casual relationships. -¿ We want to
compare actual outcomes to potential outcomes. E.g. of what would have
happened if someone without college degree had a college degree?
▶ To make casual inference, we need to get rid of endogenous (bad) variation
in the explanatory variables.
Omitted Time Constant Variables
Suppose a regression
ln(wageit) = α + ρUnionit + γAi + βXit + eit (1)
▶ Union is a dummy variable equal to 1 if a worker belongs to a labour union
and 0 otherwise
▶ Ai is a set of unobserved variables that do not change over time. (Example?)
▶ Xit is a set of observed variables that vary across individual and across time.
(Example?)
▶ ln(wagei t) is natural logarithm of observed wage
However Ait is unobserved so instead we can only estimate
ln(wageit) = α + ρUnionit + βXit + uit (2)
What are the consequences?
Omitted Variable Bias
First make some adjustments of the matrices. Without loss of generality:
▶ Define Wit = Unionit Xit
▶ Gather all observations from Wit and ln(wageit) into large matrices W and
ln(WAGE)
▶ Rewrite equation (2)
ln(WAGE) = W Θ + u (3)
Where θ = α ρ β
T
Omitted Variable Bias contd...
Now start with OLS formula
ˆΘ = (W T
W )−1
W T
ln(WAGE)
= (W T
W )−1
W T
(W Θ + u)
= (W T
W )−1
W T
(W Θ + AΓ + e)
= (W T
W )−1
W T
W Θ + (W T
W )−1
W T
AΓ + (W T
W )−1
W T
e
= Θ + (W T
W )−1
W T
AΓ
= Θ + ∆Γ
The last line implies that the estimates of the coefficients from the short
regression (2) are equal to the estimates of the long regression (1) plus the effect
of the omitted variables A times the effect of the omitted variables A on the
included variables W.
Thus, if (1) is the true causal model. Then estimating (2) will lead to omitted
variable bias. The direction of the bias depends on the signs of Γ and ∆.
Fixed Effects Model
Panel data allow us discard the variation in W that is due to the omitted
variables A. To see this consider again equation (1).
ln(wageit) = α + ρUnionit + γAi + βXit + eit
Let αi ≡ α + γAi , Then
ln(wageit) = αi + ρUnionit + βXit + eit (4)
▶ Note that this model has more parameters to estimate than there are
number of individuals (N).
▶ Fortunately we do not need consistent estimates of αi to obtain consistent
estimates of ρ. We just need to ”kill” the variation in Union and X that is
related to the fixed effects αi .
Fixed Effects Model: Deviations from the Mean
We can exploit panel data structure to get rid of the individual fixed effects αi .
Firstly, for every individual i, calculate averages of the variables in equation (4).
ln(wagei ) = αi + ρUnioni + βXi + ei (5)
where every variable V in the model (5) is calculated in the following way:
Vi = 1
T t Vi,t
Next subtract (5) from (4)
ln(wageit) − ln(wagei ) = ρUnionit − ρUnioni + βXit − βXi + eit − ei (6)
Equation (6) is the Fixed Effects model. It is sometimes called the Within
estimator because it uses only the within unit (e.g. individual) variation.
Within Estimator: Caveat
▶ The within estimator is useful when we want to control for fixed effects. For
example unobserved time constant variables αi . But there is a caveat.
Which?
Within Estimator: Caveat
▶ The within estimator is useful when we want to control for fixed effects. For
example unobserved time constant variables αi . But there is a caveat.
Which?
▶ Within estimator can only estimate effects of variables that vary on both
dimensions (e.g. i and t).
▶ Pooled OLS (POLS) uses all the variation (between and within) so it is able
to estimate all variables but there is a risk of omitted variable bias.
Within Estimator or POLS?
Formalize the conditions: Consider the equation (4)
ln(wageit) = αi + ρUnionit + βXit + eit
Suppose we run POLS
ln(wageit) = α + ρUnionit + βXit + vit
Suppose we do not observe some variables in αi . The POLS will give us
consistent estimates of ρ if the following holds:
cov(Union, vi ) = 0
Thus,
cov(Union, ei ) = 0
and
cov(Union, αi ) = 0
Example
Suppose you have data on hourly wages of male workers in the U.S. Each of
these men continuously worked from 1980 to 1987. You observe the following
variables: education, experience, race, and whether a worker is a member of a
working union or not. Suppose we came up with the following model.
ln(wagei,t) = β0 + β1educi,t + β2experi,t + β3exper2
i,t + β4blacki + β5hispani
+ β6unioni,t +
1987
j=1981
γj Dj + ϵi,t
a Suppose we decide to estimate the equation using Pooled OLS. What
assumption on the error term should we impose to obtain consistent
estimates?
b Suppose we run FE regression. What is the underlying assumption on the
error term now? Can we identify all the coefficients? Explain in details.
c Suppose the union premium estimated by FE is by 10 % lower than the OLS
estimate. What does this suggest about the correlation between union and
the unobserved effect?
Example contd.
a Consistent estimate requires the cov(xi,t, ϵi,t) = 0 condition to hold. In
panel data the error term ϵi,t is usually ϵi,t = αi + ui,t. Hence the
consistency requires the following to hold:
cov(xi,t, αi ) = 0
and
cov(xi,t, ui,t) = 0
b The FE model relaxes the cov(xi,t, αi ) = 0 condition, but still requires the
cov(xi,t, ui,t) = 0 condition to hold.
FE are not able to identify the parameters of the time invariant variables.
Hence β0, β4, andβ5 are not identified.
c βFE
5 < βOLS
5 suggests that the cov(unioni,t, αi ) > 0
Estimates of the Individual Effects αi
▶ To estimate consistently ρ in equation (4), we just need large number of
individuals (N).
▶ However, to consistently estimate αi ’s. We need large number of time
periods T.
Proof of αi Inconsistency With Fixed Time Periods
Suppose a model yi = xi β + αi + ui , show that the estimate of αi is inconsistent.
First, notice that an estimator is consistent if in the limit it approaches the true
coefficient. Write this condition in the mean squared error sense:
MSE = Eθ[(h(X) − θ)2
] = Vθ(h(x)) + (E[h(X)] − θ)2
= 0
Where h(X) is estimator and θ is the true population coefficient.
Now write the equation for estimate of αi
ˆαi = yi − xi
′ ˆβ
Var(ˆαi ) =
1
T
σ2
u + xi
′
Var(ˆβ)xi
If ˆβ is consistent, the second term of the variance equation goes to 0 as N goes to
infinity. However the first term does not vanish unless T goes to infinity as well.
Excercise
Use the data in ATTEND.dta to answer this question.
a To determine the effects of attending lecture on final exam performance,
estimate a model relating stndfnl (the standardized final exam score) to
atndrte (the percent of lectures attended). Include the binary variables frosh
and soph as explanatory variables. Interpret the coefficient on atndrte, and
discuss its significance.
b How confident are you that the OLS estimates from part (a) are estimating
the causal effect of attendance? Explain.
c You are worried that you omitted student ability. Comment on the direction
of the bias. Be sure to state all necessary assumptions and justify your
reasoning on the sign of the coefficients necessary to quantify the bias.
d As proxy variables for student ability, add to the regression priGPA (prior
cumu- lative GPA) and ACT (achievement test score). Now what is the
effect of atndrte? Discuss how the effect differs from that in part (a)?
e Visualize the results from (a) and (d) in a graph using ggplot2.
Excercise contd.
Until next week, Answer upload a document with all the questions answered and
upload a working R script.