Exercise 8


The file JTRAIN2.dta contains data on a job training experiment for a group of men. Men could enter
the program starting in January 1976 through about mid-1977. The program ended in December 1977.
The idea is to test whether participation in the job training program had an effect on unemployment
probabilities and earnings in 1978.

(i)     The variable train is the job training indicator. How many men in the sample participated
in the job training program? What was the highest number of months a man actually participated in
the program?

smpl train –restrict

smpl full

summary mostrn

185 out of 445 participated in the job training program. The longest time in the experiment was 24
months.

(ii)  Run a linear regression of train on several demographic and pretraining variables: unem74,
unem75, age, educ, black, hisp, and married. Are these variables jointly significant at the 5%
level?

ols train const unem74 unem75 age educ black hisp married

The F statistic for joint significance of the explanatory variables is F(7,437) = 1.43 with p-value
= .19. Therefore, they are jointly insignificant at even the 15% level. Note that, even though we
have estimated a linear probability model, the null hypothesis we are testing is that all slope
coefficients are zero, and so there is no heteroskedasticity under H0. This means that the usual F
statistic is asymptotically valid

(iii)            Estimate a probit version of the linear model in part (ii). Compute the likelihood
ratio test for joint significance of all variables. What do you conclude?

probit train const unem74 unem75 age educ black hisp married

After estimating the model P(train=1|X)= Φ( by probit maximum likelihood, the likelihood ratio test
for joint significance is 10.18. In a  distribution this gives P-value =0.18, which is very similar
to that obtained in the LPM in part (ii).

(iv)  Based on your answers to parts (ii) and (iii), does it appear that participation in job
training can be treated as exogenous for explaining 1978 unemployment status? Explain.

Training eligibility was randomly assigned among the participants, so it is not surprising that
train appears to be independent of other observed factors. (However, there can be a difference
between eligibility and actual participation, as men can always refuse to participate if chosen.)

(v)   Run a simple regression of unem78 on train and report the results in equation form. What is
the estimated effect of participating in the job training program on the probability of being
unemployed in 1978? Is it statistically significant?

ols unem78 const train

Participating in the job training program lowers the estimated probability of being unemployed in
1978 by .111, or 11.1 percentage points. This is a large effect: the probability of being
unemployed without participation is .354, and the training program reduces it to .243. The
differences is statistically significant at almost the 1% level against at two-sided alternative.
(Note that this is another case where, because training was randomly assigned, we have confidence
that OLS is consistently estimating a causal effect, even though the R-squared from the regression
is very small. There is much about being unemployed that we are not explaining, but we can be
pretty confident that this job training program was beneficial.)

(vi)  Run a probit of unem78 on train. Does it make sense to compare the probit coefficient on
train with the coefficient obtained from the linear model in part (v)?

It does not make sense to compare the coefficient on train for the probit, −.321, with the LPM
estimate. The probabilities have different functional forms. However, note that the probit and LPM
t statistics are essentially the same (although the LPM standard errors should be made robust to
heteroskedasticity).

(vii)          Find the fitted probabilities from parts (v) and (vi). Explain why they are
identical. Which approach would you use to measure the effect and statistical significance of the
job training program?

There are only two fitted values in each case, and they are the same: .354 when train =

0 and .243 when train = 1. This has to be the case, because any method simply delivers the cell
frequencies as the estimated probabilities. The LPM estimates are easier to interpret because they
do not involve the transformation by Φ(⋅), but it does not matter which is used provided the
probability differences are calculated.


(viii)        Add all of the variables from part (ii) as additional controls to the models from
parts (v) and (vi). Are the fitted probabilities now identical? What is the correlation between
them?

ols unem78 const train unem74 unem75 age educ black hisp married

series yhat=$yhat

probit unem78 const train unem74 unem75 age educ black hisp married

series yhat2=$yhat

corr yhat yhat2

The fitted values are no longer identical because the model is not saturated, that is, the
explanatory variables are not an exhaustive, mutually exclusive set of dummy variables. But,
because the other explanatory variables are insignificant, the fitted values are highly correlated:
the LPM and probit fitted values have a correlation of about .993