1/36 Econometrics Qualitative and Limited Dependent Variable Models Anna Donina Lecture 8 Introduction So far, the dependent variable (Y) was continuous: • Average wage • Number of children • Money growth rate But what if it is a binary variable? Y = 1, if person has college degree, 0 otherwise; Y = 1, if person smokes, 0 otherwise; The linear probability model (LPM) Non-linear probability model • Probit • Logit • Limited dependent variables (LDV) • LDV are variables whose range is substantively restricted • Binary variables, e.g. employed/not employed • Nonnegative variables, e.g. wages, prices, interest rates • Nonnegative variables with excess zeros, e.g. labor supply • Count variables, e.g. the number of arrests in a year • Censored variables, e.g. unemployment durations Limited Dependent Variable Models • Linear regression when the dependent variable is binary Linear probability model (LPM) If the dependent variable only takes on the values 1 and 0 In the linear probability model, the coefficients describe the effect of the explanatory variables on the probability that y=1 (the probability of „success“) A Binary Dependent Variable: The Linear Probability Model • Example: Labor force participation of married women =1 if in labor force, =0 otherwise Non-wife income (in thousand dollars per year) If the number of kids under six years increases by one, the proprobability that the woman works falls by 26.2% Not significant A Binary Dependent Variable: The Linear Probability Model • Example: Female labor participation of married women (cont.) Graph for nwifeinc=50, exper=5, age=30, kindslt6=1, kidsge6=0 Negative predicted probability but no problem because no woman in the sample has educ < 5. The maximum level of education in the sample is educ=17. For the given case, this leads to a predicted probability to be in the labor force of about 50%. A Binary Dependent Variable: The Linear Probability Model The Linear Probability Model: Heteroskedasticity Yi = β0 + β1X1i + · · · + βk Xki + ui The variance of a Bernoulli random variable: Var (Y ) = Pr (Y = 1) × (1 − Pr (Y = 1)) We can use this to find the conditional variance of the error term Solution: always use heteroskedasticity robust standard errors when estimating a LPM • Disadvantages of the linear probability model • Predicted probabilities may be larger than one or smaller than zero • Marginal probability effects sometimes logically impossible • The linear probability model is necessarily heteroskedastic • Heterosceasticity consistent standard errors need to be computed • Advantanges of the linear probability model • Easy estimation and interpretation • Estimated effects and predictions often reasonably good in practice Variance of Bernoulli variable A Binary Dependent Variable: The Linear Probability Model • Disadvantages of the LPM for binary dependent variables • Predictions sometimes outside the unit interval • Partial effects of explanatory variables are constant • Nonlinear models for binary response • Response probability is a nonlinear function of explanat. variables Probability of a „success“ given explanatory variables A cumulative distribution function . . The response probability is thus a function of the explanatory variables x. Shorthand vector notation: the vector of explanatory variables x also contains the constant of the model. Logit and Probit Models for Binary Response • Choices for the link function Logit: (logistic function) Probit: (standard normal distribution) Logit and Probit Models for Binary Response Logit and Probit Models for Binary Response Pr (Y = 1) = Pr (Z ≤ −0.8) = Φ(−0.8) = 0.2119 Logit and Probit Models for Binary Response Pr 𝑌 = 1 = Pr 𝑍 ≤ −0.8 = 1 1 + 𝑒0.8 = 0.31 Logit and Probit Models for Binary Response • Interpretation of coefficients in Logit and Probit models • Partial effects are nonlinear and depend on the level of x ! where How does the probability for y=1 change if explanatory variable xj changes by one unit? Discrete explanatory variables: For example, explanatory variable xk increases by one unit. Logit and Probit Models for Binary Response • So far, we used OLS to estimate models • Logit and Probit models are nonlinear in parameters: • Hence, in this case the OLS cannot be used • The method used to estimate Logit and Probit models is Maximum Likelihood Estimation (MLE) • The MLE are the values of parameters that best describe the full distribution of the data • The likelihood function is the joint probability distribution of the data, treated as a function of the unknown coefficients • The MLE are the values of the coefficients that maximize the likelihood function • MLE’s are the parameter values “most likely” to have produced the data Logit and Probit Models: Estimation • Goodness-of-fit measures for Logit and Probit models • Percent correctly predicted • Pseudo R-squared • Correlation based measures Individual i‘s outcome is predicted as one if the probability for this event is larger than .5, then percentage of correctly predicted y=1 and y=0 is counted Compare maximized log-likelihood of the model with that of a model that only contains a constant (and no explanatory variables) Look at correlation (or squared correlation) between predictions or predicted prob. and true values Logit and Probit Models for Binary Response • Reporting partial effects of explanatory variables • The difficulty is that partial effects are not constant but depend on x • Partial effects at the average: • Average partial effects: The partial effect of explanatory variable xj is considered for an „average individual“ (this is problematic in the case of explanatory variables such as gender) The partial effect of explanatory variable xj is computed for each individual in the sample and then averaged across all sample members (makes more sense) Logit and Probit Models for Binary Response • Example: Married women‘s labor force participation The coefficients are not comparable across models Often, Logit estimated coefficients are 1.6 times Probit estimated because . The biggest difference between the LPM and Logit/Probit is that partial effects are nonconstant in Logit/Probit: (Larger decrease in probability for the first child) Logit and Probit Models for Binary Response