Qualitative and Limited Dependent Variable Models Ketevani Kapanadze Brno, 2020 A Single Dummy Independent Variable • Qualitative Information • Examples: gender, race, industry, region, rating grade, … • A way to incorporate qualitative information is to use dummy variables • They may appear as the dependent or as independent variables Dummy Variables • Dummy variable - takes on the values of 0 or 1, depending on a qualitative attribute; • Examples of dummy variables are: Intercept Dummy • Dummy variable included in a regression alone (not interacted with other variables) is an intercept dummy; • It changes the intercept for the subset of data defined by a dummy variable condition: Yi = β0 + β1Di + β2Xi + ui • We have: (on the board) • Graphical Illustration Intercept Dummy Example • Estimating the determinant of wages: wagei = -3.89 + 2.156 Mi + 0.603 educi + 0.010 experi (0.270) (0.051) (0.064) • Interpretation of the dummy variable M: men earn on average $2.156 per hour more than women, ceteris paribus • Estimated wage equation with intercept shift • Does that mean that women are discriminated against? • Not necessarily. Being female may be correlated with other productivity characteristics that have not been controlled for. Holding education, experience, and tenure fixed, women earn 1.81$ less per hour than men A Single Dummy Independent Variable • Comparing means of subpopulations described by dummies • Discussion • It can easily be tested whether difference in means is significant • The wage difference between men and women is larger if no other things are controlled for; i.e. part of the difference is due to differences in education, experience and tenure between men and women Not holding other factors constant, women earn 2.51$ per hour less than men, i.e. the difference between the mean wage of men and that of women is 2.51$. A Single Dummy Independent Variable • If a dummy variable is interacted with another variable (x), it is a slope dummy; • It changes the relationship between x and y for a subset of data defined by a dummy variable condition: Yi = β0 + β1Xi + β2(Xi*Di) + ui • We have: (on the board) Slope Dummy Example • Estimating the determinant of wages: wagei = -2.620 + 0.450 educi + 0.17 Mi * educi + 0.010 experi (0.054) (0.021) (0.065) • Interpretation: men gain on average 17 cents per hour more than women for each additional year of education, ceteris paribus Multiple categories • What if a variable defines three or more qualitative attributes? • Example: level of education - elementary school, high school, and college; • Define and use a set of dummy variables: • Should we include also a third dummy in the regression, which is equal to 1 for people with elementary education? • No, unless we exclude the intercept! • Using full set of dummies leads to perfect multicollinearity (dummy variable trap) • Dummy variable trap This model cannot be estimated (perfect collinearity) When using dummy variables, one category always has to be omitted: Alternatively, one could omit the intercept: The base category are men The base category are women Disadvantages: 1) More difficult to test for differences between the parameters 2) R-squared formula only valid if regression contains intercept A Single Dummy Independent Variable • Allowing for different slopes • Interesting hypotheses = intercept men = intercept women = slope men = slope women The return to education is the same for men and women The whole wage equation is the same for men and women Interaction term Interactions Involving Dummy Variables A Binary Dependent Variable: The Linear Probability Model • Linear regression when the dependent variable is binary Linear probability model (LPM) If the dependent variable only takes on the values 1 and 0 In the linear probability model, the coefficients describe the effect of the explanatory variables on the probability that y=1 (=the probability of „success“) A Binary Dependent Variable: The Linear Probability Model • Example: Labor force participation of married women =1 if in labor force, =0 otherwise Non-wife income (in thousand dollars per year) If the number of kids under six years increases by one, the proprobability that the woman works falls by 26.2% Not significant A Binary Dependent Variable: The Linear Probability Model • Example: Female labor participation of married women (cont.) Graph for nwifeinc=50, exper=5, age=30, kindslt6=1, kidsge6=0 Negative predicted probability but no problem because no woman in the sample has educ < 5. The maximum level of education in the sample is educ=17. For the given case, this leads to a predicted probability to be in the labor force of about 50%. A Binary Dependent Variable: The Linear Probability Model • Disadvantages of the linear probability model • Predicted probabilities may be larger than one or smaller than zero • Marginal probability effects sometimes logically impossible • The linear probability model is necessarily heteroskedastic • Heterosceasticity consistent standard errors need to be computed • Advantanges of the linear probability model • Easy estimation and interpretation • Estimated effects and predictions often reasonably good in practice Variance of Bernoulli variable A Binary Dependent Variable: The Linear Probability Model • Disadvantages of the LPM for binary dependent variables • Predictions sometimes outside the unit interval • Partial effects of explanatory variables are constant • Nonlinear models for binary response • Response probability is a nonlinear function of explanat. variables Probability of a „success“ given explanatory variables A cumulative distribution function . . The response probability is thus a function of the explanatory variables x. Shorthand vector notation: the vector of explanatory variables x also contains the constant of the model. Logit and Probit Models for Binary Response • Choices for the link function Logit: (logistic function) Probit: (standard normal distribution) Logit and Probit Models for Binary Response • Interpretation of coefficients in Logit and Probit models • Marginal effects are nonlinear and depend on the level of X ! where How does the probability for y=1 change if explanatory variable xj changes by one unit? Logit and Probit Models for Binary Response Interpretation of marginal effects • An increase in x increases (decreases) the probability that y=1 by the marginal effect expressed as a percent. • For dummy independent variables, the marginal effect is expressed in comparison to the base category (x=0). • For continuous independent variables, the marginal effect is expressed for a one-unit change in x. • We interpret both the sign and the magnitude of the marginal effects. • The probit and logit models produce almost identical marginal effects. • Goodness-of-fit measures for Logit and Probit models • Percent correctly predicted • Pseudo R-squared Individual i‘s outcome is predicted as one if the probability for this event is larger than .5, then percentage of correctly predicted y=1 and y=0 is counted Compare maximized log-likelihood of the model with that of a model that only contains a constant (and no explanatory variables) Logit and Probit Models for Binary Response Discussion about binary outcome models Choice between the logit and probit model • The choice depends on the data generating process, which is unknown. • The models produce almost identical results (different coefficients but similar marginal effects). • The choice is up to you. Coding of the dependent variable If we reverse the categories 0 and 1, the signs of the coefficients are reversed (positive become negative and vice versa) but the magnitudes are the same. • Example: Married women‘s labor force participation The coefficients are not comparable across models Often, Logit estimated coefficients are 1.6 times Probit estimated because . The biggest difference between the LPM and Logit/Probit is that partial effects are nonconstant in Logit/Probit: (Larger decrease in probability for the first child) Logit and Probit Models for Binary Response Next Class – 10.04 In the Zoom at 1pm Regression Analysis with Time Series Data