LECTURE 9 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 3, 2017 1 / 23 ON PREVIOUS LECTURES 2 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation 2 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 2 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 1. correct independent variables 2 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 1. correct independent variables 2. correct functional form 2 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 1. correct independent variables 2. correct functional form 3. correct form of the stochastic error term 2 / 23 SHORT REVISION 3 / 23 SHORT REVISION We talked about the choice of correct functional form: What are the most common function forms? 3 / 23 SHORT REVISION We talked about the choice of correct functional form: What are the most common function forms? We studied what happens if we omit a relevant variable: Does omitting a relevant variable cause a bias in the other coefficients? 3 / 23 SHORT REVISION We talked about the choice of correct functional form: What are the most common function forms? We studied what happens if we omit a relevant variable: Does omitting a relevant variable cause a bias in the other coefficients? We studied what happens if we include an irrelevant variable: Does including an irrelevant variable cause a bias in the other coefficients? 3 / 23 SHORT REVISION We talked about the choice of correct functional form: What are the most common function forms? We studied what happens if we omit a relevant variable: Does omitting a relevant variable cause a bias in the other coefficients? We studied what happens if we include an irrelevant variable: Does including an irrelevant variable cause a bias in the other coefficients? We defined the four specification criteria that determine if a variable belongs to the equation: Can you list some of these specification criteria? 3 / 23 ON TODAY’S LECTURE 4 / 23 ON TODAY’S LECTURE We will finish the discussion of the choice of independent variables by talking about multicollinearity 4 / 23 ON TODAY’S LECTURE We will finish the discussion of the choice of independent variables by talking about multicollinearity We will start the discussion of the correct form of the error term by talking about heteroskedasticity 4 / 23 ON TODAY’S LECTURE We will finish the discussion of the choice of independent variables by talking about multicollinearity We will start the discussion of the correct form of the error term by talking about heteroskedasticity For both of these issues, we will learn 4 / 23 ON TODAY’S LECTURE We will finish the discussion of the choice of independent variables by talking about multicollinearity We will start the discussion of the correct form of the error term by talking about heteroskedasticity For both of these issues, we will learn what is the nature of the problem 4 / 23 ON TODAY’S LECTURE We will finish the discussion of the choice of independent variables by talking about multicollinearity We will start the discussion of the correct form of the error term by talking about heteroskedasticity For both of these issues, we will learn what is the nature of the problem what are its consequences 4 / 23 ON TODAY’S LECTURE We will finish the discussion of the choice of independent variables by talking about multicollinearity We will start the discussion of the correct form of the error term by talking about heteroskedasticity For both of these issues, we will learn what is the nature of the problem what are its consequences how it is diagnosed 4 / 23 ON TODAY’S LECTURE We will finish the discussion of the choice of independent variables by talking about multicollinearity We will start the discussion of the correct form of the error term by talking about heteroskedasticity For both of these issues, we will learn what is the nature of the problem what are its consequences how it is diagnosed what are the remedies available 4 / 23 Multicollinearity 5 / 23 PERFECT MULTICOLLINEARITY 6 / 23 PERFECT MULTICOLLINEARITY Some explanatory variable is a perfect linear function of one or more other explanatory variables 6 / 23 PERFECT MULTICOLLINEARITY Some explanatory variable is a perfect linear function of one or more other explanatory variables Violation of one of the classical assumptions 6 / 23 PERFECT MULTICOLLINEARITY Some explanatory variable is a perfect linear function of one or more other explanatory variables Violation of one of the classical assumptions OLS estimate cannot be found 6 / 23 PERFECT MULTICOLLINEARITY Some explanatory variable is a perfect linear function of one or more other explanatory variables Violation of one of the classical assumptions OLS estimate cannot be found Intuitively: the estimator cannot distinguish which of the explanatory variables causes the change of the dependent variable if they move together 6 / 23 PERFECT MULTICOLLINEARITY Some explanatory variable is a perfect linear function of one or more other explanatory variables Violation of one of the classical assumptions OLS estimate cannot be found Intuitively: the estimator cannot distinguish which of the explanatory variables causes the change of the dependent variable if they move together Technically: the matrix X X is singular (not invertible) 6 / 23 PERFECT MULTICOLLINEARITY Some explanatory variable is a perfect linear function of one or more other explanatory variables Violation of one of the classical assumptions OLS estimate cannot be found Intuitively: the estimator cannot distinguish which of the explanatory variables causes the change of the dependent variable if they move together Technically: the matrix X X is singular (not invertible) Rare and easy to detect 6 / 23 EXAMPLES OF PERFECT MULTICOLLINEARITY 7 / 23 EXAMPLES OF PERFECT MULTICOLLINEARITY Dummy variable trap 7 / 23 EXAMPLES OF PERFECT MULTICOLLINEARITY Dummy variable trap Inclusion of dummy variable for each category in the model with intercept 7 / 23 EXAMPLES OF PERFECT MULTICOLLINEARITY Dummy variable trap Inclusion of dummy variable for each category in the model with intercept Example: wage equation for sample of individuals who have high-school education or higher: wagei = β1 + β2high schooli + β3universityi + β4phdi + ei 7 / 23 EXAMPLES OF PERFECT MULTICOLLINEARITY Dummy variable trap Inclusion of dummy variable for each category in the model with intercept Example: wage equation for sample of individuals who have high-school education or higher: wagei = β1 + β2high schooli + β3universityi + β4phdi + ei Automatically detected by most statistical softwares 7 / 23 IMPERFECT MULTICOLLINEARITY 8 / 23 IMPERFECT MULTICOLLINEARITY Two or more explanatory variables are highly correlated in the particular data set 8 / 23 IMPERFECT MULTICOLLINEARITY Two or more explanatory variables are highly correlated in the particular data set OLS estimate can be found, but it may be very imprecise 8 / 23 IMPERFECT MULTICOLLINEARITY Two or more explanatory variables are highly correlated in the particular data set OLS estimate can be found, but it may be very imprecise Intuitively: the estimator can hardly distinguish the effects of the explanatory variables if they are highly correlated 8 / 23 IMPERFECT MULTICOLLINEARITY Two or more explanatory variables are highly correlated in the particular data set OLS estimate can be found, but it may be very imprecise Intuitively: the estimator can hardly distinguish the effects of the explanatory variables if they are highly correlated Technically: the matrix X X is nearly singular and this causes the variance of the estimator Var β = σ2 X X −1 to be very large 8 / 23 IMPERFECT MULTICOLLINEARITY Two or more explanatory variables are highly correlated in the particular data set OLS estimate can be found, but it may be very imprecise Intuitively: the estimator can hardly distinguish the effects of the explanatory variables if they are highly correlated Technically: the matrix X X is nearly singular and this causes the variance of the estimator Var β = σ2 X X −1 to be very large Usually referred to simply as “multicollinearity” 8 / 23 CONSEQUENCES OF MULTICOLLINEARITY 9 / 23 CONSEQUENCES OF MULTICOLLINEARITY 1. Estimates remain unbiased and consistent (estimated coefficients are not affected) 9 / 23 CONSEQUENCES OF MULTICOLLINEARITY 1. Estimates remain unbiased and consistent (estimated coefficients are not affected) 2. Standard errors of coefficients increase 9 / 23 CONSEQUENCES OF MULTICOLLINEARITY 1. Estimates remain unbiased and consistent (estimated coefficients are not affected) 2. Standard errors of coefficients increase Confidence intervals are very large - estimates are less reliable 9 / 23 CONSEQUENCES OF MULTICOLLINEARITY 1. Estimates remain unbiased and consistent (estimated coefficients are not affected) 2. Standard errors of coefficients increase Confidence intervals are very large - estimates are less reliable t-statistics are smaller - variables may become insignificant 9 / 23 DETECTION OF MULTICOLLINEARITY 10 / 23 DETECTION OF MULTICOLLINEARITY Some multicollinearity exists in every equation - the aim is to recognize when it causes a severe problem 10 / 23 DETECTION OF MULTICOLLINEARITY Some multicollinearity exists in every equation - the aim is to recognize when it causes a severe problem Multicollinearity can be signaled by the underlying theory, but it is very sample depending 10 / 23 DETECTION OF MULTICOLLINEARITY Some multicollinearity exists in every equation - the aim is to recognize when it causes a severe problem Multicollinearity can be signaled by the underlying theory, but it is very sample depending We judge the severity of multicollinearity based on the properties of our sample and on the results we obtain 10 / 23 DETECTION OF MULTICOLLINEARITY Some multicollinearity exists in every equation - the aim is to recognize when it causes a severe problem Multicollinearity can be signaled by the underlying theory, but it is very sample depending We judge the severity of multicollinearity based on the properties of our sample and on the results we obtain One simple method: examine correlation coefficients between explanatory variables 10 / 23 DETECTION OF MULTICOLLINEARITY Some multicollinearity exists in every equation - the aim is to recognize when it causes a severe problem Multicollinearity can be signaled by the underlying theory, but it is very sample depending We judge the severity of multicollinearity based on the properties of our sample and on the results we obtain One simple method: examine correlation coefficients between explanatory variables if some of them is too high, we may suspect that the coefficients of these variables can be affected by multicollinearity 10 / 23 REMEDIES FOR MULTICOLLINEARITY 11 / 23 REMEDIES FOR MULTICOLLINEARITY Drop a redundant variable 11 / 23 REMEDIES FOR MULTICOLLINEARITY Drop a redundant variable when the variable is not needed to represent the effect on the dependent variable 11 / 23 REMEDIES FOR MULTICOLLINEARITY Drop a redundant variable when the variable is not needed to represent the effect on the dependent variable in case of severe multicollinearity, it makes no statistical difference which variable is dropped 11 / 23 REMEDIES FOR MULTICOLLINEARITY Drop a redundant variable when the variable is not needed to represent the effect on the dependent variable in case of severe multicollinearity, it makes no statistical difference which variable is dropped theoretical underpinnings of the model should be the basis for such a decision 11 / 23 REMEDIES FOR MULTICOLLINEARITY Drop a redundant variable when the variable is not needed to represent the effect on the dependent variable in case of severe multicollinearity, it makes no statistical difference which variable is dropped theoretical underpinnings of the model should be the basis for such a decision Do nothing 11 / 23 REMEDIES FOR MULTICOLLINEARITY Drop a redundant variable when the variable is not needed to represent the effect on the dependent variable in case of severe multicollinearity, it makes no statistical difference which variable is dropped theoretical underpinnings of the model should be the basis for such a decision Do nothing when multicollinearity does not cause insignificant t-scores or unreliable estimated coefficients 11 / 23 REMEDIES FOR MULTICOLLINEARITY Drop a redundant variable when the variable is not needed to represent the effect on the dependent variable in case of severe multicollinearity, it makes no statistical difference which variable is dropped theoretical underpinnings of the model should be the basis for such a decision Do nothing when multicollinearity does not cause insignificant t-scores or unreliable estimated coefficients deletion of collinear variable can cause specification bias 11 / 23 REMEDIES FOR MULTICOLLINEARITY Drop a redundant variable when the variable is not needed to represent the effect on the dependent variable in case of severe multicollinearity, it makes no statistical difference which variable is dropped theoretical underpinnings of the model should be the basis for such a decision Do nothing when multicollinearity does not cause insignificant t-scores or unreliable estimated coefficients deletion of collinear variable can cause specification bias Increase the size of the sample 11 / 23 REMEDIES FOR MULTICOLLINEARITY Drop a redundant variable when the variable is not needed to represent the effect on the dependent variable in case of severe multicollinearity, it makes no statistical difference which variable is dropped theoretical underpinnings of the model should be the basis for such a decision Do nothing when multicollinearity does not cause insignificant t-scores or unreliable estimated coefficients deletion of collinear variable can cause specification bias Increase the size of the sample the confidence intervals are narrower when we have more observations 11 / 23 EXAMPLE 12 / 23 EXAMPLE Estimating the demand for gasoline in the U.S.: PCONi = 389.6 − 36.5 13.2) TAXi + 60.8 10.3) UHMi − 0.061 0.043) REGi t = 5.92 − 2.77 − 1.43 R2 = 0.924 , n = 50 , Corr(UHM, REG) = 0.978 PCONi . . . petroleum consumption in the i-th state TAXi . . . the gasoline tax rate in the i-th state UHMi . . . urban highway miles within the i-th state REGi . . . motor vehicle registrations in the i-the state 12 / 23 EXAMPLE 13 / 23 EXAMPLE We suspect a multicollinearity between urban highway miles and motor vehicle registration across states, because those states that have a lot of highways might also have a lot of motor vehicles. 13 / 23 EXAMPLE We suspect a multicollinearity between urban highway miles and motor vehicle registration across states, because those states that have a lot of highways might also have a lot of motor vehicles. Therefore, we might run into multicollinearity problems. How do we detect multicollinearity? 13 / 23 EXAMPLE We suspect a multicollinearity between urban highway miles and motor vehicle registration across states, because those states that have a lot of highways might also have a lot of motor vehicles. Therefore, we might run into multicollinearity problems. How do we detect multicollinearity? Look at correlation coefficient. It is indeed huge (0.978). 13 / 23 EXAMPLE We suspect a multicollinearity between urban highway miles and motor vehicle registration across states, because those states that have a lot of highways might also have a lot of motor vehicles. Therefore, we might run into multicollinearity problems. How do we detect multicollinearity? Look at correlation coefficient. It is indeed huge (0.978). Look at the coefficients of the two variables. Are they both individually significant? UHM is significant, but REG is not. This further suggests a presence of multicollinearity. 13 / 23 EXAMPLE We suspect a multicollinearity between urban highway miles and motor vehicle registration across states, because those states that have a lot of highways might also have a lot of motor vehicles. Therefore, we might run into multicollinearity problems. How do we detect multicollinearity? Look at correlation coefficient. It is indeed huge (0.978). Look at the coefficients of the two variables. Are they both individually significant? UHM is significant, but REG is not. This further suggests a presence of multicollinearity. Remedy: try dropping one of the correlated variables. 13 / 23 EXAMPLE PCONi = 551.7 − 53.6 16.9) TAXi + 0.186 0.012) REGi t = −3.18 15.88 R2 = 0.866 , n = 50 14 / 23 EXAMPLE PCONi = 551.7 − 53.6 16.9) TAXi + 0.186 0.012) REGi t = −3.18 15.88 R2 = 0.866 , n = 50 PCONi = 410.0 − 39.6 13.1) TAXi + 46.4 2.16) UHMi t = −3.02 21.40 R2 = 0.921 , n = 50 14 / 23 Heteroskedasticity 15 / 23 HETEROSKEDASTICITY 16 / 23 HETEROSKEDASTICITY Observations of the error term are drawn from a distribution that has no longer a constant variance 16 / 23 HETEROSKEDASTICITY Observations of the error term are drawn from a distribution that has no longer a constant variance Var(εi) = σ2 i , i = 1, 2, . . . , n 16 / 23 HETEROSKEDASTICITY Observations of the error term are drawn from a distribution that has no longer a constant variance Var(εi) = σ2 i , i = 1, 2, . . . , n Note: constant variance means: Var(εi) = σ2(i = 1, 2, . . . , n) 16 / 23 HETEROSKEDASTICITY Observations of the error term are drawn from a distribution that has no longer a constant variance Var(εi) = σ2 i , i = 1, 2, . . . , n Note: constant variance means: Var(εi) = σ2(i = 1, 2, . . . , n) Often occurs in data sets in which there is a wide disparity between the largest and smallest observed values Smaller values often connected to smaller variance and larger values to larger variance (e.g. consumption of households based on their income level) 16 / 23 HETEROSKEDASTICITY Observations of the error term are drawn from a distribution that has no longer a constant variance Var(εi) = σ2 i , i = 1, 2, . . . , n Note: constant variance means: Var(εi) = σ2(i = 1, 2, . . . , n) Often occurs in data sets in which there is a wide disparity between the largest and smallest observed values Smaller values often connected to smaller variance and larger values to larger variance (e.g. consumption of households based on their income level) One particular form of heteroskedasticity (variance of the error term is a function of some observable variable): Var(εi) = h(xi) , i = 1, 2, . . . , n 16 / 23 HETEROSKEDASTICITY X Y 17 / 23 CONSEQUENCES OF HETEROSKEDASTICITY 18 / 23 CONSEQUENCES OF HETEROSKEDASTICITY Violation of one of the classical assumptions 18 / 23 CONSEQUENCES OF HETEROSKEDASTICITY Violation of one of the classical assumptions 1. Estimates remain unbiased and consistent (estimated coefficients are not affected) 18 / 23 CONSEQUENCES OF HETEROSKEDASTICITY Violation of one of the classical assumptions 1. Estimates remain unbiased and consistent (estimated coefficients are not affected) 2. Estimated standard errors of the coefficients are biased 18 / 23 CONSEQUENCES OF HETEROSKEDASTICITY Violation of one of the classical assumptions 1. Estimates remain unbiased and consistent (estimated coefficients are not affected) 2. Estimated standard errors of the coefficients are biased heteroskedastic error term causes the dependent variable to fluctuate in a way that the OLS estimation procedure attributes to the independent variable 18 / 23 CONSEQUENCES OF HETEROSKEDASTICITY Violation of one of the classical assumptions 1. Estimates remain unbiased and consistent (estimated coefficients are not affected) 2. Estimated standard errors of the coefficients are biased heteroskedastic error term causes the dependent variable to fluctuate in a way that the OLS estimation procedure attributes to the independent variable heteroskedasticity biases t statistics, which leads to unreliable hypothesis testing 18 / 23 CONSEQUENCES OF HETEROSKEDASTICITY Violation of one of the classical assumptions 1. Estimates remain unbiased and consistent (estimated coefficients are not affected) 2. Estimated standard errors of the coefficients are biased heteroskedastic error term causes the dependent variable to fluctuate in a way that the OLS estimation procedure attributes to the independent variable heteroskedasticity biases t statistics, which leads to unreliable hypothesis testing typically, we encounter underestimation of the standard errors, so the t scores are incorrectly too high 18 / 23 DETECTION OF HETEROSKEDASTICITY 19 / 23 DETECTION OF HETEROSKEDASTICITY There is a battery of tests for heteroskedasticity Sometimes, simple visual analysis of residuals is sufficient to detect heteroskedasticity 19 / 23 DETECTION OF HETEROSKEDASTICITY There is a battery of tests for heteroskedasticity Sometimes, simple visual analysis of residuals is sufficient to detect heteroskedasticity We will derive a test for the model yi = β0 + β1xi + β2zi + εi 19 / 23 DETECTION OF HETEROSKEDASTICITY There is a battery of tests for heteroskedasticity Sometimes, simple visual analysis of residuals is sufficient to detect heteroskedasticity We will derive a test for the model yi = β0 + β1xi + β2zi + εi The test is based on analysis of residuals ei = yi − yi = yi − (β0 + β1xi + β2zi) 19 / 23 DETECTION OF HETEROSKEDASTICITY There is a battery of tests for heteroskedasticity Sometimes, simple visual analysis of residuals is sufficient to detect heteroskedasticity We will derive a test for the model yi = β0 + β1xi + β2zi + εi The test is based on analysis of residuals ei = yi − yi = yi − (β0 + β1xi + β2zi) The null hypothesis for the test is no heteroskedasticity: E(e2) = σ2 19 / 23 DETECTION OF HETEROSKEDASTICITY There is a battery of tests for heteroskedasticity Sometimes, simple visual analysis of residuals is sufficient to detect heteroskedasticity We will derive a test for the model yi = β0 + β1xi + β2zi + εi The test is based on analysis of residuals ei = yi − yi = yi − (β0 + β1xi + β2zi) The null hypothesis for the test is no heteroskedasticity: E(e2) = σ2 Therefore, we will analyse the relationship between e2 and explanatory variables 19 / 23 WHITE TEST FOR HETEROSKEDASTICITY 20 / 23 WHITE TEST FOR HETEROSKEDASTICITY 1. Estimate the equation, get the residuals ei 20 / 23 WHITE TEST FOR HETEROSKEDASTICITY 1. Estimate the equation, get the residuals ei 2. Regress the squared residuals on all explanatory variables and on squares and cross-products of all explanatory variables: e2 i = α0 + α1xi + α2zi + α3x2 i + α4z2 i + α5xizi + νi (1) 20 / 23 WHITE TEST FOR HETEROSKEDASTICITY 1. Estimate the equation, get the residuals ei 2. Regress the squared residuals on all explanatory variables and on squares and cross-products of all explanatory variables: e2 i = α0 + α1xi + α2zi + α3x2 i + α4z2 i + α5xizi + νi (1) 3. Get the R2 of this regression and the sample size n 20 / 23 WHITE TEST FOR HETEROSKEDASTICITY 1. Estimate the equation, get the residuals ei 2. Regress the squared residuals on all explanatory variables and on squares and cross-products of all explanatory variables: e2 i = α0 + α1xi + α2zi + α3x2 i + α4z2 i + α5xizi + νi (1) 3. Get the R2 of this regression and the sample size n 4. Test the joint significance of (1): test statistic = nR2 ∼ χ2 k, where k is the number of slope coefficients in (1) 20 / 23 WHITE TEST FOR HETEROSKEDASTICITY 1. Estimate the equation, get the residuals ei 2. Regress the squared residuals on all explanatory variables and on squares and cross-products of all explanatory variables: e2 i = α0 + α1xi + α2zi + α3x2 i + α4z2 i + α5xizi + νi (1) 3. Get the R2 of this regression and the sample size n 4. Test the joint significance of (1): test statistic = nR2 ∼ χ2 k, where k is the number of slope coefficients in (1) 5. If nR2 is larger than the χ2 k critical value, then we have to reject H0 of no heteroskedasticity 20 / 23 REMEDIES FOR HETEROSKEDASTICITY 21 / 23 REMEDIES FOR HETEROSKEDASTICITY 1. Redefing the variables 21 / 23 REMEDIES FOR HETEROSKEDASTICITY 1. Redefing the variables in order to reduce the variance of observations with extreme values 21 / 23 REMEDIES FOR HETEROSKEDASTICITY 1. Redefing the variables in order to reduce the variance of observations with extreme values e.g. by taking logarithms or by scaling some variables 21 / 23 REMEDIES FOR HETEROSKEDASTICITY 1. Redefing the variables in order to reduce the variance of observations with extreme values e.g. by taking logarithms or by scaling some variables 2. Weighted Least Squares (WLS) 21 / 23 REMEDIES FOR HETEROSKEDASTICITY 1. Redefing the variables in order to reduce the variance of observations with extreme values e.g. by taking logarithms or by scaling some variables 2. Weighted Least Squares (WLS) consider the model yi = β0 + β1xi + β2zi + εi 21 / 23 REMEDIES FOR HETEROSKEDASTICITY 1. Redefing the variables in order to reduce the variance of observations with extreme values e.g. by taking logarithms or by scaling some variables 2. Weighted Least Squares (WLS) consider the model yi = β0 + β1xi + β2zi + εi suppose Var(εi) = σ2 z2 i 21 / 23 REMEDIES FOR HETEROSKEDASTICITY 1. Redefing the variables in order to reduce the variance of observations with extreme values e.g. by taking logarithms or by scaling some variables 2. Weighted Least Squares (WLS) consider the model yi = β0 + β1xi + β2zi + εi suppose Var(εi) = σ2 z2 i we prove on the lecture that if we redefine the model as yi zi = β0 1 zi + β1 xi zi + β2 + εi zi , it becomes homoskedastic 21 / 23 REMEDIES FOR HETEROSKEDASTICITY 1. Redefing the variables in order to reduce the variance of observations with extreme values e.g. by taking logarithms or by scaling some variables 2. Weighted Least Squares (WLS) consider the model yi = β0 + β1xi + β2zi + εi suppose Var(εi) = σ2 z2 i we prove on the lecture that if we redefine the model as yi zi = β0 1 zi + β1 xi zi + β2 + εi zi , it becomes homoskedastic 3. Heteroskedasticity-corrected robust standard errors 21 / 23 HETEROSKEDASTICITY-CORRECTED ROBUST ERRORS 22 / 23 HETEROSKEDASTICITY-CORRECTED ROBUST ERRORS The logic behind: Since heteroskedasticity causes problems with the standard errors of OLS but not with the coefficients, it makes sense to improve the estimation of the standard errors in a way that does not alter the estimate of the coefficients (White, 1980) 22 / 23 HETEROSKEDASTICITY-CORRECTED ROBUST ERRORS The logic behind: Since heteroskedasticity causes problems with the standard errors of OLS but not with the coefficients, it makes sense to improve the estimation of the standard errors in a way that does not alter the estimate of the coefficients (White, 1980) Heteroskedasticity-corrected standard errors are typically larger than OLS s.e., thus producing lower t scores 22 / 23 HETEROSKEDASTICITY-CORRECTED ROBUST ERRORS The logic behind: Since heteroskedasticity causes problems with the standard errors of OLS but not with the coefficients, it makes sense to improve the estimation of the standard errors in a way that does not alter the estimate of the coefficients (White, 1980) Heteroskedasticity-corrected standard errors are typically larger than OLS s.e., thus producing lower t scores In panel and cross-sectional data with group-level variables, the method of clustering the standard errors is the desired answer to heteroskedasticity 22 / 23 SUMMARY 23 / 23 SUMMARY Multicollinearity 23 / 23 SUMMARY Multicollinearity does not lead to inconsistent estimates, but it makes them lose significance 23 / 23 SUMMARY Multicollinearity does not lead to inconsistent estimates, but it makes them lose significance if really necessary, can be remedied by dropping or transforming variables, or by getting more data 23 / 23 SUMMARY Multicollinearity does not lead to inconsistent estimates, but it makes them lose significance if really necessary, can be remedied by dropping or transforming variables, or by getting more data Heteroskedasticity 23 / 23 SUMMARY Multicollinearity does not lead to inconsistent estimates, but it makes them lose significance if really necessary, can be remedied by dropping or transforming variables, or by getting more data Heteroskedasticity does not lead to inconsistent estimates, but invalidates inference 23 / 23 SUMMARY Multicollinearity does not lead to inconsistent estimates, but it makes them lose significance if really necessary, can be remedied by dropping or transforming variables, or by getting more data Heteroskedasticity does not lead to inconsistent estimates, but invalidates inference can be simply remedied by the use of (clustered) robust standard errors 23 / 23 SUMMARY Multicollinearity does not lead to inconsistent estimates, but it makes them lose significance if really necessary, can be remedied by dropping or transforming variables, or by getting more data Heteroskedasticity does not lead to inconsistent estimates, but invalidates inference can be simply remedied by the use of (clustered) robust standard errors Readings: Studenmund Chapter 8 and 10 Wooldridge Chapter 8 23 / 23