Exercise 6 Problem 1 The file stockton96.gdt contains 940 observations on home sales in Stockton, CA in 1996. a) Use least squares to estimate a linear equation that relates house price PRICE to the size of the house in square feet SQFT and the age of the house in years AGE. Interpret all the estimates. ols price const age sqft b) Suppose that you own two houses. One has 1400 square feet; the other has 1800 square feet. Both are 20 years old. What price do you estimate you will get for each house? c) Test the hypothesis that the size and the age of the house are important determinants of its price (separately as well as jointly). Both have three stars. Also jointly significant according to above output d) Using the Breusch-Pagan test for heteroscedasticity, test whether the model satisfies the homoscedasticity assumption by using the command for the BP test in Gretl. series yhat=$yhat genr resid=price-yhat modtest --breusch-pagan e) Use the White test to test for heteroskedasticity. modtest --white f) What do you conclude regarding the heteroskedasticity? Does your conclusion depend on the choosing a specific test? Discuss also drawbacks of the BP and White tests. There is heteroskedasticity A weakness of the BP test is that it assumes the heteroskedasticity is a linear function of the independent variables. Failing to find evidence of heteroskedasticity with the BP doesn't rule out a nonlinear relationship between the independent variable(s) and the error variance. The weakness of white test is that if you have many variables, the number of possible interactions plus the squared variables plus the original variables can be quite high. g) Test the hypothesis that the size and the age of the house are important determinants of its price (separately as well as jointly). Hint: choose appropriate standard errors. Does your conclusion differ from part (c)? ols price const age sqft –robust compare the robust and non-robust standard errors and parameters. You can see that the parameters did not change, while standard errors increased. Still, conclusions have not changed, based on the F-statistic Problem 2 Using the data in cps4_small.gdt estimate the following wage equation with least squares and heteroskedasticity-robust standard errors: (a) Report the results. genr exper2=exper^2 genr experedu=exper*educ genr lnwage=ln(wage) ols lnwage educ exper exper2 experedu const --robust Graphical user interface, text, application, Word Description automatically generated (b) Add MARRIED to the equation and re-estimate. Holding education and experience constant, do married workers get higher wages? Using a 5% significance level, test a null hypothesis that wages of married workers are less than or equal to those of unmarried workers against the alternative that wages of married workers are higher. Graphical user interface, text, application, Word Description automatically generated The null and alternative hypotheses for testing whether married workers get higher wages are given by The test value is: 1.188, the critical value at the 5% level of significance is 1.646. Since the test value is less than the critical value, we do not reject the null hypothesis at the 5% level. We conclude that there is insufficient evidence to show that wages of married workers are greater than those of unmarried workers. (c) Plot the residuals from part (a) against the two values of MARRIED. Is there evidence of heteroskedasticity? series uhat=$uhat gnuplot uhat married Graphical user interface, application, Word Description automatically generated The residual plot suggests the variance of wages for married workers is greater than that for unmarried workers. Thus, there is the evidence of heteroskedasticity. It probably makes better sense to plot squared residuals against the married variable because in reality, variance is a squared term. However, above figure still shows the change in the dispersion of the data-cloud given the explanatory variable. As we can see, the slope of the fitted line is not horizontal, meaning that there is a heteroskedasticity issue Graphical user interface, application, Word Description automatically generated (d) Plot the least squares residuals against EDUC and against EXPER. What do they suggest? Graphical user interface, application Description automatically generated Graphical user interface, application Description automatically generated Both residual plots exhibit a pattern in which the absolute magnitudes of the residuals tend to increase as the values of EDUC and EXPER increase, although for EXPER the increase is not very pronounced. Thus, the plots suggest there is heteroskedasticity with the variance dependent on EDUC and possibly EXPER. (e) Test for heteroskedasticity using a Breusch-Pagan test where the variance depends on EDUC, EXPER and MARRIED. What do you conclude at a 5% significance level? modtest --breusch-pagan Graphical user interface, application, Word Description automatically generated The null and alternative hypotheses are With H1 implying the error variance depends on one or more of EXPER, EDUC or MARRIED. The value of the test statistic is 26.1, with P value 0.000085, therefore, we reject the null hypothesis and conclude that heteroskedasticity exists.