Introduction to Econometrics Worksheet week # 8 1. Consider the following model: log(price) = β0+β1 log(assess)+β2 log(sqrft)+β3 log(lotsize)+β4 d bdrms+ε , (1) where price is house price, assess is the assessed housing value (before the house was sold), lotsize is size of the lot (in feet), sqrft is square footage, and d bdrms is a dummy variable indicating if the house has more than 3 bedrooms. (a) Use the data housing.gdt to estimate the model (1). First transform the first four variables in logarithms, then construct the dummy variable as d bdrms = 1 if bdrms > 3 0 otherwise and run the regression. Interpret the coefficients. (b) Now, suppose we would like to test whether the assessed housing price is a rational valuation: if this is the case, then a 1% change in assess should be associated with a 1% change in price. In addition, lotsize, sqrft, and d bdrms should not help to explain log(price), once the assessed value has been controlled for. Define the hypotheses to be tested, the test statistic, and explain how would you conduct the test. Then test for rational valuation in Gretl. 2. You have organized a ski trip to the mountains for a group of your friends and, as a true econometrician, you decided to estimate a model of the expenditures of each participant. You suppose that the cost of the trip for each person depends on how many days he or she spent there (some people arrived later and some left earlier) and on what type of skis he or she was going. Some people went on downhill skis, some went on cross-country skis and some people managed to go on both. Since you are friends only with people who like sports, there was nobody who was not skiing (i.e., everybody went on downhill or cross-country skis or both). (a) You specify the following model: cost = β0 + β1day + β2DS + β3CS + ε , where cost is the cost of the trip, days stands for the number of days the person stays in the mountains, DS is a dummy equal to 1 if the person goes on downhill skis, zero otherwise, and CS is a dummy equal to 1 if the person goes on crosscountry skis, zero otherwise. 1 i. In terms of the parameters of your model, what is the expected cost of the trip for a person who spends two days in the mountains and goes both on cross-country and downhill skis? What is the expected cost for a person who spends three days in the mountains and goes on downhill skis only? ii. You want to test if the two dummy variables are jointly significant in your model. Running the model with the dummies included leads to R2 = 0.8, whereas running it without the dummies gives R2 = 0.65. Knowing that you have 25 observations, test for the joint significance of the two dummies at 95% confidence level. (b) A friend of yours is working on the same problem, but he specifies the model in a little bit different way: cost = γ1day + γ2DSO + γ3CSO + γ4BS + ε , where cost is the cost of the trip, days stands for the number of days the person stays in the mountains, DSO is a dummy equal to 1 if the person goes only on downhill skis, zero otherwise, CSO is a dummy equal to 1 if the person goes only on cross-country skis, zero otherwise, and BS is a dummy equal to 1 if the person goes on both skis, zero otherwise. (Hence, your friend’s model differs from yours only in the definition of the dummies.) i. You see that your friend includes in the model the full set of dummies. How can you explain that he is not facing perfect multicollinearity? ii. In terms of the parameters of your friend’s model, what is the expected cost of the trip for a person who spends two days in the mountains and goes both on cross-country and downhill skis? What is the expected cost for a person who spends three days in the mountains and goes on downhill skis only? 3. Suppose your data produce the regression result y = 1 + 0.7x. Consider scaling the data to express them in a different base year dollar, by multiplying observations by 0.8. (a) If both y and x are scaled, what regression results would you obtain? (b) If y is scaled but x is not, what regression results would you obtain? (c) If x is scaled but y is not, what regression results would you obtain? 4. Suppose we have estimated the model y = 10+2x+3D, where y represents earnings, x experience and D is a dummy variable equal to 1 for females and 0 for males. If we redefine the dummy as 1 for females and -1 for males, what result would we get (how the estimated coefficients would change)? 2 5. You are interested in what determines the election results. You have arrived at the following specification: voteA = β0 + β1 log(expendA) + β2 log(expendB) + β3prtystrA + β4democA + ε, where: voteA is the percent of votes received by candidate A, expendA and expendB are campaign expenditures, prtystrA is a measure of party strength of candidate A, democA is a dummy indicating whether candidate A is a democrat. Using the state-level data from past elections, you have obtained the following results: Model 1: OLS estimates using the 173 observations 1-173 Dependent variable: voteA coefficient std. error t-ratio p-value - - - - - - - - - - - - - - - - - - - - - - - - - - const 37.6614 4.73604 7.952 2.56E-013 *** lexpendA 5.77929 0.391820 14.75 4.03E-032 *** lexpendB -6.23784 0.397460 -15.69 9.34E-035 *** prtystrA 0.251918 0.0712925 3.534 0.0005 *** democA 3.79294 1.40652 2.697 0.0077 *** Mean of dependent variable = 50.5029 Standard deviation of dep. var. = 16.7848 Sum of squared residuals = 9635.07 Standard error of the regression = 7.57309 Unadjusted R-squared = 0.80116 Adjusted R-squared = 0.79643 F-statistic (4, 168) = 169.229 (p-value < 0.00001) (a) What is the interpretation of β1? (b) Test the hypothesis that democratic candidates had 4% higher chances to win. (c) How would you test a hypothesis that 1% increase in A’s expenditures is offset by 1% increase in B’s expenditures? (State the null hypothesis in terms of regression parameters. Perform the testing if it is possible. If not, propose a model which would allow you test this hypothesis.) 3