•October 15, 2021 • •Dali Laxton Multiple Regression Analyses: Statistical Inference 1 2 •We are going to discuss how hypotheses about coefficients can be tested in regression models • •We will explain what significance of coefficients mean •We will learn how to read regression output • –Wooldridge Chapter 4; –Studenmund Chapter 5.1-5.4 Today’s Lecture 3 •Statistical inference in the regression model §Hypothesis tests about population parameters §Construction of confidence intervals • •Sampling distributions of the OLS estimators §The OLS estimators are random variables §We already know their expected values and their variances §For hypothesis testing we need to know their distribution Multiple Regression Analyses: Inference 4 Inference: Sampling distributions of the OLS Estimators •Assumption 6 (Normality of error terms) TP_tmp.png independently of TP_tmp.png It is assumed that the unobserved factors are normally distributed around the population regression function. The form and the variance of the distribution does not depend on any of the explanatory variables. Give example of impact of age on fighters’ performance Show normality of the error terms in GRETL •Open GRETL load sample data “Engel” •Run regression ols foodexp income const •Generate residuals: •series exphat=$yhat •genr resid=foodexp-exphat •or •genr resid =foodexp-( $coeff(const) + $coeff(income)*income) •Display distribution of residuals: •freq resid --plot=display 5 6 •Discussion of the normality assumption •The error term is the sum of „many“ different unobserved factors •Sums of independent factors are normally distributed (CLT) •Problems: §How many different factors? Observations large enough? §Possibly very heterogenuous distributions of individual factors §How independent are the different factors? •The normality of the error term is an empirical question •At least the error distribution should be „close“ to normal •In many cases, normality is questionable or impossible by definition Inference: Sampling distributions of the OLS Estimators Show in gretl normality of residuals. The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement (>30), then the distribution of the sample means will be approximately normally distributed with mean equal to the population mean, regardless of the fact whether the original population distribution was normal or not. 7 •Discussion of the normality assumption (cont.) •Examples where normality cannot hold: •Wages (nonnegative; also: minimum wage) •Unemployment (indicator variable, takes on only 1 or 0) •In some cases, normality can be achieved through transformations of the dependent variable •Under normality, OLS is the best (even nonlinear) unbiased estimator •Important: For the purposes of statistical inference, the assumption of normality can be replaced by a large sample size (CLT) Inference: Sampling distributions of the OLS Estimators 8 Inference: Sampling distributions of the OLS Estimators CLT The central limit theorem produces approximately normal sampling distributions in this histogram. Source: https://statisticsbyjim.com/basics/central-limit-theorem/ in the graph above, the gray color shows the skewed distribution of the values in the population. The other colors represent the sampling distributions of the means for different sample sizes. The red color shows the distribution of means when your sample size is 5. Blue denotes a sample size of 20. Green is 40. The red curve (n=5) is still skewed a bit, but the blue and green (20 and 40) are not visibly skewed. As the sample size increases, the sampling distributions more closely approximate the normal distribution and become more tightly clustered around the population mean—just as the central limit theorem states! 9 • •We cannot prove that a given hypothesis is “correct” using hypothesis testing •All we can do is to state that a particular sample conforms to a particular hypothesis •We can often reject a given hypothesis with a certain degree of confidence •In such a case, we conclude that it is very unlikely the sample result would have been observed if the hypothesized theory were correct Multiple Regression Analyses: Hypothesis Testing 10 •Step 1: state explicitly the hypothesis to be tested •Null hypothesis: statement of the range of values of the regression coefficient that would be expected to occur if the researcher‘s theory were not correct •Alternative hypothesis: specification of the range of values of the coefficient that would be expected to occur if the researcher‘s theory were correct •In other words, we define the null hypothesis as the result we do not expect Multiple Regression Analyses: Hypothesis Testing 11 • •It would be unrealistic to think that conclusions drawn from regression analysis will always be right • •There are two types of errors we can make: –Type I: we reject a true null hypothesis –Type II: We fail to reject a false null hypothesis • Type I and Type II Errors 12 Example: •H0: The defendant is innocent •HA: The defendant is guilty –Type I error: sending an innocent person to jail –Type II error: freeing a guilty person •Lowering the probability of Type I error means increasing the probability of Type II error; •In hypothesis testing, we focus on Type I error and we ensure that its probability is not unreasonably large Type I and Type II Errors 13 Type I and Type II Errors A picture containing text, person, newspaper Description generated with high confidence 14 •Testing hypotheses about a single population parameter •Theorem (t-distribution for standardized estimators) • • • • • •Null hypothesis (for more general hypotheses, see below) Under assumptions 1 – 6: TP_tmp.png If the standardization is done using the estimated standard deviation (= standard error), the normal distribution is replaced by a t-distribution TP_tmp.png The population parameter is equal to zero, i.e. after controlling for the other independent variables, there is no effect of xj on y Note: The t-distribution is close to the standard normal distribution if n-k-1 is large. Inference: The t Test 15 •t-statistic (or t-ratio) • • • • •Distribution of the t-statistic if the null hypothesis is true • •Goal: Define a rejection rule so that, if it is true, H0 is rejected only with a small probability (= significance level, e.g. 5%) TP_tmp.png The t-statistic will be used to test the above null hypothesis. The farther the estimated coefficient is away from zero, the less likely it is that the null hypothesis holds true. But what does „far“ away from zero mean? This depends on the variability of the estimated coefficient, i.e. its standard deviation. The t-statistic measures how many estimated standard deviations the estimated coefficient is away from zero. TP_tmp.png Inference: The t Test 16 •Testing against one-sided alternatives (greater than zero) Test ag against . TP_tmp.png TP_tmp.png Reject the null hypothesis in favour of the alternative hypothesis if the estimated coef- ficient is „too large“ (i.e. larger than a criti- cal value). Construct the critical value so that, if the null hypothesis is true, it is rejected in, for example, 5% of the cases. In the given example, this is the point of the t-distribution with 28 degrees of freedom that is exceeded in 5% of the cases. ! Reject if t-statistic greater than 1.701 Inference: The t Test 17 •Example: Wage equation •Test whether, after controlling for education and tenure, higher work experience leads to higher hourly wages TP_tmp.png TP_tmp.png Standard errors Test against . One would either expect a positive effect of experience on hourly wage or no effect at all. TP_tmp.png TP_tmp.png Inference: The t Test 18 •Example: Wage equation (cont.) TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png „The effect of experience on hourly wage is statistically greater than zero at the 5% (and even at the 1%) significance level.“ t-statistic Degrees of freedom; here the standard normal approximation applies Critical values for the 5% and the 1% significance level (these are conventional significance levels). The null hypothesis is rejected because the t-statistic exceeds the critical value. Inference: The t Test 19 •Testing against one-sided alternatives (less than zero) Test a against . TP_tmp.png TP_tmp.png Reject the null hypothesis in favour of the alternative hypothesis if the estimated coef- ficient is „too small“ (i.e. smaller than a criti- cal value). Construct the critical value so that, if the null hypothesis is true, it is rejected in, for example, 5% of the cases. In the given example, this is the point of the t-distribution with 18 degrees of freedom so that 5% of the cases are below the point. ! Reject if t-statistic less than -1.734 Inference: The t Test 20 •Example: Student performance and school size •Test whether smaller school size leads to better student performance TP_tmp.png TP_tmp.png Test against . Do larger schools hamper student performance or is there no such effect? Percentage of students passing maths test Average annual tea- cher compensation Staff per one thou-sand students School enrollment (= school size) TP_tmp.png TP_tmp.png Inference: The t Test 21 •Example: Student performance and school size (cont.) TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png One cannot reject the hypothesis that there is no effect of school size on student performance (not even for a larger significance level of 15%). t-statistic Degrees of freedom; here the standard normal approximation applies Critical values for the 5% and the 15% significance level. The null hypothesis is not rejected because the t-statistic is not smaller than the critical value. Inference: The t Test 22 •Example: Student performance and school size (cont.) •Alternative specification of functional form: TP_tmp.png TP_tmp.png TP_tmp.png Test against . R-squared slightly higher TP_tmp.png Inference: The t Test 23 •Example: Student performance and school size (cont.) TP_tmp.png TP_tmp.png The hypothesis that there is no effect of school size on student performance can be rejected in favor of the hypothesis that the effect is negative. t-statistic Critical value for the 5% significance level ! reject null hypothesis How large is the effect? + 10% enrollment ! -0.129 percentage points students pass test (small effect) Inference: The t Test Interpretation is different with linear-log -> if we increase x by 1%, y increases by Beta/100 units. log-linear -> if we increase x by 1, y increases by 100%*Beta 24 •Testing against two-sided alternatives Test against . TP_tmp.png TP_tmp.png Reject the null hypothesis in favour of the alternative hypothesis if the absolute value of the estimated coefficient is too large. Construct the critical value so that, if the null hypothesis is true, it is rejected in, for example, 5% of the cases. In the given example, these are the points of the t-distribution so that 5% of the cases lie in the two tails. ! Reject if absolute value of t-statistic is less than -2.06 or greater than 2.06 Inference: The t Test 25 •Example: Determinants of college GPA TP_tmp.png TP_tmp.png Lectures missed per week TP_tmp.png TP_tmp.png TP_tmp.png The effects of hsGPA and skipped are significantly different from zero at the 1% significance level. The effect of ACT is not significantly different from zero, not even at the 10% significance level. For critical values, use standard normal distribution Inference: The t Test 26 •„Statistically significant“ variables in a regression •If a regression coefficient is different from zero in a two-sided test, the corresponding variable is said to be „statistically significant“ •If the number of degrees of freedom is large enough so that the normal approximation applies, the following rules of thumb apply: TP_tmp.png TP_tmp.png TP_tmp.png „statistically significant at 10 % level“ „statistically significant at 5 % level“ „statistically significant at 1 % level“ Inference: The t Test 27 •Guidelines for discussing economic and statistical significance •If a variable is statistically significant, discuss the magnitude of the coefficient to get an idea of its economic or practical importance •The fact that a coefficient is statistically significant does not necessarily mean it is economically or practically significant! •If a variable is statistically and economically important but has the „wrong“ sign, the regression model might be misspecified •If a variable is statistically insignificant at the usual levels (10%, 5%, 1%), one may think of dropping it from the regression •If the sample size is small, effects might be imprecisely estimated so that the case for dropping insignificant variables is less strong Inference: The t Test 28 •Testing more general hypotheses about a regression coefficient •Null hypothesis • • •t-statistic • • • •The test works exactly as before, except that the hypothesized value is substracted from the estimate when forming the statistic TP_tmp.png TP_tmp.png Hypothesized value of the coefficient Inference: The t Test 29 •Example: Campus crime and enrollment •An interesting hypothesis is whether crime increases by one percent if enrollment is increased by one percent TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png The hypothesis is rejected at the 5% level Estimate is different from one but is this difference statistically significant? Inference: The t Test 30 •Computing p-values for t-tests •If the significance level is made smaller and smaller, there will be a point where the null hypothesis cannot be rejected anymore •The reason is that, by lowering the significance level, one wants to avoid more and more to make the error of rejecting a correct H0 •The smallest significance level at which the null hypothesis is still rejected, is called the p-value of the hypothesis test •A small p-value is evidence against the null hypothesis because one would reject the null hypothesis even at small significance levels •A large p-value is evidence in favor of the null hypothesis •P-values are more informative than tests at fixed significance levels Inference: The t Test 31 •How the p-value is computed (here: two-sided test) The p-value is the significance level at which one is indifferent between rejecting and not rejecting the null hypothesis. In the two-sided case, the p-value is thus the probability that the t-distributed variable takes on a larger absolute value than the realized value of the test statistic, e.g.: From this, it is clear that a null hypothesis is rejected if and only if the corresponding p-value is smaller than the significance level. For example, for a significance level of 5% the t-statistic would not lie in the rejection region. TP_tmp.png value of test statistic These would be the critical values for a 5% significance level Inference: The t Test Inference: Confidence Intervals 32 Critical value of two-sided test •Confidence intervals •Simple manipulation of the result in Theorem 4.2 implies that • • • • • •Interpretation of the confidence interval •The bounds of the interval are random •In repeated samples, the interval that is constructed in the above way will cover the population regression coefficient in 95% of the cases TP_tmp.png Lower bound of the Confidence interval Upper bound of the Confidence interval Confidence level 33 •Confidence intervals for typical confidence levels • • • • • •Relationship between confidence intervals and hypotheses tests TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png reject in favor of TP_tmp.png Use rules of thumb TP_tmp.png TP_tmp.png Inference: Confidence Intervals 34 •Example: Model of firms‘ R&D expenditures TP_tmp.png TP_tmp.png Spending on R&D Annual sales Profits as percentage of sales TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png The effect of sales on R&D is relatively precisely estimated as the interval is narrow. Moreover, the effect is significantly different from zero because zero is outside the interval. This effect is imprecisely estimated as the in- terval is very wide. It is not even statistically significant because zero lies in the interval. Inference: Confidence Intervals (0.0128 ) 0.0217 (0.0128 ) 35 •Example: Return to education at 2 year vs. at 4 year colleges TP_tmp.png Years of education at 2 year colleges Years of education at 4 year colleges Test against . A possible test statistic would be: TP_tmp.png The difference between the estimates is normalized by the estimated standard deviation of the difference. The null hypothesis would have to be rejected if the statistic is „too negative“ to believe that the true difference between the parameters is equal to zero. TP_tmp.png TP_tmp.png Inference: Testing hypotheses about a linear combination of parameters 36 Inference: Testing hypotheses about a linear combination of parameters •Impossible to compute with standard regression output because • • •Alternative method TP_tmp.png Usually not available in regression output Define and test against . TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png a new regressor (= total years of college) Insert into original regression TP_tmp.png 37 •Estimation results • • • • • • • • •This method works always for single linear hypotheses TP_tmp.png TP_tmp.png Total years of college TP_tmp.png TP_tmp.png TP_tmp.png Hypothesis is rejected at 10% level but not at 5% level Inference: Testing hypotheses about a linear combination of parameters 38 •Testing multiple linear restrictions: The F-test •Testing exclusion restrictions TP_tmp.png Years in the league Average number of games per year TP_tmp.png Salary of major lea- gue baseball player Batting average Home runs per year Runs batted in per year TP_tmp.png against TP_tmp.png Test whether performance measures have no effect/can be exluded from regression. Inference: The F Test 39 •Estimation of the unrestricted model TP_tmp.png TP_tmp.png TP_tmp.png None of these variabels are statistically significant when tested individually Idea: How would the model fit be if these variables were dropped from the regression? Inference: The F Test 40 •Estimation of the restricted model • • • • • • •Test statistic TP_tmp.png TP_tmp.png The sum of squared residuals necessarily increases, but is the increase statistically significant? TP_tmp.png The relative increase of the sum of squared residuals when going from H1 to H0 follows a F-distribution (if the null hypothesis H0 is correct) Number of restrictions Inference: The F Test 41 •Rejection rule A F-distributed variable only takes on positive values. This corresponds to the fact that the sum of squared residuals can only increase if one moves from H1 to H0. Choose the critical value so that the null hypo-thesis is rejected in, for example, 5% of the cases, although it is true. Inference: The F Test 42 •Test decision in example • • • • • • •Discussion ØThe three variables are „jointly significant“ ØThey were not significant when tested individually ØThe likely reason is multicollinearity between them TP_tmp.png Number of restrictions to be tested Degrees of freedom in the unrestricted model TP_tmp.png TP_tmp.png The null hypothesis is overwhel-mingly rejected (even at very small significance levels). Inference: The F Test 43 •Test of overall significance of a regression • • • • • • • • •The test of overall significance is reported in most regression packages; the null hypothesis is usually overwhelmingly rejected TP_tmp.png TP_tmp.png TP_tmp.png The null hypothesis states that the explanatory variables are not useful at all in explaining the dependent variable TP_tmp.png Restricted model (regression on constant) Inference: The F Test