1/36 Econometrics Multiple Regression Analyses: Statistical Inference Anna Donina Lecture 4 Sample regression function vs Population regression function Sample regression function vs Population regression function 12/36 1. Linearity 2. Random sampling 3. No perfect collinearity 4. Zero conditional mean 5. Homoskedasticity 6. Normality of the error term ▪ OLS is unbiased – assumptions (1-4) ▪ Gauss-Markov theorem: OLS is BLUE – assumptions (1-5) Classical Assumptions 10 • We are going to discuss how hypotheses about coefficients can be tested in regression models • We will explain what significance of coefficients mean • We will learn how to read regression output – Wooldridge Chapter 4; – Studenmund Chapter 5.1-5.4 Today’s Lecture 11 • Statistical inference in the regression model ▪ Hypothesis tests about population parameters ▪ Construction of confidence intervals • Sampling distributions of the OLS estimators ▪ The OLS estimators are random variables ▪ We already know their expected values and their variances ▪ For hypothesis testing we need to know their distribution Multiple Regression Analyses: Inference Inference: Sampling distributions of the OLS Estimators 12 • Assumption 6 (Normality of error terms) independently of It is assumed that the unobserved factors are normally distributed around the population regression function. The form and the variance of the distribution does not depend on any of the explanatory variables. Inference: Sampling distributions of the OLS Estimators 13 • Discussion of the normality assumption • The error term is the sum of „many“ different unobserved factors • Sums of independent factors are normally distributed (CLT) • Problems: ▪ How many different factors? Observations large enough? ▪ Possibly very heterogenuous distributions of individual factors ▪ How independent are the different factors? • The normality of the error term is an empirical question • At least the error distribution should be „close“ to normal • In many cases, normality is questionable or impossible by definition Inference: Sampling distributions of the OLS Estimators 14 • Discussion of the normality assumption (cont.) • Examples where normality cannot hold: • Wages (nonnegative; also: minimum wage) • Unemployment (indicator variable, takes on only 1 or 0) • In some cases, normality can be achieved through transformations of the dependent variable • Under normality, OLS is the best (even nonlinear) unbiased estimator • Important: For the purposes of statistical inference, the assumption of normality can be replaced by a large sample size (CLT) Inference: Sampling distributions of the OLS Estimators, CLT 15 Source: https://statisticsbyjim.com/basics/central-limit-theorem/ 16 • We cannot prove that a given hypothesis is “correct” using hypothesis testing • All we can do is to state that a particular sample conforms to a particular hypothesis • We can often reject a given hypothesis with a certain degree of confidence • In such a case, we conclude that it is very unlikely the sample result would have been observed if the hypothesized theory were correct Multiple Regression Analyses: Hypothesis Testing 17 Step 1: state explicitly the hypothesis to be tested • Null hypothesis: statement of the range of values of the regression coefficient that would be expected to occur if the researcher‘s theory were not correct • Alternative hypothesis: specification of the range of values of the coefficient that would be expected to occur if the researcher‘s theory were correct • In other words, we define the null hypothesis as the result we do not expect Multiple Regression Analyses: Hypothesis Testing 18 Step 2: set the significance level (α) - chance that you will accept your alternative hypothesis when your null hypothesis is actually true. - The smaller the significance level, the greater the burden of proof needed to reject the null hypothesis, or in other words, to support the alternative hypothesis. Multiple Regression Analyses: Hypothesis Testing 19 • It would be unrealistic to think that conclusions drawn from regression analysis will always be right • There are two types of errors we can make: – Type I: we reject a true null hypothesis – Type II: We fail to reject a false null hypothesis Type I and Type II Errors 20 Example: • H0: The defendant is innocent • HA: The defendant is guilty – Type I error: sending an innocent person to jail – Type II error: freeing a guilty person • Lowering the probability of Type I error means increasing the probability of Type II error; • In hypothesis testing, we focus on Type I error and we ensure that its probability is not unreasonably large Type I and Type II Errors 21 Type I and Type II Errors Inference: The t Test 22 • Testing hypotheses about a single population parameter • Theorem 4.2 (t-distribution for standardized estimators) • Null hypothesis Under assumptions 1 – 6: If the standardization is done using the estimated standard deviation (= standard error), the normal distribution is replaced by a t-distribution The population parameter is equal to zero, i.e. after controlling for the other independent variables, there is no effect of xj on y Note: The t-distribution is close to the standard normal distribution if n-k-1 is large. Inference: The t Test 23 • t-statistic (or t-ratio) • Distribution of the t-statistic if the null hypothesis is true • Goal: Define a rejection rule so that, if it is true, H0 is rejected only with a small probability (= significance level, e.g. 5%) The t-statistic will be used to test the above null hypothesis. The farther the estimated coefficient is away from zero, the less likely it is that the null hypothesis holds true. But what does „far“ away from zero mean? This depends on the variability of the estimated coefficient, i.e. its standard deviation. The t-statistic measures how many estimated standard deviations the estimated coefficient is away from zero. Inference: The t Test 24 • Testing against one-sided alternatives (greater than zero) Test ag against Reject the null hypothesis in favour of the alternative hypothesis if the estimated coefficient is „too large“ (i.e. larger than a critical value). Construct the critical value so that, if the null hypothesis is true, it is rejected in, for example, 5% of the cases. In the given example, this is the point of the t-distribution with 28 degrees of freedom that is exceeded in 5% of the cases. ! Reject if t-statistic greater than 1.701 Inference: The t Test 25 • Example: Wage equation • Test whether, after controlling for education and tenure, higher work experience leads to higher hourly wages Standard errors Test against . One would either expect a positive effect of experience on hourly wage or no effect at all. Inference: The t Test 26 • Example: Wage equation (cont.) „The effect of experience on hourly wage is statistically greater than zero at the 5% (and even at the 1%) significance level.“ t-statistic Degrees of freedom; here the standard normal approximation applies Critical values for the 5% and the 1% significance level (these are conventional significance levels). The null hypothesis is rejected because the t-statistic exceeds the critical value. Inference: The t Test 27 • Testing against one-sided alternatives (less than zero) Test a against . Reject the null hypothesis in favour of the alternative hypothesis if the estimated coefficient is „too small“ (i.e. smaller than a critical value). Construct the critical value so that, if the null hypothesis is true, it is rejected in, for example, 5% of the cases. In the given example, this is the point of the t-distribution with 18 degrees of freedom so that 5% of the cases are below the point. ! Reject if t-statistic less than -1.734 Inference: The t Test 28 • Example: Student performance and school size • Test whether smaller school size leads to better student performance Test against . Do larger schools hamper student performance or is there no such effect? Percentage of students passing maths test Average annual teacher compensation Staff per one thousand students School enrollment (= school size) Inference: The t Test 29 • Example: Student performance and school size (cont.) One cannot reject the hypothesis that there is no effect of school size on student performance (not even for a larger significance level of 15%). t-statistic Degrees of freedom; here the standard normal approximation applies Critical values for the 5% and the 15% significance level. The null hypothesis is not rejected because the t-statistic is not smaller than the critical value. Inference: The t Test 30 • Example: Student performance and school size (cont.) • Alternative specification of functional form: Test against . R-squared slightly higher Inference: The t Test 31 • Example: Student performance and school size (cont.) The hypothesis that there is no effect of school size on student performance can be rejected in favor of the hypothesis that the effect is negative. t-statistic Critical value for the 5% significance level ! reject null hypothesis How large is the effect? + 10% enrollment ! -0.129 percentage points students pass test (small effect) Inference: The t Test 32 • Testing against two-sided alternatives Test against . Reject the null hypothesis in favour of the alternative hypothesis if the absolute value of the estimated coefficient is too large. Construct the critical value so that, if the null hypothesis is true, it is rejected in, for example, 5% of the cases. In the given example, these are the points of the t-distribution so that 5% of the cases lie in the two tails. ! Reject if absolute value of t-statistic is less than -2.06 or greater than 2.06 Inference: The t Test 33 • Example: Determinants of college GPA Lectures missed per week The effects of hsGPA and skipped are significantly different from zero at the 1% significance level. The effect of ACT is not significantly different from zero, not even at the 10% significance level. For critical values, use standard normal distribution Inference: The t Test 34 • „Statistically significant“ variables in a regression • If a regression coefficient is different from zero in a two-sided test, the corresponding variable is said to be „statistically significant“ • If the number of degrees of freedom is large enough so that the normal approximation applies, the following rules of thumb apply: „statistically significant at 10 % level“ „statistically significant at 5 % level“ „statistically significant at 1 % level“ Inference: The t Test 35 • Guidelines for discussing economic and statistical significance • If a variable is statistically significant, discuss the magnitude of the coefficient to get an idea of its economic or practical importance • The fact that a coefficient is statistically significant does not necessarily mean it is economically or practically significant! • If a variable is statistically and economically important but has the „wrong“ sign, the regression model might be misspecified • If a variable is statistically insignificant at the usual levels (10%, 5%, 1%), one may think of dropping it from the regression • If the sample size is small, effects might be imprecisely estimated so that the case for dropping insignificant variables is less strong Inference: The t Test 36 • Testing more general hypotheses about a regression coefficient • Null hypothesis • t-statistic • The test works exactly as before, except that the hypothesized value is substracted from the estimate when forming the statistic Hypothesized value of the coefficient Inference: The t Test 37 • Example: Campus crime and enrollment • An interesting hypothesis is whether crime increases by one percent if enrollment is increased by one percent The hypothesis is rejected at the 5% level Estimate is different from one but is this difference statistically significant? Inference: The t Test 38 • Computing p-values for t-tests • If the significance level is made smaller and smaller, there will be a point where the null hypothesis cannot be rejected anymore • The reason is that, by lowering the significance level, one wants to avoid more and more to make the error of rejecting a correct H0 • The smallest significance level at which the null hypothesis is still rejected, is called the p-value of the hypothesis test • A small p-value is evidence against the null hypothesis because one would reject the null hypothesis even at small significance levels • A large p-value is evidence in favor of the null hypothesis • P-values are more informative than tests at fixed significance levels Inference: The t Test 39 • How the p-value is computed (here: two-sided test) The p-value is the significance level at which one is indifferent between rejecting and not rejecting the null hypothesis. In the two-sided case, the p-value is thus the probability that the t-distributed variable takes on a larger absolute value than the realized value of the test statistic, e.g.: From this, it is clear that a null hypothesis is rejected if and only if the corresponding pvalue is smaller than the significance level. For example, for a significance level of 5% the t-statistic would not lie in the rejection region. value of test statistic These would be the critical values for a 5% significance level Inference: Confidence Intervals 40 Critical value of two-sided test • Confidence intervals • Simple manipulation of the result in Theorem 4.2 implies that • Interpretation of the confidence interval • The bounds of the interval are random • In repeated samples, the interval that is constructed in the above way will cover the population regression coefficient in 95% of the cases Lower bound of the Confidence interval Upper bound of the Confidence interval Confidence level Inference: Confidence Intervals 41 • Confidence intervals for typical confidence levels • Relationship between confidence intervals and hypotheses tests reject in favor of Use rules of thumb Inference: Confidence Intervals 42 • Example: Model of firms‘ R&D expenditures Spending on R&D Annual sales Profits as percentage of sales The effect of sales on R&D is relatively precisely estimated as the interval is narrow. Moreover, the effect is significantly different from zero because zero is outside the interval. This effect is imprecisely estimated as the interval is very wide. It is not even statistically significant because zero lies in the interval. (0.0128 ) 0.0217 (0.0128 ) Inference: Three ways to conclude about the t-test Rejection region: • No need to know the test statistic in order to determine the rejection region • Critical value around two at the usual 5% level Confidence interval • Interesting in its own right • No need to specify the hypothesized value first • Problem with one-tailed tests P-value • No need to specify the significance level in advance, or: results immediately seen for varying significance levels Inference: Testing hypotheses about a linear combination of parameters 44 • Example: Return to education at 2 year vs. at 4 year colleges Years of education at 2 year colleges Years of education at 4 year colleges Test against . A possible test statistic would be: The difference between the estimates is normalized by the estimated standard deviation of the difference. The null hypothesis would have to be rejected if the statistic is „too negative“ to believe that the true difference between the parameters is equal to zero. Inference: Testing hypotheses about a linear combination of parameters 45 • Impossible to compute with standard regression output because • Alternative method Usually not available in regression output Define and test against . a new regressor (= total years of college)Insert into original regression 46 • Estimation results • This method works always for single linear hypotheses Total years of college Hypothesis is rejected at 10% level but not at 5% level Inference: Testing hypotheses about a linear combination of parameters Estimation: Goodness-of-Fit measure 47 How well the model fits our data (the goal is to end up with a single number, ideally expressed as a percentage) • total sum of squares (SST) • 𝑆𝑆𝑇 = σ𝑖=1 𝑛 (𝑦𝑖 − ത𝑦)2 • Explained sum of squares (SSE) • 𝑆𝑆𝐸 = σ𝑖=1 𝑛 (ෝ𝑦𝑖 − ത𝑦)2 • Residual sum of squares (SSR) • 𝑆𝑆𝑅 = σ𝑖=1 𝑛 (𝑦𝑖 − ෝ𝑦𝑖)2= σ𝑖=1 𝑛 ෝ𝑢𝑖 2 Estimation: Goodness-of-Fit measure 48 Important algebraic identity: 𝑆𝑆𝑇 = 𝑆𝑆𝐸 + 𝑆𝑆𝑅 ➢ nice way of describing the goodness of fit of the model R-squared of the regression (or the coefficient of determination): • 𝑅2 = 𝑆𝑆𝐸 𝑆𝑆𝑇 = 1 − 𝑆𝑆𝑅 𝑆𝑆𝑇 Properties of R2 : • 0 ≤ 𝑅2 ≤ 1 • 𝑅2 = 1 only if 𝑆𝑆𝑅 = 0, which means that all residuals are zero, and all observations lie exactly on the regression line • 𝑅2 = 0 only if 𝑆𝑆𝐸 = 0, which implies that ෢𝛽1 = 0, ෢𝛽0 = ത𝑦 𝑅2 is the fraction of the sample variation in y that is explained by x Estimation: Goodness-of-Fit measure 49 • Goodness-of-Fit • Decomposition of total variation • R-squared • Alternative expression for R-squared Notice that R-squared can only increase if another explanatory variable is added to the regression R-squared is equal to the squared correlation coefficient between the actual and the predicted value of the dependent variable Inference: The F Test 50 • Testing multiple linear restrictions: The F-test • Testing exclusion restrictions Years in the league Average number of games per year Salary of major league baseball player Batting average Home runs per year Runs batted in per year against Test whether performance measures have no effect/can be exluded from regression. 51 • Estimation of the unrestricted model None of these variabels are statistically significant when tested individually Idea: How would the model fit be if these variables were dropped from the regression? Inference: The F Test Inference: The F Test 52 • Estimation of the restricted model • Test statistic The sum of squared residuals necessarily increases, but is the increase statistically significant? The relative increase of the sum of squared residuals when going from H1 to H0 follows a F-distribution (if the null hypothesis H0 is correct) Number of restrictions Inference: The F Test 53 • Rejection rule A F-distributed variable only takes on positive values. This corresponds to the fact that the sum of squared residuals can only increase if one moves from H1 to H0. Choose the critical value so that the null hypothesis is rejected in, for example, 5% of the cases, although it is true. Inference: The F Test 54 • Test decision in example • Discussion ➢ The three variables are „jointly significant“ ➢ They were not significant when tested individually ➢ The likely reason is multicollinearity between them Number of restrictions to be tested Degrees of freedom in the unrestricted model The null hypothesis is overwhelmingly rejected (even at very small significance levels). Inference: The F Test 55 • Test of overall significance of a regression • The test of overall significance is reported in most regression packages; the null hypothesis is usually overwhelmingly rejected The null hypothesis states that the explanatory variables are not useful at all in explaining the dependent variable Restricted model (regression on constant)