Further Issues in Using OLS with Time Series Data Ketevani Kapanadze Brno, 2020 Stationary and Weakly Dependent Time Series • The assumptions used so far seem to be too restricitive • Strict exogeneity, homoscedasticity, and no serial correlation are very demanding requirements, especially in the time series context • Statistical inference rests on the validity of the normality assumption • Much weaker assumptions are needed if the sample size is large • A key requirement for large sample analysis of time series is that the time series in question are stationary and weakly dependent • Stationary time series • Loosely speaking, a time series is stationary if its stochastic properties and its temporal dependence structure do not change over time • Weakly dependent time series • Discussion of the weak dependence property • An implication of weak dependence is that the correlation between , and must converge to zero if grows to infinity • For the LLN and the CLT to hold, the individual observations must not be too strongly related to each other; in particular their relation must become weaker (and this fast enough) the farther they are apart • Note that a series may be nonstationary but weakly dependent Stationary and Weakly Dependent Time Series A stochastic process is weakly dependent , if is „almost independent“ of if grows to infinity (for all ). • Stationary stochastic processes • Covariance stationary processes A stochastic process is stationary, if for every collection of indices the joint distribution of , is the same as that of for all integers . A stochastic process is covariance stationary, if its expected value, its variance, and its covariances are constant over time: 1) , 2) , and 3) . Stationary and Weakly Dependent Time Series • Examples for weakly dependent time series • Moving average process of order one (MA(1)) • Autoregressive process of order one (AR(1)) The process is weakly dependent because observations that are more than one time period apart have nothing in common and are therefore uncorrelated. The process is a short moving average of an i.i.d. series et The process carries over to a certain extent the value of the previous period (plus random shocks from an i.i.d. series et) If the stability condition holds, the process is weakly dependent because serial correlation converges to zero as the distance between observations grows to infinity. Stationary and Weakly Dependent Time Series • Assumption TS.1‘ (Linear in parameters) • Same as assumption TS.1 but now the dependent and independent variables are assumed to be stationary and weakly dependent • Assumption TS.2‘ (No perfect collinearity) • Same as assumption TS.2 • Assumption TS.3‘ (Zero conditional mean) • Now the explanatory variables are assumed to be only contemporaneously exogenous rather than strictly exogenous, i.e. The explanatory variables of the same period are uninformative about the mean of the error term Asymptotic Properties of OLS • Theorem 11.1 (Consistency of OLS) • Why is it important to relax the strict exogeneity assumption? • Strict exogeneity is a serious restriction beause it rules out all kinds of dynamic relationships between explanatory variables and the error term • In particular, it rules out feedback from the dep. var. on future values of the explanat. variables (which is very common in economic contexts) • Strict exogeneity precludes the use of lagged dep. var. as regressors Important note: For consistency it would even suffice to assume that the explanatory variables are merely contemporaneously uncorrelated with the error term. Asymptotic Properties of OLS • Why do lagged dependent variables violate strict exogeneity? • OLS estimation in the presence of lagged dependent variables • Under contemporaneous exogeneity, OLS is consistent but biased This is the simplest possible regression model with a lagged dependent variable Contemporanous exogeneity: Strict exogeneity: Strict exogeneity would imply that the error term is uncorrelated with all yt, t=1, … , n-1This leads to a contradiction because: Asymptotic Properties of OLS Asymptotic Properties of OLS • Assumption TS.4‘ (Homoskedasticity) • Assumption TS.5‘ (No serial correlation) • Theorem 11.2 (Asymptotic normality of OLS) • Under assumptions TS.1‘ – TS.5‘, the OLS estimators are asymptotically normally distributed. Further, the usual OLS standard errors, t-statistics and F-statistics are asymptotically valid. The errors are contemporaneously homoskedastic Conditional on the explanatory variables in periods t and s, the errors are uncorrelated • Example: Efficient Markets Hypothesis (EMH) The EMH in a strict form states that information observable to the market prior to week t should not help to predict the return during week t. A simplification assumes in addition that only past returns are considered as relevant information to predict the return in week t. This implies that A simple way to test the EMH is to specify an AR(1) model. Under the EMH assumption,TS.3‘ holds so that an OLS regression can be used to test whether this week‘s returns depend on last week‘s. There is no evidence against the EMH. Including more lagged returns yields similar results. Asymptotic Properties of OLS Using Highly Persistent Time Series in Regression Analysis • Using trend-stationary series in regression analysis • Time series with deterministic time trends are nonstationary • If they are stationary around the trend and in addition weakly dependent, they are called trend-stationary processes • Trend-stationary processes also satisfy assumption TS.1‘ • Using highly persistent time series in regression analysis • Unfortunately many economic time series violate weak dependence because they are highly persistent (= strongly dependent) • In this case OLS methods are generally invalid (unless the CLM hold) • In some cases transformations to weak dependence are possible • Random walks The value today is the accumulation of all past shocks plus an initial value. This is the reason why the random walk is highly persistent: The effect of a shock will be contained in the series forever. The random walk is called random walk because it wanders from the previous position yt-1 by an i.i.d. random amount et The random walk is not covariance stationary because its variance and its covariance depend on time. It is also not weakly dependent because the correlation between observations vanishes very slowly and this depends on how large t is. Using Highly Persistent Time Series in Regression Analysis Using Highly Persistent Time Series in Regression Analysis • Examples for random walk realizations • Random walks with drift This leads to a linear time trend around which the series follows its random walk behaviour. As there is no clear direction in which the random walk develops, it may also wander away from the trend. In addition to the usual random walk mechanism, there is a deterministic increase/decrease (= drift) in each period Otherwise, the random walk with drift has similar properties as the random walk without drift. Random walks with drift are not covariance stationary and not weakly dependent. Using Highly Persistent Time Series in Regression Analysis Using Highly Persistent Time Series in Regression Analysis • Sample path of a random walk with drift • Transformations on highly persistent time series • Order of integration • Weakly dependent time series are integrated of order zero (= I(0)) • If a time series has to be differenced one time in order to obtain a weakly dependent series, it is called integrated of order one (= I(1)) • Examples for I(1) processes • Differencing is often a way to achieve weak dependence Using Highly Persistent Time Series in Regression Analysis • Deciding whether a time series is I(1) • There are statistical tests for testing whether a time series is I(1) (= unit root tests); • Alternatively, look at the sample first order autocorrelation: • If the sample first order autocorrelation is close to one, this suggests that the time series may be highly persistent Measures how strongly adjacent times series observations are related to each other. Using Highly Persistent Time Series in Regression Analysis • Example: Fertility equation This equation could be estimated by OLS if the CLM assumptions hold. These may be questionable, so that one would have to resort to large sample analysis. For large sample analysis, the fertility series and the series of the personal tax exemption have to be stationary and weakly dependent. This is questionable because the two series are highly persistent: It is therefore better to estimate the equation in first differences: Using Highly Persistent Time Series in Regression Analysis Testing for Unit Roots • For the validity of regression analysis, it is crucial to know whether or not dependent or independent variables are highly persistent • Dickey-Fuller test • One can use the t-statistic to test the hypothesis, but under the null, it has not got the tdistribution but the Dickey-Fuller distribution Under the null hypothesis, the process has a unit root. Under the alternative, it is a stable AR(1) process The test is based on an AR(1) regression Next Class Panel Data Models