Problems with autocorrelated errors

在此前讨论的线性回归模型中,一个重要的假设是“误差项独立(independent error terms)”。本节我们将讨论在违背此假设的情况下线性回归如何进行。特别的,我们考虑一种情况:误差项correlated over time。这种情形又称为误差项autocorrelated或serially correlated。当误差项是autocorrelated时,我们使用OLS便会遇到一些问题:

那么何种情况下更有可能出现误差项自相关呢?When you are working with data that are collected repeatedly across time, just be careful about that!!!

Time series: ACF & PACF

time series is a sequence of measurements of the same variable(s) made over time. The measurements are usually made at evenly spaced times(e.g. monthly or yearly). For example, we have a y-variable time series $\{y_t\}_{t=1,2,3,\cdots}$

$$ y_1,y_2,y_3,\cdots\ y_t, \cdots $$

下面介绍两个概念——ACF与PACF。

**AutoCorrelation Function(ACF)**表示时间序列中某两项的总相关关系(total),数值上等于$\text{Corr}(y_t, y_{t-k})$;**Partial AutoCorrelation Function(PACF)**表示时间序列中某两项的净相关关系(pure),数值上等于$y_t = \beta_0 + \beta_1y_{t-1} + \beta_2y_{t-2}+ \cdots + \beta_ky_{t-k}+ \cdots$中的$\beta_k$。

例如,我们讨论lag 2 ACF与PACF:

Untitled

实践中PACF更有用:利用「PACF & lag plot」确定autoregressive model的阶数(order)

Autoregressive model

An autoregressive model is when a value from a time series is regressed on previous values from that same time series. For example, we have a y-variable measured as a time series and regress $y_t$ on $y_{t-1}$: