1. Variance decomposition

1.1 恒等式

Untitled

SSY = SSE + SSR ↓↓↓

$$ \sum_i(y_i-\bar{y})^2 = \sum_i (y_i-\hat{y}_i)^2 + \sum_i (\hat{y}_i-\bar{y})^2 $$

直观的动态图例请参考:https://online.stat.psu.edu/stat501/lesson/2/2.5

1.2 证明

对$y_i - \bar{y} = (y_i - \hat{y_i}) + (\hat{y_i} - \bar{y})$两边同时平方:

$$ (y_i - \bar{y})^2 = (y_i - \hat{y_i})^2 + (\hat{y_i} - \bar{y})^2+2(y_i - \hat{y_i})(\hat{y_i} - \bar{y}) $$

下面需要证明$\sum_{i=1}^n(y_i - \hat{y_i})(\hat{y_i} - \bar{y})=0$:

$$ \begin{equation*} \begin{aligned} \text{LHS} &= \sum_ie_i(\hat{y_i}-\bar{y}) \\ &= \sum_ie_i\hat{y_i}-\sum_ie_i\bar{y} \\ &= 0-0=0 \\ \end{aligned} \end{equation*} $$

1.3 Test of significance

此前我们用t-test来做回归方程的显著性检验(H0: beta1=0)。这里我们构造一个新的F统计量来做显著性检验:

$$ F_0 = \frac{\text{MSR}}{\text{MSE}} = \frac{\text{SSR}/1}{\text{SSE}/(n-2)} $$

此统计量在H0为真时,服从F(1,n-2)。同时,注意到该F统计量和t统计量的关系:

$$ \begin{equation*} \begin{aligned} t_0^2 &= (\frac{\hat{\beta_1}}{se(\beta_1)})^2 \\ &= \frac{\hat{\beta_1}^2}{s^2/S_{xx}} \\ &= \frac{\hat{\beta_1}^2 S_{xx}}{\text{SSE}/(n-2)} \\ &=\frac{\text{SSR}/1}{\text{SSE}/(n-2)} \\ &= F_0

\end{aligned} \end{equation*} $$