Covariance Matrix Estimation

As a necessary precursor to estimation of the variance of $\hat\sigma^2$ the covariance matrix estimator, we need to find $\mathsf{E}{(\hat u’ \hat u)}$ using the Residual Maker M. First, vector $u$ of Least Squares residuals:

$$\hat u = y – X\hat\beta.$$

Substituting for $\hat\beta$,
\begin{align*}\hat u &= y – X (X’X)^{-1} X’ y \\
&= [I_n – X (X’X)^{-1} X’]y = My\end{align*}

M is $n \times n$ Symmetric: $M = M’$
M is Idempotent: $M = M^2$
M is Orthogonal to X: $M’X = 0$.

Intuition: When X is regressed on X, a perfect fit will result and the residuals will be zero.

\begin{align*}
\hat u &= My = M(X\beta +u)\\
&= MX\beta + Mu = Mu\\
\therefore \hat u’ &= (Mu)’ = u’M’\\
\hat u’ \hat u &= u’M’Mu = u’Mu\\
(\hat u’ \hat u &= u_1^2 + u_2^2 + \dots + u_n^2 )%equiv text{tr}(\hat u \hat u’) \text{ sum of diagonals of } \hat u \hat u’ \text{!}
\end{align*}
expectation of the trace is the trace of the expectation. M is constant.

$\mathsf{E}{(\hat u’ \hat u)}$ $=$ $\mathsf{E}{[\text{{tr}}(u’ M u)]}$ $=$ $\text{tr}[M \mathsf{E}{(u u’)}]$ $=$ $\sigma^2 \text{tr}[M]$ $=$ $\sigma^2 \text{tr}[M]$

note that,
\begin{align*}
\text{tr}(X (X’X)^{-1} X’) &= \text{tr}((X’X)^{-1} (X’X))\\
&= tr(I_k)\\
\therefore \mathsf{E}{(\hat u’ \hat u)} &= \sigma^2( \text{tr}(I_n) – \text{tr}(I_k))\\
&= \sigma^2( n- k)\\
\therefore \mathsf{E}{(\hat \sigma^2)} &= \dfrac{ \sigma^2( n- k)}{n-k} = \sigma^2
\end{align*}
“Annihilator” M = I – P. For A, C and B, tr(ACB) = tr(CBA) = tr(BAC), tr(A+C) = tr(A) + tr(C)