Now that we have our kai squared result, we can start proving confidence in our results for our coefficients. So beta hat and if you don't know this already. We haven't gone over this. It's x transpose x inverse x transpose y. Under the assumption that Y is normal, X beta sigma squared I. Show that expected value of beta hat is beta and that the variants of beta hat is equal to X transpose X inverse. Sigma squared. And this should be old hat to you by now. I want to point out one comment though. As is normal, what we saw in linear regression that it's really variability in the x's make our variance in our coefficients smaller because of that inverse there. So just like before, you want to increase the variability in your regressors to decrease the variance, the standard error of the coefficients. So that makes sense if we have a linear regression in our estimate of the line is going to be much more variable if we only measure the dots in a really tiny variance of axe. But, if we spread them out all along a very wide variance of the axes, then we'll get a less variable estimate of the line. So, small extra point one on there. The other point I would like to make is that, the co-variance of beta hat. And e is 0. And I can say this because the covariance of beta hat e, is the covariance of x transpose x. Inverse x transpose y. Now the covariance of e is I minus H of x. And y, okay? So that's equal to. X transpose x. Inverse x transpose covariance of y with itself and I- H (x) when I pull it out on the other side. Covariance of y with itself is sigma squared I, so this works out to be x transpose x inverse x transpose times I minus H of x times sigma squared. And then we know that if we multiply this matrix times x transpose, we're going to get 0. So, the covariance of beta hat and e R is zero and them because of the normality that means that beta had in our residuals are unrelated. And similarly our y hat and our residuals are unrelated. We've seen that they're orthogonal, but now we actually see that they're statistically unrelated. Here we go. Okay, now let's prove our T distribution result for our coefficients. So take some linear contrast of the datas, let's say q transpose beta is a linear contrast for q transpose, so q is a p by 1 vector. And q transpose beta then it's some contrast to the betas that we're interested in. So as a simple example, q might be equal to say, 0, 1, 0, 0, 0. And then q transpose beta would just pick off the second component of the beta vector or it can, this contrast would pick off every component of the beta vector. But often we might be interested in the difference between two coefficients and so on. So, natural estimate of que transpose beta is que transpose beta hat. So, que transpose beta hat is clearly normally distributed with mean que beta and variance que transpose x transpose x inverse que sigma squared. Furthermore, since beta hat is independent of the residuals, q transpose beta hat will be independent of the residuals, hence they'll be independent of any function of the residuals. So they're going to be independent of s squared and any function of s squared. Ok, so Q transpose Beta hat, minus Q transpose Beta, divided by square root Q transpose X transpose X Inverse q times sigma squared is going to be normal 0,1. Furthermore we can divide this whole thing now we could divide this whole thing by the square root of (n -p) s squared over sigma square. Which is the denominator is the square root of the Chi squared, divided by it's degrees of, oops I'm sorry. I wanted to divide by an extra N minus p there. Which is the square root of the Chi squared divided by it's degrees of freedom. So. This quantity right here, we've talked about is kai squared. And there is its degrees of freedom. So work all these together, and I'm just going to move over here, I get that, q transpose Beta hat minus q transpose Beta divided by s square root q transpose x transpose x inverse q is exactly a normal (0,1) divided by a kai squared divided by its degrees of freedom where the top and the bottom are independent, so this is a t distribution with n-p Degrees of freedom. So, it is exactly using this result that we get that the individual coefficients from our linear model follow a t distribution, and then this is exactly how we get that the individual coefficients, that we get the t table for the individual coefficients, for example in R when you do summary. Okay, so let's go through a coding example of this, but this is exactly where the results in that T table come from.