Least Squares and Statistics
Last time I talked about the beautiful linear Algebra interpretation of Least Squares which adds a wonderful geometric interpretation to the analytical footwork we had to do before. This time I will borrow from a video I saw on Khan Academy a long time ago. I connects some of the elemental principles of statistics with regression.
The basic idea is similar to the analytical approach. But this time, we will only try to fit a line to $$M$$ $$x$$/$$y$$ pairs. So, we have a model of the form
The squared errors between model and data is then
Of course, we search for the global minima of this solution with respect to the parameters $$m$$ and $$c$$. So let's find the derivations:
We can conveniently drop the 2 that appears in front of all terms. We will now rewrite these equations by using the definition of the sample mean:
This gives:
Let's loose the $$M$$ s and solve for $$m$$ and $$c$$.
If you look closely at the term for the slope $$m$$ you see that this is actually just the covariance of $$x$$ and $$y$$ divided by the variance of $$x$$. I find this very intriguing.