We have finally arrived at the last post of our series of the proof that linear regression indeed is a sharp learner.

Recall that in the first post we began by motivating linear regression as a problem on predicting house prices and quickly came to understand there was a beautiful way to frame this probem abstractly:

given any set of features $\mathfrak{X}$ and fd Euclidean space of labels $\mathfrak{y}$, as well as dataspace $\mathfrak{D}$ satisfying the separation condition for a finite dimensional hypothesis space $\mathfrak{H}\subset \mathfrak{y}^{\mathfrak{X}}$, is it possible to find a map $$h:\mathfrak{D} \longrightarrow \mathfrak{H}$$ such that $c(\Delta, h_\Delta)=\min_{h \in \mathfrak{H}} c(\Delta,h)$ where $$ c(\Delta, h)=\sum_{(x,y)\in \Delta}\vert \vert y-h(x)\vert\vert^2 $$

Wen then proved in a subsequent post that this claim was indeed true. This relied heavily on a lesser known linear algebraic object called the pseudo-inverse

Finally, we introduced coordinates on the feature space $\mathfrak{X}$ and label-space $\mathfrak{y}$ and showed in this case that all is right with the world so that indeed the solution to linear regression is given by the normal equation.

Thanks very much for following the series!

Share Comments
comments powered by Disqus