In linear algebra, a common equation used is A** x = b**, in fact this equation is one of the foundations of linear algebra, which is known as the matrix equation. So it’s when we cannot get a solution to this matrix equation that we implement the least squares method. Which gives us a good approximation for the solution we seek. Completing a few of these problems in which a professor assigned to me, I really began to see the power of this least squares method. One can determine the future profits of an airline, biomass growth, the list goes on.

Here is a summary that I wrote about the least squares problem, where * x *is a vector, while

*x*is a unit vector and likewise for

*&*

**b***b*:

…when we were faced with an over-determined system of equations Α** x = b**, we simply gave up and said “the system has no solution” or “the system is inconsistent”, the points are not collinear. What the least squares method seeks to do is to find an

*x*that minimizes the error or distance with relation to

*. This gives us a solution to the problem, even though it is not an exact solution; it is the “best approximation” of a solution to the problem. The definition of this problem given by David C. Lay in his textbook entitled, Linear Algebra and its Applications is:*

**b**If Α is *m* x *n* and * b *is in R

^{m}a

**least squares solution**of Α

**is an**

*x = b**x*in R

^{n}: ||

**- Α**

*b**x*

**Α**

*|| ≤ ||***-***b***|| for all**

*x**in R*

**x**^{n}

As Lay also points out, no matter what *x* is selected the vector Α*x**will always be in the column space of Α, Col(Α),* As we will see this justifies our use of least squares. The solution to least squares begins with using The Best Approximation Theorem to the subspace Col(Α), so let

*b *=* proj Col(A) to b*

Since there is a *b *in the Col(Α) and an *x *in R^{n} we arrive at Αx=*b *because Αx=*b *is consistent So deriving now by orthogonal decomposition and A^{T} (** b**- Α

*x*) = Ο then by expanding the equation we arrive at A

^{T}

**- A**

*b*^{T}

*x*= Ο and then through algebraic manipulation we have A

^{T}Α

**= A**

*x*^{T}

*which represents our*

**b***normal equations*for Α

**The solution to which is**

*x = b**x*. Wrapping this all up we have A

^{T}Α which is an invertible square matrix when the columns of Α are linearly independent.

For our case *x *= (A^{T} A)^{-1} A^{T}* b *thus resulting in the least squares error for approximation

**Α**

*||***-***b***||. This problem has some remarkable properties for mathematics and after completing a few of these problems one can see the power math has when explaining the world around us and this problem is a classic example of that power…**

*x*The reason I am writing about this is because for me, this was the most intuitive application I have found in a math course and it proves how math applies to the world in so many different ways.

References:

Bentz, Ryan (2014). Least Squares Summary. Blackwood

Echeverria, P. (n.d.). Orthogonal Projections. *Instructor Notes*.

Lay, D. C. (2012). *Linear Algebra and Its Applications.* Upper Saddle River: Addison Wesley.