At this point, I'll spare you the linear algebra, and give you the answer, which you can check by direct substitution:
(10)
Figure 3 shows the fitted data. From this graph, it appears that a straight line is not a very good fit to the data--we might have been better off with a quadratic. But for the straight line that we assumed, this certainly looks like the best fit we could hope for.
Elegance
Now that we've seen how to do the math for a simple 4-point example, let's try to generalize it. To do so, we're going to have to lean on summation notation, but if anything, it comes out much cleaner than the math of our example. Take another look at Equation 5:
(11)
Substituting from Equation 4 gives:
(12)
Expanding the polynomial gives:
(13)
As we expand into individual sums, remember that a and b are the same for all values of i, so we can extract them from the sum operators. Only the terms involving xi and yi actually get summed. To simplify the notation, I'm going to drop the summation range info, except for the first sum. We get:
(14)
The value of that first sum is simply n, the number of data points.
As in the example, we seek the minimum of M by taking its derivatives with respect to a and b. We get:
(15)
Setting both of these to zero (and dividing both by two) gives our two linear equations in a and b:
(16)