Stephen Stigler’s *The History of Statistics* is not for the faint of heart. It’s a deep exploration of the origins of statistics as a form of mathematical inquiry, from astronomy and geodesy in the late 18th century to social science in the 19th century. It assumes you’re familiar with modern proofs of results like the weak law of large numbers and the normal approximation to the binomial distribution. It’s an intellectual challenge, and I’m loving it.

As I’ve been working through Stigler’s book, I’ve been taking the time to learn (or relearn) derivations that are unfamiliar to me. One of those is Legendre’s original exposition of the method of least squares. I’m comfortable with modern approaches to find the best-fitting curve to a set of points, but there were parts of Legendre’s formulation that felt foreign to me.

In order to shore up my own understanding and connect it to Legendre’s exposition, I worked through a number of resources, which I share below:

- Here is Legendre’s original work, translated into English
- Steven Miller’s explanation of the proof in modern terms is remarkably clear
- This textbook explanation offers different perspectives from geometry, algebra, and calculus
- The MathWorld article has a deep yet concise explanation, along with some simplifications that appear in many statistics textbooks
- Though dry, this video covers the modern derivation thoroughly
- This video demonstrates an application of least squares

Finally, there is some debate about who developed the method first, Gauss or Legendre. Legendre published first in 1805, but Gauss claimed he had discovered the method by 1795. Stephen Stigler has a fascinating essay on this priority dispute.