# Learning statistics

Alex Popescu has an interesting post entitled “Programmers Need to Learn Statistics“. He mentions hating having to learn about probability and statistics in college. I never took a statistics course in college, but I did take Probability (Math 361 at UIUC) and let’s just say I did very poorly.

So it’s a little disconcerting when one realizes that “machine learning” and “data science” are actually both just fancy terms for “Applied Statistics”. So, in any case, I’ve been trying to get smarter at learning statistics as I go along, with the help of some of the resources below:

- OpenIntro Statistics – A free textbook on introductory statistics (you can also order the paperback at Amazon for less than $10). The OpenIntro project aims to produce free textbooks in many different subject areas. On a related note: Boundless is a new startup devoted to producing free textbook content, but they don’t have a Statistics textbook yet.
- Elements of Statistical Learning – Another textbook that you can download for free. Covers more advanced topics in statistical/machine learning.
- Think Stats – Another book free for download, this one focusing on using Python for statistical computing. Also published by O’Reilly.
- Think Bayes – Another free textbook recommended by a colleague. You probably should know some of the more basic statistics material before diving too deeply into Bayesian statistics.
- I’ve mentioned it before, but Jeff Leek’s Coursera course on Data Analysis (video and slides) gives a good background on some statistical concepts like distributions, regression, etc. while using the R programming language to perform data analysis and some basic machine learning techniques.

There are many other resources available at extra cost of course, but the above should get you started learning enough statistics to be dangerous and you won’t have to spend money on a textbook, an activity which virtually no one outside of a college course would ever do willingly.