The importance of having a solid grasp over essential concepts of statistics and probability cannot be overstated. Many practitioners in the field actually consider classical (non-neural network) machine learning to be nothing but statistical learning. The subject is vast, and focused planning is critical to cover the most essential concepts:
- Data summaries and descriptive statistics, central tendency, variance, covariance, correlation
- Basic probability: basic idea, expectation, probability calculus, Bayes' theorem, conditional probability
- Probability distribution functions: uniform, normal, binomial, chisquare, Student's t-distribution, central limit theorem
- Sampling, measurement, error, random number generation
- Hypothesis testing, A/B testing, confidence intervals, p-values
- ANOVA, t-test
- Linear regression, regularization
Where You Might Use It
In interviews. If you can show you've mastered these concepts, you will impress the other side of the table fast. And you will use them nearly every day as a data scientist.