Essential Statistics for Data Science · Part 3 of 4
Probability is the language ML speaks. This part of the series moves from describing data to reasoning about it under uncertainty, joint, marginal, and conditional probability, Bayes' rule, and the distributions (normal, Bernoulli, binomial, Poisson, exponential, log-normal) that show up over and over in real datasets.
It closes the loop on preprocessing: when to standardize vs normalize, when log-transforms actually help, and how distributional assumptions quietly determine which models will work on your data.