Probability and Statistics for Machine Learning

This section covers the essential concepts from probability theory and statistics that are fundamental to machine learning.

Probability Theory

Basic Concepts

  • Sample Space: Set of all possible outcomes
  • Events: Subsets of sample space
  • Probability Axioms: Basic rules
  • Conditional Probability: Dependent events

Random Variables

  • Discrete Variables
  • Continuous Variables
  • Probability Mass Functions
  • Probability Density Functions

Probability Distributions

  • Normal Distribution
  • Binomial Distribution
  • Poisson Distribution
  • Exponential Distribution

Statistical Inference

Point Estimation

  • Maximum Likelihood
  • Method of Moments
  • Bayesian Estimation
  • Bias and Variance

Interval Estimation

  • Confidence Intervals
  • Prediction Intervals
  • Credible Intervals
  • Tolerance Intervals

Hypothesis Testing

  • Null and Alternative Hypotheses
  • Type I and Type II Errors
  • p-values
  • Statistical Power

Statistical Learning Theory

Model Assessment

  • Bias-Variance Decomposition
  • Overfitting and Underfitting
  • Cross-validation
  • Bootstrap Methods

Learning Bounds

  • PAC Learning
  • VC Dimension
  • Generalization Error
  • Sample Complexity

Information Theory

  • Entropy
  • Mutual Information
  • KL Divergence
  • Information Gain

Regression Analysis

Linear Regression

  • Simple Linear Regression
  • Multiple Linear Regression
  • Polynomial Regression
  • Regularization

Model Diagnostics

  • Residual Analysis
  • Goodness of Fit
  • Outlier Detection
  • Influence Analysis

Advanced Methods

  • Generalized Linear Models
  • Non-linear Regression
  • Time Series Regression
  • Survival Analysis

Multivariate Analysis

Correlation

  • Pearson Correlation
  • Spearman Correlation
  • Partial Correlation
  • Canonical Correlation

Dimensionality Reduction

  • Principal Component Analysis
  • Factor Analysis
  • Multidimensional Scaling
  • t-SNE

Clustering

  • K-means
  • Hierarchical Clustering
  • Mixture Models
  • Density-based Clustering

Bayesian Statistics

Bayesian Inference

  • Prior Distributions
  • Likelihood Functions
  • Posterior Distributions
  • Conjugate Priors

Bayesian Methods

  • Bayesian Linear Regression
  • Bayesian Networks
  • Markov Chain Monte Carlo
  • Variational Inference