Probability Distributions

Understanding common probability distributions, their properties, and applications in machine learning and statistics.

Probability distributions describe the likelihood of different outcomes in random phenomena and are fundamental to statistical modeling and machine learning.

Discrete Distributions

Bernoulli Distribution

  • PMF: P(X=k)=pk(1p)1kP(X = k) = p^k(1-p)^{1-k}, k{0,1}k \in \{0,1\}
  • Mean: μ=p\mu = p
  • Variance: σ2=p(1p)\sigma^2 = p(1-p)
class BernoulliDistribution:
    def __init__(self, p):
        self.p = p
    
    def pmf(self, k):
        return self.p if k == 1 else (1 - self.p)
    
    def sample(self, size=1):
        return np.random.binomial(1, self.p, size)
    
    def mean(self):
        return self.p
    
    def variance(self):
        return self.p * (1 - self.p)

Binomial Distribution

  • PMF: P(X=k)=(nk)pk(1p)nkP(X = k) = \binom{n}{k}p^k(1-p)^{n-k}
  • Mean: μ=np\mu = np
  • Variance: σ2=np(1p)\sigma^2 = np(1-p)
class BinomialDistribution:
    def __init__(self, n, p):
        self.n = n
        self.p = p
    
    def pmf(self, k):
        return stats.binom.pmf(k, self.n, self.p)
    
    def cdf(self, k):
        return stats.binom.cdf(k, self.n, self.p)
    
    def sample(self, size=1):
        return np.random.binomial(self.n, self.p, size)
    
    def mean(self):
        return self.n * self.p
    
    def variance(self):
        return self.n * self.p * (1 - self.p)

Poisson Distribution

  • PMF: P(X=k)=λkeλk!P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}
  • Mean: μ=λ\mu = \lambda
  • Variance: σ2=λ\sigma^2 = \lambda
class PoissonDistribution:
    def __init__(self, lambda_):
        self.lambda_ = lambda_
    
    def pmf(self, k):
        return stats.poisson.pmf(k, self.lambda_)
    
    def cdf(self, k):
        return stats.poisson.cdf(k, self.lambda_)
    
    def sample(self, size=1):
        return np.random.poisson(self.lambda_, size)

Geometric Distribution

  • Success probability
  • Memory-less property
  • Applications
  • Waiting time problems

Continuous Distributions

Normal Distribution

  • PDF: f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}
  • CDF: F(x)=12[1+erf(xμσ2)]F(x) = \frac{1}{2}[1 + \text{erf}(\frac{x-\mu}{\sigma\sqrt{2}})]
class NormalDistribution:
    def __init__(self, mu=0, sigma=1):
        self.mu = mu
        self.sigma = sigma
    
    def pdf(self, x):
        return stats.norm.pdf(x, self.mu, self.sigma)
    
    def cdf(self, x):
        return stats.norm.cdf(x, self.mu, self.sigma)
    
    def sample(self, size=1):
        return np.random.normal(self.mu, self.sigma, size)
    
    def quantile(self, p):
        return stats.norm.ppf(p, self.mu, self.sigma)

Uniform Distribution

  • Definition
  • Properties
  • Random number generation
  • Sampling applications

Exponential Distribution

  • Rate parameter
  • Memory-less property
  • Relationship to Poisson
  • Survival analysis

Student's t-Distribution

  • Degrees of freedom
  • Relationship to normal
  • Robust statistics
  • Small sample inference

Multivariate Distributions

  1. Multivariate Normal

    • Parameters
    • Properties
    • Covariance structure
    • Applications
  2. Dirichlet Distribution

    • Parameters
    • Properties
    • Relationship to Beta
    • Topic modeling

Special Distributions

Beta Distribution

  • Parameters
  • Properties
  • Conjugate priors
  • Probability modeling

Gamma Distribution

  • Shape and scale
  • Properties
  • Relationship to others
  • Applications

Applications in Machine Learning

Model Building

  • Distribution assumptions
  • Parameter estimation
  • Model selection
  • Validation

Probabilistic Models

  • Mixture models
  • Latent variable models
  • Bayesian networks
  • Generative models

Sampling and Estimation

  1. Sampling Methods

    • Direct sampling
    • Rejection sampling
    • Importance sampling
    • MCMC methods
  2. Parameter Estimation

    • Maximum likelihood
    • Method of moments
    • Bayesian estimation
    • Distribution fitting

Practical Considerations

Implementation

  • Numerical stability
  • Random number generation
  • Library support
  • Computational efficiency

Common Challenges

  • Parameter estimation
  • Model selection
  • Goodness of fit
  • Distribution testing