Probability Distributions
Understanding common probability distributions, their properties, and applications in machine learning and statistics.
Probability distributions describe the likelihood of different outcomes in random phenomena and are fundamental to statistical modeling and machine learning.
Discrete Distributions
Bernoulli Distribution
- PMF: ,
- Mean:
- Variance:
class BernoulliDistribution:
def __init__(self, p):
self.p = p
def pmf(self, k):
return self.p if k == 1 else (1 - self.p)
def sample(self, size=1):
return np.random.binomial(1, self.p, size)
def mean(self):
return self.p
def variance(self):
return self.p * (1 - self.p)
Binomial Distribution
- PMF:
- Mean:
- Variance:
class BinomialDistribution:
def __init__(self, n, p):
self.n = n
self.p = p
def pmf(self, k):
return stats.binom.pmf(k, self.n, self.p)
def cdf(self, k):
return stats.binom.cdf(k, self.n, self.p)
def sample(self, size=1):
return np.random.binomial(self.n, self.p, size)
def mean(self):
return self.n * self.p
def variance(self):
return self.n * self.p * (1 - self.p)
Poisson Distribution
- PMF:
- Mean:
- Variance:
class PoissonDistribution:
def __init__(self, lambda_):
self.lambda_ = lambda_
def pmf(self, k):
return stats.poisson.pmf(k, self.lambda_)
def cdf(self, k):
return stats.poisson.cdf(k, self.lambda_)
def sample(self, size=1):
return np.random.poisson(self.lambda_, size)
Geometric Distribution
- Success probability
- Memory-less property
- Applications
- Waiting time problems
Continuous Distributions
Normal Distribution
- PDF:
- CDF:
class NormalDistribution:
def __init__(self, mu=0, sigma=1):
self.mu = mu
self.sigma = sigma
def pdf(self, x):
return stats.norm.pdf(x, self.mu, self.sigma)
def cdf(self, x):
return stats.norm.cdf(x, self.mu, self.sigma)
def sample(self, size=1):
return np.random.normal(self.mu, self.sigma, size)
def quantile(self, p):
return stats.norm.ppf(p, self.mu, self.sigma)
Uniform Distribution
- Definition
- Properties
- Random number generation
- Sampling applications
Exponential Distribution
- Rate parameter
- Memory-less property
- Relationship to Poisson
- Survival analysis
Student's t-Distribution
- Degrees of freedom
- Relationship to normal
- Robust statistics
- Small sample inference
Multivariate Distributions
-
Multivariate Normal
- Parameters
- Properties
- Covariance structure
- Applications
-
Dirichlet Distribution
- Parameters
- Properties
- Relationship to Beta
- Topic modeling
Special Distributions
Beta Distribution
- Parameters
- Properties
- Conjugate priors
- Probability modeling
Gamma Distribution
- Shape and scale
- Properties
- Relationship to others
- Applications
Applications in Machine Learning
Model Building
- Distribution assumptions
- Parameter estimation
- Model selection
- Validation
Probabilistic Models
- Mixture models
- Latent variable models
- Bayesian networks
- Generative models
Sampling and Estimation
-
Sampling Methods
- Direct sampling
- Rejection sampling
- Importance sampling
- MCMC methods
-
Parameter Estimation
- Maximum likelihood
- Method of moments
- Bayesian estimation
- Distribution fitting
Practical Considerations
Implementation
- Numerical stability
- Random number generation
- Library support
- Computational efficiency
Common Challenges
- Parameter estimation
- Model selection
- Goodness of fit
- Distribution testing