Engineering Math

Probability density and mass functions

Consider an experiment that measures a random variable. We can plot the relative frequency of the measurand landing in different “bins” (ranges of values). This is called a frequency distribution or a probability mass function (PMF).

 Figure 3.5
Figure 3.5: Plot of a probability mass function.

Consider, for instance, a probability mass function as plotted in fig. ¿fig:pmf-generic?, where a frequency \(a_i\) can be interpreted as an estimate of the probability of the measurand being in the \(i\)th interval. The sum of the frequencies must be unity: \[\begin{aligned} \sum_{i=1}^k a_i = 1\end{aligned}\] with \(k\) being the number of bins.

The frequency density distribution is similar to the frequency distribution, but with \(a_i \mapsto a_i/\Delta{}x\), where \(\Delta x\) is the bin width.

If we let the bin width approach zero, we derive the probability density function (PDF) \[\begin{aligned} f(x) = \lim_{\substack{k\rightarrow\infty \\ \Delta x \rightarrow 0}}\ \, \sum_{j=1}^k a_j/\Delta{}x. \end{aligned}\] We typically think of a probability density function \(f\), like the one in fig. ¿fig:pdf-generic? as a function that can be integrated over to find the probability of the random variable (measurand) being in an interval \([a,b]\): \[\begin{aligned} P(x\in[a,b]) &= \int_a^b f(\chi) d\chi. \end{aligned}\] Of course, \[\begin{aligned} P(x \in (-\infty,\infty)) &= \int_{-\infty}^\infty f(\chi) d\chi \\ &= 1.\end{aligned}\]

 Figure 3.6
Figure 3.6: Plot of a probability density function.

We now consider a common PMF and a common PDF.

Binomial PMF

Consider a random binary sequence of length \(n\) such that each element is a random \(0\) or \(1\), generated independently, like \[\begin{aligned} (1, 0, 1, 1, 0, \cdots , 1, 1). \end{aligned}\] Let events \(\{1\}\) and \(\{0\}\) be mutually exclusive and exhaustive and \(P(\{1\}) = p\). The probability of the sequence above occurring is \[\begin{aligned} P((1, 0, 1, 1, 0, \cdots , 1, 1)) &= p (1-p) p p (1-p) \cdots p p.\end{aligned}\] There are \(n \text{ choose } k\), \[\begin{aligned} {n}\choose{k} &= \frac{n!} {k! (n-k)!}, \end{aligned}\] possible combinations of \(k\) ones for \(n\) bits. Therefore, the probability of any combination of \(k\) ones in a series is \[\begin{aligned} \label{eq:binomial_pdf} f(k) &= {n}\choose{k} p^k (1-p)^{n-k}. \end{aligned}\] We call eq. ¿eq:binomial_pdf? the binomial distribution PDF.

Example 3.7

Consider a field sensor that fails for a given measurement with probability p. Given n measurements, plot the binomial PMF as a function of k failed measurements for a few different probabilities of failure p ∈ [0.04,0.25,0.5,0.75,0.96].

We proceed in Python.

Import the necessary libraries.

import numpy as np
import matplotlib.pyplot as plt
from scipy.special import comb

Define the parameters.

n = 100
k_a = np.arange(1, n + 1)  
p_a = np.array([0.04, 0.25, 0.5, 0.75, 0.96])

Define the binomial distribution function.

def binomial(n, k, p):
    return comb(n, k) * (p ** k) * ((1 - p) ** (n - k))

Construct the binomial distribution array.

f_a = np.zeros((len(k_a), len(p_a)))
for i in range(len(k_a)):
    for j in range(len(p_a)):
        f_a[i, j] = binomial(n, k_a[i], p_a[j])

Plot the binomial distribution.

colors = plt.cm.jet(np.linspace(0, 1, len(p_a)))
fig, ax = plt.subplots()
for j in range(len(p_a)):
    plt.bar(k_a, f_a[:, j], color=colors[j], alpha=0.5, label=f'$p = {p_a[j]}$')
plt.legend(loc='best', frameon=False, fontsize='medium')
plt.xlabel('Number of ones in sequence k')
plt.ylabel('Probability')
plt.xlim([0, 100])
plt.show()
 Figure 3.7
Figure 3.7: Binomial PDF for n = 100 measurements and different values of P({1}) = p, the probability of a measurement error.

Note that the symmetry is due to the fact that events {1} and {0} are mutually exclusive and exhaustive.

Gaussian PDF

The Gaussian or normal random variable \(x\) has PDF \[\begin{aligned} f(x) &= \frac{1} {\sigma\sqrt{2\pi}} \exp{\frac{-(x-\mu)^2} {2\sigma^2}}. \end{aligned}\] Although we’re not quite ready to understand these quantities in detail, it can be shown that the parameters \(\mu\) and \(\sigma\) have the following meanings:

  • \(\mu\) is the mean of \(x\),

  • \(\sigma\) is the standard deviation of \(x\), and

  • \(\sigma^2\) is the variance of \(x\).

 Figure 3.8
Figure 3.8: PDF for Gaussian random variable x, mean μ = 0, and standard deviation $\sigma = 1/\sqrt{2}$.

Consider the “bell-shaped” Gaussian PDF in fig. ¿fig:gaussian-pdf?. It is always symmetric. The mean \(\mu\) is its central value and the standard deviation \(\sigma\) is directly related to its width. We will continue to explore the Gaussian distribution in the following lectures, especially in section 4.3.

Online Resources for Section 3.6

No online resources.