Transforming Random Variables
TODO: describe the theory and formulae
For random variables \(X\) and \(Y\) with PDFs \(f_X\) and \(f_Y\), and with invertible transformation \(Y = g(X)\), we have the linear approximation \[ f_Y(y) = \left.\frac{1}{|d y/d x|} f_X(x)\right|_{x \mapsto g^{-1}(y)}. \qquad{(1)}\]
Suppose we are to probabilistically quantify a parachutist’s chances of landing within a certain horizontal distance of a landing target, accounting for random wind displacements. Develop a PDF for the random variable R, the landing distance from the target.
Without much intuition, no data, or a very good physical model of the situation, we are left to bootstrap a solution. A toehold can perhaps be found by narrowing the problem to a drop of a fixed, relatively short distance, such as that shown in [@fig:skyfall1].
For each vertical drop of 10 m, we might expect a horizontal displacement of a few meters. Without any information about average prevailing winds, we cannot expect any particular direction to be most likely. It seems more likely that wind gusts would displace the parachutist a small amount than a large amount, and even less likely to displace a very large amount. These facts suggest a reasonable model to start with is a Gaussian distribution with PDF $$ f_X(x) = \frac{1}{\sqrt{2 \pi} \sigma} \exp\frac{-(x - \mu)^2}{2 \sigma^2}, $$ where μ = 0 m and σ = 5 m. This model could clearly be improved with some data or a detailed analysis of the physics involved, but this seems to be a reasonable place to begin.
From here, we can extrapolate. For one 10-m drop, the displacement random variable is X. For two 10-m drops, the displacement random variable is 2X, and so on. We conclude that for N drops of 10 m, the landing displacement random variable R is R = NX. Here we have assumed the parachutist lands after N drops of 10 m. Another way of writing this is R = h(X) = NX. The function h transforms random variable X (with value x) to random variable R with value (r).
We can apply [@eq:rv_transformation_invertible] directly to find the PDF of R as follows: $$\begin{align} f_Y(y) &= \left.\frac{1}{|d r/d x|} f_X(x)\right|_{x \mapsto h^{-1}(r)} \\ &= \frac{1}{N} \cdot \frac{1}{\sqrt{2 \pi} \sigma} \exp\frac{-(r/N - \mu)^2}{2 \sigma^2}. \end{align}$$ Letting μ′ = Nμ and σ′ = Nσ, we obtain $$ f_R(r) = \frac{1}{\sqrt{2 \pi} \sigma'} \exp\frac{-(x - \mu')^2}{2 \sigma'^2}. $$ That is, R also has a Gaussian PDF. We see that the linear transformation has simply transformed the mean μ and standard deviation σ accordingly.
We observe that for greater N (higher jumps), the standard deviation is also greater. This is an intuitive result. We now turn to Python for graphical and simulation purposes.
Load the necessary packages:
import numpy as np
import sympy as sp
import matplotlib.pyplot as pltDefine fixed parameters:
mu = 0.0 # Mean of the Gaussian distribution for the 10 m drop
sigma = 5.0 # Standard deviation of the Gaussian distribution for the 10 m dropDefine the 10 m drop Gaussian distribution fX(x) symbolically
x, r, N = sp.symbols('x, r, N', real=True)
f_X = 1/(sigma * sp.sqrt(2 * sp.pi)) * sp.exp(-(x - mu)**2 / (2 * sigma**2))
print(f_X)0.1*sqrt(2)*exp(-0.02*x**2)/sqrt(pi)
Define the functional relationship between X and R, the horizontal distance from the initial drop point
h_eq = sp.Eq(r, N * x)
h_sol = sp.solve(h_eq, r, dict=True)[0]
h_inv = sp.solve(h_eq, x, dict=True)[0]
dr_dx = sp.diff(h_sol[r], x)
print(dr_dx)N
Define symbolically fR(r), the probability density function for the horizontal distance from the initial drop point:
f_R = 1/sp.Abs(dr_dx) * f_X.subs(h_inv)
print(f_R)0.1*sqrt(2)*exp(-0.02*r**2/N**2)/(sqrt(pi)*Abs(N))
Lambdify the PDF for numerical evaluation
f_R_fun = sp.lambdify((r, N), f_R, 'numpy')Plot the PDF fR(r) for several values of N:
N_vals = np.array([600, 800, 1000])/10 # Drop steps of 10 m
r_vals = np.linspace(-1000, 1000, 1001)
fig, ax = plt.subplots()
for N_val in N_vals:
p_vals = f_R_fun(r_vals, N_val)
ax.plot(r_vals, p_vals, label=f'N = {N_val}')
ax.set_xlabel('$r_f$ (m)')
ax.set_ylabel('$p(r_f)$')
ax.legend()
plt.draw()Compute the probability of landing within ± 500 m of the initial drop point:
r_min, r_max = -500, 500
p_landing = sp.integrate(f_R, (r, r_min, r_max))
print(p_landing)Piecewise((0.707106781186548*sqrt(2)*Abs(N)*erf(70.7106781186548/Abs(N))/N, N >= 0), (-0.707106781186548*sqrt(2)*Abs(N)*erf(70.7106781186548/Abs(N))/N, True))
Plot the probability of landing within ± 500 m of the initial drop point as a function of N:
N_vals = np.linspace(1, 1200, 1001)
p_landing_fun = sp.lambdify(N, p_landing, 'numpy')
p_landing_vals = np.zeros(N_vals.shape[0]) # Preallocate
for i, N_val in enumerate(N_vals):
p_landing_vals[i] = p_landing_fun(N_val) # Evaluate
fig, ax = plt.subplots()
ax.plot(N_vals * 10, p_landing_vals)
ax.set_xlabel('Drop height (m)')
ax.set_ylabel('$p(\pm 500 m)$')
plt.draw()Define a function to take one 10 m drop:
def take_drop(x_previous):
x_new = x_previous + np.random.normal(mu, sigma)
return x_newDefine a function to simulate a random walk:
def simulate_random_walk(N_sim):
y_sim = np.flip(np.arange(0, N_sim + 1)) * 10 # Heights
x_sim = np.zeros(N_sim + 1) # Preallocate
x_sim[0] = 0 # Initial drop point
for i in range(1, N_sim + 1):
x_sim[i] = take_drop(x_sim[i - 1])
return x_sim, y_simSimulate several random walks (drops) for various values of N:
N_vals = [60, 80, 100]
n_sim = 50 # Number of simulations
x_sims = [np.zeros((n_sim, N_val+1)) for N_val in N_vals] # Preallocate
y_sims = [np.zeros((n_sim, N_val+1)) for N_val in N_vals] # Preallocate
for i, N_val in enumerate(N_vals):
for j in range(n_sim):
x_sim, y_sim = simulate_random_walk(N_val)
x_sims[i][j] = x_sim
y_sims[i][j] = y_simPlot the random walks (drops) for several values of N:
fig, ax = plt.subplots()
for i, N_val in enumerate(N_vals):
for j in range(n_sim):
ax.plot(
x_sims[i][j], y_sims[i][j],
color=f'C{i}', alpha=[0.7, 0.5, 0.3][i]
)
ax.set_xlabel('Horizontal distance (m)')
ax.set_ylabel('Height (m)')
plt.show()Online Resources for Section 3.9
No online resources.