Cumulative Distribution Function#

The PMF is one way to describe the distribution of a discrete random variable. As we will see later on, PMF cannot be defined for continuous random variables. The cumulative distribution function (CDF) of a random variable is another method to describe the distribution of random variables. The advantage of the CDF is that it can be defined for any kind of random variable (discrete, continuous, and mixed) [Pishro-Nik, 2014].

The take away lesson here is that CDF is another way to describe the distribution of a random variable. In particular, in continuous random variables, we do not have an equivalent of PMF, so we use CDF instead.

Definition#

Definition 20 (Cumulative Distribution Function)

Let \(X\) be a discrete random variable with \(\S = \lset \xi_1, \xi_2, \ldots \rset\) where \(\xi_i \in \R\) for all \(i\). Note that \(X(\xi_i) = x_i\) for all \(i\) where \(x_i\) is the state of \(X\).

Then the cumulative distribution function \(\cdf\) is defined as

(5)#\[ \cdf(x_k) \overset{\text{def}}{=} \P \lsq X \leq x_k \rsq = \sum_{\ell=1}^k \P \lsq X = x_{\ell} \rsq = \sum_{\ell=1}^k \pmf(x_{\ell}) \]

Since \(\P \lsq X = x_{\ell} \rsq\) is the probability mass function, we can also replace the symbol with the \(\pmf\) symbol.

Example 8 (CDF)

Consider a random variable \(X\) with the following probability mass function:

\[\begin{split} \pmf(x) = \begin{cases} \frac{1}{4} & \text{if } x = 0 \\ \frac{1}{2} & \text{if } x = 1 \\ \frac{1}{4} & \text{if } x = 4 \\ \end{cases} \end{split}\]

Then by definition Definition 20, we have the CDF of \(X\) to be computed as:

\[\begin{split} \begin{align} \cdf(0) & = \P \lsq X \leq 0 \rsq = \P \lsq X = 0 \rsq = \frac{1}{4} \\ \cdf(1) & = \P \lsq X \leq 1 \rsq = \P \lsq X = 0 \rsq + \P \lsq X = 1 \rsq = \frac{1}{4} + \frac{1}{2} = \frac{3}{4} \\ \cdf(4) & = \P \lsq X \leq 4 \rsq = \P \lsq X = 0 \rsq + \P \lsq X = 1 \rsq + \P \lsq X = 4 \rsq = \frac{1}{4} + \frac{1}{2} + \frac{1}{4} = 1 \end{align} \end{split}\]

Thus, our CDF is given by:

\[\begin{split} \cdf(x) = \begin{cases} \frac{1}{4} & \text{if } x \leq 0 \\ \frac{3}{4} & \text{if } 0 < x \leq 1 \\ 1 & \text{if } x > 1 \end{cases} \end{split}\]
 1import warnings
 2
 3warnings.filterwarnings("ignore")
 4import numpy as np
 5import matplotlib.pyplot as plt
 6
 7p = np.array([0.25, 0.5, 0.25])
 8x = np.array([0, 1, 4])
 9F = np.cumsum(p)
10# plot 2 diagrams in one figure
11# y axis start from 0 to 1
12fig, ax = plt.subplots(1, 2, sharex=False, sharey=False, figsize=(10, 5))
13ax[0].set_ylim(0, 1)
14ax[0].set_title("PMF")
15ax[0].set_ylabel("Probability")
16ax[0].set_xlabel("x")
17ax[0].stem(x, p, use_line_collection=True)
18ax[0].grid(False)
19ax[1].set_ylim(0, 1)
20ax[1].set_title("CDF")
21ax[1].set_ylabel("Probability")
22ax[1].set_xlabel("x")
23ax[1].step(x, F)
24ax[1].grid(False)
25plt.show()
../_images/9ee61b74d0df754732cd19468850d8ff94eb95030313fa90519b33d9a6b71e6e.png

Properties#

Theorem 5 (Properties of CDF)

Let \(X\) be a discrete random variable with \(\S = \lset \xi_1, \xi_2, \ldots \rset\) where \(\xi_i \in \R\) for all \(i\). Then, the CDF \(\cdf\) of \(X\) satisfies the following properties:

  1. The CDF is a staircase function and is non-decreasing. That is, for any \(\xi \in \S\), we have

    \[ \cdf(x) \leq \cdf(x+1) \]
  2. The CDF is a probability function.

    \[ 0 \leq \cdf(x) \leq 1 \]

    In particular, we have the minimum of the CDF is 0 and the maximum is 1 for \(x = -\infty\) and \(x = \infty\) respectively.

  3. The CDF is right continuous.

PMF and CDF Conversion#

Theorem 6 (PMF and CDF Conversion)

Let \(X\) be a discrete random variable with \(\S = \lset \xi_1, \xi_2, \ldots \rset\) where \(\xi_i \in \R\) for all \(i\). Note that \(X(\xi_i) = x_i\) for all \(i\) where \(x_i\) is the state of \(X\). Then, the PMF of \(X\) can be obtained from the CDF by

(6)#\[ \pmf(x_k) = \cdf(x_k) - \cdf(x_{k-1}) \]

where \(X\) has a countable set of states \(\S\).