8  Discrete random variables

8.1 Last time

Notation Meaning
\(N\) Number of cards in deak
\(K\) Number of cards dealt
\(u\) Number of unique outcomes
\(K_1\) Number of cards of the first unique outcome
\(K_2\) Number of cards of the second unique outcome
\(\vdots\)
\(K_u\) Number of cards of the \(u^{th}\) unique outcome
Sequence Hand
With replacement \(\frac{1}{N^K}\) \(\frac{\frac{K!}{K_1!K_2!\cdots K_u!}}{N^K}\)
Without replacement \(\frac{1}{\frac{N!}{(N-K)!}}\) \(\frac{K!}{\frac{N!}{(N-K)!}} = \frac{1}{{N \choose K}}\)

8.2 TODAY

Random variables created from Bernoulli random variables.

8.3 Bernoulli random variable

  • A process or experiment that generates a binary outcome (0 or 1; heads or tails; success or failure)

  • Successive replications of the process are independent

  • The probability \(P(outcome = 1)\) is constant

  • Notation:

    • \(p = P(outcome = 1)\)
    • \(q = (1-p) = P(outcome = 0)\)

8.4 Bernoulli sequences

\[ Success,\ Success,\ Failure,\ Success\] \[ 1,\ 0,\ 1,\ 1,\ 1,\ 0 \] \[tails,\ tails,\ tails,\ heads\]

Note: A Bernoulli sequence can be thought of as \(K\) draws with replacement from a deck of \(N=2\) cards.

8.4.1 How to create the sequence (stopping rules)

  1. Fixed numbers of flips, trials, or draws. Binomial
  2. Keep flipping/drawing until observe \(K\) successes. Negative binomial
  3. Keep flipping/drawing until observe \(K\) successes or \(K\) failures. World Series

8.4.1.1 Examples

  1. Number of new cancer cases in a cohort of 1000 patients.
  2. Number of vehicles observed until observing 100 cars.
  3. World Series

8.5 Binomial random variable

  • The number of successes in a Bernoulli sequence of size N

  • Bernoulli properties still apply: independent outcomes, constant probability

  • Notation:

    • \(p\) = probability of success in a single Bernoulli replicate
    • \(N\) = number of replicates in Bernoulli sequence

8.5.1 Ways to express probabilities associated with the outcome:

  • Table
  • Function
  • Figure
P(X = x) P(X \(\leq x\))
0 0.03125 0.03125
1 0.15625 0.18750
2 0.31250 0.50000
3 0.31250 0.81250
4 0.15625 0.96875
5 0.03125 1.00000

  • The function of \(P(X=x)\) is called the probability mass function.
  • The function \(P(X\leq x)\) is called the cumulative probability function or probability function.

8.5.2 How to calculate the probability of a sequence (order matters)?

Because successive outcomes are independent and \(p\) is constant,

\[ \begin{align*}P(1,\ 0,\ 1,\ 1,\ 1,\ 0) &= P(1)P(0)P(1)P(1)P(1)P(0) \\ & = p(1-p)ppp(1-p) \\ & = p^4(1-p)^2 \end{align*} \]

8.5.3 What about when order doesn’t matter?

  • How do we calculate P(2 heads in 4 flips)?
    • If the sequence was \(H,\ H,\ T,\ T\), then \[P(H,\ H,\ T,\ T) = p^2(1-p)^2\]
    • If the sequence was \(T,\ T,\ H,\ H\), then \[P(T,\ T,\ H,\ H) = p^2(1-p)^2\]
    • We could list all possible 4 flip sequences …

  • How do we calculate P(2 heads in 4 flips)?
    • We could list all possible 4 flip sequences …
    • Identify all the sequences that have 2 heads …

  • How do we calculate P(2 heads in 4 flips)?
    • We could list all possible 4 flip sequences …
    • Identify all the sequences that have 2 heads …

\[\begin{align*}P(\text{2 heads}\ & \text{in 4 flips}) =\\ P(&\text{HHTT or HTHT or HTTH or} \\ &\text{THHT or THTH or TTHH})\end{align*}\]

  • How do we calculate P(2 heads in 4 flips)?
    • We could list all possible 4 flip sequences …
    • Identify all the sequences that have 2 heads …
    • Because the sequences are mutually exclusive:

\[\begin{align*}P(\text{2 heads in 4 flips}) &= P(HHTT)\\& + P(HTHT)\\& + P(HTTH)\\& + P(THHT)\\& + P(THTH)\\& + P(TTHH) \end{align*}\]

\[\begin{align*}P(\text{2 heads in 4 flips}) &= p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2 \end{align*}\]

\[P(\text{2 heads in 4 flips}) = 6\ p^2(1-p)^2\]

8.5.3.1 Generally (for arbitrary N)

  • How do we calculate \(P(X \text{ heads in } N \text{ flips})\)?

\[\begin{align*}P(X &\text{ heads in } N \text{ flips}) = \\ &\text{[Number of sequences with X heads]} \times p^X(1-p)^{N-X}\end{align*}\]

\[P(X \text{ heads in } N \text{ flips}) = {N \choose X} p^X(1-p)^{N-X}\]

8.6 Binomial random variable

  • In R
# Probability 2 heads in 4 flips when p = 0.5
dbinom(2,4,.5)
# Probability 2 or fewer heads in 4 flips when p = 0.5
pbinom(2,4,.5)
# Pseudo-random draw of the number of heads in 10 flips when p = 0.75
rbinom(1,10,.75)
[1] 0.375
[1] 0.6875
[1] 10

8.7 Negative Binomial

  • Suppose one flips a coin until there are 5 heads

    • HHHHH
    • HTTHHTH
    • TTTTHTTTHTTTHTTHH
  • The number of failures before the \(K^{th}\) success in a Bernoulli sequence is a negative binomial random variable

  • Bernoulli properties still apply: independent outcomes, constant probability

  • Notation:

    • \(p\) = probability of success in a single Bernoulli replicate
    • \(K\) = number of successes in Bernoulli sequence
  • What are the possible values?

    • 0, 1, 2, 3, …

8.7.1 How to calculate probabilities of a negative binomial random variable

  • What is P(3 tails before 5th head)?

    • [3 tails before 5th head] = [hand of 3 tails and 4 heads] and [heads]

    \[\text{P[hand of 3 tails and 4 heads] P[heads]}\]

    \[{7 \choose 3)p^3(1-p)^4 \times p = {7 \choose 3)p^4(1-p)^4\]

    • P[3 tails before 5th head] = dbinom(4,7,p)*p

8.8 Poisson random variable

8.8.1 In teaching lore

8.9 In teaching lore … the death by horse kick distribution

8.10 Poisson distribution

  • In general, the Poisson distribution is useful for describing the number of events that occur during a specified time interval.

8.10.1 Examples of data that might be described by a Poisson distribution

From a Pubmed search for articles.

8.10.2 PMF

\[ P(Outcome = x|\lambda) = \frac{\lambda^xe^{-\lambda}}{x!} \]

\[\lambda = \text{Controls the rate of events}\]

In R

# Probability 4 crashes today on a stretch of I-64
dpois(4,2)

# Probability 4 or fewer crashes today on a stretch of I-64
ppois(4,2)

# Pseudo-random draw of the number of crashes today on a stretch of I-64
rpois(10,2)
[1] 0.09022352
[1] 0.947347
 [1] 2 1 1 3 0 2 1 2 2 2

8.11 Poisson distribution