8 Discrete random variables
8.1 Last time
Notation | Meaning |
---|---|
\(N\) | Number of cards in deak |
\(K\) | Number of cards dealt |
\(u\) | Number of unique outcomes |
\(K_1\) | Number of cards of the first unique outcome |
\(K_2\) | Number of cards of the second unique outcome |
\(\vdots\) | |
\(K_u\) | Number of cards of the \(u^{th}\) unique outcome |
Sequence | Hand | |
---|---|---|
With replacement | \(\frac{1}{N^K}\) | \(\frac{\frac{K!}{K_1!K_2!\cdots K_u!}}{N^K}\) |
Without replacement | \(\frac{1}{\frac{N!}{(N-K)!}}\) | \(\frac{K!}{\frac{N!}{(N-K)!}} = \frac{1}{{N \choose K}}\) |
8.2 TODAY
Random variables created from Bernoulli random variables.
8.3 Bernoulli random variable
A process or experiment that generates a binary outcome (0 or 1; heads or tails; success or failure)
Successive replications of the process are independent
The probability \(P(outcome = 1)\) is constant
Notation:
- \(p = P(outcome = 1)\)
- \(q = (1-p) = P(outcome = 0)\)
8.4 Bernoulli sequences
\[ Success,\ Success,\ Failure,\ Success\] \[ 1,\ 0,\ 1,\ 1,\ 1,\ 0 \] \[tails,\ tails,\ tails,\ heads\]
Note: A Bernoulli sequence can be thought of as \(K\) draws with replacement from a deck of \(N=2\) cards.
8.4.1 How to create the sequence (stopping rules)
- Fixed numbers of flips, trials, or draws. Binomial
- Keep flipping/drawing until observe \(K\) successes. Negative binomial
- Keep flipping/drawing until observe \(K\) successes or \(K\) failures. World Series
8.4.1.1 Examples
- Number of new cancer cases in a cohort of 1000 patients.
- Number of vehicles observed until observing 100 cars.
- World Series
8.5 Binomial random variable
The number of successes in a Bernoulli sequence of size N
Bernoulli properties still apply: independent outcomes, constant probability
Notation:
- \(p\) = probability of success in a single Bernoulli replicate
- \(N\) = number of replicates in Bernoulli sequence
8.5.1 Ways to express probabilities associated with the outcome:
- Table
- Function
- Figure
P(X = x) | P(X \(\leq x\)) | |
---|---|---|
0 | 0.03125 | 0.03125 |
1 | 0.15625 | 0.18750 |
2 | 0.31250 | 0.50000 |
3 | 0.31250 | 0.81250 |
4 | 0.15625 | 0.96875 |
5 | 0.03125 | 1.00000 |
- The function of \(P(X=x)\) is called the probability mass function.
- The function \(P(X\leq x)\) is called the cumulative probability function or probability function.
8.5.2 How to calculate the probability of a sequence (order matters)?
Because successive outcomes are independent and \(p\) is constant,
\[ \begin{align*}P(1,\ 0,\ 1,\ 1,\ 1,\ 0) &= P(1)P(0)P(1)P(1)P(1)P(0) \\ & = p(1-p)ppp(1-p) \\ & = p^4(1-p)^2 \end{align*} \]
8.5.3 What about when order doesn’t matter?
- How do we calculate P(2 heads in 4 flips)?
- If the sequence was \(H,\ H,\ T,\ T\), then \[P(H,\ H,\ T,\ T) = p^2(1-p)^2\]
- If the sequence was \(T,\ T,\ H,\ H\), then \[P(T,\ T,\ H,\ H) = p^2(1-p)^2\]
- We could list all possible 4 flip sequences …
- How do we calculate P(2 heads in 4 flips)?
- We could list all possible 4 flip sequences …
- Identify all the sequences that have 2 heads …
- How do we calculate P(2 heads in 4 flips)?
- We could list all possible 4 flip sequences …
- Identify all the sequences that have 2 heads …
\[\begin{align*}P(\text{2 heads}\ & \text{in 4 flips}) =\\ P(&\text{HHTT or HTHT or HTTH or} \\ &\text{THHT or THTH or TTHH})\end{align*}\]
- How do we calculate P(2 heads in 4 flips)?
- We could list all possible 4 flip sequences …
- Identify all the sequences that have 2 heads …
- Because the sequences are mutually exclusive:
\[\begin{align*}P(\text{2 heads in 4 flips}) &= P(HHTT)\\& + P(HTHT)\\& + P(HTTH)\\& + P(THHT)\\& + P(THTH)\\& + P(TTHH) \end{align*}\]
\[\begin{align*}P(\text{2 heads in 4 flips}) &= p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2 \end{align*}\]
\[P(\text{2 heads in 4 flips}) = 6\ p^2(1-p)^2\]
8.5.3.1 Generally (for arbitrary N)
- How do we calculate \(P(X \text{ heads in } N \text{ flips})\)?
\[\begin{align*}P(X &\text{ heads in } N \text{ flips}) = \\ &\text{[Number of sequences with X heads]} \times p^X(1-p)^{N-X}\end{align*}\]
\[P(X \text{ heads in } N \text{ flips}) = {N \choose X} p^X(1-p)^{N-X}\]
8.6 Binomial random variable
- In R
# Probability 2 heads in 4 flips when p = 0.5
dbinom(2,4,.5)
# Probability 2 or fewer heads in 4 flips when p = 0.5
pbinom(2,4,.5)
# Pseudo-random draw of the number of heads in 10 flips when p = 0.75
rbinom(1,10,.75)
[1] 0.375
[1] 0.6875
[1] 10
8.7 Negative Binomial
Suppose one flips a coin until there are 5 heads
- HHHHH
- HTTHHTH
- TTTTHTTTHTTTHTTHH
- …
The number of failures before the \(K^{th}\) success in a Bernoulli sequence is a negative binomial random variable
Bernoulli properties still apply: independent outcomes, constant probability
Notation:
- \(p\) = probability of success in a single Bernoulli replicate
- \(K\) = number of successes in Bernoulli sequence
What are the possible values?
- 0, 1, 2, 3, …
8.7.1 How to calculate probabilities of a negative binomial random variable
What is P(3 tails before 5th head)?
- [3 tails before 5th head] = [hand of 3 tails and 4 heads] and [heads]
\[\text{P[hand of 3 tails and 4 heads] P[heads]}\]
\[{7 \choose 3)p^3(1-p)^4 \times p = {7 \choose 3)p^4(1-p)^4\]
- P[3 tails before 5th head] =
dbinom(4,7,p)*p
8.8 Poisson random variable
8.8.1 In teaching lore
8.9 In teaching lore … the death by horse kick distribution
8.10 Poisson distribution
- In general, the Poisson distribution is useful for describing the number of events that occur during a specified time interval.
8.10.1 Examples of data that might be described by a Poisson distribution
From a Pubmed search for articles.
8.10.2 PMF
\[ P(Outcome = x|\lambda) = \frac{\lambda^xe^{-\lambda}}{x!} \]
\[\lambda = \text{Controls the rate of events}\]
In R
# Probability 4 crashes today on a stretch of I-64
dpois(4,2)
# Probability 4 or fewer crashes today on a stretch of I-64
ppois(4,2)
# Pseudo-random draw of the number of crashes today on a stretch of I-64
rpois(10,2)
[1] 0.09022352
[1] 0.947347
[1] 2 1 1 3 0 2 1 2 2 2