Combinatorics, Binomial Distribution & Negative Binomial Distribution

Suppose

outcome_A <- sample(LETTERS, 1)

Question: What is the probability that outcome_A = Q?

`sample` probabilities

There are many random processes where all the possible outcomes are equally likely.

`sample` probabilities

There are many random processes where all the possible outcomes are equally likely.
In these settings, \[ P(\text{Outcome}) = \frac{1}{\text{number of possible outcomes}} \]

`sample` probabilities

There are many random processes where all the possible outcomes are equally likely.
In these settings, \[ P(\text{Outcome}) = \frac{1}{\text{number of possible outcomes}} \]
Example:

\[ P(\text{Randomly selected letter from English alphabet}) = \frac{1}{\text{26}} \]

`sample` probabilities

There are many random processes where all the possible outcomes are equally likely.
Informally, outcomes of this type are often called sample outcomes or urn outcomes.
More formally, outcomes of this type are sometimes called discrete uniform random variables.

`sample` probabilities

There are many random processes where all the possible outcomes are equally likely.
Informally, outcomes of this type are often called sample outcomes or urn outcomes.
More formally, outcomes of this type are sometimes called discrete uniform random variables.

`sample` probabilities

There are many random processes where the possible outcomes are constructed from a sequence of urn outcomes (or discrete uniform random variables).

`sample` probabilities

There are many random processes where the possible outcomes are constructed from a sequence of urn outcomes (or discrete uniform random variables).

three_letter_word <- c(
    sample(LETTERS, 1)
  , sample(LETTERS, 1)
  , sample(LETTERS, 1)
)
three_letter_word

[1] “D” “T” “B”

What is the probability that the sequence of outcomes spells “TOM”?

`sample` probabilities

There are many random processes where the possible outcomes are constructed from a sequence of urn outcomes (or discrete uniform random variables).

three_letter_word <- c(
    sample(LETTERS, 1)
  , sample(LETTERS, 1)
  , sample(LETTERS, 1)
)

What is the probability that the sequence of outcomes spells TOM?

`sample` probabilities

What if

another_three_letter_word <- sample(LETTERS, 3)

What is the probability that another_three_letter_word = TOM?

In order to calculate

\[ \begin{array}{l} P(\texttt{three_letter_word = TOM}) = \\ \phantom{ = }\phantom{ = } \frac{1}{\text{Number of 3 letter sequences}} \end{array} \]

we need need to calculate the number of 3 letter sequences.

There are a number of tools for calculating the total number of sequences.

Q1: How many sequences of \(K\) cards from \(N\) are there?

Q1.a: with replacement

three_letter_word <- c(
    sample(LETTERS, 1)
  , sample(LETTERS, 1)
  , sample(LETTERS, 1)
)

Q1.b: without replacement

another_three_letter_word <- sample(LETTERS, 3)

Hand vs Sequence

Is the outcome TOM the same as MOT or OTM?

Hand vs Sequence

Is the outcome TOM the same as MOT or OTM?

Sequence: Order matters TOM \(\neq\) MOT

Hand: Order doesn’t matter TOM = MOT

Q2: How many hands of \(K\) cards from \(N\) are there?

Q2.a: with replacement

Q2.b: without replacement

Sequences with replacement

Tool: Tree

A systematic way to write out all sequences.

Sequences with replacement

Each path represents a sequence

Sequences with replacement

The total number of paths is the product of the number of candidate cards at each draw

Sequences with replacement

The total number of paths is the product of the number of candidate cards at each draw

\[ \underbrace{N \times N \times \cdots \times N}_{K \text{ draws}} = N^K \]

Sequences with replacement

The total number of paths is the product of the number of candidate cards at each draw

Sequences without replacement?

How many possible sequences are there?

Sequences without replacement

Enumerate possible sequences with a tree

Sequences without replacement

The total number of sequences is the product of the number of candidate cards at each draw

\[ \underbrace{N \times (N-1) \times \cdots \times (N-K+1)}_{K \text{ draws}} = \frac{N!}{(N-K)!} \]

The factorial

\[N! = N\times(N-1)\times(N-2)\times \cdots \times 3 \times 2 \times 1\]

\[ 5! = 5 \times 4 \times 3 \times 2 \times 1\]

The factorial

In R:

factorial(5)

## [1] 120

The factorial

Note

\[ 0! = 1 \]

For sequences of K draws from N cards

Total number of sequences is \(N^K\) when drawing with replacement
Total number of sequences is \(N!/(N-K)!\) when drawing without replacement

Hands

In a sequence of draws, order matters:

\[ABC \neq CAB\]

Hands

In a hand (or sample) of draws, order does NOT matter:

\[ABC = CAB\]

Hands

Q2: How many hands of \(K\) cards from \(N\) are there?

Q2.a: with replacement (We’ll come back to this)

Q2.b: without replacement (Start here)

Hand without replacement

Go back to all the sequences:

Hand without replacement

Which are duplicates

Hand without replacement

For each hand there are a certain number of duplicate sequences

Hand without replacement

For each hand, how many sequences are there?

Hand without replacement

For each hand, how many sequences are there?
Hint: Think of the hand as a mini deck.

Hand without replacement

For each hand, how many sequences are there?
Hint: Think of the hand as a mini deck.
How many sequences of \(K\) cards from a deck of \(K\) cards are there, without replacement?

Hand without replacement

For each hand, how many sequences are there?
Hint: Think of the hand as a mini deck.
How many sequences of \(K\) cards from a deck of \(K\) cards are there, without replacement?
\(K\times(K-1)\times \cdots \times 2 \times 1\)

Hand without replacement

\(K\) cards from \(N\) without replacement has \(N!/(N-K)!\) possible sequences
For each hand there are \(K!\) sequences (The hand to sequence multiplier)

Hand without replacement

To get the number of hands, divide the number of sequences by the multiplier.

\[\frac{\text{Number sequences}}{\text{hand to sequence multiplier}} = \frac{\frac{N!}{(N-K)!}}{K!} = \frac{N!}{(N-K)!K!} \]

choose

\[ {N \choose K} = \frac{N!}{(N-K)!K!} \]

The number of hands of size \(K\) from a deck of \(N\) cards when drawing without replacement.

choose

In R

#The number of 5 card hands from a 52 card deck
choose(52,5)

## [1] 2598960

Bernoulli sequences

Bernoulli random variable

A process or experiment that generates a binary outcome (0 or 1; heads or tails; success or failure)
Successive replications of the process are independent
The probability \(P(outcome = 1)\) is constant
Notation:
- \(p = P(outcome = 1)\)
- \(q = (1-p) = P(outcome = 0)\)

Bernoulli sequences

\[ Success,\ Success,\ Failure,\ Success\] \[ 1,\ 0,\ 1,\ 1,\ 1,\ 0 \] \[tails,\ tails,\ tails,\ heads\]

Bernoulli sequences

Note: A Bernoulli sequence can be thought of as \(K\) draws with replacement from a deck of \(N=2\) cards.

Bernoulli sequences

Because successive outcomes are independent and \(p\) is constant,

\[ \begin{align*}P(1,\ 0,\ 1,\ 1,\ 1,\ 0) &= P(1)P(0)P(1)P(1)P(1)P(0) \\ & = p(1-p)ppp(1-p) \\ & = p^4(1-p)^2 \end{align*} \]

Binomial random variable

The number of successes in a Bernoulli sequence
Bernoulli properties still apply: independent outcomes, constant probability
Notation:
- \(p\) = probability of success in a single Bernoulli replicate
- \(N\) = number of replicates in Bernoulli sequence

Binomial random variable

For a sequence of 10 coin flips, what are the possible number of heads?

Binomial random variable

For a sequence of 10 coin flips, what are the possible number of heads?
- 0, 1, 2, 3, …, 10

Binomial random variable

How do we calculate P(2 heads in 4 flips)?

Binomial random variable

How do we calculate P(2 heads in 4 flips)?
- If the sequence was \(H,\ H,\ T,\ T\), then \[P(H,\ H,\ T,\ T) = p^2(1-p)^2\]

Binomial random variable

How do we calculate P(2 heads in 4 flips)?
- If the sequence was \(H,\ H,\ T,\ T\), then \[P(H,\ H,\ T,\ T) = p^2(1-p)^2\]
- If the sequence was \(T,\ T,\ H,\ H\), then \[P(T,\ T,\ H,\ H) = p^2(1-p)^2\]

Binomial random variable

How do we calculate P(2 heads in 4 flips)?
- We could list all possible 4 flip sequences …

Binomial random variable

How do we calculate P(2 heads in 4 flips)?
- We could list all possible 4 flip sequences …
- Identify all the sequences that have 2 heads …

Binomial random variable

How do we calculate P(2 heads in 4 flips)?
- We could list all possible 4 flip sequences …
- Identify all the sequences that have 2 heads …

\[\begin{align*}P(\text{2 heads}\ & \text{in 4 flips}) =\\ P(&\text{HHTT or HTHT or HTTH or} \\ &\text{THHT or THTH or TTHH})\end{align*}\]

Binomial random variable

How do we calculate P(2 heads in 4 flips)?
- We could list all possible 4 flip sequences …
- Identify all the sequences that have 2 heads …
- Because the sequences are mutually exclusive:

\[\begin{align*}P(\text{2 heads in 4 flips}) &= P(HHTT)\\& + P(HTHT)\\& + P(HTTH)\\& + P(THHT)\\& + P(THTH)\\& + P(TTHH) \end{align*}\]

Binomial random variable

How do we calculate P(2 heads in 4 flips)?
- We could list all possible 4 flip sequences …
- Identify all the sequences that have 2 heads …
- Because the sequences are mutually exclusive:

\[\begin{align*}P(\text{2 heads in 4 flips}) &= p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2\\& + p^2(1-p)^2 \end{align*}\]

Binomial random variable

How do we calculate P(2 heads in 4 flips)?
- We could list all possible 4 flip sequences …
- Identify all the sequences that have 2 heads …
- Because the sequences are mutually exclusive:

\[P(\text{2 heads in 4 flips}) = 6\ p^2(1-p)^2\]

Binomial random variable

How do we calculate \(P(X \text{ heads in } N \text{ flips})\)?

\[\begin{align*}P(X &\text{ heads in } N \text{ flips}) = \\ &\text{[Number of sequences with X heads]} \times p^X(1-p)^{N-X}\end{align*}\]

Binomial random variable

How do we calculate \(\text{[Number of sequences with X heads in N flips]}\)?

Binomial random variable

How do we calculate \(\text{[Number of sequences with X heads in N flips]}\)?
- Trick: Consider a deck of cards numbered 1 to N
  - Let the number denote location in the sequence

Binomial random variable

How do we calculate \(\text{[Number of sequences with X heads in N flips]}\)?
- Trick: Consider a deck of cards numbered 1 to N
  - Let the number denote location in the sequence
  - Deal X cards

Binomial random variable

How do we calculate \(\text{[Number of sequences with X heads in N flips]}\)?
- Trick: Consider a deck of cards numbered 1 to N
  - Let the number denote location in the sequence
  - Deal X cards
  - Create a sequence by placing heads at the locations indicated by the cards
  - Place tails in all the other locations

\[ \text{cards: } [2],\ [1] \to \{\text{heads}, \text{heads}, \text{tails}, \text{tails}\} \] \[ \text{cards: } [2],\ [4] \to \{\text{tails}, \text{heads}, \text{tails}, \text{heads}\} \]

Binomial random variable

Does the order of the cards matter?

\[ \text{cards: } [2],\ [1] \to \{\text{heads}, \text{heads}, \text{tails}, \text{tails}\} \] \[ \text{cards: } [1],\ [2] \to \{\text{heads}, \text{heads}, \text{tails}, \text{tails}\} \] \[ \text{cards: } [2],\ [4] \to \{\text{tails}, \text{heads}, \text{tails}, \text{heads}\} \] \[ \text{cards: } [4],\ [2] \to \{\text{tails}, \text{heads}, \text{tails}, \text{heads}\} \]

Binomial random variable

How do we calculate \(\text{[Number of sequences with X heads in N flips]}\)?

\[\text{Every sequence with X heads in N flips}\] \[\text{corresponds to}\] \[\text{A hand of X cards from a deck of N (without replacement)}\]

Binomial random variable

How do we calculate \(\text{[Number of sequences with X heads in N flips]}\)?

\[\text{Every sequence with X heads in N flips}\] \[\text{corresponds to}\] \[\text{A hand of X cards from a deck of N (without replacement)}\]

We already know how to calculate the number of possible hands of X cards from a deck of N (without replacement)

Binomial random variable

How do we calculate \(\text{[Number of sequences with X heads in N flips]}\)?

\[{N \choose X}\]

Binomial random variable

How do we calculate \(P(X \text{ heads in } N \text{ flips})\)?

\[P(X \text{ heads in } N \text{ flips}) = {N \choose X} p^X(1-p)^{N-X}\]

Binomial random variable

In R

# Probability 2 heads in 4 flips when p = 0.5
dbinom(2,4,.5)
# Probability 2 or fewer heads in 4 flips when p = 0.5
pbinom(2,4,.5)
# Pseudo-random draw of the number of heads in 10 flips when p = 0.75
rbinom(1,10,.75)

## [1] 0.375
## [1] 0.6875
## [1] 5

Negative Binomial

Negative binomial random variable

Suppose one flips a coin until there are 5 heads

Negative binomial random variable

Suppose one flips a coin until there are 5 heads
- HHHHH
- HTTHHTH
- TTTTHTTTHTTTHTTHH
- …

Negative binomial random variable

The number of failures before the \(K^{th}\) success in a Bernoulli sequence is a negative binomial random variable
Bernoulli properties still apply: independent outcomes, constant probability
Notation:
- \(p\) = probability of success in a single Bernoulli replicate
- \(K\) = number of successes in Bernoulli sequence

Negative binomial random variable

What are the possible values?

Negative binomial random variable

What are the possible values?
- 0, 1, 2, 3, …

Negative binomial random variable

What is P(3 tails before 5th head)?

Negative binomial random variable

What is P(3 tails before 5th head)?
- [3 tails before 5th head] = [hand of 3 tails and 4 heads] and [heads]

Negative binomial random variable

What is P(3 tails before 5th head)?
- P[3 tails before 5th head] =
\[\text{P[hand of 3 tails and 4 heads] P[heads]}\]

Negative binomial random variable

What is P(3 tails before 5th head)?
- P[3 tails before 5th head] = dbinom(4,7,p)*p

Negative binomial random variable

In R

# Probability of 3 tails before 5th heads with a fair coin
dnbinom(3,5,.5)
# Probability of 3 or fewer tails before 5th heads with a fair coin
pnbinom(3,5,.5)
# Pseudo-random draw of number of tails before 5th heads with a fair coin
rnbinom(1,5,.5)

## [1] 0.1367188
## [1] 0.3632813
## [1] 5

Suppose

Question: What is the probability that outcome_A = Q?

sample probabilities

sample probabilities

sample probabilities

sample probabilities

sample probabilities

sample probabilities

sample probabilities

sample probabilities

sample probabilities

There are a number of tools for calculating the total number of sequences.

Q1: How many sequences of \(K\) cards from \(N\) are there?

Q1.a: with replacement

Q1.b: without replacement

Hand vs Sequence

Hand vs Sequence

Q2: How many hands of \(K\) cards from \(N\) are there?

Q2.a: with replacement

Q2.b: without replacement

Sequences with replacement

Tool: Tree

Sequences with replacement

Each path represents a sequence

Sequences with replacement

The total number of paths is the product of the number of candidate cards at each draw

Sequences with replacement

The total number of paths is the product of the number of candidate cards at each draw

Sequences with replacement

The total number of paths is the product of the number of candidate cards at each draw

Sequences without replacement?

How many possible sequences are there?

Sequences without replacement

Enumerate possible sequences with a tree

Sequences without replacement

Sequences without replacement

Sequences without replacement

The total number of sequences is the product of the number of candidate cards at each draw

The factorial

The factorial

The factorial

For sequences of K draws from N cards

Hands

Hands

Hands

Q2: How many hands of \(K\) cards from \(N\) are there?

Q2.a: with replacement (We’ll come back to this)

Q2.b: without replacement (Start here)

Hand without replacement

Hand without replacement

Which are duplicates

Hand without replacement

Hand without replacement

Hand without replacement

Hand without replacement

Hand without replacement

Hand without replacement

Hand without replacement

Hand without replacement

Hand without replacement

choose

choose

Bernoulli sequences

Bernoulli random variable

Bernoulli sequences

Bernoulli sequences

Bernoulli sequences

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

Binomial random variable

`sample` probabilities

`sample` probabilities

`sample` probabilities

`sample` probabilities

`sample` probabilities

`sample` probabilities

`sample` probabilities

`sample` probabilities

`sample` probabilities