6  Exam 1 Prep

Instructions. Create a reproducible report with your answers with either Quarto or Jupyter notebook. (If you use Quarto, the .qmd of this file is available here: (link).)

Submission instructions.

  1. Render your report as HTML, then print to PDF.
  2. Upload PDF report to GradeScope via Canvas (link)

6.1 Computing

1.1 This document was initially set-up to hide code in revealable chunks. Modify this document so that the code chunks are hidable but also visible by default. Hint: (link).

1.2 The following line of code came from the roulette deliverable.

bad_luck = max([len(list(values)) for key, values in groupby(ledger["outcome"]) if key == 0])

1.2a Please explain what the code does.

1.2b If

ledger["outcome"] = [0,1,1,0,0,0,1,1,1,1,0,0,1,0,1,1]

what would be the value of bad luck?

1.3 This line of code generated the following error. What is the mistake in the code, and how is it fixed.

>>> np.random.binomial(10,.2)
NameError: name 'np' is not defined
Cell In[1], line 1
----> 1 np.random.binomial(10,.2)

1.4 The following is a schematic of a project folder, with subfolders and files.

/c/
│
├───memes
│
└───project
    │
    ├───code
    │       script.qmd
    │
    ├───data
    │       survey-responses.csv
    │
    └───docs

Supposing the code subfolder is the designated working directory, write the command to be included in the script.qmd file which will read the survey-responses.csv data, avoiding absolute file paths?

1.5 What is the absolute path for script.qmd

6.2 Probability

2.1 What is probability? (Describe 3 different ways the word is used.)

2.2 Review the examples of uncertainty from your own life that you submitted in homework.

2.3 Consider the following table about operating system and computer type. Please complete the table, calculating all joint, conditional, and marginal probabilities.

Mac Windows Margin
Laptop Cell 0.8
Row
Col
Desktop Cell 0.15
Row
Col
Margin 0.6

2.4 What is the probability that a randomly selected UVA student has a preference for a laptop with Mac operating system?

2.5 Among laptop users, what is the probability of randomly selecting a student with a preference for Windows?

2.6 If you do not know the operating system preference, what would be the probability that a randomly selected student prefers a desktop?

2.7 Among Mac users, what is the probability of randomly selecting a student with a preference for a desktop?

2.8 One of the big ideas of this course is that probability is a framework for coherently updating beliefs based on new information and data. Please explain how the solution in 2.7 represents an update of beliefs from the solution in 2.6.

2.9 What is the probability that a randomly selected student will prefer a laptop or windows operating system?

2.10 Complete the following table with all cell, conditional, and marginal probabilities.

Mac Windows Margin
Laptop Cell 0.8
Row .9
Col
Desktop Cell
Row .8
Col
Margin

2.11 In a class of 60 students, what is the probability that 40 students will prefer mac? (Use table from 2.10)

2.12 Create 9 plots of the binomial distribution probability mass function with N=10 and p = .1, .2, …, .9

2.13 Create a visualization of the negative binomial probability mass function with p = .7 and k = 5.

2.14 Suppose electric bikes represent 0.1 of all bikes among UVA students. If a researcher were to stand at the Emmett and Ivy intersection to count bikes, what is the probability that 30 or fewer regular bikes will be observed before the 5 e-bike is observed.

2.15 Create a visualization of the Poisson probability mass function with \(\lambda = 2\).

2.16 Suppose accidents at the Emmett Ivy intersection occur at a rate of 5 per year. If accidents follow the Poisson distribution, what is probability of observing 3 accidents in 9 months.

2.17 Consider a class of 15 students, of which 9 are right handed, 4 are left handed, and 2 are ambidextrous. What is the probability that a team of 8 students has no left handed students if teams are created randomly.

6.3 Simulation

3.1 Write a simulation that estimates the probability from question 2.17. Use only 200 replicates.

3.2 Calculate the simulation error from problem 3.1.

3.3 If we repeated the simulation with 2 million replicates, would the simulation error likely get larger or smaller.