Previously on Inference for one Proportion…

Population proportion vs Sample statistic
Hypothesis testing using theoretical methods
Central limit theorem

Inference on Single Proportion

Today, we will discuss the following:

Using theoretical methods to compute confidence intervals for one proportion.

Proof of COVID-19 vaccination

The problem shown below was taken and slightly modified from your textbook OpenIntro: Introduction to Modern Statistics Section 16.4.

Consider the research study described below.

A Gallup poll surveyed 3,731 randomly sampled US in April 2021, asking how they felt about requiring proof of COVID-19 vaccination for travel by airplane. The poll found that 57% said they would favor it. Gallup 2021b

Population Parameter: The proportion of all US adults who favor requiring proof of COVID-19 vaccination for travel by airplane. Let \(p\) be the population proportion.
Sample Statistic: The sample statistic for this parameter is the proportion of US adults in this sample who favor requiring proof of COVID-19 vaccination for travel by airplane. Let \(\hat{p} = 0.57\) be the point-estimate.

Question: How much uncertainty are there with the sample statistic? What is the feasible range of values for the true population proportion?

Conditions

Independence: The observations are independent because the samples are randomly sampled from the population.
Success-Failure: We have at least 10 observation for each level. \[n\hat{p} = 3731(0.57) = 2127 \longrightarrow \text{"successes" - in favor}\] \[n(1-\hat{p}) = 3731(1-0.57) = 1604 \longrightarrow \text{"failures" - not in favor}\]
Thus, the variability of \(\hat{p}\) is approximately normal.

95% Confidence Interval (1/2)

Step 1: Compute standard error. \[ \begin{aligned} SE & = \sqrt{\frac{\hat{p} (1 - \hat{p})}{n}} \\ & = \sqrt{\frac{0.57 (1 - 0.57)}{3731}} \\ & = 0.0081 \end{aligned} \] Notice that here we are using the point-estimate \(\hat{p}\) because \(p\) is unknown.
Step 2: Compute \(z*\) for a 95% confidence level. \[z^* = 1.96\] Here, we using R command qnorm(0.975,0,1). Why \(0.975\) as input? Because \(0.95+\frac{1-0.95}{2} = 0.975\) (right tail of the standard normal distribution).

95% Confidence Interval (2/2)

Step 3: Compute the 95% confidence interval. \[ \begin{aligned} \hat{p} \pm z*SE & = \hat{p} \pm z^*\sqrt{\frac{\hat{p} (1 - \hat{p})}{n}} \\ & = 0.57 \pm (1.96) \sqrt{\frac{0.57 (1 - 0.57)}{3731}} \\ & = 0.57 \pm (1.96) 0.0081 \\ & = 0.57 \pm 0.015876 \\ & = (0.55,0.59) \end{aligned} \]
Therefore, we are 95% confidence that the true proportion of individuals who favor the COVID-19 vaccination proof for travel by airplane in between \(0.55\) and \(0.59\).

Margin of Error

The term \(z^* SE\) is called the Margin of Error (ME). \[ME = z^* SE\]
The current ME is \(0.015876\) (1.5876%).
Question - How many samples should we have in order to cut the margin of error of a 95% confidence interval down to 1%? \[ \begin{aligned} 0.01 & \ge (1.96) \sqrt{\frac{0.57 (1 - 0.57)}{n}} \\ 0.01^2 & \ge (1.96)^2 \frac{0.57 (1 - 0.57)}{n} \\ n & \ge \frac{(1.96)^2 0.57 (1 - 0.57) }{0.01^2} \\ n & \ge 9415.762 \longrightarrow \text{sample size should be at least 9416} \end{aligned} \] Notice here that we are using \(\hat{p}\) from previous study.

10.10-Minute Activity (1/3)

The problem shown below was taken and slightly modified from your textbook OpenIntro: Introduction to Modern Statistics Section 16.4.

Consider the research study described below.

Study Abroad

A survey on 1,509 high school seniors who took the SAT and who completed an optional web survey shows that 55% of high school seniors are fairly certain that they will participate in a study abroad program in college. AEC 2008

Construct a 90% confidence interval for the proportion of high school seniors (of those who took the SAT) who are fairly certain they will participate in a study abroad program in college, and interpret this interval in context.
What is the current Margin of Error? How many samples should you have to cut down the Margin of Error to 1%?

Timer starts

10:10

10.10-Minute Activity (2/3)

First, check for the conditions:
- Independence: The observations are independent because the samples are randomly sampled from the population.
- Success-Failure: We have at least 10 observation for each level. \[n\hat{p} = 1509(0.55) = 830 \longrightarrow \text{"successes" - in favor}\] \[n(1-\hat{p}) = 1509(1-0.55) = 679 \longrightarrow \text{"failures" - not in favor}\]
Thus, the variability of \(\hat{p}\) is approximately normal.

10.10-Minute Activity (3/3)

Step 1: Compute the standard error: \[SE = 0.0128\]
Step 2: Compute the \(z^*\) for a 90% CI. \[z^* = 1.64\]
Step 3: Compute the 90% confidence interval. \[(0.53,0.57)\]
Therefore, we are 90% confident that the true proportion of high school seniors who took the SAT are fairly and who certain that they will participate in a study abroad program in college.
The current Margin of Error: \(ME = 0.02\). To cut down the Margin of Error to 1%, we need at least 6657.

Summary

Today, we discussed the following:

Computing the confidence interval for one proportion using the theoretical method.

Next, we will discuss:

Hypothesis testing and confidence intervals for difference (two) in proportions

11 - Inference for One Proportion Confidence Intervals