Previously on Statistics…

Population proportion vs Sample statistic
Null and alternative hypothesis
Hypothesis testing using simulations
Central limit theorem

Inference on Single Proportion

Today, we will discuss the following:

Using theoretical methods to perform hypothesis testing on inference for one proportion.

National Health Plan

The problem shown below was taken and slightly modified from your textbook OpenIntro: Introduction to Modern Statistics Section 16.4.

Consider the research study described below.

A Kaiser Family Foundation poll for a random sample of US adults in 2019 found that 79% of Democrats, 55% of Independents, and 24% of Republicans supported a generic “National Health Plan.” There were 347 Democrats, 298 Republicans, and 617 Independents surveyed. K. F. Foundation 2019

Claim: A majority of independents support the National Health Plan (NHP).

Questions: How do we define “majority”? What if it’s not actually a majority but a minority? Do these data provide strong evidence to support the claim?

Population Parameter and Sample Statistic

Question - What is the population parameter and the sample statistic?

Answer:

Population Parameter: Proportion of all independents who support the NHP. Let \(p\) be the population proportion.
Sample Statistic: Proportion of sampled independents who support the NHP Let \(\hat{p}\) be the sample statistic.

Point estimate and Null Value

Suppose that “majority” means more than 50% supports a National Health Plan. Recall that 55% of independents supported that National Health Plan.

Question - What is the point-estimate and the null value?

Answer:

Null Value: The proportion for the null and alternative hypothesis. Let \(p_0 = 0.50\) be the the null value.
Point-estimate: The proportion of sampled individuals who support the NHP. Let \(\hat{p} = 0.55\) be the point-estimate.

The Null and Alternative Hypothesis

Null Hypothesis Neither a majority nor minority of independents support the NHP.
\[H_0: p = 0.50\]
Alternative Hypothesis A majority or minority of independents support the NHP. A majority means that more than 50% of independents support NHP. \[H_A: p \ne 0.50\]
Notice that we are using the null value \(p_0 = 0.50\) in our hypotheses and this is a two-sided hypothesis test.

Hypthesis Testing (1/3)

Goal: To compute the p-value.
Step 1: Check if the conditions are satisfied. \(n=617\) and \(\hat{p} = 0.55\)
- Independence: The sample is randomly sampled because it says “random sample of US adults in 2019”, meaning our observation is just a small fraction of all US adults.
- success-failure: Among independents, \(617(0.55) = 340\) support and \(617(1-0.55) = 277\) don’t support. Both are greater than 10.

Hypthesis Testing (2/3)

Step 2: Compute standard error. \[ \begin{aligned} SE & = \sqrt{\frac{p_0(1-p_0)}{n}} \\ & = \sqrt{\frac{0.50(1-0.50)}{617}} \\ & = 0.02 \end{aligned} \]
Notice that, here I am using the \(p_0 = 0.50\). We can use \(\hat{p} = 0.55\) as the best guess if \(p_0\) is unknown.

Hypthesis Testing (3/3)

Step 3: Compute the test statistic. \[ \begin{aligned} Z & = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}} \\ & = \frac{0.55 - 0.50}{\sqrt{0.50(1-0.50)/617}} \\ & = 2.5\\ \end{aligned} \]
Since this is a two-tailed test, this yields a one-tail area of \(0.0062\), and a p-value of \(2(0.0062)=0.0124\). Here, you can use R - 2*(1-pnorm(2.5,0,1)) - to compute the p-value.
Because the p-value is smaller than \(0.05\), we reject the null hypothesis. We have strong evidence to support that \(p\) is different from \(0.5\), and since the data provide a point estimate above \(0.5\), we have strong evidence to support the claim.

10.10-minute Activity (1/3)

The problem shown below was taken and slightly modified from your textbook OpenIntro: Introduction to Modern Statistics Section 16.4.

Is college worth it?

Among a simple random sample of 331 American adults who do not have a four-year college degree and are not currently enrolled in school, 48% said they decided not to go to college because they could not afford school. Pew Research Center 2011

A newspaper article states that only a minority of the Americans who decide not to go to college do so because they cannot afford it and uses the point estimate from this survey as evidence. Conduct a hypothesis test to determine if these data provide strong evidence supporting this statement.

Timer starts

10:10

10.10-minute Activity (2/3)

The hypotheses are as follows: \(H_0:p=0.5\) and \(H_A:p<0.5\).
Before calculating the test statistic we should check that the conditions are satisfied. Since the observations are independent and the success-failure condition is met, \(\hat{p}\) is expected to be approximately normal. \[np = 331(0.48) = 159 > 10\] \[n(1-p) = 331(1-0.48) = 172 > 10\]

10.10-minute Activity (3/3)

The test statistic can be calculated as shown below. \[\hat{p} = 0.48 \text{ and } p_0 = 0.55\] \[ \begin{aligned} Z & = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}} \\ & = \frac{0.48 - 0.50}{\sqrt{0.50(1-0.50)/331}} \\ & = -0.73\\ \end{aligned} \] \[\text{p-value} = P(Z < -0.73) = 0.2327\]
Here, you can use R - pnorm(-0.73,0,1) - to compute the p-value.
Since the p-value is large, we fail to reject \(H_0\). The data do not provide strong evidence that less than half of American adults who decide not to go to college make this decision because they cannot afford college.

Summary

Today, we discussed the following:

Computing the test statistic Z for hypothesis testing

Next, we will discuss:

Confidence intervals for one proportion

In lab, we will work on:

Using simulations for hypothesis testing and confidence intervals.
Utilizing R for our computations.

11 - Inference for One Proportion Hypothesis Testing