Alex John Quijano
11/08/2021
Population proportion vs Sample statistic
Null and alternative hypothesis
Hypothesis testing using simulations
Central limit theorem
Today, we will discuss the following:
The problem shown below was taken and slightly modified from your textbook OpenIntro: Introduction to Modern Statistics Section 16.4.
Consider the research study described below.
A Kaiser Family Foundation poll for a random sample of US adults in 2019 found that 79% of Democrats, 55% of Independents, and 24% of Republicans supported a generic “National Health Plan.” There were 347 Democrats, 298 Republicans, and 617 Independents surveyed. K. F. Foundation 2019
Claim: A majority of independents support the National Health Plan (NHP).
Questions: How do we define “majority”? What if it’s not actually a majority but a minority? Do these data provide strong evidence to support the claim?
Suppose that “majority” means more than 50% supports a National Health Plan. Recall that 55% of independents supported that National Health Plan.
Null Hypothesis Neither a majority nor minority of independents support the NHP.
\[H_0: p = 0.50\]
Alternative Hypothesis A majority or minority of independents support the NHP. A majority means that more than 50% of independents support NHP. \[H_A: p \ne 0.50\]
Notice that we are using the null value \(p_0 = 0.50\) in our hypotheses and this is a two-sided hypothesis test.
Goal: To compute the p-value.
Step 1: Check if the conditions are satisfied. \(n=617\) and \(\hat{p} = 0.55\)
Step 2: Compute standard error. \[ \begin{aligned} SE & = \sqrt{\frac{p_0(1-p_0)}{n}} \\ & = \sqrt{\frac{0.50(1-0.50)}{617}} \\ & = 0.02 \end{aligned} \]
Notice that, here I am using the \(p_0 = 0.50\). We can use \(\hat{p} = 0.55\) as the best guess if \(p_0\) is unknown.
Step 3: Compute the test statistic. \[ \begin{aligned} Z & = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}} \\ & = \frac{0.55 - 0.50}{\sqrt{0.50(1-0.50)/617}} \\ & = 2.5\\ \end{aligned} \]
Since this is a two-tailed test, this yields a one-tail area of \(0.0062\), and a p-value of \(2(0.0062)=0.0124\). Here, you can use R - 2*(1-pnorm(2.5,0,1))
- to compute the p-value.
Because the p-value is smaller than \(0.05\), we reject the null hypothesis. We have strong evidence to support that \(p\) is different from \(0.5\), and since the data provide a point estimate above \(0.5\), we have strong evidence to support the claim.
The problem shown below was taken and slightly modified from your textbook OpenIntro: Introduction to Modern Statistics Section 16.4.
Is college worth it?
Among a simple random sample of 331 American adults who do not have a four-year college degree and are not currently enrolled in school, 48% said they decided not to go to college because they could not afford school. Pew Research Center 2011
10:10
The hypotheses are as follows: \(H_0:p=0.5\) and \(H_A:p<0.5\).
Before calculating the test statistic we should check that the conditions are satisfied. Since the observations are independent and the success-failure condition is met, \(\hat{p}\) is expected to be approximately normal. \[np = 331(0.48) = 159 > 10\] \[n(1-p) = 331(1-0.48) = 172 > 10\]
The test statistic can be calculated as shown below. \[\hat{p} = 0.48 \text{ and } p_0 = 0.55\] \[ \begin{aligned} Z & = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}} \\ & = \frac{0.48 - 0.50}{\sqrt{0.50(1-0.50)/331}} \\ & = -0.73\\ \end{aligned} \] \[\text{p-value} = P(Z < -0.73) = 0.2327\]
Here, you can use R - pnorm(-0.73,0,1)
- to compute the p-value.
Since the p-value is large, we fail to reject \(H_0\). The data do not provide strong evidence that less than half of American adults who decide not to go to college make this decision because they cannot afford college.
Today, we discussed the following:
Next, we will discuss:
In lab, we will work on:
Using simulations for hypothesis testing and confidence intervals.
Utilizing R for our computations.