Alex John Quijano
11/22/2021
Confidence intervals for one mean using one sample t-intervals
Confidence intervals for comparing two means using two sample t-intervals
Today, we will discuss the following:
Every year, the US releases to the public a large data set containing information on births recorded in the country. This data set has been of interest to medical researchers who are studying the relation between habits and practices of expectant mothers and the birth of their children. We will work with a random sample of 1,000 cases from the data set released in 2014.
The births14
data can be found in the openintro R package.
fage | mage | weeks | visits | gained | weight | sex | habit |
---|---|---|---|---|---|---|---|
34 | 34 | 37 | 14 | 28 | 6.96 | male | nonsmoker |
36 | 31 | 41 | 12 | 41 | 8.86 | female | nonsmoker |
37 | 36 | 37 | 10 | 28 | 7.51 | female | nonsmoker |
16 | 38 | 29 | 6.19 | male | nonsmoker |
We would like to know, is there convincing evidence that newborns from mothers who smoke have a different average birth weight than newborns from mothers who don’t smoke?
Habit | n | Mean | SD |
---|---|---|---|
nonsmoker | 867 | 7.27 | 1.23 |
smoker | 114 | 6.68 | 1.60 |
Conditions:
The data come from a simple random sample, the observations are independent, both within and between samples.
Both groups over 30 observations, we inspect the data for any particularly extreme outliers and find none.
Since both conditions are satisfied, the difference in sample means may be modeled using a \(t\)-distribution.
The top panel represents birth weights for infants whose mothers smoked during pregnancy. The bottom panel represents the birth weights for infants whose mothers who did not smoke during pregnancy.
Consider one group (smoking) from the data. It is known that a newborn baby has an average weight of \(7.5\) lbs. We want to test whether the average weight for the smoking group is less than the average using a one sample t-test.
Is the data (smoking group) a convincing evidence to support the claim of the average weight to be less than \(7.5\) lbs?
\(H_0\): The average weight of the smoking group is \(7.5\) lbs. \[\mu = 7.5\]
\(H_A\): The average weight of the smoking group is not \(7.5\) lbs. \[\mu \ne 7.5\]
The null value is \(\mu_0 = 7.5\). The point-estimate is \(\bar{x} = 6.68\) and the sample standard deviation is \(s = 1.60\).
Step 1: Compute the standard error \[ \begin{aligned} SE & = \frac{s}{\sqrt{n}} \\ & = \frac{1.60}{\sqrt{114}} \\ SE & = 0.15 \end{aligned} \]
Step 2: Compute the T statistic \[ \begin{aligned} T & = \frac{\bar{x} - \mu_0}{SE} \\ & = \frac{6.68 - 7.5}{0.15} \\ T & = -5.47 \end{aligned} \]
Step 3: Degrees of freedom is \(df = n - 1 = 114 - 1 = 113\).
Step 4: The p-value is \(1.75 \times 10^{-07}\). Here, we used 2*pt(-5.57,113)
in R. We multiply by 2 because it’s a two-sided test.
Conclusions:
Since the p-value is less than significance of 0.05 or 0.01 (the p-value is really small), we can conclude that the data is a strong evidence that the average weights for the smoking group is not equal to \(7.5\) lbs.
Since the T statistic is negative, we can say that the average weights is less than the null value.
Habit | n | Mean | SD |
---|---|---|---|
nonsmoker | 867 | 7.27 | 1.23 |
smoker | 114 | 6.68 | 1.60 |
Is there a difference in weight means between the smoking group and nonsmoking group?
\(H_0\): There is no difference in means between the smoking and nonsmoking groups. \[\mu_{smoking} = \mu_{nonsmoking}\]
\(H_A\): There is a significant difference in means between the smoking and nonsmoking groups. In particular the smoking group weights is less than the nonsmoking group weights. \[\mu_{smoking} < \mu_{nonsmoking}\]
The null value is \(\mu_0 = 0\). The point-estimate is \(\bar{x}_{nonsmoking} - \bar{x}_{smoking} = 0.59\) and the sample standard deviations are \(s_{smoking} = 1.60\) and \(s_{nonsmoking} = 1.23\).
Step 1: Compute the standard error \[ \begin{aligned} SE & = \sqrt{\frac{s_{smoking}^2}{n_{smoking}} + \frac{s_{nonsmoking}^2}{n_{nonsmoking}}} \\ & = \sqrt{\frac{1.60^2}{114} + \frac{1.23^2}{867}} \\ SE & = 0.156 \end{aligned} \]
Step 2: Compute the T statistic \[ \begin{aligned} T & = \frac{\bar{x}_{nonsmoking} - \bar{x}_{smoking} - \mu_0}{SE} \\ & = \frac{0.59 - 0}{0.156} \\ T & = 3.78 \end{aligned} \]
Step 3: Degrees of freedom is \(df = min(n_{smoking} - 1,n_{nonsmoking} - 1) = 114 - 1 = 113\).
Step 4: The p-value is \(0.000126\). Here, we used 1-pt(3.78,113)
in R. This is a one-sided test.
Conclusions:
Since the p-value is less than significance of 0.05 or 0.01 (the p-value is really small), we can conclude that the data is a strong evidence that there is a difference in weights between nonsmoking and smoking groups.
Since the T statistic is positive, by the order of how we computed the difference, we can say that the average weights is greater in the nonsmoking group than in the smoking group.
The problem shown below was taken and slightly modified from your textbook OpenIntro: Introduction to Modern Statistics Section 20.6. Consider the research study described below.
Each year the US Environmental Protection Agency (EPA) releases fuel economy data on cars manufactured in that year. Below are summary statistics on fuel efficiency (in miles/gallon) from random samples of cars with manual and automatic transmissions manufactured in 2021. Do these data provide strong evidence of a difference between the average fuel efficiency of cars with manual and automatic transmissions in terms of their average city mileage? US DOE EPA 2021
CITY | Mean | SD | n |
---|---|---|---|
Automatic | 17.4 | 3.44 | 25 |
Manual | 22.7 | 4.58 | 25 |
10:10
\(H_0\): There is no difference in fuel efficiency means between the manual and automatic groups. \[\mu_{manual} = \mu_{automatic}\]
\(H_A\): There is a significant difference in fuel efficiency means between the manual and automatic groups. In particular the manual group weights is greater than the automatic group weights. \[\mu_{manual} > \mu_{automatic}\]
The null value is \(\mu_0 = 0\). The point-estimate is \(\bar{x}_{manual} - \bar{x}_{automatic} = 5.3\) and the sample standard deviations are \(s_{manual} = 4.58\) and \(s_{automatic} = 3.44\).
Step 1: Compute the standard error \[ \begin{aligned} SE & = \sqrt{\frac{s_{manual}^2}{n_{manual}} + \frac{s_{automatic}^2}{n_{automatic}}} \\ & = \sqrt{\frac{4.58^2}{25} + \frac{3.44^2}{25}} \\ SE & = 1.1456 \end{aligned} \]
Step 2: Compute the T statistic \[ \begin{aligned} T & = \frac{\bar{x}_{manual} - \bar{x}_{automatic} - \mu_0}{SE} \\ & = \frac{5.3 - 0}{1.1456} \\ T & = 4.6264 \end{aligned} \]
Step 3: Degrees of freedom is \(df = min(n_{manual} - 1,n_{automatic} - 1) = 25 - 1 = 24\).
Step 4: The p-value is \(5.37 \times 10^{-05}\). Here, we used 1-pt(4.6264,24)
in R. This is a one-sided test.
Conclusions:
Since the p-value is less than significance of 0.05 or 0.01 (the p-value is extremely small), we can conclude that the data is a strong evidence that there is a difference in fuel efficiency between manual and automatic groups.
Since the T statistic is positive, by the order of how we computed the difference, we can say that the mean fuel efficiency is greater in the manual group than in the automatic group.
Today, we discussed the following:
Hypothesis testing for one and comparing two means
One sample and Two sample t-test
Next, we will discuss:
In lab, we will work on: