12 - Inference for One Mean
Confidence Intervals

Alex John Quijano

11/17/2021

Previously on Statistics…

Inference on Single Mean

Today, we will discuss the following:

Central Limit Theorem for the sample mean

When we collect a sufficiently large sample of \(n\) independent observations from a population with mean \(\mu\) and standard deviation \(\sigma,\) the sampling distribution of \(\bar{x}\) will be nearly normal with

\[\text{Mean} = \mu \qquad \text{Standard Error }(SE) = \frac{\sigma}{\sqrt{n}}\]

Evaluating the two conditions required for modeling \(\bar{x}\)

Two conditions are required to apply the Central Limit Theorem for a sample mean \(\bar{x}:\)

General rule for performing the normality check

Note, it often takes practice to get a sense for whether or not a normal approximation is appropriate.

Normality Assesment (1/2)

Consider the four plots provided that come from simple random samples from different populations.

Are the independence and normality conditions met in each case?

Histograms of samples from two different populations.

Histograms of samples from two different populations.

Normality Assesment (2/2)

The t-distribution (1/2)

Comparison of a $t$-distribution and a normal distribution.

Comparison of a \(t\)-distribution and a normal distribution.

The \(t\)-distribution is always centered at zero and has a single parameter: degrees of freedom. The degrees of freedom describes the precise form of the bell-shaped \(t\)-distribution. In general, we’ll use a \(t\)-distribution with \(df = n - 1\) to model the sample mean when the sample size is \(n.\)

The t-distribution (2/2)

The larger the degrees of freedom, the more closely the $t$-distribution resembles the standard normal distribution.

The larger the degrees of freedom, the more closely the \(t\)-distribution resembles the standard normal distribution.

Mercury content in Risso’s dolphins (1/3)

We will identify a confidence interval for the average mercury content in dolphin muscle using a sample of 19 Risso’s dolphins from the Taiji area in Japan.

Summary of mercury content in the muscle of 19 Risso’s dolphins from the Taiji area. Measurements are in micrograms of mercury per wet gram of muscle \((\mu\)g/wet g).
n Mean SD Min Max
19 4.4 2.3 1.7 9.2

Mercury content in Risso’s dolphins (2/3)

One sample t-intervals

\[ \begin{aligned} \text{point estimate} \ &\pm\ t^*_{df} SE \\ \bar{x} \ &\pm\ t^*_{df} \frac{s}{\sqrt{n}} \end{aligned} \]

# use qt() to find the t-cutoff (with 95% in the middle)
qt(0.025, df = 18)
#> [1] -2.1
qt(0.975, df = 18)
#> [1] 2.1

Mercury content in Risso’s dolphins (3/3)

One sample t-intervals

\[ \begin{aligned} \bar{x} \ &\pm\ t^*_{18} SE \\ 4.4 \ &\pm\ 2.10 (0.528) \\ \end{aligned} \] \[(3.29,5.51)\]

We are 95% confident the average mercury content of muscles in Risso’s dolphins is between 3.29 and 5.51 \(\mu\)g/wet gram, which is considered extremely high.

The Margin of Error for Means

\[ME = t^*_{df}SE = t^*_{df} \frac{s}{\sqrt{n}}\]

where \(t^*_{df}\) is calculated from a specified percentile on the t-distribution with df degrees of freedom.

We can work backwards:

10.10-Minute Activity (1/3)

The exercise problem shown below was taken and slightly modified from your textbook OpenIntro: Introduction to Modern Statistics Section 19.4.

Heights of adults.

Researchers studying anthropometry collected body measurements, as well as age, weight, height and gender, for 507 physically active individuals. Summary statistics for the distribution of heights (measured in centimeters), along with a histogram, are provided below. Heinz et al. 2003)

Min Q1 Median Mean Q3 Max SD IQR
147 164 170 171 178 198 9.4 14

  1. Check if the conditions are statisfied.

  2. Compute the 90% confidence interval for the average heights of adults.

  3. Work backwards to compute the critical \(t^*_{df}\) for a margin of error of \(0.001\).

10:10

10.10-Minute Activity (2/4)

    • Independence: The observations are a simple random sample, therefore it is reasonable to assume that the 507 physically active individuals are independent.
    • Normality: Based on the summary statistics and the histogram, there are no clear outliers and the sample size is large enough to assume that the resulting sampling distribution of the mean is normally distributed.

10.10-Minute Activity (3/4)

    • Compute the standard error. \[SE = \frac{s}{\sqrt{n}} = \frac{9.4}{\sqrt{148}} = 0.7727\]
    • Compute the \(t^*_{df}\). Given a confidence level of 90%, the t statistic computed using R command qt(0.95,147) is shown below. \[t^*_{147} = 1.6553\]
    • Compute the 90% confidence interval. \[ \begin{aligned} \bar{x} & \pm t^*_{147} SE \\ 171 & \pm 1.6553 (0.772) \\ \end{aligned} \] \[(169.721,172.279)\]
    Therefore, we are 90% confident that the true mean heights of adults is between \(169.721\)cm and \(172.27\)cm.

10.10-Minute Activity (4/4)

    • The goal is to find the critical \(t^*_{df}\) for a margin of error \(ME = 0.001\).

    • Work backwards. \[ \begin{aligned} ME & = t^*_{df}SE \\ 0.001 & = t^*_{df} (0.772) \\ t^*_{df} & = \frac{0.001}{0.772} \\ t^*_{df} & = 0.0013 \\ \end{aligned} \]

    For a \(t^*_{df} = 0.0013\) with the same standard error and degrees of freedom, the interval will become narrower and more precise but we lose the level of confidence.

Summary

Today, we discussed the following:

Next, we will discuss: