12 - Inference for Comparing Paired Means

Alex John Quijano

11/29/2021

Previously on Statistics…

Inference on Single Mean

Today, we will discuss the following:

Global Warming

The problem shown below was taken and slightly modified from your textbook OpenIntro: Introduction to Modern Statistics Section 21.5. Consider the research study described below.

Let’s consider a limited set of climate data, examining temperature differences in 1948 vs 2018. We sampled 197 locations from the National Oceanic and Atmospheric Administration’s (NOAA) historical data, where the data was available for both years of interest. We want to know: were there more days with temperatures exceeding 90F in 2018 or in 1948? NOAA 2018 The difference in number of days exceeding 90F (number of days in 2018 - number of days in 1948) was calculated for each of the 197 locations. The average of these differences was 2.9 days with a standard deviation of 17.2 days.

The climate70 data used in this exercise can be found in the openintro R package.

We are interested in determining whether these data provide strong evidence that there were more days in 2018 that exceeded 90F from NOAA’s weather stations.

The Data Visualized

For each observation in one dataset, there is exactly one specially corresponding observation in the other dataset for the same geographic location. The data are paired.

The Null and Alternative Hypothesis

Hypothesis Testing

95% Confidence Interval

10.10-Minute Activity

Consider the following statement.

Each textbook has two corresponding prices in the data set: one for the UCLA bookstore and one for Amazon. Therefore, each textbook price from the UCLA bookstore has a natural correspondence with a textbook price from Amazon. When two sets of observations have this special correspondence, they are said to be paired.

Are textbooks actually cheaper online? Here we compare the price of textbooks at UCLA’s bookstore and prices at Amazon.com. Seventy-three UCLA courses were randomly sampled in Spring 2010, representing less than 10% of all UCLA courses. Source: AHS

The summary statistics are given here.

\[n_{diff} = 73, \hspace{10px} \bar{x}_{diff} = 12.76, \hspace{10px} s_{diff} = 14.26\]

  1. Are the conditions satisfied? How is this different from our previous topic of comparing two independent means?

  2. What is the null and alternative hypothesis?

  3. Perform a hypothesis test and compute the confidence interval. What is are your conclusions in terms of the problem?

10.10-Minute Activity

TBA

Summary

Today, we discussed the following:

Next, we will discuss: