5 - Hypothesis Testing Continued

Alex John Quijano

09/29/2021

Previously on Hypothesis Testing…

In the previous lecture, we learned about the following:

Hypothesis Testing Continued

In this lecture, we will learn about:

Hypothesis Testing

A hypothesis test is a formal technique for evaluating two competing possibilities.

Null and alternative hypotheses.

Gender Discrimination - Revisitied (1/4)

We discussed a study from the 1970’s that explored whether there was strong evidence that female candidates were less likely to be promoted than male candidates.

The research question are female candidates discriminated against in promotion decisions? was framed in the context of hypotheses:

Gender Discrimination - Revisitied (2/4)

Gender Discrimination - Revisitied (3/4)

The 48 red and white cards are show in three panels.  The first panel represents the original data and original allocation of the male and female files (in the original data there are 3 white cards in the male group and 10 white cards in the female group).  The second panel represents the shuffled red and white cards that are randomly assigned as male and female files.  The third panel has the cards sorted according to the random assignment of female or male.  In the third panel there are 6 white cards in the male group and 7 white cards in the female group.

We summarize the randomized data to produce one estimate of the difference in proportions given no sex discrimination. Note that the sort step is only used to make it easier to visually calculate the simulated sample proportions.

Gender Discrimination - Revisitied (4/4)

The P-value

Case Study - Opportunity Cost

Research Question
“How rational and consistent is the behavior of the typical American college student?”

Sources:

Case Study: Opportunity Cost - The Setup (1/3)

We are interested in whether reminding students about this well-known fact about money causes them to be a little thriftier.

A skeptic might think that such a reminder would have no impact.

We can summarize the two different perspectives using the null and alternative hypothesis framework.

Case Study: Opportunity Cost - The Setup (2/3)

One-hundred and fifty students were recruited for the study, and each was given the following statement:

Imagine that you have been saving some extra money on the side to make some purchases, and on your most recent visit to the video store you come across a special sale on a new video. This video is one with your favorite actor or actress, and your favorite type of movie (such as a comedy, drama, thriller, etc.). This particular video that you are considering is one you have been thinking about buying for a long time. It is available for a special sale price of $14.99. What would you do in this situation? Please circle one of the options below.

Case Study: Opportunity Cost - The Setup (3/3)

Half of the 150 students were randomized into a control group and were given the following two options:

The remaining 75 students were placed in the treatment group, and they saw a slightly modified option (B):

Would the extra statement reminding students of an obvious fact impact the purchasing decision?

Case Study: Opportunity Cost - The Data (1/3)

Summary results of the opportunity cost study.
decision
group buy video not buy video Total
control 56 19 75
treatment 41 34 75
Total 97 53 150

Case Study: Opportunity Cost - The Data (2/3)

Stacked bar plot of results of the opportunity cost study.

Stacked bar plot of results of the opportunity cost study.

Case Study: Opportunity Cost - The Data (3/3)

The opportunity cost data are summarized using row proportions. Row proportions are particularly useful here since we can view the proportion of buy and not buy decisions in each group.
decision
group buy video not buy video Total
control 0.747 0.253 1
treatment 0.547 0.453 1

Case Study: Opportunity Cost - Point Estimate (1/2)

We will define a success in this study as a student who chooses not to buy the video.

Then, the value of interest is the change in video purchase rates that results by reminding students that not spending money now means they can spend the money later.

Case Study: Opportunity Cost - Point Estimate (2/2)

Case Study: Opportunity Cost - Variability of the Statistic (1/3)

Case Study: Opportunity Cost - Variability of the Statistic (2/3)

Case Study: Opportunity Cost - Variability of the Statistic (3/3)

Case Study: Opportunity Cost - 1 Randomization

The results of a single randomization is shown in Table @ref(tab:opportunity-cost-obs-simulated).

Summary of student choices against their simulated groups. The group assignment had no connection to the student decisions, so any difference between the two groups is due to chance.
decision
group buy video not buy video Total
control 46 29 75
treatment 51 24 75
Total 97 53 150

The difference that occurred from the first shuffle of the data (i.e., from chance alone):

\[\hat{p}_{T, shfl1} - \hat{p}_{C, shfl1} = \frac{24}{75} - \frac{29}{75} = 0.32 - 0.387 = - 0.067\]

Case Study: Opportunity Cost - 1000 Randomizations (1/2)

A histogram of 1,000 chance differences produced under the null hypothesis. Histograms like this one are a convenient representation of data or results when there are a large number of simulations.

A histogram of 1,000 chance differences produced under the null hypothesis. Histograms like this one are a convenient representation of data or results when there are a large number of simulations.

Case Study: Opportunity Cost - 1000 Randomizations (2/2)

The P-value and Statistical Significance (1/3)

The P-value and Statistical Significance (2/3)

The P-value and Statistical Significance (2/3)

Statistical Significance


We say that the data provide statistically significant evidence against the null hypothesis if the p-value is less than some predetermined threshold (e.g., 0.01, 0.05, 0.1).

Caution (1/2)

Caution (2/2)

Summary

In this lecture we talked about:

In the next lecture, we will talk about:

Today’s Activity

Within your group, discuss the answers for the following problem.

Hypotheses. Write the null and alternative hypotheses in words and then symbols for each of the following situations. OpenIntro: IMS Section 11.5

  1. New York is known as “the city that never sleeps”. A random sample of 25 New Yorkers were asked how much sleep they get per night. Do these data provide convincing evidence that New Yorkers on average sleep less than 8 hours a night?

  2. Employers at a firm are worried about the effect of March Madness, a basketball championship held each spring in the US, on employee productivity. They estimate that on a regular business day employees spend on average 15 minutes of company time checking personal email, making personal phone calls, etc. They also collect data on how much company time employees spend on such non- business activities during March Madness. They want to determine if these data provide convincing evidence that employee productivity decreases during March Madness.