class: center, middle ### ANalysis Of VAriance Test <img src="img/DAW.png" width="450px"/> <span style="color: #91204D;"> .large[Kelly McConville | Math 141 | Week 12 | Fall 2020] </span> --- ## Announcements/Reminders * Extra Credit Assignment: Write a stats poem. + Due December 2nd * Lab This Week: + If have a Friday afternoon lab, can attend a TH or Friday morning session. + Can see the times at https://solar.reed.edu/class_schedule/ + MUST inform both lab instructors of which lab you are in and which one you will be attending. * Data Viz Contest Presentations Today! Come vote for your favorites and potentially win door prizes! + Today, 4:30 - 5:30 pm + Zoom link in our class Slack --- ## Week 12 Topics * ANOVA Test * Simulation Methods versus Probability Model Methods for Inference * Inference for Linear Regression ********************************************* ### Goals for Today * The ANOVA Test --- ### Inference for Many Means Consider the situation where: * Response variable: quantitative * Explanatory variable: categorical -- * Parameter of interest: `\(\mu_1 - \mu_2\)` -- This parameter of interest only makes sense if the explanatory variable is restricted to two categories. -- It is time to learn how to conduct inference for more than two means. --- ### Hypotheses Consider the situation where: * Response variable: quantitative * Explanatory variable: categorical -- `\(H_o\)`: `\(\mu_1 = \mu_2 = \cdots = \mu_K\)` (Variables are independent/not related.) `\(H_a\)`: At least one mean is not equal to the rest. (Variables are dependent/related) --- ### Example Do Audience Ratings vary by movie genre? ```r library(tidyverse) # Load data library(Lock5Data) movies <- HollywoodMovies2011 %>% filter(!(Genre %in% c("Fantasy", "Adventure"))) %>% drop_na(Genre, AudienceScore) ``` * **Cases**: * **Variables of interest (including type)**: * **Hypotheses**: --- ### Example Does there appear to be a relationship? ```r ggplot(data = movies, mapping = aes(x = Genre, y = AudienceScore)) + geom_boxplot() + stat_summary(fun = mean, geom = "point", color = "purple") ``` <img src="wk12_mon_files/figure-html/unnamed-chunk-2-1.png" width="360" style="display: block; margin: auto;" /> -- What movie did the audience hate so much?? --- ### Example Does there appear to be a relationship? ```r bad <- filter(movies, AudienceScore == min(AudienceScore)) ggplot(data = movies, mapping = aes(x = Genre, y = AudienceScore)) + geom_boxplot() + geom_label(data = bad, mapping = aes(label = Movie)) + stat_summary(fun = mean, geom = "point", color = "purple") ``` <img src="wk12_mon_files/figure-html/unnamed-chunk-3-1.png" width="360" style="display: block; margin: auto;" /> What movie did the audience hate so much?? --- ### Test Statistic Need a test statistic! * Won't be a sample statistic. $$ \bar{x}_1 - \bar{x}_2 - \cdots - \bar{x}_K \mbox{ won't work!} $$ * Needs to measure the discrepancy between the observed sample and the sample we'd expect to see if `\(H_o\)` were true * Would be nice if its null distribution could be approximated by a known probability model --- ### Test Statistic Let's return to the name of the test. -- * Called "Analysis of VARIANCE" test. -- * Not called "Analysis of MEANS" test. -- **Question**: Why analyze *variability* to test differences in means? --- ### Why analyze *variability* to test differences in means? Let's look at some simulated data for a moment. <img src="wk12_mon_files/figure-html/unnamed-chunk-4-1.png" width="576" style="display: block; margin: auto;" /> **Question**: For which scenario are you most convinced that the means are different? --- ### Key Idea: Partitioning the Variability <img src="wk12_mon_files/figure-html/unnamed-chunk-5-1.png" width="576" style="display: block; margin: auto;" /> `\begin{align*} \mbox{Total Variability} & = \mbox{Variability Between Groups} + \mbox{Variability Within Groups} \end{align*}` -- * Variability Between Groups: How much the group means vary + Compare the red dots. -- * Variability Within Groups: How much natural group variability there is + Within groups, compare the black dots to the red dot. --- `\begin{align*} \mbox{Total Variability} & = \mbox{Variability Between Groups} + \mbox{Variability Within Groups} \end{align*}` * Variability Between Groups: How much the group means vary + Compare the red dots. `\begin{align*} \mbox{Variability Between Groups} &= \sum n_i (\bar{x}_i - \bar{x})^2 \\ & = \mbox{Sum of Squares Group} \\ & = \mbox{SSG} \end{align*}` -- * Variability Within Groups: How much natural group variability there is + Within groups, compare the black dots to the red dot. `\begin{align*} \mbox{Variability Within Groups} &= \sum (x - \bar{x}_i)^2 \\ & = \mbox{Sum of Squares Error} \\ & = \mbox{SSE} \end{align*}` -- * Total Variability: How much points vary from the overall mean `\begin{align*} \mbox{Total Variability} &= \sum (x - \bar{x})^2 \\ & = \mbox{Sum of Squares Total} \\ & = \mbox{SSTotal} \end{align*}` --- ### Mean Squares Need to standardize the Sums of Squares to compare SSG to SSE. -- `\begin{align*} \mbox{Mean Variability Between Groups} & = \frac{\mbox{SSG}}{K - 1} \end{align*}` `\begin{align*} \mbox{Mean Variability Within Groups} & = \frac{\mbox{SSE}}{n - K} \end{align*}` -- Now on a comparable scale! Now we can create a test statistice that compares these two measures of variability. --- ### Test Statistic In some ways, MSG is the natural test statistic but as we saw for this example, MSG alone isn't enough. <img src="wk12_mon_files/figure-html/unnamed-chunk-6-1.png" width="576" style="display: block; margin: auto;" /> -- Scenarios 2 and 3 have roughly the same MSG but we are much more convinced that the means are different for 2 than 3. -- That is where MSE comes in! --- ### Test Statistic $$ F = \frac{\mbox{MSG}}{\mbox{MSE}} = \frac{\mbox{var between groups}}{\mbox{variance within groups}} $$ If `\(H_o\)` is true, then `\(F\)` should be roughly equal to what? -- If `\(H_a\)` is true, then `\(F\)` should be greater than 1 because there is more variation in the group means than we'd expect if the population means are all equal. --- ### Returning to the Movies Example ```r library(infer) #Compute F test stat test_stat <- movies %>% specify(AudienceScore ~ Genre) %>% calculate(stat = "F") test_stat ``` ``` ## # A tibble: 1 x 1 ## stat ## <dbl> ## 1 3.88 ``` * Is 3.88 a large test statistic? Is a test statistic of 3.88 unusual under `\(H_o\)`? --- ### Generating the Null Distribution ``` ## AudienceScore Genre ## 1 43 Action ## 2 73 Thriller ## 3 48 Action ## 4 55 Thriller ## 5 68 Horror ## 6 44 Comedy ## 7 54 Comedy ## 8 62 Horror ## 9 50 Animation ## 10 59 Comedy ``` -- **Steps**: 1. Shuffle Genre. 2. Compute the `\(MSE\)` and `\(MSG\)`. 3. Compute the test statistic. 4. Repeat 1 - 3 many times. --- ### Generating the Null Distribution ```r # Construct null distribution null_dist <- movies %>% specify(AudienceScore ~ Genre) %>% hypothesize(null = "independence") %>% generate(reps = 1000, type = "permute") %>% calculate(stat = "F") visualize(null_dist) ``` <img src="wk12_mon_files/figure-html/unnamed-chunk-9-1.png" width="360" style="display: block; margin: auto;" /> --- ### The Null Distribution <img src="wk12_mon_files/figure-html/unnamed-chunk-10-1.png" width="360" style="display: block; margin: auto;" /> **Key Observations**: * Smallest possible value? <br> * Shape? --- ### The Null Distribution <img src="wk12_mon_files/figure-html/unnamed-chunk-11-1.png" width="360" style="display: block; margin: auto;" /> **Key Observations**: * Smallest possible value? <br> * Shape? <br> * Is our observed test statistic unusual? --- ### The P-value ```r # Compute p-value null_dist %>% get_pvalue(obs_stat = test_stat, direction = "greater") ``` ``` ## # A tibble: 1 x 1 ## p_value ## <dbl> ## 1 0 ``` --- ### Approximating the Null Distribution ```r visualize(null_dist, method = "both") ``` <img src="wk12_mon_files/figure-html/unnamed-chunk-13-1.png" width="360" style="display: block; margin: auto;" /> If * There are at least 30 observations in each group or the response variable is normal * The variability is similar in all groups then $$ \mbox{test statistic} \sim F(df1 = K - 1, df2 = n - K) $$ --- ### The ANOVA Test Check assumptions! ```r movies %>% group_by(Genre) %>% summarize(n(), sd(AudienceScore)) ``` ``` ## # A tibble: 7 x 3 ## Genre `n()` `sd(AudienceScore)` ## <fct> <int> <dbl> ## 1 Action 32 18.4 ## 2 Animation 12 13.9 ## 3 Comedy 27 15.7 ## 4 Drama 21 14.5 ## 5 Horror 17 15.9 ## 6 Romance 10 12.9 ## 7 Thriller 13 14.9 ``` --- ### The ANOVA Test Check assumptions! ```r ggplot(data = movies, mapping = aes(x = AudienceScore)) + geom_histogram(bins = 15) + facet_wrap(~Genre) ``` <img src="wk12_mon_files/figure-html/unnamed-chunk-15-1.png" width="576" style="display: block; margin: auto;" /> --- ### The ANOVA Test ```r library(broom) mod <- aov(AudienceScore ~ Genre, data = movies) tidy(mod) ``` ``` ## # A tibble: 2 x 6 ## term df sumsq meansq statistic p.value ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 Genre 6 5855. 976. 3.88 0.00137 ## 2 Residuals 125 31413. 251. NA NA ```