- Sort students by Hogwarts House, and within house, by year.
survey %>%
arrange(hogwarts, year) %>%
select(hogwarts, year, everything())
## # A tibble: 103 × 34
## hogwarts year X1 social economic diet college_app reedie_social
## <chr> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 don't know Senior 67 5 6 Neither 6 2
## 2 Gryffindor Freshman 19 5 7 Neither 3 3
## 3 Gryffindor Freshman 41 1 2 Vegan 1 3
## 4 Gryffindor Freshman 42 8 5 Neither 22 2
## 5 Gryffindor Freshman 63 2 2 Neither 12 2
## 6 Gryffindor Freshman 99 6 4 Neither 20 2
## 7 Gryffindor Freshman 103 3 3 Vegetari… 6 2
## 8 Gryffindor Junior 17 3 1 Neither 12 3
## 9 Gryffindor Junior 65 2 5 Vegetari… 8 3
## 10 Gryffindor Junior 90 4 6 Vegetari… 5 4
## # … with 93 more rows, and 26 more variables: reedie_economic <dbl>,
## # study <chr>, commons <chr>, transportation <chr>, division <chr>,
## # tradition <chr>, awkward <chr>, technology <chr>, historian <chr>,
## # alcohol <dbl>, reedie_alcohol <dbl>, marijuana <dbl>,
## # reedie_marijuana <dbl>, social_media <chr>, coffee_tea <chr>,
## # computer <chr>, season <chr>, thai <chr>, ac <chr>, beach_mountain <chr>,
## # donut <chr>, first_kiss <dbl>, meme <chr>, dog_pants <chr>, …
- Find the number of students in each year who think hot dogs are sandwiches.
survey %>%
group_by(year, hot_dog) %>%
summarize(n = n()) %>%
filter(hot_dog == "Yes")
## `summarise()` has grouped output by 'year'. You can override using the `.groups` argument.
## # A tibble: 4 × 3
## # Groups: year [4]
## year hot_dog n
## <chr> <chr> <int>
## 1 Freshman Yes 7
## 2 Junior Yes 6
## 3 Senior Yes 3
## 4 Sophomore Yes 13
- Calculate the median number of college applications submitted by students of Herodotus.
survey %>%
filter(historian == "Herodotus") %>%
summarize(median_apps = median(college_app))
## # A tibble: 1 × 1
## median_apps
## <dbl>
## 1 7
- Create a data set consisting only of categorical variables (ordered alphabetically), and with student responses ordered alphabetically, starting with the first variable.
survey %>%
select(dog_pants, historian, hogwarts, hot_dog, year) %>%
arrange(dog_pants, historian, hogwarts, hot_dog, year)
## # A tibble: 103 × 5
## dog_pants historian hogwarts hot_dog year
## <chr> <chr> <chr> <chr> <chr>
## 1 All legs don't know don't know Maybe Senior
## 2 All legs Herodotus Gryffindor Maybe Junior
## 3 All legs Herodotus Gryffindor Maybe Sophomore
## 4 All legs Herodotus Gryffindor No Freshman
## 5 All legs Herodotus Gryffindor No Sophomore
## 6 All legs Herodotus Gryffindor Yes Sophomore
## 7 All legs Herodotus Hufflepuff Yes Junior
## 8 All legs Herodotus Ravenclaw No Freshman
## 9 All legs Herodotus Ravenclaw No Junior
## 10 All legs Herodotus Ravenclaw No Sophomore
## # … with 93 more rows
- Identify students whose social and economic views differ by 2 or more points.
survey %>%
mutate(diff = abs(social - economic)) %>%
select(social, economic, diff) %>%
filter(diff > 2)
## # A tibble: 18 × 3
## social economic diff
## <dbl> <dbl> <dbl>
## 1 1 6 5
## 2 3 8 5
## 3 3 6 3
## 4 4 8 4
## 5 2 5 3
## 6 8 5 3
## 7 3 7 4
## 8 2 5 3
## 9 3 9 6
## 10 6 9 3
## 11 1 4 3
## 12 1 5 4
## 13 4 7 3
## 14 3 7 4
## 15 5 2 3
## 16 2 5 3
## 17 3 6 3
## 18 8 5 3
- Create a data set consisting of two variables: Hogwarts House and Political Views, where the Political Views score is obtained by averaging a student’s Social and Economic views scores.
survey %>%
mutate(poli_view = (social+economic)/2) %>%
select(hogwarts, poli_view)
## # A tibble: 103 × 2
## hogwarts poli_view
## <chr> <dbl>
## 1 Ravenclaw 4.5
## 2 Ravenclaw 3
## 3 Gryffindor 5
## 4 Hufflepuff 4
## 5 Slytherin 3
## 6 Slytherin 4
## 7 Ravenclaw 3.5
## 8 Slytherin 1
## 9 Hufflepuff 2
## 10 Gryffindor 3
## # … with 93 more rows
- Count how many students think both that dogs should wear pants on their back legs and that hot dogs are sandwiches.
survey %>%
filter(hot_dog=="Yes") %>%
filter(dog_pants == "Back legs") %>%
summarize(how_many = n())
## # A tibble: 1 × 1
## how_many
## <int>
## 1 22
- Among students who drink who are not freshmen, create a data set that could be used make a scatterplot of alcohol use vs. social views.
survey %>%
filter(alcohol > 0) %>%
filter(year != "Freshman") %>%
select(alcohol, social) %>%
ggplot(aes(x = alcohol, y = social))+
geom_jitter()
