class: center, middle # More Graphing with `ggplot2` <img src="img/hero_wall_pink.png" width="800px"/> ## Kelly McConville .large[Math 241 | Week 2 | Spring 2021] --- # Announcements * Lab 1 due on Gradescope on Monday! + Colored pencils outside my office (end of 3rd floor of Library). * Lab 2 posted in the shared folder on the RStudio Server. + Due next Thursday by 8:30am on Gradescope. * Make sure to add your GitHub username to [this sheet](https://docs.google.com/spreadsheets/d/1nvM8jJDUvp8H5iYF59aKNdnJlqy_aW18R6d6DBUPeSU/edit?usp=sharing). * If you have group member preferences for the mini-projects, fill out [this form](https://forms.gle/6UHStiAh4ZvKXej1A) by Friday, Feb 5th. --- ## Goals for Today * Reproducible workflow * `reprex` + A productive way to ask for help and to give help * More `ggplot2` + Additional geoms + Scale + Color + Themes * Context + Take a sad graph and make it better... * Not discussed: Visualizing spatial data with maps + Will cover later in the course! --- ## Reproducible Workflow * One where if you shared your data and work with someone else, they could reproduce your results. -- * Not the same as **replication**: Where someone collects new data following your same design to see if they get the same results. -- * `RMarkdown` documents allow us to include our R code, output, and narrative in the same place. + Load the **raw** data. + Be transparent about all the analysis steps. + Even if you don't showcase the `R` code in the output file, it is contained in the `Rmd` file. <img src="img/rmarkdown.png" width="20%" style="display: block; margin: auto;" /> --- ## Creating `repr`oducible `ex`amples with `reprex` Why do I need to learn to create reproducible technical examples? -- So you can participate on Stack Overflow, R help mailing lists, or our class Slack channel or in a GitHub Issue! * More on GitHub next week! --- ## What is wrong with this question? I am trying to create a plot and I can't get the bars to do what I want them to. Help?! --- ## What is wrong with this coding question? I want to do the following but it isn't working: thing <- read.csv("long/file/path/thing.csv") ggplot(thing, aes(x = that)) + geom_bar() + scale_x_discrete(c("thing1", "thing2", "thing3")) Help?! --- ## What is wrong with this coding question? I want to reorder the bars of my plot but can't get it working. Help! ```r library(tidyverse) mtcars <- mtcars %>% group_by(cyl) %>% mutate(mean_mpg = mean(mpg)) %>% ungroup() %>% mutate(heavy = case_when(wt < mean(wt) ~ "no", wt >= mean(wt) ~ "yes")) mtcars %>% ggplot(mapping = aes(x = factor(cyl))) + geom_bar() mtcars %>% count(heavy) ``` ``` ## # A tibble: 2 x 2 ## heavy n ## <chr> <int> ## 1 no 16 ## 2 yes 16 ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-2-1.png" width="504" /> --- ## What is wrong with this coding question? I want to reorder the bars of my plot but can't get it working. Help! ```r rm(list = ls()) library(tidyverse) mtcars %>% ggplot(mapping = aes(x = factor(cyl))) + geom_bar() ``` --- ## [What makes a good coding question?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) -- * It uses a **minimal** dataset to reproduce the issue. * It includes the **shortest** amount of **runnable** code necessary to reproduce the issue. * It includes any necessary information on the used packages, R version, system, etc. * For random processes, it uses `set.seed(insert #)` for reproducibility. * It doesn't wreak havoc on other people's computers. * It includes code **and output** so that others don't have to run it! --- ## Minimal Dataset Create a toy data frame. ```r dat <- data.frame(animal = c("cat", "dog", "mouse"), weight = c(5, 10, 0.5)) dat ``` ``` ## animal weight ## 1 cat 5.0 ## 2 dog 10.0 ## 3 mouse 0.5 ``` Use a built-in dataset or a dataset from a particular package. ```r library(help = "datasets") ?mtcars ``` --- # [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/datapasta-in-the-cloud.html) If you really want to use a specific dataset, you can use `datapasta` to create a dataframe. * But don't use a large dataset in your `reprex`. **Steps**: * Install `datapasta`. * Grab [some data](https://www.nytimes.com/interactive/2017/05/25/sunday-review/opinion-pell-table.html). * Paste into an empty R script. * Select all and go to the `Addins` and select `Paste as data.frame`. * Notice that the text has transformed into code defining data! * Grab code, write `dat_name <- ` and paste in code. --- ## Minimal Code Include the **necessary** libraries. Test run the code in a restarted R session to make sure it is runnable! ```r library(tidyverse) mtcars %>% arrange(cyl) %>% ggplot(mapping = aes(x = factor(cyl))) + geom_bar() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-6-1.png" width="504" /> --- ## Make sure your code is copy-and-paste-able! Don't copy from the console. ```r > library(tidyverse) > > mtcars %>% + arrange(cyl) %>% + ggplot(mapping = aes(x = factor(cyl))) + + geom_bar() ``` --- ## Make sure your code is copy-and-paste-able! No screenshots! ![](img/screenShotCode.png) --- ## What Else to Include? (Don't need to do this for our class Slack channel) * R Version * Platform ```r sessionInfo() ``` ``` ## R version 4.0.3 (2020-10-10) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Debian GNU/Linux 10 (buster) ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0 ## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0 ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] forcats_0.5.0 stringr_1.4.0 dplyr_1.0.2 purrr_0.3.4 ## [5] readr_1.4.0 tidyr_1.1.2 tibble_3.0.4 ggplot2_3.3.3 ## [9] tidyverse_1.3.0 knitr_1.30 ## ## loaded via a namespace (and not attached): ## [1] tidyselect_1.1.0 xfun_0.19 haven_2.3.1 colorspace_2.0-0 ## [5] vctrs_0.3.6 generics_0.1.0 htmltools_0.5.0 yaml_2.2.1 ## [9] utf8_1.1.4 rlang_0.4.10.9000 pillar_1.4.7 glue_1.4.2 ## [13] withr_2.3.0 DBI_1.1.0 dbplyr_2.0.0 modelr_0.1.8 ## [17] readxl_1.3.1 lifecycle_0.2.0 munsell_0.5.0 gtable_0.3.0 ## [21] cellranger_1.1.0 rvest_0.3.6 evaluate_0.14 labeling_0.4.2 ## [25] fansi_0.4.1 broom_0.7.3 Rcpp_1.0.5 scales_1.1.1 ## [29] backports_1.2.1 jsonlite_1.7.2 farver_2.0.3 fs_1.5.0 ## [33] hms_0.5.3 digest_0.6.27 stringi_1.5.3 xaringan_0.19 ## [37] grid_4.0.3 cli_2.2.0 tools_4.0.3 magrittr_2.0.1 ## [41] crayon_1.3.4 pkgconfig_2.0.3 ellipsis_0.3.1 xml2_1.3.2 ## [45] reprex_1.0.0 lubridate_1.7.9.2 assertthat_0.2.1 rmarkdown_2.6 ## [49] httr_1.4.2 rstudioapi_0.13 R6_2.5.0 compiler_4.0.3 ``` --- ## What Else to Include? * If it is a problem with a specific package, include the version. ```r packageVersion("tidyverse") ``` ``` ## [1] '1.3.0' ``` --- ## `reprex` Now we have our reproducible example: ```r library(tidyverse) mtcars %>% arrange(cyl) %>% ggplot(mapping = aes(x = factor(cyl))) + geom_bar() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-10-1.png" width="504" /> How can we easily share it? * Using the `reprex()` function in the `reprex` package. --- ## `reprex` Practice Time! But first: **Q**: What is an R script file? -- * A text file for entering R commands. -- **Q**: How is an R script file different from an R Markdown document? -- * You only put code in an R script. * If you add any text you must comment it out with `#`. * Think of it as a single r chunk that you won't compile into an output document. * Useful when writing a lot of code and want to compartmentalize. * Will come back to when we write an R data package! --- #### `reprex` Practice Time! (1) Restart your R Session. (2) Open a script file and include in the top line: ```r library(reprex) ``` (3) Put the code you want to use in the script file and make sure it runs. ```r library(tidyverse) mtcars %>% arrange(cyl) %>% ggplot(mapping = aes(x = factor(cyl))) + geom_bar() ``` (4) Surround the code with `reprex({ ... })` and run it. (5) To the question "Open the output file for manual copy?" select "1: yes". A window with the output file should pop up. (6) Head over to Slack/GitHub and create a text/code snippet with the following: + Type: R + Content: Paste code and output (but not any figures). + Message: Paste any images and provide your question. --- ## Recap: Components of Data Graphics * **data**: dataset that contains the raw data * **geom**: geometric shape that the data are mapped to. + point, line, bar, text, ... * **aes**thetic: visual properties of the **geom** + x position, y position, color, fill, shape * **coord**: coordinate system + Cartesian, polar * **scale**: controls how data are mapped to the visual values of the aesthetic. + EX: particular colors, linear * **guide**: legend to help user convert visual display back to the data --- ## ggplot2 example code ```r ggplot(data = ---, mapping = aes(---)) + geom_---(---) + coord_---() + scale_---_---() + --- ``` --- ##Recap Data: Births2015 ```r # Load libraries library(mosaicData) library(tidyverse) # Grab data data(Births2015) ``` --- ## Recap Data: Births2015 ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-15-1.png" width="504" /> * Let's think more about the scales. * Want to add more context! --- ## Scales ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + scale_x_date() + scale_y_continuous() + scale_color_discrete() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-16-1.png" width="504" /> * Maybe we want to change the default settings. * Maybe we want a different scale than the default. --- ## Scales ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + scale_y_continuous(breaks = seq(6000, 14000, by = 500)) + scale_color_brewer(type = "qual", palette = 1) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-17-1.png" width="504" /> --- ##Context: Labels ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + labs(x = "Date", y = "Number of Births in US", title = "Trend of Births in 2015", subtitle = "Data: National Vital Statistics System", color = "Week Days") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-18-1.png" width="504" /> * Prefer citing the data at the bottom? --- ##Context: Labels ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + labs(x = "Date", y = "Number of Births in US", title = "Trend of Births in 2015", caption = "Data: National Vital Statistics System", color = "Week Days") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-19-1.png" width="504" /> * Now we want to add even more context: Identify holidays. --- ##Context: Labels * For slide space, I neglect my labeling in the rest of the presentation. --- ##Context: Adding Holidays ```r library(lubridate) holidays <- data.frame(date = ymd("2015-01-01","2015-05-25", "2015-07-04", "2015-12-25", "2015-11-26", "2015-12-24", "2015-09-07"), occasion = c("New Year", "Memorial Day", "Independence Day", "Christmas", "Thanksgiving", "Christmas Eve", "Labor Day"), image = c("https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/party-popper_1f389.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/military-medal_1f396-fe0f.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/sparkler_1f387.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/christmas-tree_1f384.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/turkey_1f983.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/wrapped-gift_1f381.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/320/apple/271/construction-worker_1f477.png" ), image2 = c("https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/party-popper_1f389.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/military-medal_1f396-fe0f.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/sparkler_1f387.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/christmas-tree_1f384.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/turkey_1f983.png", "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/240/apple/271/wrapped-gift_1f381.png", "~/math241s21/img/full_bernie.png" )) holidays <- left_join(holidays, Births2015) ``` --- ##Context: Adding Holidays ```r glimpse(holidays) ``` ``` ## Rows: 7 ## Columns: 11 ## $ date <date> 2015-01-01, 2015-05-25, 2015-07-04, 2015-12-25, 2015-11… ## $ occasion <chr> "New Year", "Memorial Day", "Independence Day", "Christm… ## $ image <chr> "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.… ## $ image2 <chr> "https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.… ## $ births <dbl> 8068, 7746, 7944, 6515, 7332, 8714, 8127 ## $ wday <ord> Thu, Mon, Sat, Fri, Thu, Thu, Mon ## $ year <dbl> 2015, 2015, 2015, 2015, 2015, 2015, 2015 ## $ month <dbl> 1, 5, 7, 12, 11, 12, 9 ## $ day_of_year <int> 1, 145, 185, 359, 330, 358, 250 ## $ day_of_month <dbl> 1, 25, 4, 25, 26, 24, 7 ## $ day_of_week <dbl> 5, 2, 7, 6, 5, 5, 2 ``` * Let's add some context. --- ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_point(data = holidays, mapping = aes(x = date, y = births), color = "black", size = 3) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-22-1.png" width="504" /> --- ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_text(data = holidays, mapping = aes(label = occasion)) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-23-1.png" width="504" /> * Problems? --- ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_text(data = holidays, mapping = aes(label = occasion), show.legend = FALSE) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-24-1.png" width="504" /> --- ```r library(ggrepel) ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_text_repel(data = holidays, mapping = aes(label = occasion), show.legend = FALSE) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-25-1.png" width="504" /> --- ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point(size = 6, color = "black", data=holidays) + geom_point(size = 5, color = "grey90", data=holidays) + geom_point() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-26-1.png" width="504" /> --- ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point(size = 6, color = "black", data=holidays) + geom_point(size = 5, color = "grey90", data=holidays) + geom_point() + annotate("text", x = as_date("2015-09-01"), y = 7500, label = "Holidays", color="black", size=5) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-27-1.png" width="504" /> --- ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point(size = 6, color = "black", data=holidays) + geom_point(size = 5, color = "grey90", data=holidays) + geom_point() + annotate("segment", colour = "black", x = as_date("2015-09-01"), xend = holidays$date, y = 6800, yend = holidays$births, size = 1, alpha = 0.2, arrow = arrow()) + annotate("text", x = as_date("2015-09-01"), y = 6600, label = "Holidays", color = "black", size = 5) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-28-1.png" width="504" /> --- ```r # Create a story label label_data <- data.frame(date = ymd("2015-01-01"), births = max(Births2015$births), label = "The frequency of births on holidays \nfollows weekend \ntrends.") ``` --- ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_text(mapping = aes(label = label), data = label_data, color = "black", vjust = "top", hjust = "left") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-30-1.png" width="504" /> --- ```r library(ggimage) ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_image(data = holidays, mapping = aes(image = image, x = date, y = births), inherit.aes = FALSE) + geom_text(mapping = aes(label = label), data = label_data, color = "black", vjust = "top", hjust = "left") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-31-1.png" width="504" /> --- ```r library(ggimage) ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + geom_image(data = holidays, mapping = aes(image = image2, x = date, y = births), inherit.aes = FALSE, size = .05) + geom_text(mapping = aes(label = label), data = label_data, color = "black", vjust = "top", hjust = "left") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-32-1.png" width="504" /> --- ### Context <img src="slidesWk2Th_files/figure-html/unnamed-chunk-33-1.png" width="504" /> And there are lots more ways to annotate your graph (shaded regions, spike lines...). -- A few notes: * Don't over do it! * Like with selecting a `geom` or a `mapping` or a `scale`, try several out first. --- ## BLS Consumer Expenditure Survey * New dataset: Last quarter of the Bureau of Labor Statistics Consumer Expenditure Survey. ```r ce <- read_csv("/home/courses/math241s21/Data/ce.csv") glimpse(ce) ``` ``` ## Rows: 6,301 ## Columns: 49 ## $ FINLWT21 <dbl> 25985, 6581, 20208, 18078, 20112, 19907, 11705, 24431, 42859… ## $ FINCBTAX <dbl> 116920, 200, 117000, 0, 2000, 942, 0, 91000, 95000, 40037, 1… ## $ BLS_URBN <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … ## $ POPSIZE <dbl> 2, 3, 4, 2, 2, 2, 1, 2, 5, 2, 3, 2, 2, 3, 4, 3, 3, 1, 4, 1, … ## $ EDUC_REF <chr> "16", "15", "16", "15", "14", "11", "10", "13", "12", "12", … ## $ EDUCA2 <chr> "15", "15", "13", NA, NA, NA, NA, "15", "15", "14", "12", "1… ## $ AGE_REF <dbl> 63, 50, 47, 37, 51, 63, 77, 37, 51, 64, 26, 59, 81, 51, 67, … ## $ AGE2 <chr> "50", "47", "46", ".", ".", ".", ".", "36", "53", "67", "44"… ## $ SEX_REF <dbl> 1, 1, 2, 1, 2, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, … ## $ SEX2 <dbl> 2, 2, 1, NA, NA, NA, NA, 2, 2, 1, 1, 1, NA, NA, NA, 1, NA, 1… ## $ REF_RACE <dbl> 1, 4, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, … ## $ RACE2 <dbl> 1, 4, 1, NA, NA, NA, NA, 1, 1, 1, 1, 1, NA, NA, NA, 2, NA, 1… ## $ HISP_REF <dbl> 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, … ## $ HISP2 <dbl> 2, 2, 1, NA, NA, NA, NA, 2, 2, 2, 2, 2, NA, NA, NA, 2, NA, 2… ## $ FAM_TYPE <dbl> 3, 4, 1, 8, 9, 9, 8, 3, 1, 1, 3, 1, 8, 9, 8, 5, 9, 4, 8, 3, … ## $ MARITAL1 <dbl> 1, 1, 1, 5, 3, 3, 2, 1, 1, 1, 1, 1, 2, 3, 5, 1, 3, 1, 3, 1, … ## $ REGION <dbl> 4, 4, 3, 4, 4, 3, 4, 1, 3, 2, 1, 4, 1, 3, 3, 3, 2, 1, 2, 4, … ## $ SMSASTAT <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … ## $ HIGH_EDU <chr> "16", "15", "16", "15", "14", "11", "10", "15", "15", "14", … ## $ EHOUSNGC <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … ## $ TOTEXPCQ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … ## $ FOODCQ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … ## $ TRANSCQ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … ## $ HEALTHCQ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … ## $ ENTERTCQ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … ## $ EDUCACQ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … ## $ TOBACCCQ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, … ## $ STUDFINX <chr> ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", … ## $ IRAX <chr> "1000000", "10000", "0", ".", ".", "0", "0", "15000", ".", "… ## $ CUTENURE <dbl> 1, 1, 1, 1, 1, 2, 4, 1, 1, 2, 1, 2, 2, 2, 2, 4, 1, 1, 1, 4, … ## $ FAM_SIZE <dbl> 4, 6, 2, 1, 2, 2, 1, 5, 2, 2, 4, 2, 1, 2, 1, 4, 2, 4, 1, 3, … ## $ VEHQ <dbl> 3, 5, 0, 4, 2, 0, 0, 2, 4, 2, 3, 2, 1, 3, 1, 2, 4, 4, 0, 2, … ## $ ROOMSQ <chr> "8", "5", "6", "4", "4", "4", "7", "5", "4", "9", "6", "10",… ## $ INC_HRS1 <chr> "40", "40", "40", "44", "40", ".", ".", "40", "40", ".", "40… ## $ INC_HRS2 <chr> "30", "40", "52", ".", ".", ".", ".", "40", "40", ".", "65",… ## $ EARNCOMP <dbl> 3, 2, 2, 1, 4, 7, 8, 2, 2, 8, 2, 8, 8, 7, 8, 2, 7, 3, 1, 2, … ## $ NO_EARNR <dbl> 4, 2, 2, 1, 2, 1, 0, 2, 2, 0, 2, 0, 0, 1, 0, 2, 1, 3, 1, 2, … ## $ OCCUCOD1 <chr> "03", "03", "05", "03", "04", NA, NA, "12", "04", NA, "01", … ## $ OCCUCOD2 <chr> "04", "02", "01", NA, NA, NA, NA, "02", "03", NA, "11", NA, … ## $ STATE <chr> "41", "15", "48", "06", "06", "48", "06", "42", NA, "27", "2… ## $ DIVISION <dbl> 9, 9, 7, 9, 9, 7, 9, 2, NA, 4, 1, 8, 2, 5, 6, 7, 3, 2, 3, 9,… ## $ TOTXEST <dbl> 15452, 11459, 15738, 25978, 588, 0, 0, 7261, 9406, -1414, 14… ## $ CREDFINX <chr> "0", ".", "0", ".", "5", ".", ".", ".", ".", "0", ".", "0", … ## $ CREDITB <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, … ## $ CREDITX <chr> "4000", "5000", "2000", ".", "7000", "1800", ".", "6000", ".… ## $ BUILDING <chr> "01", "01", "01", "02", "08", "01", "01", "01", "01", "01", … ## $ ST_HOUS <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, … ## $ INT_PHON <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, … ## $ INT_HOME <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, … ``` --- ## Issue with Plotting Bigger Datasets ```r ggplot(data = ce, aes(x = FINCBTAX, y = TOTEXPCQ)) + geom_point() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-35-1.png" width="504" /> --- ## Issue with Plotting Bigger Datasets ```r ggplot(data = ce, aes(x = FINCBTAX, y = TOTEXPCQ)) + geom_point(alpha = .2) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-36-1.png" width="504" /> --- ## Heatmaps ```r ggplot(data = ce, aes(x = TOTEXPCQ, y = FINCBTAX)) + geom_hex() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-37-1.png" width="504" /> * Useful for seeing density for large datasets. --- ## Handling Transformations ```r ggplot(data = ce, aes(x = log10(TOTEXPCQ), y = log10(FINCBTAX))) + geom_hex() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-38-1.png" width="504" /> * Transform variables directly. --- ## Handling Transformations ```r ggplot(data = ce, aes(x = TOTEXPCQ, y = FINCBTAX)) + geom_hex() + scale_x_log10() + scale_y_log10() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-39-1.png" width="504" /> * Transform scale. --- ## ColorBrewer ```r RColorBrewer::display.brewer.all() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-40-1.png" width="504" /> --- ## Color Options: Saturation ```r ggplot(data = ce, aes(x = log10(TOTEXPCQ), y = log10(FINCBTAX))) + geom_hex() + scale_fill_distiller(palette = "Purples") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-41-1.png" width="504" /> --- ## ColorBrewer YlGn palette ```r ggplot(data = ce, aes(x = log10(TOTEXPCQ), y = log10(FINCBTAX))) + geom_hex() + scale_fill_distiller(palette = "YlGn") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-42-1.png" width="504" /> --- ## [Viridis Palette](https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html) ```r library(viridis) ggplot(data = ce, aes(x = log10(TOTEXPCQ), y = log10(FINCBTAX))) + geom_hex() + scale_fill_viridis(direction = -1, option = "A") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-43-1.png" width="504" /> --- ## [Viridis Palette](https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html) ```r library(viridis) ggplot(data = ce, aes(x = log10(TOTEXPCQ), y = log10(FINCBTAX))) + geom_hex() + scale_fill_viridis(direction = 1, option = "C") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-44-1.png" width="504" /> --- ## [Viridis Palette](https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html) ```r library(viridis) ggplot(data = ce, aes(x = log10(TOTEXPCQ), y = log10(FINCBTAX))) + geom_hex() + scale_fill_viridis(direction = 1, option = "D") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-45-1.png" width="504" /> --- ## Color Options: Hue ```r colors() ``` ``` ## [1] "white" "aliceblue" "antiquewhite" ## [4] "antiquewhite1" "antiquewhite2" "antiquewhite3" ## [7] "antiquewhite4" "aquamarine" "aquamarine1" ## [10] "aquamarine2" "aquamarine3" "aquamarine4" ## [13] "azure" "azure1" "azure2" ## [16] "azure3" "azure4" "beige" ## [19] "bisque" "bisque1" "bisque2" ## [22] "bisque3" "bisque4" "black" ## [25] "blanchedalmond" "blue" "blue1" ## [28] "blue2" "blue3" "blue4" ## [31] "blueviolet" "brown" "brown1" ## [34] "brown2" "brown3" "brown4" ## [37] "burlywood" "burlywood1" "burlywood2" ## [40] "burlywood3" "burlywood4" "cadetblue" ## [43] "cadetblue1" "cadetblue2" "cadetblue3" ## [46] "cadetblue4" "chartreuse" "chartreuse1" ## [49] "chartreuse2" "chartreuse3" "chartreuse4" ## [52] "chocolate" "chocolate1" "chocolate2" ## [55] "chocolate3" "chocolate4" "coral" ## [58] "coral1" "coral2" "coral3" ## [61] "coral4" "cornflowerblue" "cornsilk" ## [64] "cornsilk1" "cornsilk2" "cornsilk3" ## [67] "cornsilk4" "cyan" "cyan1" ## [70] "cyan2" "cyan3" "cyan4" ## [73] "darkblue" "darkcyan" "darkgoldenrod" ## [76] "darkgoldenrod1" "darkgoldenrod2" "darkgoldenrod3" ## [79] "darkgoldenrod4" "darkgray" "darkgreen" ## [82] "darkgrey" "darkkhaki" "darkmagenta" ## [85] "darkolivegreen" "darkolivegreen1" "darkolivegreen2" ## [88] "darkolivegreen3" "darkolivegreen4" "darkorange" ## [91] "darkorange1" "darkorange2" "darkorange3" ## [94] "darkorange4" "darkorchid" "darkorchid1" ## [97] "darkorchid2" "darkorchid3" "darkorchid4" ## [100] "darkred" "darksalmon" "darkseagreen" ## [103] "darkseagreen1" "darkseagreen2" "darkseagreen3" ## [106] "darkseagreen4" "darkslateblue" "darkslategray" ## [109] "darkslategray1" "darkslategray2" "darkslategray3" ## [112] "darkslategray4" "darkslategrey" "darkturquoise" ## [115] "darkviolet" "deeppink" "deeppink1" ## [118] "deeppink2" "deeppink3" "deeppink4" ## [121] "deepskyblue" "deepskyblue1" "deepskyblue2" ## [124] "deepskyblue3" "deepskyblue4" "dimgray" ## [127] "dimgrey" "dodgerblue" "dodgerblue1" ## [130] "dodgerblue2" "dodgerblue3" "dodgerblue4" ## [133] "firebrick" "firebrick1" "firebrick2" ## [136] "firebrick3" "firebrick4" "floralwhite" ## [139] "forestgreen" "gainsboro" "ghostwhite" ## [142] "gold" "gold1" "gold2" ## [145] "gold3" "gold4" "goldenrod" ## [148] "goldenrod1" "goldenrod2" "goldenrod3" ## [151] "goldenrod4" "gray" "gray0" ## [154] "gray1" "gray2" "gray3" ## [157] "gray4" "gray5" "gray6" ## [160] "gray7" "gray8" "gray9" ## [163] "gray10" "gray11" "gray12" ## [166] "gray13" "gray14" "gray15" ## [169] "gray16" "gray17" "gray18" ## [172] "gray19" "gray20" "gray21" ## [175] "gray22" "gray23" "gray24" ## [178] "gray25" "gray26" "gray27" ## [181] "gray28" "gray29" "gray30" ## [184] "gray31" "gray32" "gray33" ## [187] "gray34" "gray35" "gray36" ## [190] "gray37" "gray38" "gray39" ## [193] "gray40" "gray41" "gray42" ## [196] "gray43" "gray44" "gray45" ## [199] "gray46" "gray47" "gray48" ## [202] "gray49" "gray50" "gray51" ## [205] "gray52" "gray53" "gray54" ## [208] "gray55" "gray56" "gray57" ## [211] "gray58" "gray59" "gray60" ## [214] "gray61" "gray62" "gray63" ## [217] "gray64" "gray65" "gray66" ## [220] "gray67" "gray68" "gray69" ## [223] "gray70" "gray71" "gray72" ## [226] "gray73" "gray74" "gray75" ## [229] "gray76" "gray77" "gray78" ## [232] "gray79" "gray80" "gray81" ## [235] "gray82" "gray83" "gray84" ## [238] "gray85" "gray86" "gray87" ## [241] "gray88" "gray89" "gray90" ## [244] "gray91" "gray92" "gray93" ## [247] "gray94" "gray95" "gray96" ## [250] "gray97" "gray98" "gray99" ## [253] "gray100" "green" "green1" ## [256] "green2" "green3" "green4" ## [259] "greenyellow" "grey" "grey0" ## [262] "grey1" "grey2" "grey3" ## [265] "grey4" "grey5" "grey6" ## [268] "grey7" "grey8" "grey9" ## [271] "grey10" "grey11" "grey12" ## [274] "grey13" "grey14" "grey15" ## [277] "grey16" "grey17" "grey18" ## [280] "grey19" "grey20" "grey21" ## [283] "grey22" "grey23" "grey24" ## [286] "grey25" "grey26" "grey27" ## [289] "grey28" "grey29" "grey30" ## [292] "grey31" "grey32" "grey33" ## [295] "grey34" "grey35" "grey36" ## [298] "grey37" "grey38" "grey39" ## [301] "grey40" "grey41" "grey42" ## [304] "grey43" "grey44" "grey45" ## [307] "grey46" "grey47" "grey48" ## [310] "grey49" "grey50" "grey51" ## [313] "grey52" "grey53" "grey54" ## [316] "grey55" "grey56" "grey57" ## [319] "grey58" "grey59" "grey60" ## [322] "grey61" "grey62" "grey63" ## [325] "grey64" "grey65" "grey66" ## [328] "grey67" "grey68" "grey69" ## [331] "grey70" "grey71" "grey72" ## [334] "grey73" "grey74" "grey75" ## [337] "grey76" "grey77" "grey78" ## [340] "grey79" "grey80" "grey81" ## [343] "grey82" "grey83" "grey84" ## [346] "grey85" "grey86" "grey87" ## [349] "grey88" "grey89" "grey90" ## [352] "grey91" "grey92" "grey93" ## [355] "grey94" "grey95" "grey96" ## [358] "grey97" "grey98" "grey99" ## [361] "grey100" "honeydew" "honeydew1" ## [364] "honeydew2" "honeydew3" "honeydew4" ## [367] "hotpink" "hotpink1" "hotpink2" ## [370] "hotpink3" "hotpink4" "indianred" ## [373] "indianred1" "indianred2" "indianred3" ## [376] "indianred4" "ivory" "ivory1" ## [379] "ivory2" "ivory3" "ivory4" ## [382] "khaki" "khaki1" "khaki2" ## [385] "khaki3" "khaki4" "lavender" ## [388] "lavenderblush" "lavenderblush1" "lavenderblush2" ## [391] "lavenderblush3" "lavenderblush4" "lawngreen" ## [394] "lemonchiffon" "lemonchiffon1" "lemonchiffon2" ## [397] "lemonchiffon3" "lemonchiffon4" "lightblue" ## [400] "lightblue1" "lightblue2" "lightblue3" ## [403] "lightblue4" "lightcoral" "lightcyan" ## [406] "lightcyan1" "lightcyan2" "lightcyan3" ## [409] "lightcyan4" "lightgoldenrod" "lightgoldenrod1" ## [412] "lightgoldenrod2" "lightgoldenrod3" "lightgoldenrod4" ## [415] "lightgoldenrodyellow" "lightgray" "lightgreen" ## [418] "lightgrey" "lightpink" "lightpink1" ## [421] "lightpink2" "lightpink3" "lightpink4" ## [424] "lightsalmon" "lightsalmon1" "lightsalmon2" ## [427] "lightsalmon3" "lightsalmon4" "lightseagreen" ## [430] "lightskyblue" "lightskyblue1" "lightskyblue2" ## [433] "lightskyblue3" "lightskyblue4" "lightslateblue" ## [436] "lightslategray" "lightslategrey" "lightsteelblue" ## [439] "lightsteelblue1" "lightsteelblue2" "lightsteelblue3" ## [442] "lightsteelblue4" "lightyellow" "lightyellow1" ## [445] "lightyellow2" "lightyellow3" "lightyellow4" ## [448] "limegreen" "linen" "magenta" ## [451] "magenta1" "magenta2" "magenta3" ## [454] "magenta4" "maroon" "maroon1" ## [457] "maroon2" "maroon3" "maroon4" ## [460] "mediumaquamarine" "mediumblue" "mediumorchid" ## [463] "mediumorchid1" "mediumorchid2" "mediumorchid3" ## [466] "mediumorchid4" "mediumpurple" "mediumpurple1" ## [469] "mediumpurple2" "mediumpurple3" "mediumpurple4" ## [472] "mediumseagreen" "mediumslateblue" "mediumspringgreen" ## [475] "mediumturquoise" "mediumvioletred" "midnightblue" ## [478] "mintcream" "mistyrose" "mistyrose1" ## [481] "mistyrose2" "mistyrose3" "mistyrose4" ## [484] "moccasin" "navajowhite" "navajowhite1" ## [487] "navajowhite2" "navajowhite3" "navajowhite4" ## [490] "navy" "navyblue" "oldlace" ## [493] "olivedrab" "olivedrab1" "olivedrab2" ## [496] "olivedrab3" "olivedrab4" "orange" ## [499] "orange1" "orange2" "orange3" ## [502] "orange4" "orangered" "orangered1" ## [505] "orangered2" "orangered3" "orangered4" ## [508] "orchid" "orchid1" "orchid2" ## [511] "orchid3" "orchid4" "palegoldenrod" ## [514] "palegreen" "palegreen1" "palegreen2" ## [517] "palegreen3" "palegreen4" "paleturquoise" ## [520] "paleturquoise1" "paleturquoise2" "paleturquoise3" ## [523] "paleturquoise4" "palevioletred" "palevioletred1" ## [526] "palevioletred2" "palevioletred3" "palevioletred4" ## [529] "papayawhip" "peachpuff" "peachpuff1" ## [532] "peachpuff2" "peachpuff3" "peachpuff4" ## [535] "peru" "pink" "pink1" ## [538] "pink2" "pink3" "pink4" ## [541] "plum" "plum1" "plum2" ## [544] "plum3" "plum4" "powderblue" ## [547] "purple" "purple1" "purple2" ## [550] "purple3" "purple4" "red" ## [553] "red1" "red2" "red3" ## [556] "red4" "rosybrown" "rosybrown1" ## [559] "rosybrown2" "rosybrown3" "rosybrown4" ## [562] "royalblue" "royalblue1" "royalblue2" ## [565] "royalblue3" "royalblue4" "saddlebrown" ## [568] "salmon" "salmon1" "salmon2" ## [571] "salmon3" "salmon4" "sandybrown" ## [574] "seagreen" "seagreen1" "seagreen2" ## [577] "seagreen3" "seagreen4" "seashell" ## [580] "seashell1" "seashell2" "seashell3" ## [583] "seashell4" "sienna" "sienna1" ## [586] "sienna2" "sienna3" "sienna4" ## [589] "skyblue" "skyblue1" "skyblue2" ## [592] "skyblue3" "skyblue4" "slateblue" ## [595] "slateblue1" "slateblue2" "slateblue3" ## [598] "slateblue4" "slategray" "slategray1" ## [601] "slategray2" "slategray3" "slategray4" ## [604] "slategrey" "snow" "snow1" ## [607] "snow2" "snow3" "snow4" ## [610] "springgreen" "springgreen1" "springgreen2" ## [613] "springgreen3" "springgreen4" "steelblue" ## [616] "steelblue1" "steelblue2" "steelblue3" ## [619] "steelblue4" "tan" "tan1" ## [622] "tan2" "tan3" "tan4" ## [625] "thistle" "thistle1" "thistle2" ## [628] "thistle3" "thistle4" "tomato" ## [631] "tomato1" "tomato2" "tomato3" ## [634] "tomato4" "turquoise" "turquoise1" ## [637] "turquoise2" "turquoise3" "turquoise4" ## [640] "violet" "violetred" "violetred1" ## [643] "violetred2" "violetred3" "violetred4" ## [646] "wheat" "wheat1" "wheat2" ## [649] "wheat3" "wheat4" "whitesmoke" ## [652] "yellow" "yellow1" "yellow2" ## [655] "yellow3" "yellow4" "yellowgreen" ``` --- ## Color Options: Hue ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + scale_color_manual(name = "Day of the Week", values=sample(colors(), 7)) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-47-1.png" width="504" /> --- ## Color Options: [Hue](http://www.colourlovers.com/palette/694737/Thought_Provoking) ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + scale_color_manual(name = "Day of the Week", values=c("#ECD078", "#D95B43", "#C02942", "#542437", "#53777A", "#8A9B0F", "#6A4A3C")) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-48-1.png" width="504" /> --- ## ColorBrewer Hue ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + scale_color_brewer(name = "Day of the Week", palette = "Dark2") ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-49-1.png" width="504" /> --- ## [Themes](https://r-graphics.org/recipe-appearance-theme-modify) ```r #?theme ``` * Can override specific aspects of the theme + EX: `+ theme(legend.position = "bottom")` --- ## Themes ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + theme(axis.title.x = element_text(color = "#C1D82F", size = 20), axis.title.y = element_text(color = "#00857D", size = 25), axis.text.x = element_text(color = "#C1D82F", size = 12), axis.text.y = element_text(color = "#FF7401", size = 18)) ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-51-1.png" width="504" /> --- ### [Built-in Themes](https://ggplot2.tidyverse.org/reference/ggtheme.html) ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + theme_bw() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-52-1.png" width="504" /> --- ### Built-in Themes ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + theme_dark() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-53-1.png" width="504" /> --- ### Built-in Themes ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + theme_void() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-54-1.png" width="504" /> --- ### Additional Themes Package: [ggthemes](https://cran.r-project.org/web/packages/ggthemes/vignettes/ggthemes.html) ```r library(ggthemes) ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + theme_economist() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-55-1.png" width="504" /> --- ### Additional Themes Package ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + theme_fivethirtyeight() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-56-1.png" width="504" /> --- ### Additional Themes Package ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() + theme_wsj() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-57-1.png" width="504" /> --- ### Controlling the size of your figures R Chunk options: * `fig.asp` * `fig.width` * `fig.height` ```r ggplot(data = Births2015, mapping = aes(x = date, y = births, color = wday)) + geom_point() ``` <img src="slidesWk2Th_files/figure-html/unnamed-chunk-58-1.png" width="504" /> --- ## To Do Now * Practice assessing and downloading the class lab assignments. + Shared folder: `/home/courses/math241s21`