class: center, middle ### Dates/Times with `lubridate` and Factors with `forcats` <img src="img/hero_wall_pink.png" width="800px"/> ### Kelly McConville .large[Math 241 | Week 8 | Spring 2021] --- ## Announcements/Reminders * Mini Project 2 is due next Thursday. * No new lab this week! --- ## A Few Thoughts on R Programming * For what you want to do, start with the minimal viable product. * Think about your inputs and outputs. + Class? + Size? + Indexing? * Sometimes boxed mac and cheese is better than homemade. Sometimes homemade is better. * Reduce redundancies with functions and iteration. * Good names can be as helpful as good comments. * Consider how you are handling missingness. * Its okay to start with smelly, working code. + And then refactor. --- ## Why do we need to talk about dates and times? **Question:** When did the crashes happen in Portland in 2018? ```r library(tidyverse) crashes <- read_csv("/home/courses/math241s21/Data/pdx_crash_2018_CRASH.csv") crashes %>% count(CRASH_DT) %>% ggplot(mapping = aes(x = CRASH_DT, y = n)) + geom_point() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-1-1.png" width="288" /> --- ## Dates ```r class(crashes$CRASH_DT) ``` ``` ## [1] "character" ``` -- What class should it be? --- ## Converting Strings to Dates * Identify the order of year, month, day, hour, minute, second * Pick the `lubridate` function that replicates that order. ```r sample(crashes$CRASH_DT, size = 10) ``` ``` ## [1] "09/07/18 00:00:00" "03/28/18 00:00:00" "08/19/18 00:00:00" ## [4] "02/25/18 00:00:00" "06/26/18 00:00:00" "11/02/18 00:00:00" ## [7] "09/24/18 00:00:00" "08/08/18 00:00:00" "11/17/18 00:00:00" ## [10] "08/28/18 00:00:00" ``` ```r library(lubridate) crashes <- crashes %>% mutate(CRASH_DT = mdy_hms(CRASH_DT), CRASH_D = date(CRASH_DT)) class(crashes$CRASH_DT) ``` ``` ## [1] "POSIXct" "POSIXt" ``` ```r class(crashes$CRASH_D) ``` ``` ## [1] "Date" ``` --- ## Why do we need to talk about dates and times? **Question:** When did the crashes happen in Portland in 2018? ```r crashes %>% count(CRASH_D) %>% ggplot(mapping = aes(x = CRASH_D, y = n)) + geom_point() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-4-1.png" width="288" /> --- ## What else makes dates and times unique? -- * Hours have 60 minutes. (Well, some have 61) * Not all years have 365 days. * Daylight Savings caused us to lose an hour on Sunday but not folks in Arizona. --- ## Let's Look at [Portland's Biketown Data](https://www.biketownpdx.com/system-data) ```r biketown <- read_csv("/home/courses/math141f19/Data/biketown_2017_07_09.csv") %>% filter(Distance_Miles < 1000) biketown_dt <- biketown %>% select(StartDate, StartTime, EndDate, EndTime, Distance_Miles, BikeID) glimpse(biketown_dt) ``` ``` ## Rows: 134,838 ## Columns: 6 ## $ StartDate <chr> "7/1/2017", "7/1/2017", "7/1/2017", "7/1/2017", "7/1/20… ## $ StartTime <time> 00:00:00, 00:00:00, 00:00:00, 00:01:00, 00:03:00, 00:0… ## $ EndDate <chr> "7/1/2017", "7/1/2017", "7/1/2017", "7/1/2017", "7/1/20… ## $ EndTime <time> 00:06:00, 00:16:00, 00:02:00, 00:33:00, 00:06:00, 00:0… ## $ Distance_Miles <dbl> 0.55, 2.03, 0.17, 2.75, 0.40, 0.40, 5.08, 0.95, 2.39, 2… ## $ BikeID <dbl> 7375, 6191, 6321, 6434, 6850, 6420, 6593, 6160, 7380, 6… ``` --- ## Let's Look at [Portland's Biketown Data](https://www.biketownpdx.com/system-data) * Fix the class of the date columns. * Create date-time columns. ```r library(lubridate) biketown_dt <- biketown_dt %>% mutate(StartDate = mdy(StartDate), EndDate = mdy(EndDate)) %>% mutate(StartDateTime = ymd_hms(paste(StartDate, StartTime, sep = " ")), EndDateTime = ymd_hms(paste(EndDate, EndTime, sep = " "))) glimpse(biketown_dt) ``` ``` ## Rows: 134,838 ## Columns: 8 ## $ StartDate <date> 2017-07-01, 2017-07-01, 2017-07-01, 2017-07-01, 2017-0… ## $ StartTime <time> 00:00:00, 00:00:00, 00:00:00, 00:01:00, 00:03:00, 00:0… ## $ EndDate <date> 2017-07-01, 2017-07-01, 2017-07-01, 2017-07-01, 2017-0… ## $ EndTime <time> 00:06:00, 00:16:00, 00:02:00, 00:33:00, 00:06:00, 00:0… ## $ Distance_Miles <dbl> 0.55, 2.03, 0.17, 2.75, 0.40, 0.40, 5.08, 0.95, 2.39, 2… ## $ BikeID <dbl> 7375, 6191, 6321, 6434, 6850, 6420, 6593, 6160, 7380, 6… ## $ StartDateTime <dttm> 2017-07-01 00:00:00, 2017-07-01 00:00:00, 2017-07-01 0… ## $ EndDateTime <dttm> 2017-07-01 00:06:00, 2017-07-01 00:16:00, 2017-07-01 0… ``` --- ## Grabbing Components ```r biketown_dt$StartDateTime[40008] ``` ``` ## [1] "2017-07-23 13:44:00 UTC" ``` ```r year(biketown_dt$StartDateTime[40008]) ``` ``` ## [1] 2017 ``` ```r month(biketown_dt$StartDateTime[40008], label = TRUE) ``` ``` ## [1] Jul ## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec ``` ```r day(biketown_dt$StartDateTime[40008]) ``` ``` ## [1] 23 ``` --- ## Grabbing Components ```r week(biketown_dt$StartDateTime[40008]) ``` ``` ## [1] 30 ``` ```r wday(biketown_dt$StartDateTime[40008], label = TRUE) ``` ``` ## [1] Sun ## Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat ``` ```r hour(biketown_dt$StartDateTime[40008]) ``` ``` ## [1] 13 ``` ```r minute(biketown_dt$StartDateTime[40008]) ``` ``` ## [1] 44 ``` --- ## Grabbing Components ```r ggplot(data = biketown_dt, mapping = aes(month(StartDateTime, label = TRUE))) + geom_bar() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-9-1.png" width="360" /> --- ## Grabbing Components ```r ggplot(data = biketown_dt, mapping = aes(wday(StartDateTime, label = TRUE))) + geom_bar() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-10-1.png" width="360" /> --- ## Grabbing Components ```r ggplot(data = biketown_dt, mapping = aes(hour(StartDateTime))) + geom_bar() ggplot(data = biketown_dt, mapping = aes(hour(EndDateTime))) + geom_bar() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-11-1.png" width="288" /><img src="slidesWk8Th_files/figure-html/unnamed-chunk-11-2.png" width="288" /> --- ## Grabbing Components ```r biketown_dt %>% mutate(hour = hour(StartDateTime), month = month(StartDateTime, label = TRUE)) %>% group_by(hour, month) %>% summarise(mean_dist = mean(Distance_Miles, na.rm = TRUE)) %>% ggplot(mapping = aes(x = hour, y = mean_dist, color = month)) + geom_line(size = 2) ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-12-1.png" width="360" /> --- # And if you are in R and want to know the date/time: ```r today() ``` ``` ## [1] "2021-03-18" ``` ```r now() ``` ``` ## [1] "2021-03-18 08:14:27 PDT" ``` --- class: inverse, middle, center ## Topic Shift! --- ## Motivation: Imposing Structure on Categorical Variables ```r library(pdxTrees) pdxTrees <- get_pdxTrees_parks() five_most_common <- c("Douglas-Fir", "Norway Maple", "Western Redcedar", "Northern Red Oak", "Pin Oak") pdxCommon <- pdxTrees %>% filter(Common_Name %in% five_most_common) ``` --- ## Motivation: Imposing Structure on Categorical Variables ```r ggplot(data = pdxCommon, mapping = aes(x = Common_Name)) + geom_bar() + coord_flip() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-15-1.png" width="360" /> -- How might we want to restructure this graph? --- ```r levels(pdxCommon$Common_Name) ``` ``` ## NULL ``` ```r class(pdxCommon$Common_Name) ``` ``` ## [1] "character" ``` ```r pdxCommon <- pdxCommon %>% mutate(Common_Name = factor(Common_Name)) levels(pdxCommon$Common_Name) ``` ``` ## [1] "Douglas-Fir" "Northern Red Oak" "Norway Maple" "Pin Oak" ## [5] "Western Redcedar" ``` ```r class(pdxCommon$Common_Name) ``` ``` ## [1] "factor" ``` * What is the order of the levels? --- ## What Are the Classes? ```r pdxCommon$Common_Name %>% fct_unique() %>% unclass() ``` ``` ## [1] 1 2 3 4 5 ## attr(,"levels") ## [1] "Douglas-Fir" "Northern Red Oak" "Norway Maple" "Pin Oak" ## [5] "Western Redcedar" ``` ```r unique(pdxCommon$Common_Name) ``` ``` ## [1] Douglas-Fir Northern Red Oak Norway Maple Pin Oak ## [5] Western Redcedar ## 5 Levels: Douglas-Fir Northern Red Oak Norway Maple ... Western Redcedar ``` --- # Simple Frequency ```r pdxCommon$Common_Name %>% fct_count() ``` ``` ## # A tibble: 5 x 2 ## f n ## <fct> <int> ## 1 Douglas-Fir 6783 ## 2 Northern Red Oak 736 ## 3 Norway Maple 1502 ## 4 Pin Oak 619 ## 5 Western Redcedar 964 ``` ```r count(pdxCommon, Common_Name) %>% arrange(desc(n)) ``` ``` ## # A tibble: 5 x 2 ## Common_Name n ## <fct> <int> ## 1 Douglas-Fir 6783 ## 2 Norway Maple 1502 ## 3 Western Redcedar 964 ## 4 Northern Red Oak 736 ## 5 Pin Oak 619 ``` --- ## Reorder the Levels * Add the `levels` argument ```r pdxCommon <- pdxCommon %>% mutate(Common_Name = factor(Common_Name, levels = five_most_common)) levels(pdxCommon$Common_Name) ``` ``` ## [1] "Douglas-Fir" "Norway Maple" "Western Redcedar" "Northern Red Oak" ## [5] "Pin Oak" ``` --- ## Reorder the Levels * Order levels by when they show up in the dataset ```r pdxCommon <- pdxCommon %>% mutate(Common_Name = fct_inorder(Common_Name)) levels(pdxCommon$Common_Name) ``` ``` ## [1] "Douglas-Fir" "Northern Red Oak" "Norway Maple" "Pin Oak" ## [5] "Western Redcedar" ``` --- ## Reorder the Levels + Note: This code didn't permanently change the order in `pdxCommon`. + Why? ```r pdxCommon %>% mutate(Common_Name = fct_infreq(Common_Name)) %>% ggplot(mapping = aes(Common_Name)) + geom_bar() + coord_flip() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-21-1.png" width="360" /> --- ## Reorder the Levels ```r pdxCommon %>% mutate(Common_Name = fct_infreq(Common_Name), Common_Name = fct_rev(Common_Name)) %>% ggplot(mapping = aes(Common_Name)) + geom_bar() + coord_flip() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-22-1.png" width="360" /> --- ## If You Love the Pipe... ```r pdxCommon %>% mutate(Common_Name = fct_infreq(Common_Name) %>% fct_rev()) %>% ggplot(mapping = aes(Common_Name)) + geom_bar() + coord_flip() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-23-1.png" width="360" /> --- ## Reorder the Levels * Can also relevel after the fact manually ```r pdxCommon %>% mutate(Common_Name = fct_relevel(Common_Name, five_most_common)) %>% ggplot(mapping = aes(x = Common_Name)) + geom_bar() + coord_flip() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-24-1.png" width="360" /> --- ## Reorder the Levels * Maybe I just want to bring one or two category to the front ```r pdxCommon %>% mutate(Common_Name = fct_relevel(Common_Name, "Norway Maple", "Pin Oak")) %>% ggplot(mapping = aes(x = Common_Name)) + geom_bar() + coord_flip() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-25-1.png" width="360" /> --- ## What Have We Wrangled Here? ```r DBH_by_name <- pdxCommon %>% group_by(Common_Name) %>% summarize(mean_DBH = mean(DBH), lb_DBH = mean_DBH - 2*sd(DBH)/sqrt(n()), ub_DBH = mean_DBH + 2 *sd(DBH/sqrt(n()))) DBH_by_name ``` ``` ## # A tibble: 5 x 4 ## Common_Name mean_DBH lb_DBH ub_DBH ## <fct> <dbl> <dbl> <dbl> ## 1 Douglas-Fir 29.6 29.3 29.8 ## 2 Northern Red Oak 29.4 28.3 30.5 ## 3 Norway Maple 20.3 19.9 20.8 ## 4 Pin Oak 25.6 24.8 26.4 ## 5 Western Redcedar 18.1 17.3 18.9 ``` --- ## Reordering by Another Variable * How might we want to reorder `Common_Name`? ```r ggplot(data = DBH_by_name, mapping = aes(y = mean_DBH, x = Common_Name)) + geom_point() + geom_errorbar(aes(ymin = lb_DBH, ymax = ub_DBH), width = 0.4) ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-27-1.png" width="360" /> --- ## Reordering by Another Variable ```r thing <- pdxCommon %>% mutate(Common_Name = fct_reorder(Common_Name, DBH)) levels(thing$Common_Name) ``` ``` ## [1] "Western Redcedar" "Norway Maple" "Pin Oak" "Northern Red Oak" ## [5] "Douglas-Fir" ``` ```r DBH_by_name %>% mutate(Common_Name = fct_reorder(Common_Name, -mean_DBH)) %>% ggplot(mapping = aes(y = mean_DBH, x = Common_Name)) + geom_point() + geom_errorbar(aes(ymin = lb_DBH, ymax = ub_DBH), width = 0.4) ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-28-1.png" width="360" /> --- ## Reordering by Another Variable ```r ggplot(data = pdxCommon, mapping = aes(x = DBH, y = Total_Annual_Services, color = Condition)) + geom_smooth() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-29-1.png" width="360" /> --- ## Reordering by Another Variable ```r pdxCommon %>% mutate(Condition = fct_reorder2(Condition, DBH, Total_Annual_Services)) %>% ggplot(mapping = aes(x = DBH, y = Total_Annual_Services, color = Condition)) + geom_smooth() ``` <img src="slidesWk8Th_files/figure-html/unnamed-chunk-30-1.png" width="360" /> --- ## Recode * How might we want to change these categories? ```r levels(pdxCommon$Common_Name) ``` ``` ## [1] "Douglas-Fir" "Northern Red Oak" "Norway Maple" "Pin Oak" ## [5] "Western Redcedar" ``` --- ## Recode ```r pdxCommon <- pdxCommon %>% mutate(Common_Name = fct_recode(Common_Name, "Douglas Fir" = "Douglas-Fir")) count(pdxCommon, Common_Name) ``` ``` ## # A tibble: 5 x 2 ## Common_Name n ## <fct> <int> ## 1 Douglas Fir 6783 ## 2 Northern Red Oak 736 ## 3 Norway Maple 1502 ## 4 Pin Oak 619 ## 5 Western Redcedar 964 ``` --- ## Collapsing Levels ```r pdxCommon <- pdxCommon %>% mutate(Common_Name2 = fct_collapse(Common_Name, Oak = c("Northern Red Oak", "Pin Oak"))) count(pdxCommon, Common_Name2) ``` ``` ## # A tibble: 4 x 2 ## Common_Name2 n ## <fct> <int> ## 1 Douglas Fir 6783 ## 2 Oak 1355 ## 3 Norway Maple 1502 ## 4 Western Redcedar 964 ``` --- ## Dropping Unused Levels ```r pdxCommon <- pdxTrees %>% mutate(Common_Name = factor(Common_Name)) %>% filter(Common_Name %in% five_most_common) length(levels(pdxCommon$Common_Name)) ``` ``` ## [1] 304 ``` ```r levels(pdxCommon$Common_Name) ``` ``` ## [1] "Accolade Elm" ## [2] "Alaska Yellow-Cedar" ## [3] "Aleppo Pine" ## [4] "Allegheny Serviceberry" ## [5] "American Beech" ## [6] "American Elm" ## [7] "American Hophornbeam" ## [8] "American Hornbeam, Blue Beech" ## [9] "American Linden" ## [10] "American Persimmon" ## [11] "American Smoketree" ## [12] "American Sycamore" ## [13] "American Yellowwood" ## [14] "Amur Cork Tree" ## [15] "Amur Maackia" ## [16] "Amur Maple" ## [17] "Apple (Mado)" ## [18] "Arborvitae, Eastern Arborvitae, Northern White-Cedar" ## [19] "Ash" ## [20] "Ashe's Magnolia, Dwarf Bigleaf Magnolia" ## [21] "Asian Pear, Sand Pear" ## [22] "Austrian Black Pine" ## [23] "Bald Cypress" ## [24] "Balkan Maple" ## [25] "Bigleaf Maple" ## [26] "Bigleaf Snowbell, Fragrant Snowbell" ## [27] "Bird Cherry" ## [28] "Black Cottonwood" ## [29] "Black Hawthorn" ## [30] "Black Locust" ## [31] "Black Oak" ## [32] "Black Poplar, Lombardy Poplar" ## [33] "Black Tupelo" ## [34] "Black Walnut" ## [35] "Blue Atlas Cedar" ## [36] "Box Elder" ## [37] "Boxleaf Azara" ## [38] "Brewer Spruce" ## [39] "Buddhist Pine, Yew Pine" ## [40] "Bur Oak" ## [41] "Butternut" ## [42] "California Buckeye" ## [43] "Campbell's Magnolia" ## [44] "Camperdown Elm" ## [45] "Camphor Tree" ## [46] "Cascara Buckthorn" ## [47] "Caucasian Fir, Nordmann Fir" ## [48] "Cedar Of Lebanon" ## [49] "Cherry" ## [50] "China Fir" ## [51] "Chinese Evergreen Magnolia, Delavay's Magnolia" ## [52] "Chinese Fringe Tree" ## [53] "Chinese Horsechestnut" ## [54] "Chinese Lacebark Elm" ## [55] "Chinese Paper Birch" ## [56] "Chinese Parasol Tree" ## [57] "Chinese Pistache" ## [58] "Chinese Tupelo" ## [59] "Chinkapin Oak" ## [60] "Chitalpa" ## [61] "Chusan Palm Or Windmill Palm" ## [62] "Coast Live Oak" ## [63] "Coast Redwood" ## [64] "Colorado Blue Spruce" ## [65] "Common Fig" ## [66] "Common Hackberry" ## [67] "Common Horsechestnut" ## [68] "Cork Oak" ## [69] "Corkscrew Willow" ## [70] "Cornelian Cherry" ## [71] "Coulter Pine" ## [72] "Crape Myrtle" ## [73] "Cucumber Magnolia" ## [74] "Cypress" ## [75] "Daphniphyllum" ## [76] "Dawn Redwood" ## [77] "Deodar Cedar" ## [78] "Dogwood" ## [79] "Douglas-Fir" ## [80] "Dove Or Handkerchief Tree" ## [81] "Eastern Cotonwood" ## [82] "Eastern Dogwood" ## [83] "Eastern Redbud" ## [84] "Eastern White Oak" ## [85] "Eastern White Pine" ## [86] "Elm" ## [87] "Elm Hybrid" ## [88] "English Hawthorn, Common Hawthorn" ## [89] "English Holly" ## [90] "English Laurel, Cherry Laurel" ## [91] "English Oak" ## [92] "English Walnut" ## [93] "English Yew" ## [94] "European Ash" ## [95] "European Aspen" ## [96] "European Beech" ## [97] "European Hornbeam" ## [98] "European Larch" ## [99] "European Mountain Ash" ## [100] "European Pear (Including Cultivars)" ## [101] "European Silver Fir" ## [102] "European White Birch" ## [103] "Evergreen Dogwood" ## [104] "False Ginseng" ## [105] "Falsecypress" ## [106] "Field Elm" ## [107] "Fir" ## [108] "Flowering Ash" ## [109] "Flowering Pear" ## [110] "Flowering Plum" ## [111] "Fragrant Olive" ## [112] "Franklin Tree, Franklinia" ## [113] "Fullmoon Maple" ## [114] "Giant Dogwood" ## [115] "Giant Sequoia" ## [116] "Ginkgo" ## [117] "Golden Chinkapin" ## [118] "Golden Larch" ## [119] "Goldenchain Tree" ## [120] "Goldenrain Tree" ## [121] "Grand Fir" ## [122] "Green Ash" ## [123] "Guangdong Pine" ## [124] "Hainan White Pine, Kwangtung Pine" ## [125] "Hardy Chinese Rubber Tree, Eucommia" ## [126] "Harlequin Glory Bower" ## [127] "Hartweg's Pine" ## [128] "Hazel Or Hazelnut" ## [129] "Hazel, Filbert" ## [130] "Hedge Maple" ## [131] "Henry Anise Tree, Henry's Star Anise" ## [132] "Hiba Arborvitae, Thujopsis" ## [133] "Himalayan Spruce" ## [134] "Himalayan White Pine" ## [135] "Himalayan Whitebarked Birch" ## [136] "Hinoki Falsecypress" ## [137] "Holly Oak, Holm Oak" ## [138] "Honey Locust" ## [139] "Hornbeam" ## [140] "Horsechestnuts, Buckeyes" ## [141] "Hungarian Oak, Italian Oak" ## [142] "Incense Cedar" ## [143] "Italian Cypress" ## [144] "Japanese Apricot" ## [145] "Japanese Bay Tree" ## [146] "Japanese Black Pine" ## [147] "Japanese Blueberry Tree" ## [148] "Japanese Cedar" ## [149] "Japanese Emperor Oak, Daimyo Oak" ## [150] "Japanese Flowering Cherry" ## [151] "Japanese Hemlock" ## [152] "Japanese Hornbeam" ## [153] "Japanese Maple" ## [154] "Japanese Pagoda Tree, Chinese Scholar Tree" ## [155] "Japanese Persimmon, Asian Persimmon" ## [156] "Japanese Red Pine" ## [157] "Japanese Snowbell" ## [158] "Japanese Stewartia" ## [159] "Japanese Tree Lilac" ## [160] "Japanese White Pine" ## [161] "Japanese Zelkova" ## [162] "Jeffrey Pine" ## [163] "Juniper" ## [164] "Katsura" ## [165] "Kentucky Coffeetree" ## [166] "Korean Fir" ## [167] "Kousa Dogwood" ## [168] "Lacebark Pine" ## [169] "Larch" ## [170] "Largeleaf Linden" ## [171] "Lavalle Hawthorn" ## [172] "Leatherwood" ## [173] "Leuteneggeri Hybrid Fir" ## [174] "Leyland Cypress" ## [175] "Limber Pine" ## [176] "Littleleaf Linden" ## [177] "London Plane Tree" ## [178] "Longleaf Pine" ## [179] "Loquat" ## [180] "Lusterleaf Holly" ## [181] "Magnolia" ## [182] "Martin's Magnolia" ## [183] "Mexican Pinyon" ## [184] "Midland Hawthorn, English Hawthorn" ## [185] "Monpellier Maple" ## [186] "Mountain Ash Or Whitebeam" ## [187] "Mountain Hemlock" ## [188] "Mountain Silverbell" ## [189] "Mu Gua Hong" ## [190] "Mulberry" ## [191] "Narrowleaf Ash (Includes 'Raywood')" ## [192] "Netleaf Hackberry" ## [193] "Noble Fir" ## [194] "Northern Catalpa" ## [195] "Northern Red Oak" ## [196] "Norway Maple" ## [197] "Norway Spruce" ## [198] "Ohio Buckeye" ## [199] "Orange-Bark Stewartia, Tall Stewartia" ## [200] "Oregon Ash" ## [201] "Oregon Myrtle" ## [202] "Oregon White Oak" ## [203] "Oriental Spruce" ## [204] "Ornamental Crabapple" ## [205] "Oyama Magnolia" ## [206] "Pacific Crabapple" ## [207] "Pacific Dogwood" ## [208] "Pacific Madrone" ## [209] "Pacific Silver Fir" ## [210] "Pacific Yew" ## [211] "Paper Birch" ## [212] "Paperbark Cherry, Birchbark Cherry, Tibetan Cherry" ## [213] "Paperbark Maple" ## [214] "Paulownia, Empress Tree, Foxglove Tree" ## [215] "Pear" ## [216] "Pecan" ## [217] "Persian Ironwood" ## [218] "Persimmon" ## [219] "Phoebe" ## [220] "Pin Oak" ## [221] "Pine" ## [222] "Plum" ## [223] "Ponderosa Pine" ## [224] "Port Orford Cedar" ## [225] "Portugal Laurel, Portuguese Laurel" ## [226] "Privet" ## [227] "Prunus Species" ## [228] "Quaking Aspen" ## [229] "Red Alder" ## [230] "Red Horsechestnut" ## [231] "Red Lotus Tree" ## [232] "Red Maple" ## [233] "Red-Silver Maple Hybrid" ## [234] "Redvein Maple" ## [235] "Ring-Cupped Oak, Japanese Blue Oak" ## [236] "River Birch" ## [237] "Rocky Mountain Bristlecone Pine" ## [238] "Rocky Mountain Glow Maple" ## [239] "Sargent's Cherry" ## [240] "Saucer Magnolia" ## [241] "Sawara Cypress" ## [242] "Sawtooth Oak" ## [243] "Scarlet Oak" ## [244] "Schima" ## [245] "Scots Pine" ## [246] "Serbian Spruce" ## [247] "Serviceberry" ## [248] "Seven-Son Plant" ## [249] "Shagbark Hickory" ## [250] "Shore Pine, Lodgepole Pine" ## [251] "Shumard Oak" ## [252] "Siberian Elm" ## [253] "Silk Tree" ## [254] "Silver Linden" ## [255] "Silver Maple" ## [256] "Silverleaf Oak" ## [257] "Sitka Spruce" ## [258] "Smiling Monkey Tree" ## [259] "Snakebark Maple" ## [260] "Sourwood" ## [261] "Southern Catalpa" ## [262] "Southern Live Oak" ## [263] "Southern Magnolia" ## [264] "Spanish Chestnut" ## [265] "Spanish Fir" ## [266] "Spindle-Tree" ## [267] "Star Magnolia" ## [268] "Strawberry Tree" ## [269] "Sugar Maple" ## [270] "Sugarberry, Southern Hackberry" ## [271] "Swamp White Oak" ## [272] "Swedish Whitebeam" ## [273] "Sweetbay" ## [274] "Sweetgum" ## [275] "Sycamore Maple" ## [276] "Tree Of Heaven" ## [277] "Trident Maple" ## [278] "Tubeflower Viburnum" ## [279] "Tuliptree" ## [280] "Turkish Hazel" ## [281] "Umbrella Pine" ## [282] "Unknown (Dead)" ## [283] "Valley Oak" ## [284] "Vine Maple" ## [285] "Washington Hawthorn" ## [286] "Water Oak" ## [287] "Weeping Willow" ## [288] "Western Hemlock" ## [289] "Western Larch" ## [290] "Western Redcedar" ## [291] "Western White Pine" ## [292] "Wheel Tree" ## [293] "White Ash" ## [294] "White Fir" ## [295] "White Spruce" ## [296] "Willow" ## [297] "Willow Oak" ## [298] "Winged Elm" ## [299] "Wingnut" ## [300] "Witch Hazel" ## [301] "Yellow Buckeye" ## [302] "Yellow Lily-Tree" ## [303] "Yulan Magnolia" ## [304] "Zen Magnolia" ``` --- ## Dropping Unused Levels ```r pdxCommon <- pdxTrees %>% mutate(Common_Name = factor(Common_Name)) %>% filter(Common_Name %in% five_most_common) %>% mutate(Common_Name = fct_drop(Common_Name)) length(levels(pdxCommon$Common_Name)) ``` ``` ## [1] 5 ``` ```r levels(pdxCommon$Common_Name) ``` ``` ## [1] "Douglas-Fir" "Northern Red Oak" "Norway Maple" "Pin Oak" ## [5] "Western Redcedar" ``` --- ## Looking Ahead * `forcats`/factors questions? * Ready for `stringr`/strings? ```r library(glue) feeling <- "not at all" glue('I am {feeling} ready!') ``` ``` ## I am not at all ready! ``` --- ## Language **String** -- ```r x <- "cat" ``` **Character vector** -- ```r x <- c("dog", "cat", "mouse") ``` -- **Factor vector** -- ```r x <- factor(x) levels(x) ``` ``` ## [1] "cat" "dog" "mouse" ```