## Goals for Today * Discuss git and GitHub. * Go over the GitHub + RStudio workflow. + For solo projects + For group projects * Discuss R packages. * Learn how to create an R data package

---

## git/GitHub

---

## git and GitHub * git: version control system * GitHub: Hosting service for git repositories

---

## Github Repo = RStudio Project * A **repo**, short for repository, is the folder that contains all of the files for the project on []( * For each repo, you should create an RStudio Project (with version control).

---

## Workflow * Do some work on your project in RStudio. * **Pull** the most recent version of the project from GitHub to your account on the RStudio Server. * **Commit** that work. + Committing takes a snapshot of all the files in the project. + Look over the **diff**: which shows what has changed since your last update. + Include a quick note, **commit message** to summarize the motivation for the changes. * **Push** your commit to GitHub from your account on the RStudio Server.

---

## Collaboration: Git Style * Git is a *decentralized* version control system. + Each collaborator has a complete version of the repo. + Everyone can work offline and simultaneously. + GitHub holds the master copy. + Pull regularly to receive and integrate changes. * **Issues**: The primary method to communicate with your group members. + Make an issue if you have a question or comment or want to make a to do list for the project. + Remember that I am part of the repo... though I won't normally read the issues.

---

## Git Real * git is not friendly and can be frustrating. + BUT, the version control and collaborative rewards are big! * is a great place to develop an online presence. + For now, we will use private repos. * If you end up with a mess of errors, then don't worry. Come see me and we will make a new repo with your most recent copy of the project. + It happens to [everyone](

---

## Introduce Yourself to Git * Run the following code to introduce yourself to git ```r library(usethis) use_git_config( = "mcconvil", = "") ```

---

## Sync repo and an RStudio Project repo **In your repo on**: * Click on the green clone or download button. * Copy the given url for "Clone with HTTPS". **On the RStudio Server**: * In the upper left, go to File > New Project > Version Control. * Select Git. * Paste in the url. It should automatically give it a name. Select where you want the project to live in your home directory. Then click okay. --- ## Ignoring Files * There are several files that we want to **NOT** push to GitHub. * These include: + `.gitignore` + `___.Rproj` + `.DS_Store` * Add these files to the `.gitignore`. --- ## Test the waters: Let's go through the workflow. * Pull. (Yes, there is nothing to pull yet but it is always good practice to start here.) * Click on the readme. * Add something to the readme. * Click on the git tab. Check the box next to the Hit commit. * Put in a commit message. Look over the diff. * Push. **Look for updates in the readme on**

---

## Cache credentials So that we don't have to type in our username and password every time we want to push or pull from GitHub, run the following **in the Terminal** not **in the Console**: `git config --global credential.helper 'cache --timeout=10000000'`

---

## R Packages * What is an R package?

> "R packages are the fundamental unit of R-ness". -- Jenny Bryan

* Contains functions and datasets * "base R": 14 base packages that are preloaded + There are 15 other packages that also come preloaded. * CRAN has > 6,000 more packages + `install.packages("dplyr")` + `library(dplyr)` * And then there are all the packages on `GitHub`: + `devtools::install_github("hadley/dplyr")` + `library(dplyr)`

---

## R Data Packages * Great way to share data! * Why? + Includes documentation. + Very portable. * Example 1: + `library(mosaicData)` + `data(package = "mosaicData")` + `?Births2015`

* Example 2: + `library(pdxTrees)` + `get_pdxTrees_parks()` + `get_pdxTrees_streets()`

* Example 3: + `library(gbfs)`

---

## Creating an R (Data) Package Key packages: * [`devtools`]( supports the development and dissemination of the package * [`usethis`]( automates steps of package creation, such as constructing the data file * [`roxygen2`]( simplifies writing documentation

---

## Steps * Let's go through the "Creating an R Data Package" hand-out. * I will demo the process with [this Seattle bike data]( You also modify that file, commit and push. + Result: Your push will fail because there's a commit on GitHub that you don't have. + Usual Solution: Pull and *usually* git will merge their work nicely with yours. Then push. If that doesn't work, you have a **merge conflict**. Let's cross that bridge when we get there. * How to avoid merge conflicts? + Always pull when you are going to work on your project. + Always commit and push when you are done even if you made small changes.