class: center, middle # Data Collection Practice <img src="img/DAW.png" width="500px"/> <span style="color: #91204D;"> .large[Kelly McConville | Math 141 | Week 4 | Fall 2020] </span> --- ## Announcements * Slack Pro Tips + Check at least once a day. + Set up your notifications in a way that works for you. + Play around in the Preferences. + Create at least one content related post per week. --- ## Reminders * Lab 3 due before your lab session this week. + Practice visualizing data with `ggplot2` and wrangling data with `dplyr`. -- * Project Assignment 1 is due on Friday October 2nd (end of day) on Gradescope. -- * Come to office hours this week, especially if you haven't stopped by twice yet this semester. --- ## Week 4 Topics * Finish up a couple more **Data Wrangling** examples * **Data collection** * Modeling **This week is light on new R material. Make sure to use that time to get caught up on the R work so far.** --- # Goals for Today Practice addressing: * How were the data collected? * Who are the data supposed to represent? + Who is present? Who is absent? + What evidence is there that the data are representative? --- ## Types of Studies * **Observational Study:** Collect data in a way that doesn't interfere -- * **Experiment:** Interested in causal relationships so utilize random assignment. Other key features include: + Blinding + Control group + Placebo --- ## Thoughts on Data Collection #### Random Sampling * Random sampling is important to ensure the sample is representative of the population. -- * Representativeness isn't about size. + Small random samples will tend to be more representative than large non-random samples. -- * How do we draw conclusions about the population from non-random samples? -- → Investigate how your sampled cases (and respondents) are systematically different from the non-sampled cases (and non-respondents). --- ## Thoughts on Data Collection #### Random Assignment * Random assignment allows you to explore **causal** relationships between your explanatory variables and the predictor variables. -- * How do we draw causal conclusions from studies without random assignment? -- → With extreme care! Try to control for all possible confounding variables. -- → Discuss the associations/correlations you found. Use domain knowledge to address potentially causal links. -- → Take more stats to learn more about causal inference. -- **Bottom Line:** We often have to use imperfect data to make decisions. --- class: center, middle, inverse # Now let's work on the Data Collection Practice Handout!