About the Class


Course Title and Semester

MATH 241: Data Science - Spring 2022 - Reed College


Instructor Information

Instructor: Alex John Quijano
Office: Library 394
Email:


Class Time and Location

Lectures: S01: TuTh 8:50 AM – 10:10 AM, S02: TuTh 10:25 AM – 11:45 AM
Rooms: Eliot 103 or Zoom: reed-edu.zoom.us/j/92205820684


Lab Assistants

Maxwell VanLandschoot, Gillian McGinnis, and Isabelle Caldwell


Graders

Taylor Blair and Simon Ahn


Office Hours

  • Alex John Quijano
  • Maxwell VanLandschoot
    • Walk-in (Eliot 126)
      • Th, 3:30 PM - 5:30 PM
  • Gillian McGinnis
    • Walk-in (ETC 211)
      • M, 2:30 PM - 4:30 PM; Tu 3:15 PM - 5:15 PM
  • Isabelle Caldwell


Description

Data science is an interdisciplinary field of study in which scientists use scientific techniques, approaches, methods, and algorithms to extract necessary insights and information from structured and unstructured data. MATH 241 is an applied statistics course with a strong emphasis on developing data literacy, data acumen, and data visualizations. This course focuses on building the ability to employ suitable exploratory data analysis approaches, such as data wrangling, organization, and visualization, as well as developing the ability to create a compelling and accurate story with data. The curriculum contains instructions on how to effectively communicate results and analysis for a non-technical (or technical) audience and utilize a reproducible and collaborative workflow for data analysis. In addition, this course will discuss how to reason through ethical issues regarding data and tackle challenges related to mathematical biology and language modeling.


Prerequisite

MATH 141 or equivalent.


Lectures and Mini-Assignments

Lectures will occur synchronously during their scheduled time. The first 30 minutes of class will be a lecture/presentation followed by an activity with a mini-assignment to be submitted by end-of-class or end-of-day. You will need a computer during the lecture sessions (see Class Materials and Resources).


Modules

Module assignments will be released on the class website 2-3 weeks before the deadline (see Tentative Topic Schedule). It contains instructional units & homework problems. The purpose of the modules is similar to a homework assignment but with more flexibility. You can start your modules at any time and must submit your completed work before the deadline.


Project

There will be a final group project report and presentation. Information regarding the project will be released during the second half of the course.


Learning Outcomes

Upon completing MATH 241, students should be able to do the following:

  • Able to process structured and unstructured data, and produce informative data tables and visualizations, as well as able to understand and explain the basic structure of data, how it is collected, and evaluate its limitations.

  • Able to apply statistical methods, techniques, approaches, and algorithms to extract necessary information and insights from data.

  • Able to critique claims and evaluate decisions based on data, and write and run simple programming codes for data analysis and visualization.

  • Able to apply data science concepts and methods to solve problems in real-world contexts and able to communicate results effectively.


Learning Objectives

Upon completing MATH 241, students should understand the following:

  • Wrangle with a variety of data types which includes spatial, text, and network data.

  • Use proper exploratory data analysis methods to extract knowledge from data.

  • Create multivariate data visualizations that are both static and interactive.

  • Use appropriate mathematical and statistical methods to analyze data.

  • Effectively write about data for a non-technical (or technical) audience.

  • Utilize a reproducible and collaborative workflow for data analysis.

  • Reason through ethical issues regarding data.


Distribution Requirements

This course can be used towards your Group III, “Natural, Mathematical, and Psychological Science” requirement. It accomplishes the following learning goals for the group:

  1. Use and evaluate quantitative data or modeling, or use logical/mathematical reasoning to evaluate, test or prove statements.

  2. Given a problem or question, formulate a hypothesis or conjecture, and design an experiment, collect data, or use mathematical reasoning to test or validate it.

  3. Collect, interpret, and analyze data.

This course does not satisfy the “primary data collection and analysis” requirement.


Class Materials and Resources


Class Website

Class Website: The syllabus, tentative topics schedule, lecture slides, homework, project information, and all other class materials are posted on the course website, reed-statistics.github.io/math241-spring2022.


Textbook

The main textbook is Modern data science with R. 2nd Edition (2021) by Baumer, B. S., Kaplan, D. T., & Horton, N. J. The textbook is free and open-source. We will use other supplementary books and resources throughout the course (please see the Books and Online Resources List).


Computing

This class will use the R programming language and the R Studio Integrated Development Environment (IDE). There are several ways to use R and R studio. Below are two ways to access R and R Studio.

  • Reed College has an R Studio server where you can log in using your Kerberos credentials. Go to this link, rstudio.reed.edu.

  • Downloading and Installing on your own computer. First, you need to install R; r-project.org. Next, you can install R Studio; rstudio download.

If you need a computer, you can purchase or borrow computers - please see the Reed College CIS Getting Equipment webpage. For more information and resources, visit Reed R resources.


Gradescope

Assignments are submitted through Gradescope. Please sign-up as a student with your Reed email using this Gradecope entry code: D5N5PB.


Slack Workspace

Please signup for the class Slack Workspace using your Reed email during the first week of class. We will be using Slack Workspace: math241-quijano.slack.com as the main real-time communication tool; from general announcements and question-answering to direct messages. Please check the Slack Workspace regularly. If you need to have slack notifications sent to your email, please set up your email preferences.

Concise and specific messages are helpful. If you prefer communicating through email, note that the instructor has set up an email filter for this course and you must put the “MATH 241” keyword in your subject line. It is easy for the instructor to get notice of your email if you put the keyword in the subject line.


Class Assessment

Assignments

Modules and the group project are the main sources of evaluation. The mini-assignments are low-stakes sources of evaluation. There will be no exams. The class can move really fast and it is important that you keep up with the assignments, and take feedback seriously.


Attendance and Participation

It is strongly recommended that you attend classes promptly. Participation is an important part of learning data science. Be prepared to participate in the discussion by doing the assigned readings every week and submit mini-assignments during class.


Grading

Each assignment will be graded according to the grading guide.

Grading guide for conceptual or mathematical questions:

  • 5 – Outstanding; showed full understanding of the material. Congratulations!

  • 4 – Excellent; showed almost full understanding but with minor errors. Well done!

  • 3 – Acceptable; showed some understanding but okay despite a few errors. Good!

  • 2 – Needs Improvement; showed some potential but it needs more work. Okay!

  • 1 – Needs Major Improvement; at least you tried, E for effort!

  • 0 – Incorrect or no submission; meh.

Grading guide for multiple choice questions:

  • 1 – Correct.

  • 0 – Incorrect.


Extra Credit

Throughout the course, there will be opportunities for extra credit. You can submit at most two extra credit assignments. The extra credit assignment grade is added to your lab report grades. Extra credit can be from any of these two categories:

  • A critique on the statistical method, visualization, and/or analysis from a chosen article or news source, which involves writing a critical essay (2-3 pages and single-spaced) regarding the statistical analysis of a chosen scientific article or news source. The essay must include a summary of the article, a description of the data used, and statistical method used, and a description of a better statistical method, a better way to visualize the results, or a comment on statistical errors/pitfalls if it exists.

  • Create an informative and visually appealing visualization of complex data, which involves creating a visualization of a chosen data set. You can use tools (R, Python, etc.) to create the visualization. The resulting visualization must include a half-page description of how to read/interpret it and what are its weaknesses.


Class Expectations 1


Office Hours Guidelines

It is strongly recommended that you attend the walk-in office hours or set up a one-to-one office hours with the instructor if you feel like you are falling behind during our in-person class activities, or if you just need to clarify concepts discussed in class. In order to be more productive during a one-to-one office hours (or the walk-in office hours), these are three recommendations before you come-in:

  • List all gaps in knowledge you have (missed concepts) or list all concepts that was unclear to you during class. We will address them one by one.
  • Prepare questions you want answered and be ready to show relevant materials.
  • Regarding assignments, prepare to show (a) what are the steps you have tried and (b) what are the errors you encountered and the solutions you have tried.

Note that these are recommendations so that you can get the most out of the Office hours allocated for you. If you just want to come-in and chat about something else, feel free to do so. If the dedicated time for one-to-one office hours does not work for you, send the instructor a message to set up an appointment.


Academic Honor Principle

We are committed to adhering to the standards regarding academic honesty contained in the honor principle and the values of mutual trust, concern, and respect for oneself and for others upon which the Reed community depends. In class, give your undivided attention to others. If you don’t agree with what someone else has to say, you are encouraged to express your point of view, but do so respectfully, and support your claims with textual evidence.

In your written work, follow the conventions of an appropriate citation for your respective discipline or major. Please consult with the instructor if you have questions about proper citation.


Academic Support

We expect you to participate in the class through lectures, discussion, labs, and other engagements. We also expect you to make use of opportunities to get help outside of class (office hours, Slack, email, tutoring) if you need help. Concise and specific messages are the most helpful.

The Writing Center offers free appointments and experienced peer tutors who are there to help you at any stage of the writing process. I strongly encourage even experienced writers to take advantage of these services. For more information, start here: Reed Writing Resource.


Library

Reed’s subject Librarians can help you locate and access subject-specific resources for projects, classes, and thesis. Do not hesitate to turn to them!


Late Assignments and Incompletes

You are expected to turn in all completed assignments on time. Circumstances that may disallow you to turn in your work on time – such as a medical reason – are understandable. Please let the instructor know if you are unable to submit your work and have missed the deadline way beyond its original posted date. Because every assignment is an important aspect of your learning in this class, we will discuss when you will turn in the assignment as well as decide upon an acceptable consequence for your turning it in late. We are committed to successfully helping you learn data science methods from this course.


Collaboration Policy

It is encouraged that students participate in discussions regarding regarding modules and mini-assignments. However, each student must take responsibility and ownership of their work and submit their work individually.


Accommodations

We will make every effort to accommodate students whose personal obligations lead to scheduling conflicts. Please speak with the instructor during the first two weeks of class regarding any potential accommodations that may arise.

If you have a disability or think you may have a disability, you may also want to meet with Disability & Accessibility Resources (DAR) to request an official accommodation. You can find more information about DAR, including contact information, here: Reed Disability Resources.

If you have already been approved for accommodations through Disability & Accessibility Resources, please meet with the instructor outside of class so we can develop an implementation plan together.


Inclusion and Diversity

The natural and mathematical sciences are often viewed as objective disciplines. Science is a method for us to understand how the world works. However, it is historically built from a small set of privileged populations that often ignores the biases. We acknowledge that there may be some parts in this course that have overt and covert biases. Science is a human endeavor, and the pursuit of knowledge and skill must incorporate a diverse set of experiences.

We value all students regardless of their background, country of origin, race, religion, ethnicity, sexual orientation, disability status, etc. We are committed to providing a climate of excellence and inclusiveness within all aspects of this course. If you have any concerns, issues, or challenges, you are encouraged to discuss with the instructor (set up a meeting by email or a direct message in the Slack Workspace) with the assurance of full confidentiality except for academic integrity code violations or sexual harassment.


Inspirational Talk

Learning data science methods and the R programming language are like learning two new languages simultaneously – like Spanish, French, Mandarin, or Tagalog. Learning probability and statistics can help you on your research/thesis and it can help you statistically assess and critique someone’s research work, arguments or claims. It can be difficult and frustrating when learning R if you have little or no experience with computer programming. Many experienced statisticians, mathematicians, and computer scientists still get frustrated occasionally when writing R codes or any programming language. Part of the R learning experience is frustration and self-denial. These are valid emotions and will slowly fade over time. Once it clicks you will feel joy and excitement but probably with a hint of skepticism making sure the code does what it is supposed to do. If you find yourself stuck or taking a lot of time to successfully execute a code snippet, talk to your fellow peers, ask questions, and most importantly send the instructor a message. Take a break and do something fun - like eating - and try R coding again. You can do this!



  1. Some of the statements in this section are borrowed from the Reed Center for Teaching and Learning, Syllabus Policy Blurbs↩︎