This is a list of data sets that was used and will be used in class and assignments.
Starwars Movies (Episodes I-VI) Transcripts
New York City Flights data sets using the nycflights13
package
The county_complete
data set from the usdata
package
The Titanic data subset taken from Stanford’s CS109, which was originally from Kaggle’s “Titanic - Machine Learning from Disaster” data set.
These are lists of suggested data sets for you to use in your projects. Note that these are just starting points on your data exploration. You may need to search for the original source of the data set and related data sets. Let the instructor know if you have questions about a data set.
This is a list of websites that you can explore data sets other than what is listed above. You may need to search for the original source of a data set. Consult with the instructor if have questions about a data set.
CORGIS (The Collection of Really Great, Interesting, Situated Datasets)
UCI (University of California Irvine) Machine Learning Repository