Students in EEPS 1340 will complete a data science project as individuals or teams of up to 3 students. Projects should apply the techniques from the course (exploratory data analysis, predictive modeling, and/or big data analysis) to a real-world data set. Students may propose a novel project or choose to replicate and extend the results of a peer-reviewed study. Projects focused on topics in Earth, Environmental or Planetary sciences are welcomed, but NOT required. Resources for selecting a project and identifying data sets are provided by the instructor below. Theory or methodological projects may also be permitted, subject to approval by the instructor.
The listed deadlines are tentative - students should confirm deadlines on Canvas closer to due date.
Unless otherwise stated, assignments are due at 6pm ET.
Milestone #0: Brainstorming [Part A: due February 11th, Part B: due February 13th]
Milestone #1: Project Proposal [due February 27th]
Milestone #2: Exploratory Data Analysis [due March 17th]
Milestone #3: Initial Draft [due April 21st]
Project due date: May 10th
(Informal) Project presentations: May 6th
All students must submit: an individual Project Cover Sheet (written responses about the project)
Each project team must submit: a Project Report and code (see Project Guidelines)
Grading Rubric for the project
Start by familiarizing yourself with the Project Guidelines and Grading Rubric.
Watching this video: The 7 steps of machine learning [10 minutes] by Yufeng Guo @ Google Cloud
Recommended module: Introduction to Machine Learning Problem Framing module (developed by Google)
This module is designed to help you define a machine learning problem and propose a solution. Although designed for professionals in industry, this module is relevant for anyone starting a data science project. This module should take < 1 hour to complete.
Explore the list of data sets and data resources related to Earth sciences and other topics. The list is only meant as a resource; students are also welcome to use data from other sources in their projects.
Recommended video:
[7 minutes] How to Create a Dataset for Machine Learning by Jordan Harrod
Code example: Extracting image features using pretrained neural networks (Brown Only)
Group Project Tools (Eberly Center @ CMU): including team roles and team contracts
Collaborative writing tools: Overleaf or Google Docs