Data Science: Principles, Algorithms, and Applications
WedFri 2:30-3:45pm in 1221 CS Building, 3 Credits
Announcements
The class mailing list is compsci838-1-s17@lists.wisc.edu
The class's Piazza page is here. This is a forum for the students. We will monitor occasionally but do not have enough man power to answer all questions posted to this page.
Instructor & TAs
AnHai Doan, contact information available from my homepage.
Office hours: Friday 11-noon and by appointment (pls send email, thanks)
TA: Sidharth Mudgal <sidharth@cs.wisc.edu>, office hours: Wed 11-12:30 Room 1351.
Course Description, Prerequisites, and FAQs
Course Format & Grading
See the above course description
Midterm: March 15 Wed, in class at usual time/room
Final: May 3 Wed
Grading: midterm: 30%, final: 30%, project: 40%
Lecture Slides (tentative)
Logistic discussion throughout the course (not a set of slides)
Problem definition and data acquisition
Information extraction from text
Extraction from template-based data (aka wrapper-based extraction)
Data understanding, cleaning, transformation
Data integration
Data exploration and analysis
classification, clustering
association rule mining (see the book chapter)
anomaly detection
Building data-intensive artifacts & designing data-intensive experiments
cross-cutting techniques, execution stages, workflow management, team organization
the three Ss: stages, steps, stacks
scaling, quality monitoring, crowdsourcing, etc.
implementation/architectures
Project
Students will form teams for a multi-stage project that addresses a data science problem.
Project stage 2 discussion (board shots)
Resources
Click here for resources to learn Python, pandas, machine learning, more data science, etc.
Misc
dotdatascience.org UW-madison student organization focused on data science