Data Science: Principles, Algorithms, and Applications
WedFri 2:30-3:45pm in 1221 CS Building, 3 Credits
Announcements
The class mailing list compsci839-1-s18@lists.wisc.edu
The class's Piazza page is available here. This is a forum for the students. We will monitor occasionally but do not have enough man power to answer all questions posted to this page.
Instructor & TAs
AnHai Doan, contact information available from my homepage.
Office hours: Tue and Fri 5-5:45pm and by appointment (pls send email, thanks)
TA: Sidharth Mudgal <sidharth@cs.wisc.edu>, office hours: to be decided.
Course Description, Prerequisites, and FAQs
Course Format & Grading
See the above course description
Midterm: Wed March 21, in class at usual time/room
Final: Fri May 4, in class at usual time/room
Grading: midterm: 30%, final: 30%, project: 40%
Lecture Slides (tentative)
Note: the slides below are those from the previous offering of the course. I will update slides AFTER the lectures. When a slide set has been updated, I will indicate so.
Problem definition and data acquisition
Information extraction from text
Extraction from template-based data (aka wrapper-based extraction)
Data exploration, profiling, cleaning, transformation
Data integration
Data exploration and analysis
classification, clustering
association rule mining (see the book chapter)
anomaly detection
Building data-intensive artifacts & designing data-intensive experiments
cross-cutting techniques, execution stages, workflow management, team organization
the three Ss: stages, steps, stacks
scaling, quality monitoring, crowdsourcing, etc.
implementation/architectures
Project
Students will form teams for a multi-stage project that addresses a data science problem.
Resources
Click here for resources to learn Python, pandas, machine learning, more data science, etc.
Misc
dotdatascience.org UW-madison student organization focused on data science