The goal of this course is to provide you with the tools to build data-driven interactive systems and explore the new opportunities enabled by this data through a combination of guest lectures, discussion of current literature, and practical skills development. Over the course of the semester, you will learn about the entire data pipeline from collecting and analyzing to interacting with data.
This course requires comfort with programming, as required projects make use of (at a minimum) python, sql, css, and javascript (including D3). A series of "project bytes" help to lay the groundwork for larger group projects.
The learning goals of the course are as follows:
The class will involve programming and debugging. If required by your background, it is possible to minimize the programming you do for projects (in which case you will be expected to spend more time on other factors such as beautiful visual designs). However, you should not take the course if you find programming or debugging extremely difficult because you will have to master several very different programming languages/concepts in very short order (projects make use use of web programming frameworks including Flask, Bootstrap, Ajax, jQuery, D3, Google Appspot; and multiple languages including Python, Javascript and SQL).
The course is project oriented. It includes 1-2 self-defined projects along with 4-6 smaller "project bytes" designed to provide the stepping stones needed to complete the larger projects. Your work will be evaluated relative to your background and level of effort. This is a graduate class, and the assumption is that you are a mature and motivated student, and that you will define your work so that you learn and grow, given your background. Students who are taking this course as a part of a technical requirement (such as the computer science course requirement in the HCI PhD) will need to do more advanced or ambitious projects, and should consult with the instructor to make sure they are meeting this bar.
All bytes are to be done as individual work. It is expected that students may assist each other with conceptual issues, but not provide code. If you use example code, you must explicitly acknowledge this. If you are unsure about these boundaries, ask. The larger projects are to be done in groups of two or larger.
Some of the specific skills that will be covered in projects include:
There will be regular in-class quizzes. There will be a take-home final exam but no midterm. Of the in-class quizzes, you may drop your two lowest scores.
This term we will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates, the TA, and myself. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza. If you have any problems or feedback for the developers, email team@piazza.com.
Find our class page at: https://piazza.com/cmu/spring2016/05839/home/home
Readings will be made available on the CMU Blackboard. The following books are recommended:
Interactive Data Visualization for the Web (Free online version)
Doing Data Science (Schutt & O'Neil) -- based on the very successful Columbia course on data science taught by Schutt (uses R and Python)
These books may also be useful:
Visualize This (Nathan Yau) (uses R and Python)
Programming Google App Engine, Charles Severance (uses Python, plus add-ons like JavaScript)
Python for Data Analysis, Wes McKinney (Python)
Concepts
Skills
You will be expected to read assigned readings before the lecture they pertain to. These may include chapters drawn from textbooks about data, or readings about the research literature. To incentive this, each student will be required to make at least two relevant postings to the discussion group before the class on which each reading is due.
The tentative breakdown for grading is below. The course will make use of peer grading (details will be provided in class). As a reminder, here is the university policy on academic integrity.