Syllabus

CMSC 436 / 636: Data Visualization

   


Welcome to CMSC 436 / 636, Data Visualization. In this course, you will learn how, when, and why, to create data visualizations.

Location: Janet & Walter Sondheim 207
Time: M/W: 2:30 – 3:45pm (ITE 357)
Instructor: Dr. Jian Chen (email: jichen@umbc.edu, phone: 410.455.8937)
Office Hours: M/W: 1:15-2:15 pm or Scheduled By Appointment 

Teaching assistants:
Xiaokai Li: VJ82854@umbc.edu
Office hours:  Tu / Thur: 1:00-2:00pm (ITE 349)

Kyle Boyer: kyleboy1@umbc.edu
Office hours: M/W: 12noon - 1pm (ITE 334)


TEXTBOOKS

Required: 
Visualization Analysis and Design: Tamara Munzner, CRC Press Taylor & Francis Group, 2014 (buy it on Amazon)


Extra readings: 
I have these books. Ask me to take a look at them if you are curious.
  • Information Visualization: Perception for Design, Colin Ware. (This book includes many design principles) 
  • Visual Thinking for Design, Colin Ware. (This book is relatively high-level and is easy to follow.)
  • Semiology of Graphics: Diagrams, Networks, Maps, Jacques Bertin. (This is the very first book talking about design. It is still considered the most complete design book today.) 
  • Visual Explanations, Edward Tufte
  • Envisioning Information, Edward Tufte
  • Readings in Information Visualization: Using Vision to Think, Stuart K. Card, Jock D. Mackinlay, Ben Shneiderman. (This book includes a set of highly influential papers in (information) visualization)
  • Learning D3: 
    • textbook:
    • online tutorial (this is very useful): 

PREREQUISITES 
There is no prerequisite for this class and the class is open to both graduate students as well as advanced undergraduates. Some working knowledge of or willingness to learn, graphics programming tools (technologies!) will be useful. 


PIAZZA FORUM: 

COURSE DESCRIPTION

Overview

As the world is flooded with increasing amounts of data, human perceptual and cognitive abilities remain relatively constant. Showing data in visual means work better than others. 

A “visualization” is simply a visual representation of an object of our interest. It’s visual: we consume them with our eyes, and so it is essential that we know how our eyes work — and, more importantly, the parts of our brains connected to our eyes. It’s also a representation; we get to choose what this representation will be, and different choices lead to different pictures, some good and some bad. We will learn how to tell those apart, and how to make pictures that are more good than bad.

Good data visualization involves perceptual psychology, mathematics, and computer science. This makes our subject uniquely challenging: sometimes the way our eyes work stands in way of applying some beautiful result from computer science. Sometimes it’s the other way around: something deep about the math in the data will help guide the design process and let us make a picture that is beautiful, informative, and truthful.

The goal of this course is to prepare you for a career involving design of visualization and interactive experiences. Students in this course will learn about design in lectures, paper presentations, assignments, in addition to a final project that is related to solving some real-world problems.  Do not mistake this for a course in only “computer” development. This course is focused on the rules and methods of visualization design, which remain fairly constant regardless of the technology used to develop a visualization technique. While technology will play a significant role in our studies, technological details will not be our focus. We will talk about technologies and implementations when we encounter them. 

Topics

The content of the course is split roughly in three distinct aspects: mechanics, principles, and techniques.
Topics will include visual encoding approaches, task analysis, trees and graphs, scalar, vector, and tensor field visualizations, validation approaches, multiple views, two-dimensional and three-dimensional interaction techniques, and tool designs in many application domains.


THE WHY: PRINCIPLES

Data visualization itself has existed for at least 200 years; we’ll learn about Playfair, Nightingale, Minard, and others. Statistics in the 1900s, computers in the 1950s; exploratory analysis. From the 1960s on, we started to realize that some things in visualization work better than others, and around 1980 scientists started seriously studying the effectiveness of data visualization as a medium itself. This program goes on to this day. To give a few examples, we know that using positions works better than using angles; we know that using length works better than using area. We know that, in some cases, using color intensity works better than color hue (and that in other cases, it’s the other way around).

We also know, since the 1960s, that interaction is a powerful idea. Back then people interacted with a data visualization by carefully rearranging bits of paper (no supercomputers in our pockets yet!), but many of the original thoughts are still valid. We will learn the basics of interactive visualizations.

Although much of what we know about visualization is finicky and specific, we have some general principles. We will spend about four weeks studying these principles.

List of Principles
  • retinal variables
  • integral vs separable channels
  • color vision
    • color spaces
    • color blindness
  • interactive vis
    • brushing and linked views
    • transitions
  • (very basics of) human-centered design
  • confusers, hallucinators, data transformations

THE WHAT: TECHNIQUES

In comparison to the relative paucity of principles, data visualization has an enormity of existing techniques. We will spend about six weeks in this course going over existing techniques, and what kinds of data they apply to.

Here, computer science has much to say about data visualization.

For example, not everything we want to do with data is efficient, and not everything that is efficient is worth doing with data. This means that the practice of data visualization needs to be informed by algorithmic constraints.

Data visualization also interacts with software engineering: not every visualization algorithm plays well with the rest of the code in your program and in your head.

List of Techniques:
  • line plots, dot plots, scatterplots
  • vector and tensor fields
  • treemaps
  • node-link diagrams
    • directed graphs
    • undirected graphs
  • set visualization
  • techniques for large data

THE HOW: MECHANICS

When we talk about mechanics, we mean the practical things you will need to learn in order to create data visualizations. In this course, we will use web-based software. This means making visualizations through web pages, using HTML, CSS, and Javascript. The main domain-specific tool we’ll learn is D3. If you have taken computer graphics, you should already know OpenGL. If you are to use the web version, Google gives you WebGL for 3D data visualization. The modern web stack is good, bad, and ugly. In this course, you are obligated to learn how to use it to make visualizations using tools.  The TAs will help you with D3 which will make use of the following mechanics. All homework assignment will be in D3. If you like to use WebGL in your final project, please talk to Dr. Chen who will point you to the students who know how to do it so you have someone to discuss.

List of Mechanics: 
  • HTML, CSS, selectors
  • Javascript
  • SVG
  • d3
    • Basics
    • Selections
    • Joins
    • Transitions
    • events
  • Loading data, sorting data, filtering data
    • csv, json

LEARNING GOALS
 
  • An understanding of the key principles and techniques used in visualization, including graphical perception and techniques for visual encoding and interaction.
  • Exposure to a number of common data domains and corresponding analysis tasks, including graphs, medical imaging, networks, multimedia (text, video, set) data etc. 
  • Practical experiences building and evaluating visualization techniques (including the most recent crowd-sourcing techniques).
  • The ability to read and discuss research papers from the visualization literature. 

HOMEWORKS, TERM PROJECT, AND GRADING

Your letter grade is mapped to the following category. A: > 90%, B: 80%, C: 70%, D: 60%, F: <60%.

There will be five written homeworks, one midterm, and one major open-ended term project. Course grades will be based 
  • 32% on homeworks (we will take your best four),
  • 10% on reading summaries,
  • 12% on the midterm, and 
  • 46% on the major term project. 

Up to 3% extra credit may be awarded for class participation, such as for helping classmates on the Piazza forum.

Grade distribution: 
categoryassignment  points
 homeworks   hw18
  hw28
  hw38
  hw48
  hw58

The homework’s will contain written questions and questions that may require some programming. 

We will evaluate your work holistically beyond mechanical correctness and focus on the overall quality of the work using the following scale: 

100 = Excellent / no mistakes (or really minor)
90
80 = Good / some mistakes
70 
60
50 = Fair (some major conceptual errors
40
30
20 = Poor / did not finish
10
0 = did not participate / did not hand in

The instructor tries very hard to make questions unambiguous, but some ambiguities may remain. Ask if confused or state your assumptions explicitly. Reasonable assumptions will be accepted in case of ambiguous questions.

The term project is a group project, you will investigate some interesting aspect of design issues in visualization or apply design principles to a problem that interests you. There are three milestones for the term project (team, proposal, preliminary result, final result). Please see the term project page for more information.

The purpose of grading is to clearly and accurately pinpoint the strengths and weaknesses of your progress as a visualization designer. In this course you will earn your grade through hard work. An assignment description sheet will accompany each new assignment, detailing the specific requirements. Assignments are due at the beginning of the class period on the assigned due date. Please do not turn in an assignment longer than three days after the specific deadline. Only in the case of extreme situations that are called to our attention in advance can something be arranged. These excuses must be accompanied with proof.

Submission policy:
Your submission must be made on github. Please create your own github account and send the TAs and Dr. Chen a link to your github page. We will use the work and timestamp on github to grade your homeworks and projects. 


LATE POLICY

The final project and the paper summary cannot be turned in late! No exceptions!

For the assignments, you are allowed three late days. You can use all three late days for one homework, or spread them out. That is up to you. To use a late day, you must inform the TA about it in advance. Late assignments (without informing the TA in advance in the use of the late days) will not receive any credit.

Always plan ahead.


OFFICE HOURS and E-MAIL POLICY

I will do my best to answer questions sent to me directly by email within about 24 hours when I am in town. Please don't count on a response an hour before a deadline. The piazza discussion board is the most efficient way to get answers (I will answer those questions too). You might be better off posting your questions there (especially programming questions).

ATTENDANCE 

Since learning design principles means a lot of exercises and critiques, attendance is vital. There will be quizzes every week if not every class to help you grasp important concepts. Most time, there won’t be a correct or wrong answer. 

The purpose is to help you understand some concepts. More than three unexcused absences will result in the reduction of one letter grade from your final average for each additional day missed. 

Students are expected to arrive on time and remain until class is dismissed. Three tardy arrivals or early departures equal one absence. Attendance will be taken promptly at every class period. Attendance at critiques in several stages of project presentations is mandatory. Absence on critique day will result in a failing grade for the project; a doctor’s note will be required in order to excuse absence due to illness on critique days.


ORIGINAL WORK

Unless otherwise specified in an assignment, all submitted work must be your own, original work. You may discuss general approaches with others on individual assignments, but may not copy code or other work and must indicate on your turned­ in assignment who you worked with. You may not provide your solutions to other students. Any excerpts from the work of others must be clearly identified as a quotation, and a proper citation provided. Any violation of the School’s policy on Academic and Professional Integrity (stated in the Student Handbooks) will result in severe penalties, which might range from failing an assignment, to failing a course, to being expelled from the program, at the discretion of the instructor and the Associate Dean for Academic Affairs. 

I strongly encourage students to form study groups. Students may discuss and work on homework problems in groups. However, each student must write down the solutions independently, and without refereeing to written notes from the joint session. In other words, each student must understand the solution well enough to reconstruct it independently. In addition, each student should write on the problem set the set of people with whom s/he collaborated or explicitly state that “This work is the sole contribution of my own.

CREDITS

This class is based on the classes taught by Levine and Alex Lex at the University of Utah, draws on the book by Tamara Munzner at the University of British Columbia. Some of the material in this course is based on the classes taught by Carlos Scheidegger at the University of Arizona, Penny Rheingans at UMBC, Jeff Heer at the University of Washington, Torsten Möller at the University of Vienna, Hewlig Hauser at the University of Bergen, and Maneesh Agrawala and UC Berkeley. We have heavily drawn on materials and examples found online and tried our best to give credit by linking to the original source. You can find these credits mainly by direct links to the sources from the images. Please contact us if you find materials where the credit is missing or that you would rather have removed.

ACCOMMODATIONS FOR STUDENTS WITH DISABILITIES

If you think you need an accommodation for a disability, please let me know at your earliest convenience. Some aspects of this course, the assignments, the in­class activities, and the way we teach may be modified to facilitate your participation and progress. As soon as you make me aware of your needs, we can work with the Office of Student Disability Services (former Student Support Services) to help us determine appropriate accommodations. SSD (410-455-2459 or disability@umbc.edu, http://sds.umbc.edu/) typically recommends accommodations through a request form. I will treat any information you provide as private and confidential.

ETHICS
  • Cell phones should be turned off or to silent mode during class time. If you must take a phone call please step into the hallway to minimize class disruption. 
  • Please be polite in group discussions.