E6998 Spring 2014

Seminar on the intersection of computer science, journalism, and new media. Interaction of domain specific challenges in CS with various disciplines related to journalism and digital media. Challenges and opportunities in applying CS to the development and communication of narrative. Topics include: privacy, security, information extraction and natural language processing, business models in digital media, visualization of data, computer assisted storytelling, and interactive narrative.

COMS 3133/4/7/9 (Data Structures) or equivalent programming ability in at least one systems or scripting language (C++, Java, Python)

Susan McGregor (Journalism)
sem2196 at columbia
Office Hours: Tuesdays 2-5pm, by (email) appointment before class
David Elson (Computer Science)
dke4 at columbia
Office Hours: Wednesdays 8-9 (just after class), Pupin 424 (classroom) -- or by appointment

Teaching Assistant
Shruti Kamath
svk2113 at columbia
Tuesday 2-3, TA Room (1st floor of Mudd)

Wednesday 6:10-8:00, 424 Pupin

Class Discussion
Use Lore.

Grade Breakdown
  • 40% Individual Paper or Report
    • Further self-directed research into one of the issues described in the class (approx. 1500-2000 words). Research and evaluate prior examinations of the issue; describe the issue; suggest new approaches if appropriate. For an alternative paper, students may write a reported story of 800-1000 words on a topic of their choosing (with approval from the instructors). In completing this assignment, students should expect to interview no fewer than 8-10 sources, of which at least 4 must appear via direct quotation in the final story. While students may use this story to pursue academic or technical topics, pursuing perspectives on the subject from outside the academy (e.g. in business, industry or among the general public) is expected. Ideal topics will relate in some way to the topics of the class and should be clearly contextualized within current developments in the area that have been reported in the popular press. See Report Guidelines.
  • 50% Final Project
    • Work in groups of 2-4 to design and implement a software product of appropriate scope and complexity given the time constraints. Project should implement one or more of the ideas discussed in class or related issues in digital media, journalism, natural language processing, story generation or interactive storytelling. Give a class presentation and a paper write-up describing the methodology and results. See Report Guidelines.
  • 10% Class Participation
    • As this is a seminar, we will welcome thought leaders and practitioners in digital media, journalism and new media to our class as guest speakers. Students are asked to attend lectures and be prepared to ask questions.
 Date     Topics  Readings  Notes
(McGregor, Elson) Class Intro. Where technology meets uncomfortably with other disciplines  Is there an ethics of algorithms?  More recent Hal Varian talk
Pew Research: Future of mobile news
 1/29  (McGregor) Authority and Trust in Journalistic Storytelling  Why Transparency Has Replaced Independence in Journalism, Is Glenn Greenwald the Future of News?, Crowd-sourced journalism and democratic governance  Guardian UK riots coverage
 2/5  (Guest Speaker: Michael Shaprio) Web's Impact on Journalism Business Models  Web Journalism: Bubble or Lasting Business?, Websites Stretch Thin Ad Dollars
 2/12  (Elson) Automatic Summarization and Sentiment Analysis A Survey on Automatic Text Summarization
Sentiment Analysis Symposium Tutorial

Additional (optional) readings:
Papers highlighted in class
Longer survey of sentiment analysis

Automatic summarising: the state of the art
Automatic summarization of events in social media
2/19  (Guest Speaker: Maya Anand) Internet's Disruption of TV & Film New Yorker: Last Blues for Blockbuster
Variety on cord-cutting: 1, 2, 3
New Yorker on being a critic in a fragmented distribution model
 Bonus reading: 
Visual Effects Society: State of the Industry 2013
Comic relief (Onion from 2008!)
 2/26  (McGregor) Data Visualization Information Visualization, Chapter 1; Fundamentals of Computer Graphics, Chapter 27; Now You See It, Chapter 4 (files on Lore)  Slides
Midterm paper "pitch" due
 3/5  (McGregor) Privacy and Security  A Contextual Approach To Privacy Online, Journalistic Threat Models Slides. Additional readings: "Mugged by a Mugshot Online", "Erasing History"
 3/12  (Elson) Finding Great Content: Search and Recommendations  A Survey on Information Retrieval, Text Categorization, and Web Crawling, Toward the Next Generation of Recommender Systems
Additional readings: Papers discussed in class 
Midterm paper due
UPDATE: Papers now due 11:59PM Friday March 14 through the Lore assignment.
 3/19  No Lecture: Spring Break
 3/26  (Elson) Automatic Story Understanding and Generation; Interactive Narrative Graesser et al.: How does the mind construct and represent stories? (pp 1-19)
Papers discussed in class
 Final project descriptions due

 4/2  (Elson) Research Methods: Wielding Narrative Responsibly Journalists and Jabs: Media Coverage of the MMR Vaccine
Wikipedia: Misuse of Statistics
Odds Are, It's Wrong
False-Positive Psychology
Further reading: Working with Numbers and Statistics: A Handbook for Journalists
Why Most Published Research Findings Are False
Books: 1, 2
Scientific method: Statistical errors (Nature)
 4/9 Mike Dewar, NYTimes R&D Lab:
Dealing with streams of data : predictions and practice from the NYT R&D lab.
Introducing Stream Tools
Learnable Programming
Maintaining Stream Statistics Over Sliding Windows
 4/16  (Guest Speakers: Graham Sack and Apoorv Agarwal) Quantitative Literature and Film Script Analysis  Character Networks for Narrative Generation: Structural Balance Theory and the Emergence of Proto-Narratives
 Parsing Screenplays for Extracting Social Networks from Movies
 4/23   Guest Speaker: Jeff Jarvis
 Special time and location:
 CSB 453 (CS Dept Conference Room)
 Swipe access to CS office required
 BuzzMachine (Prof. Jarvis' blog)
 What now for news?
 David's OH for this week changed to 6:00-6:45 in the CS Lounge, adjacent to the CS Conference Room. Additional OH time is available on request. Let David know if you do not have swipe access to the CS Dept for this week's OH and lecture.
 4/30  Final Project Presentations - Code and data packages due - Final project write-up due Friday May 2