Lectures
All lecture slides available in Google Drive (Brown login required)
Course Modules (Tentative)
Module 01: Navigating Your First Year at Brown
Students will explore the essential skills, mindset, and resources needed for success in college-level courses, including topics in time management, study strategies, effective writing, and setting goals.
Lecture 01 (Sept 5th): Introduction
Lecture Slides: Introduction & Who is the GOAT?
Readings and Resources
How Big is Taylor Swift? We Crunched the Numbers. NYTimes (May 17, 2024). [access instructions here]
[optional] The Unstoppable Pop of Taylor Swift. Reuters (July 29, 2023)
[optional] Who is the Biggest Pop Star? The Pudding (Feb, 2019)
[optional] What Makes Simone Biles the GOAT? Andrew Doss, Medium (Jul 20, 2021)
Follow-up: See Ed Discussion for discussion questions (instructions on Canvas).
Lecture 02 (Sept 10th): Transition to College - Part 1
Lecture Slides: Who is the GOAT?, Your Data & Time Management
Reading
R. Light. (2001). Making the Most of College. Harvard University Press. Chapters 2, 3, 4 & 6.
Follow-up: See Canvas for three-part assignment associated with this reading.
Resources
Build Your Own College Rankings. NYTimes (Nov 9, 2023). [access instructions here]
How to read a book. Paul Edwards, University of Michigan School of Information. (last accessed 10 Sept 2024)
Lecture 03 (Sept 12th): Transition to College - Part 2
Lecture Slides: Setting Goals, Citing Your Sources, Academic Support
Resources
Citing Your Sources
Citations. Brown University Library. (last accessed 12 Sept 2024)
Study Skills
How to (seriously) read a scientific paper. Elisabeth Pain, Science (Mar 21, 2016).
How to read an academic paper [video]. UBC iSchool (Jan 17, 2013).
Taking notes effectively. Raul Pacheco-Vega, blog post (Sept 15, 2016).
Module 02: What is Data? Data Demystified
Students will learn what data is, its numerous forms, its role in society, and its applications in everyday life with a focus on the potential of data science for social good.
Lecture 04 (Sept 17th): What is Data - Part 1
Lecture Slides: Making the Most of College, Scavenger Hunt, What is Data?
Resources
DATA 0150 Scavenger Hunt Data (Google Sheets)
Lecture 05 (Sept 19th): What is Data? - Is Data Science a Discipline?
Lecture Slides: Is Data Science a Discipline?
Readings
Is "data science" an academic discipline. Ben Orlin, Math with Bad Drawings Blog. (Oct 17, 2023)
Data Science: An Artificial Ecosystem. Xiao-Li Meng, Harvard Data Science Review (Jul 1, 2019).
Why I believe that Data Science is its own field/discipline. Manuel Rossetti, blog post (Aug 10, 2022).
10 Research Challenge Areas in Data Science. Jeannette Wing, Columbia Data Science Institute (Dec 30, 2019).
[Video] Intro to Data Science: Historical Context. Steve Brunton, University of Washington (Jun 6, 2019).
[optional] [Video] Intro to Data Science: What is Data Science? Steve Brunton, UW (Jun 6, 2019).
Before Class: See EdSTEM for discussion questions (instructions also available on Canvas).
Resources:
Data: Data Science in Colleges/Universities [Google Sheets]
An Interview with Aram Sinnreich and Jesse Gilbert, The Secret Life of Data. Podcast: New Books in Science, Technology and Society (July 10, 2024).
Muller et atl. (2019). How Data Science Workers Work with Data: Discovery, Capture, Curation, Design Creation.
Richardson et al. (2019). Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice. NYU Law Review.
Module 03: Origins of Data Science
Students will investigate the historical context of data science, specifically its impact on society and ethical implications.
Lecture 06 (Sept 24th): The Origins of Data Science
Lecture Slides: The Origins of Data Science
Readings:
How Eugenics Shaped Statistics: Exposing the damned lies of three science pioneers. Aubrey Clayton, Nautilus (Oct 27, 2020).
See Canvas for trigger warnings and alternative (image-free) copy
The History of Data Science and Pioneers You Should Know. WPI blog.
Before Class: See EdSTEM for discussion questions (instructions also available on Canvas).
Resources:
History of Data Science. https://www.historyofdatascience.com/
Breiman (2001). Statistical Modeling: The Two Cultures. Statistical Science.
Data Visualizations and Infographics
Friendly (2006). A Brief History of Data Visualization. Handbook of Computational Statistics: Data Visualization.
Web supplement to A History of Data Visualization & Graphic Communication. Michael Friendly & Howard Wainer [last accessed Sept 2024].
Charles Minard's Infographic of Napolean's Invasion of Russia. Fosco Luscarelli, SOCKS blog (April 6, 2014).
Depicting data. Ellen Embleton,. The Royal Society blog (Aug 16, 2021).
John Snow: Data Cruncher and Public Health Crusader. History of Data Science blog (Aug, 30, 2021).
Florence Nightingale, datajounalist: information has always been beautiful. The Guardian (Aug 13, 2010).
How W.E.B. Du Bois used data visualization to confront prejudice in the early 20th century. Jason Forrest, Tableau blog (Feb, 20, 2019).
The Significant Problem of P-values. Lydia Denworth, Scientific American (Oct 1, 2019).
Module 04: Where Does Data Come From?
Students will explore diverse sources of data, ranging from censuses to surveys to institution data repositories. Students will also learn about methods for obtaining data, and the importance of distinguishing between reliable and unreliable data sources.
Lecture 07 (Sept 26th): Where does data come from? - Part 1
Lecture Slides: Sources of Data & Your Online Data
Readings:
Chapter 12: A Costless Resource to Exploit. Invisible Women: Data Bias in a World Designed for Men. Caroline Criado Perez (2019).
[optional] Why Everything from Transit to iPhones is Biased Toward Men: WIRED Q&A with Caroline Criado Perez. WIRED. (July 2, 2019).
Unpacking the Mystery of Missing Gender Data. Beegle et al., World Bank Data Blog. (Sept 28, 2023)
[optional] Browse: World Bank Group Gender Portal & Data Stories
Library of missing data sets: Are you being digitally excluded? Patricia Gestoso, blog. (July 18, 2022).
Leaving no one behind - How can we be more inclusive in our data?: What are the critical data gaps? UK Statistics Authority. (Sept 21, 2021).
Before Class: See EdSTEM for discussion questions (instructions also available on Canvas).
Lecture 08 (Oct 1st): Where does data come from? - Part 2
Lecture Slides: Are short socks cringe?: Data Collection
Readings:
Which Taylor Swift Album is the Most Popular? Nathaniel Rakich, FiveThirtyEight (March 24, 2023).
Polling data described in the article:
YouGov: In 2024, what's the most popular Taylor Swift album among her American fans? Jamie Ballard, YouGov blog (Aug 22, 2024).
Morning Consult Poll (2023) - results: PDF
5 key things to know about the margin of error in election polls. Andrew Mercer, Pew Research Center (Sept 8, 2016).
Before Class: See EdSTEM for discussion question (instructions also available on Canvas).
Lecture 09 (Oct 3rd): Where does data come from? - Part 3
Lecture Slides: Are short socks cringe?: Results
See Canvas for link to Google Slides (students have edit access)
Reading:
What's up with Gen Z's socks? A. R. Hayes, YouGov (August 6, 2024).
Module 05: Data Literacy & Visualization
Students will understand and interpret data, exploring data visualizations and the art of presenting data effectively.
Lecture 10 (Oct 8th): Data Literacy - Part 1
Lecture Slides: Data Literacy: Data Biases
Readings (optional)
Suresh & Guttag (2019). A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle. arXiv:1901.10002.
Trusting the Data -- A Look at Data Bias. National Cancer Institute, Cancer Data Science Pulse Blog (May 22, 2023).
Evaluating Information Sources. Brock University Library Research Guides. [last accessed Oct 2024]
Resources:
Survival of the Best Fit. Game by G. Csapo, J. Kim, M. Klasinc & A. ElKattan.
AI Programs are Learning to Exclude Some African American Voices. W. Knight, MIT Technology Review (Aug 16, 2017).
Top 3 Statistical Paradoxes in Data Science. F. Casalegno, Medium (Feb 25, 2021).
Griffith et al. (2020). Collider bias undermines out understanding of COVID-19 disease risk and severity. Nature Communications.
How ChatGPT and Our Language Models are Developed. OpenAI FAQ. [last accessed Oct 2024]
Brown et al. (2020). Language Models are Few-Shot Learners. arXiv:2005.14165
Evaluating Resources. University of California Berkeley Library Guides. [last accessed Oct 2024]
Lecture 11 (Oct 10th): Guest Lecture - Office of Institutional Research @ Brown
Lecture Slides: Brown's Office of Institutional Research
Before class: Complete pre-class survey (see Canvas)
After class: Complete post-class Exit Ticket (see Canvas)
Resources:
Office of Institutional Research website
Lecture 12 (Oct 15th): Statistical Fallacies & Misuse of Statistics I
Lecture Slides: Data Literacy: Evaluating Information Sources & Statistical Fallacies
Reading:
Lessons from How to Lie with Statistics: Timeless Data Literacy Advice. Will Koehrsen, Towards Data Science blog (July 28, 2019).
Before class: Answer discussion question on EdSTEM (see Canvas)
Resources:
Grofman & Cervas (2024). Statistical Fallacies in Claims about "Massive and Widespread Fraud" in the 2020 Presidential Election: Examining Claims Based on Aggregate Election Results. Statistics and Public Policy.
Groharing & McCune (2022). Benford's Law and County-Level Votes in US Presidential Elections. CHANCE Magazine (American Statistical Association).
Lecture 13 (Oct 17th): Statistical Fallacies II & Data Visualization
Lecture Slides: Statistical Fallacies II & Data Visualization
Reading:
A Guide to Getting Data Visualization Right. Sara Dholakia, Smashing Magazine (Jan 5, 2023).
Ask the Question, Visualize the Answer & Visualizing Uncertainty in Data. Nathan Yau, Flowing Data Guides. (last accessed Oct 2024).
What would feminist data visualization look like? Catherine D'Ignazio (@kanarinka), Civic Media: Creating Technology for Social Change Blog, MIT Media Lab. (Dec 1, 2015).
List of Physical Visualizations and Related Artifacts. P. Dragicevic and Y. Jansen, Data Physicalization Wiki. (last accessed Oct 2024).
Before Class: Answer two discussion questions on EdSTEM (see Canvas)
Resources:
See Google Drive > Resources > Bad Data Visualizations
Information is Beautiful website.
Lisnic et al. (2023). Misleading Beyond Visual Tricks: How People Actually Lie with Charts. CHI Conference on Human Factors in Computing Systems.
Lecture 14 (Oct 22nd): Data Visualization II
Lecture Slides: Data Visualization Principles & Student Principles of Data Visualization
Reading:
Data Feminism by C. D'Ignazio & L.F. Klein. MIT Press (2020).
Periscopic US Gun Killings Visualization (discussed in D'Ignazio & Klein, 2020).
Before Class: See Canvas for assignment
Resources:
Bad Data Visualizations (Google Drive)
What to consider when choosing colors for race, ethnicity, and world religions. L.C. Muth, Datawrapper Blog (Oct 9, 2024).
Chapter 11: Data Visualization Principles. Introduction to Data Science: Data Analysis and Prediction Algorithms in R. (1st Edition, 2019). Rafael Irizarry.
Making Data Meaningful. Part 2: A Guide to Presenting Statistics. United Nations Economic Commission for Europe. (2009)
Module 06: Data Curation & Analysis
Students will explore the process of collecting, cleaning, and organizing data for analysis, as well as basic data analysis techniques. No coding experience is necessary.
Lecture 15 (Oct 24th): Data Wrangling
Lecture Slides: Data Wrangling
Reading:
Data Feminism by C. D'Ignazio & L.F. Klein. MIT Press (2020).
For Big-Data Scientists, "Janitor Work" is a Key Hurdle to Insights. Steve Lohr, New York Times (Aug 17, 2014).
Before Class: See Canvas for assignment
Resources:
Press Release: Chipotle introduces new AI hiring platform (Oct 22, 2024)
NOAA / NWS Lightning Strikes data (website)
Lecture 16 (Oct 29th): Data Wrangling II
In-class Activity: Lightning Strikes Data Wrangling Activity (continued)
See Lecture 15 Slides & Canvas for assignment
NOAA / NWS Lightning Strikes data (website)
Lecture 17 (Oct 31st): Data Curation & Analysis
In-class Activity: Data Collection Activity
See Lecture 17 Slides & Canvas for assignment
Data Documentation templates:
The Data Cards Playbook (Google Research)
Datasheets for Datasets (Microsoft Research)
An Introduction to the Data Biography (We All Count)
See Module 07 for Additional Data Curation & Analysis Sessions (to keep things in chronological order)
Module 07: Data Science Resources
This module will provide students with information about data science resources available, both within Brown (Data Science Institute and libraries) and broadly. They will be presented with resources for data, software and additional learning materials.
Nov 5th - United States Election Day (No Classes University-wide)
If you are eligible to do so, remember to vote!
Lecture 18 (Nov 7th): Guest Lecture with the Brown Data Librarians
Lecture Slides: Navigating Data Resources at the Library [pdf] [Google Slides]
Before Class: See Canvas for assignment
Brown University Library Resources:
BruKnow: the Brown University Library Catalog
Subject Guides: Data & Statistics, Digital Scholarship, and more...
Module 08: Data Across Disciplines @ Brown
Students will explore the versatility of data science by studying its applications in various fields, culminating in a final project. This is intended to be the longest course module, and topics will be selected with student input.