Svitlana Volkova: Predicting the Future with Deep Learning and Signals from Social Media
Abstract: Social media communications are reflections of events in the real world that can be used to build a variety of predictive analytics. In this talk I will present three studies that demonstrate how social media signals in combination with deep learning models can be effectively used to make predictions about the future. First, I will discuss the advantages of linguistically infused deep learning models to predict suspicious news posts on Twitter including satire, hoaxes, clickbait and propaganda. I will highlight significant differences in the use of biased, subjective language and moral foundations behind suspicious and trustworthy news posts. I will then present a large-scale analysis on targeted public sentiments using 1.2 million multilingual connotation frames extracted from Twitter. The analysis relies on connotation frames to build models to forecast country-specific connotation dynamics – perspective change over time towards salient entities and events during Brussels bombing. Finally, I will discuss a study on modeling language dynamics in social media by tracking how meaning of words fluctuates over time in the VKontakte social network. My team developed models to forecast short-term shifts in word’s meaning from previous meaning as well as from word dynamics. Our models and novel findings advance the understanding of journalistic portrayal and biases in news reports, and improve situational awareness during crisis events.
Bio: Svitlana Volkova is a Senior Research Scientist in the Data Sciences and Analytics Group, National Security Directorate at Pacific Northwest National Laboratory. Dr. Volkova’s research focuses on advancing machine learning and natural language processing techniques to develop novel social media predictive and forecasting analytics. Svitlana’s recent work includes forecasting social media dynamics – opinions and emotions, infectious disease outbreaks, real-word events, entity and event-driven connotations, deception detection and information biases in news and social media. Svitlana interned at Microsoft Research at the Natural Language Processing and Machine Learning and Perception teams. She was awarded the Google Anita Borg Memorial Scholarship in 2010 and the Fulbright Scholarship in 2008. She is the Vice Chair of the ACM Future of Computing Academy. She received her PhD in Computer Science in 2015 from Johns Hopkins University where she was affiliated with the Center for Language and Speech Processing and the Human Language Technology Center of Excellence.
Lyle Ungar: Measuring Psychological Traits using Social Media
Abstract: The words and images people post on social media such as Twitter and Facebook provide a rich, if imperfect, view of who they are and what they care about. We analyze tens of millions of Facebook posts and tens of billions of tweets to study variation in language use with age, gender, personality, and mental and physical well-being. Word clouds visually illustrate the big five personality traits (e.g., "What is it like to be neurotic?"), while correlations between language use and county-level health data suggest connections between health and happiness, including potential psychological causes of heart disease. Similar analyses are increasingly being used for applications ranging from job candidate screening to targeted marketing.
Bio: Dr. Lyle Ungar is a Professor of Computer and Information Science at the University of Pennsylvania. He received a B.S. from Stanford University and a Ph.D. from MIT. Dr. Ungar directed Penn's Executive Masters of Technology Management (EMTM) Program for a decade, and served as Associate Director of the Penn Center for BioInformatics (PCBI). He has published over 200 articles and holds eleven patents. His current research focuses on statistical natural language processing, spectral methods, and the use of social media to understand the psychology of individuals and communities.
While a student at Sloan, Dr. Ungar worked as a strategic business analyst at the Boston Consulting Group. Since coming to Penn in 1984, he has consulted for companies ranging from start-ups to Fortune 500 companies on strategic use of information technology in areas including data mining, information retrieval, online auction design, expert systems, and e-commerce.
Gideon Mann: The War on Facts
Abstract: Our era is one of increasing epistemological danger. From fake conversations to fake news and fake voices, our shared objective reality is under attack. Traditional gatekeepers were once able to construct a set of small conversations, but open platforms now allow many voices to participate and shape a conversation, including some voices that have malicious and often secret agendas. On top of this, technology has made it significantly easier to fabricate synthetic video and voice that mimics a recognized person. In this talk, I'll review the biggest threats to the discovery of truth and the ways in which disinformation is created and promulgated. I'll cover how a traditional newsroom fact checks and then will briefly describe preliminary work we are doing at Bloomberg to address this emerging threat.
Bio: Gideon Mann is the Head of Data Science at Bloomberg L.P., where he guides the strategic direction for machine learning, natural language processing (NLP) and search across the company. He is part of the leadership team for the Office of the CTO. He’s active on issues related to the ethics of data science and also serves as a founding member of both the Data for Good Exchange (D4GX), an annual conference on data science applications for social good and the Shift Commission on Work, Workers and Technology. Before joining Bloomberg in 2014, he worked at Google Research in NYC after a short post doc at UMass Amherst. Mann graduated Brown University in 1999 and subsequently received a Ph.D. from The Johns Hopkins University in 2006.
Brandon Stewart: Causal Inference with Statistical Text Analysis: Text as Outcome, Treatment and Confounder
Abstract: Texts are increasingly used to make causal inferences: either with the document serving as outcome, treatment or confounder. We introduce a new conceptual framework to understand text-based inferences, demonstrate fundamental problems that arise when using manual or computational approaches applied to text for causal inference, and provide several solutions to the pressing problems we raise. Our work connects the the social science literature on survey experiments, the A/B test approach from industry and machine learning, and the causal inference literature in statistics. We illustrate these approaches using a range of applications drawn from several areas of the social sciences. Taken together, our work provides a more rigorous foundation to build upon for applying text based methods to causal inference.
Bio: Brandon Stewart is an Assistant Professor of Sociology at Princeton University where he is also affiliated with the Politics Department, the Office of Population Research, the Princeton Institute for Computational Science and Engineering and the Center for the Digital Humanities. He develops new quantitative statistical methods for applications across computational social science. He holds a master's degree in Statistics (2014) and a Ph.D. in Government (2015) both from Harvard University. He has been awarded a National Science Foundation Graduate Research Fellowship, the Edward M. Chase Dissertation Prize (for the best essay on a subject relating to the promotion of peace at Harvard University) and the Gosnell Prize for Excellence in Political Methodology. He is a creator of the stm package in R which implements the Structural Topic Model which he co-developed.