4th Workshop on NLP and CSS

EMNLP 2020, November 20, 2020 — Online

Welcome to the 4th Workshop on Natural Language Processing and Computational Social Science (NLP+CSS)!

Organizers: David Jurgens (Ann Arbor), Svitlana Volkova (PNNL), David Bamman (UC Berkeley), Dirk Hovy (Bocconi University), Brendan O'Connor (UMass Amherst)

Email to contact organizers: nlp-and-css -at- googlegroups.com

On Twitter: @NLPandCSS

To attend, please register for EMNLP 2020.

Go to the EMNLP Virtual Workshop page (WS-18). This contains links for:

    • RocketChat channel: #workshop-nlpcss. We will use for general announcements / updates, and it's available for general discussion as well.

    • Zoom: for talks and panels.

    • GatherTown: for virtual poster sessions and social / boaf session. Click "Calendar" within GatherTown to find the room (F/G - see notes on poster session below).

Thus, we will be switching between Zoom and GatherTown during the course of the workshop. One technical issue: sometimes you have to close Zoom in order for GatherTown to work. The schedule notes where each event takes place.

This workshop follows the ACL Anti-Harassment Policy.


The workshop takes place on November 20, 2020.

  • 10:00 ET / 15:00 UTC – Opening Remarks (Zoom)

  • 10:15 ET / 15:15 UTC – Invited Speaker: Dong Nguyen, Assistant Professor, Information and Computing Sciences, Utrecht University. "When NLP Meets Language Variation." (Zoom)

  • 11:00 ET / 16:00 UTC – Virtual poster session 1 (GatherTown)

  • 12:00 ET / 17:00 UTC – Break

  • 12:15 ET / 17:15 UTC – Invited Speaker: Elizabeth E. Bruch, Associate Professor in Sociology and Complex Systems, University of Michigan. "How (and Why) Online Dating Experiences Differ across American Cities." (Abstract below) (Zoom)

  • 13:00 ET / 18:00 UTC – Lunch + Birds of a feather (GatherTown)

  • 14:00 ET / 19:00 UTC – Invited Speaker: Jesse Shapiro, Eastman Professor of Political Economy, Brown University. "Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional Speech." (Link to paper; abstract below) (Zoom)

  • 14:45 ET / 19:45 UTC – Virtual poster session 2 (GatherTown)

  • 16:00 ET / 21:00 UTC – Invited Speaker: Diyi Yang, Assistant Professor, Interactive Computing, Georgia Institute of Technology. "Persuasion, Bias, and Choice? Building Socially-aware Language Technologies." (Abstract below) (Zoom)

  • 16:45 ET / 21:45 UTC – Panel: "CSS Research: An Industry Perspective." (Zoom) With panelists:

    • Kristin Althenburger (Facebook)

    • Umashanthi Pavalanathan (Twitter)

    • Emre Kiciman (Microsoft Research)

    • Jason Baldridge (Google)

    • Glen Coppersmith (Qntfy)

  • 17:30 ET / 22:30 UTC – Closing Remarks (Zoom)

Time zones

Invited Talk Abstracts

When NLP meets language variation
Dong Nguyen, Assistant Professor, Information and Computing Sciences, Utrecht University

There are often various ways to express the same thing. Think of, for example, the different words we can use for a given concept, or the many creative spellings in social media. Language variation is often seen as a challenge for developing robust NLP models. In this talk I will reflect on what NLP and sociolinguistics have to offer each other. In particular, I will focus on how language variation is not just a problem to be solved, but also an opportunity to explore exciting questions about language and social behavior and to develop NLP models that are sensitive to social context.

How (and Why) Online Dating Experiences Differ across American Cities
Elizabeth E. Bruch, Associate Professor in Sociology and Complex Systems, University of Michigan

Social scientists have long shown that city-level differences in patterns of assortative mating, marriage rates, and non-marital childbearing are associated with labor market conditions and partnering opportunities. But it is difficult to observe the interactions that give rise to romantic outcomes in different places. As a result, we know little about whether, how, and why romantic experiences differ across cities. In this talk, I present results from a new study that uses rich activity data from a large, U.S. dating website to explore how population composition interacts with mate-seeking behavior to shape men and women's online romantic experience. Building on insights from psychology and behavioral ecology, I focus on two distinct classes of behavior: choice/preferences and competition. I show that mate seekers in different U.S. cities have divergent strategies for mate pursuit: they differ in their preferences, pickiness, and intensity of competition. In the final section, I focus on individuals who appear to change markets, to assess whether and how changing contexts is associated with a change in strategies for mate pursuit. This study represents a novel quantitative effort to show how men and women's mate-seeking behaviors differ systematically with their opportunities.

Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional Speech
Jesse Shapiro, Eastman Professor of Political Economy, Brown University

We study the problem of measuring group differences in choices when the dimensionality of the choice set is large. We show that standard approaches suffer from a severe finite-sample bias, and we propose an estimator that applies recent advances in machine learning to address this bias. We apply this method to measure trends in the partisanship of congressional speech from 1873 to 2016, defining partisanship to be the ease with which an observer could infer a congressperson’s party from a single utterance. Our estimates imply that partisanship is far greater in recent years than in the past, and that it increased sharply in the early 1990s after remaining low and relatively constant over the preceding century.
Link to paper)

Persuasion, Bias, and Choice? Building Socially-aware Language Technologies
Diyi Yang, Assistant Professor, Interactive Computing, Georgia Institute of Technology

Over the last few decades, natural language processing (NLP) has had increasing success and produced industrial applications like search, and personal assistants. Despite being sufficient to enable these applications, current NLP systems largely ignore the social part of language, e.g., who says it, in what context, for what goals. In this talk, we take a closer look at the interplay between social signals and computational methods via three works. The first one studies what makes language persuasive by introducing a semi-supervised neural network to recognize persuasion strategies in good-faith requests on crowdfunding platforms. We then describe our neural encoder-decoder systems to automatically transform inappropriately subjective or unwanted framing into a neutral point of view. The last part demonstrates how conversation stages and topics can be utilized to generate better summaries for everyday interaction.

Accepted Papers, Talks, and Virtual Poster Session

See the list of papers on Virtual EMNLP to access papers' 5-minute pre-recorded videos, and PDFs of the papers themselves. Authors will be available to discuss the papers during the two virtual poster sessions, which will take place on GatherTown. You are encouraged to view some videos beforehand, but if you can't, authors are encouraged to discuss them anyway!

How to find it: Inside GatherTown, click "Calendar" in the right sidebar, find the "NLPCSS Poster Session" entry in the list, then click it to get directions to the workshop room - it'll be rooms F and G. (Screenshot)

The papers are divided into two sessions - authors should be available to discuss during their session. They certainly can be available in the other one too, if they wish - the virtual "poster" signs are supposed to be there the whole time.

Session 1 (at 16:00 UTC):

  • "How Language Influences Attitudes Toward Brands." David DeFranza, Arul Mishra and Himanshu Mishra.

  • "Using BERT for Qualitative Content Analysis in Psychosocial Online Counseling." Philipp Grandeit, Carolyn Haberkern, Maximiliane Lang, Jens Albrecht and Robert Lehmann.

  • "Swimming with the Tide? Positional Claim Detection across Political Text Types." Nico Blokker, Erenay Dayanik, Gabriella Lapesa and Sebastian Padó.

  • "I miss you babe: Analyzing Emotion Dynamics During COVID-19 Pandemic." Lynnette Hui Xian Ng, Roy Ka-Wei Lee and Md Rabiul Awal.

  • "Assessing population-level symptoms of anxiety, depression, and suicide risk in real time using NLP applied to social media data." Alex Fine, Patrick Crutchley, Jenny Blase, Joshua Carroll and Glen Coppersmith.

  • "Topic preference detection: A novel approach to understand perspective taking in conversation." Michael Yeomans and Alison Wood Brooks.

  • "Viable Threat on News Reading: Generating Biased News Using Natural Language Models." Saurabh Gupta, Hong Huy Nguyen, Junichi Yamagishi and Isao Echizen.

  • "Unsupervised Anomaly Detection in Parole Hearings using Language Models." Authors: Graham Todd, Catalin Voss and Jenny Hong.

  • "Identifying Worry in Twitter: Beyond Emotion Analysis." Reyha Verma, Christian von der Weth, Jithin Vachery and Mohan Kankanhalli.

  • "Text Zoning and Classification for Job Advertisements in German, French and English." Ann-Sophie Gnehm and Simon Clematide.

  • "Is Wikipedia succeeding in reducing gender bias? Assessing changes in gender bias in Wikipedia using word embeddings." Katja Geertruida Schmahl, Tom Julian Viering, Stavros Makrodimitris, Arman Naseri Jahfari, David Tax and Marco Loog.

  • "Foreigner-directed speech is simpler than native-directed: Evidence from social media." Aleksandrs Berdicevskis.

  • "Emoji and Self-Identity in Twitter Bios." Jinhang Li, Giorgos Longinos, Steven Wilson and Walid Magdy.

Session 2 (at 19:45 UTC):

  • "Does Social Support (Expressed in Post Titles) Elicit Comments in Online Substance Use Recovery Forums?" Anietie Andy and Sharath Chandra Guntuku.

  • "Measuring Linguistic Diversity During COVID-19." Jonathan Dunn, Tom Coupe and Benjamin Adams.

  • "A Lexical Semantic Leadership Network of Nineteenth Century Abolitionist Newspapers." Sandeep Soni, Lauren Klein and Jacob Eisenstein.

  • "Effects of Anonymity on Comment Persuasiveness in Wikipedia Articles for Deletion Discussions." Yimin Xiao and Lu Xiao.

  • "Uncertainty over Uncertainty: Investigating the Assumptions, Annotations, and Text Measurements of Economic Policy Uncertainty." Katherine Keith, Christoph Teichmann, Brendan O'Connor and Edgar Meij.

  • "Recalibrating classifiers for interpretable abusive content detection." Bertie Vidgen, Scott Hale, Sam Staton, Tom Melham, Helen Margetts, Ohad Kammar and Marcin Szymczak.

  • "Predicting independent living outcomes from written reports of social workers." Angelika Maier and Philipp Cimiano.

  • "Analyzing Political Bias and Unfairness in News Articles at Different Levels of Granularity." Wei-Fan Chen, Khalid Al Khatib, Henning Wachsmuth and Benno Stein.

  • "Mapping Local News Coverage: Precise location extraction in textual news content using fine-tuned BERT based language model." Sarang Gupta and Kumari Nishu.

  • "Diachronic Embeddings for People in the News." Felix Hennig and Steven Wilson.

  • "Social media data as a lens onto care-seeking behavior among women veterans of the US armed forces." Kacie Kelly, Alex Fine and Glen Coppersmith.

  • "Understanding Weekly COVID-19 Concerns through Dynamic Content-Specific LDA Topic Modeling." Mohammadzaman Zamani, H. Andrew Schwartz, Johannes Eichstaedt, Sharath Chandra Guntuku, Adithya Virinchipuram Ganesan, Sean Clouston and Salvatore Giorgi.

  • "Analyzing Gender Bias within Narrative Tropes." Dhruvil Gala, Mohammad Omar Khursheed, Hannah Lerner, Brendan O'Connor and Mohit Iyyer.

  • "An Unfair Affinity Toward Fairness: Characterizing 70 Years of Social Biases in B$^{H}$ollywood." Kunal Khadilkar and Ashiqur KhudaBukhsh.