Stanford

Social NLP group

Purpose

Welcome! Social NLP reading group will focus on reading papers at the intersection of social sciences and NLP. The papers will ideally expose us to

(1) Significant innovations in NLP methods, models, or design paradigms applied to social problems, or

(2) New theories and concepts from social sciences and how they are studied with NLP approaches.

Logistics

We hold weekly meetings where we discuss paper(s) related to a theme of interest.

How to get involved

Join us for our weekly meeting: Wed 10.00 - 11.00 am at Gates 304 and on Zoom

Join our Slack channel: #social-nlp

Organizers

Funding

We are grateful to Stanford HAI for supporting us through the Student Affinity Groups program.

Schedule Spring Quarter 2023/24

Week 1, Apr 17th

Topic: LLM agents

Papers:

Wu, Qingyun, et al. "Autogen: Enabling next-gen llm applications via multi-agent conversation framework." arXiv preprint arXiv:2308.08155 (2023). https://arxiv.org/abs/2308.08155

Discussion lead: Chi Wang (guest)

Week 2, Apr 24th

Topic: Prediction-powered inference and LLM evaluation

Papers:

Angelopoulos, A.N., Bates, S., Fannjiang, C., Jordan, M.I. and Zrnic, T., 2023. Prediction-powered inference. Science, 382(6671), pp.669-674. https://www.science.org/doi/full/10.1126/science.adi6000
Zrnic, T. and Candès, E.J., 2024. Active Statistical Inference. arXiv preprint arXiv:2403.03208. https://arxiv.org/abs/2403.03208
Boyeau, P., Angelopoulos, A.N., Yosef, N., Malik, J. and Jordan, M.I., 2024. AutoEval Done Right: Using Synthetic Data for Model Evaluation. arXiv preprint arXiv:2403.07008. https://arxiv.org/abs/2403.07008

Discussion lead: Tijana Zrnić (guest)

Week 3, May 1st

Topic: Automated data testing in Australian politics and Canadian journalism

Papers:

Digitization of the Australian Parliamentary Debates, 1998–2022 https://www.nature.com/articles/s41597-023-02464-w
Digitization of the Canadian Parliamentary Debates https://www.jstor.org/stable/26858546
Implementing Automated Data Validation for Canadian Political Datasets https://arxiv.org/abs/2309.12886
Evaluating the Decency and Consistency of Data Validation Tests Generated by LLMs https://arxiv.org/abs/2310.01402

Discussion lead: Lindsay Katz (guest)

Week 4, May 8th

Topic: Concept-aware LLMs

Papers:

Shani, Chen, Jilles Vreeken, and Dafna Shahaf. "Towards Concept-Aware Large Language Models." https://aclanthology.org/2023.findings-emnlp.877/
Optional:

Jiang, Yang, et al. "Expert feature-engineering vs. deep neural networks: which is better for sensor-free affect detection?." Artificial Intelligence in Education: 19th International Conference, AIED 2018, London, UK, June 27–30, 2018, Proceedings, Part I 19. Springer International Publishing, 2018. https://learninganalytics.upenn.edu/ryanbaker/jiang-aied2018.pdf
Rytting, Christopher, and David Wingate. "Leveraging the inductive bias of large language models for abstract textual reasoning." Advances in Neural Information Processing Systems 34 (2021): 17111-17122. https://arxiv.org/pdf/2110.02370
Si, Chenglei, et al. "Measuring inductive biases of in-context learning with underspecified demonstrations." arXiv preprint arXiv:2305.13299 (2023). https://aclanthology.org/2023.acl-long.632.pdf
Papadimitriou, Isabel, and Dan Jurafsky. "Injecting structural hints: Using language models to study inductive biases in language learning." Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. https://aclanthology.org/2023.findings-emnlp.563.pdf
Cazenavette, George, and Simon Lucey. "On the bias against inductive biases." arXiv preprint arXiv:2105.14077 (2021). https://arxiv.org/pdf/2105.14077

Discussion lead: Chen Shani

Week 5, May 15th

Topic: European Union Framework for Governing Generative AI Models

Papers:

An Overview of the European Union Framework Governing Generative AI Models and Systems. Florence G'sell. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4762804

Discussion lead: Florence G'Sell (guest)

Week 6, May 22nd

Topic: LM-mediated information and persuasion

Papers:

Discussion lead: Isabel Gallegos

Week 7, May 29th

Topic: AI and Labor

Papers:

Gen-AI: Artificial Intelligence and the Future of Work https://www.imf.org/en/Publications/Staff-Discussion-Notes/Issues/2024/01/14/Gen-AI-Artificial-Intelligence-and-the-Future-of-Work-542379
The impact of artificial intelligence on growth and employment https://cepr.org/voxeu/columns/impact-artificial-intelligence-growth-and-employment
Forecasting extreme labor displacement: A survey of AI practitioners https://www.sciencedirect.com/science/article/pii/S0040162520311495

Optional:

GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models https://arxiv.org/abs/2303.10130
Which U.S. Workers Are More Exposed to AI on Their Jobs? https://www.pewresearch.org/social-trends/2023/07/26/which-u-s-workers-are-more-exposed-to-ai-on-their-jobs/

Discussion lead: Yifan Mai

Schedule Winter Quarter 2023/24

Week 1, Jan 24th

Topic: Probing for making consistent measurements with LLMs

Papers:

Unsupervised Contrast-Consistent Ranking with Language Models (Stoehr et al. 2023)
Characterizing Mechanisms for Factual Recall in Language Models (Yu et al., 2023)

Optional:

Measurement in the Age of LLMs: An Application to Ideological Scaling (O’Hagan and Schein, 2023)
Discovering Latent Knowledge in Language Models Without Supervision (Burns et al., 2023)

Linear Representations of Sentiment in Large Language Models (Tigges et al, 2023)

Discussion lead: Niklas Stoehr (guest)

Week 2, Jan 31st

Topic: Persuasion in the era of LLMs

Papers:

Discussion lead: Weiyan Shi

Week 3, Feb 14th

Topic: NLP & Policing

Papers:

Rho, Eugenia H., et al. "Escalated police stops of Black men are linguistically and psychologically distinct in their earliest moments." Proceedings of the National Academy of Sciences 120.23 (2023): e2216162120. https://www.pnas.org/doi/full/10.1073/pnas.2216162120

Optional:

Pierson, E., Simoiu, C., Overgoor, J., Corbett-Davies, S., Jenson, D., Shoemaker, A., ... & Goel, S. (2020). A large-scale analysis of racial disparities in police stops across the United States. Nature human behaviour, 4(7), 736-745. https://www.nature.com/articles/s41562-020-0858-1
Weisburd, D., Telep, C. W., Vovak, H., Zastrow, T., Braga, A. A., & Turchan, B. (2022). Reforming the police through procedural justice training: A multicity randomized trial at crime hot spots. Proceedings of the National Academy of Sciences, 119(14), e2118780119. https://www.pnas.org/doi/full/10.1073/pnas.2118780119
Prowse, G., Weaver, V. M., & Meares, T. L. (2020). The state from below: Distorted responsiveness in policed communities. Urban Affairs Review, 56(5), 1423-1471. https://journals.sagepub.com/doi/full/10.1177/1078087419844831

Discussion lead: Maggie Harrington

Week 4, Feb 21st

Topic: Values in LLMs

Papers:

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties https://arxiv.org/abs/2309.00779
A Roadmap to Pluralistic Alignment https://arxiv.org/abs/2402.05070
LLMs grasp morality in concept https://arxiv.org/abs/2311.02294
Optional:
- Can Machines Learn Morality? The Delphi Experiment. https://arxiv.org/abs/2110.07574
- Aligning AI With Shared Human Values https://arxiv.org/abs/2008.02275

Discussion lead: Jared Moore

Week 5, Feb 28th

Topic: Reasoning about networks with LLMs

Papers:

Papachristou, Marios, and Yuan Yuan. "Network Formation and Dynamics Among Multi-LLMs." arXiv preprint arXiv:2402.10659 (2024). https://arxiv.org/pdf/2402.10659.pdf
Fatemi, Bahare, Jonathan Halcrow, and Bryan Perozzi. "Talk like a graph: Encoding graphs for large language models." arXiv preprint arXiv:2310.04560 (2023). https://arxiv.org/pdf/2310.04560.pdf

Discussion lead: Alicja Chaszczewicz

Week 6, Mar 6th

Topic: Beyond Memorization: Stronger Language Models and Their Privacy Implication

Papers:

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory https://arxiv.org/abs/2310.17884
What Does it Mean for a Language Model to Preserve Privacy? https://arxiv.org/abs/2202.05520
Optional:
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents https://arxiv.org/abs/2401.10019

Discussion lead: Yijia Shao

Schedule Fall Quarter 2023/24

Week 1, Oct 16th

Topic: Language, Identity, and Language Model Interaction

Papers:

Schlesinger, Ari, W. Keith Edwards, and Rebecca E. Grinter. "Intersectional HCI: Engaging identity through gender, race, and class." Proceedings of the 2017 CHI conference on human factors in computing systems. 2017. https://dl.acm.org/doi/10.1145/3025453.3025766
Zamfirescu-Pereira, J. D., et al. "Why Johnny can’t prompt: how non-AI experts try (and fail) to design LLM prompts." Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 2023. https://dl.acm.org/doi/abs/10.1145/3544548.3581388
Optional:
- Zheng, Lianmin, et al. "LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset." arXiv preprint arXiv:2309.11998 (2023). https://arxiv.org/pdf/2309.11998v3.pdf
- Volkova, Svitlana, et al. "Inferring latent user properties from texts published in social media." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 29. No. 1. 2015. https://ojs.aaai.org/index.php/AAAI/article/view/9271

Discussion lead: Will Held

Week 2, Oct 23rd

Topic: From Question Answering to Question Asking

Papers:

Large Language Models for Automated Open-domain Scientific Hypotheses Discovery (arxiv preprint, https://arxiv.org/abs/2309.02726)
Socially situated artificial intelligence enables learning from human interaction (PNAS Vol. 119 | No. 39, https://www.pnas.org/doi/full/10.1073/pnas.2115730119 )
Don’t Just Tell Me, Ask Me: AI Systems that Intelligently Frame Explanations as Questions Improve Human Logical Discernment Accuracy over Causal AI explanations (CHI 2023, https://dl.acm.org/doi/10.1145/3544548.3580672)

Discussion lead: Yijia Shao

Week 3, Oct 30th

Topic: Anthropomorphism

Papers:

See in the slack channel.

Optional

Talking About Large Language Models https://arxiv.org/abs/2212.03551
Computational analysis of 140 years of US political speeches reveals more positive but increasingly polarized framing of immigration https://www.pnas.org/doi/epdf/10.1073/pnas.2120510119

Discussion lead: Myra Cheng

Week 4, Nov 6th

Topic: Guiding Design of LLM-powered chatbots with Expert Feedback

Papers:

Petridis, Savvas, et al. "ConstitutionMaker: Interactively Critiquing Large Language Models by Converting Feedback into Principles." https://arxiv.org/abs/2310.15428
Chen, Siyuan, et al. "LLM-empowered Chatbots for Psychiatrist and Patient Simulation: Application and Evaluation." https://arxiv.org/abs/2305.13614

Discussion lead: Ryan Louie

Week 5, Nov 13th

Topic: Representation of global culture and values in LLMs

Papers:

Towards Measuring the Representation of Subjective Global Opinions in Language Models https://arxiv.org/pdf/2306.16388.pdf
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models https://arxiv.org/pdf/2305.14456.pdf
(only Section 5) BHASA: A Holistic Southeast Asian Linguistic and Cultural Evaluation Suite for Large Language Models https://arxiv.org/vc/arxiv/papers/2309/2309.06085v1.pdf

Discussion lead: Yifan Mai

Week 6, Nov 27th

Topic: Conversational grounding with LLMs

Papers:

Grounding or Guesswork? Large Language Models are Presumptive Grounders https://arxiv.org/abs/2311.09144
Principles of Mixed-Initiative Interaction https://dl.acm.org/doi/pdf/10.1145/302979.303030
Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations https://arxiv.org/abs/2311.05584
Optional:
- Co-constructing intersubjectivity with artificial conversational agents: People are more likely to initiate repairs of misunderstandings with agents represented as human https://www.sciencedirect.com/science/article/pii/S0747563215303101
- Generative Theories of Interaction https://hal.science/hal-03434142/file/GenTheory%20authorversion.pdf

Discussion lead: Omar Shaikh

Week 7, Dec 4th

Topic: Eliciting human preferences with Language models

Papers:

Eliciting Human Preferences with Language Models. Belinda Z. Li, Alex Tamkin, Noah Goodman, Jacob Andreas. https://arxiv.org/abs/2310.11589

Discussion lead: Alex Tamkin

Schedule Spring Quarter 2022/23

Week 1, Apr 6th

Topic: Overconfidence and uncertainty

Papers:

On Hedging in Physician-Physician Discourse; Ellen F. Prince (Linguistics), Joel Frader (Pediatrics), and Charles Bosk (Sociology)." (1980). http://www.cs.columbia.edu/~prokofieva/CandidacyPapers/Prince_Hedging.pdf
Zhou, Kaitlyn, Dan Jurafsky, and Tatsunori Hashimoto. "Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models." arXiv preprint arXiv:2302.13439 (2023). https://arxiv.org/abs/2302.13439
Mielke, Sabrina J., et al. "Reducing conversational agents’ overconfidence through linguistic calibration." Transactions of the Association for Computational Linguistics 10 (2022): 857-872. https://arxiv.org/abs/2012.14983

Discussion lead: Kaitlyn Zhou

Week 2, Apr 13th

Topic: Large language models as simulated humans

Papers:

Generative Agents: Interactive Simulacra of Human Behavior https://arxiv.org/abs/2304.03442
Whose Opinions Do Language Models Reflect? https://arxiv.org/abs/2303.17548
(Optional) Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus? https://john-joseph-horton.com/papers/llm_ask.pdf

Discussion lead: Joon Park

Week 3, Apr 20th

Topic: Culture and pragmatics

Thomas, Jenny. "Cross-cultural pragmatic failure." Applied linguistics 4.2 (1983): 91-112.

Discussion lead: Jing Huang

Week 4, Apr 27th

Topic: Reflection and self-correction

The Capacity for Moral Self-Correction in Large Language Models. https://arxiv.org/abs/2302.07459
Reflexion: an autonomous agent with dynamic memory and self-reflection. https://arxiv.org/abs/2303.11366

Discussion lead: Tiziano Piccardi

Week 5, May 4th

Topic: Polarization and mitigations

Discussion lead: Martin Saveski

Week 6, May 11th

Topic: Causality and common sense

Causal Reasoning and Large Language Models: Opening a New Frontier for Causality. Emre Kıcıman, Robert Ness, Amit Sharma, Chenhao Tan. https://arxiv.org/abs/2305.00050
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs. Maarten Sap, Ronan Le Bras, Daniel Fried, and Yejin Choi https://arxiv.org/pdf/2210.13312

Discussion lead: Kristina Gligoric

Week 7, May 18th

Topic: Measuring stereotypes in LLMs and in simulation scenarios

Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models. Cheng et al. (pdf in the Slack channel)

Discussion lead: Myra Cheng

Week 8, May 25th

Topic: Large language models and linguistic theory

Optional

Discussion lead: Omar Shaikh

Week 9, Jun 1st

Topic: AI and persuasion

Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations. Goldsteain et al. https://arxiv.org/pdf/2301.04246.pdf
Quantifying the potential persuasive returns to political microtargeting. Tappin et al. https://psyarxiv.com/dhg6k/
Working with AI to persuade: Examining a large language model’s ability to generate pro-vaccination messages. Karinshak et al. https://dl.acm.org/doi/abs/10.1145/3579592
Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks. Kang et al. https://arxiv.org/pdf/2302.05733.pdf

Discussion lead: Rylan Schaeffer

Week 10, Jun 8th

Brainstorming session

Schedule Winter Quarter 2022/23

Week 1, Jan 19th

Discussing the logistics

Week 2, Jan 26th

Topic: Causality

Papers:

Feder, Amir, et al. "Causal inference in natural language processing: Estimation, prediction, interpretation and beyond." Transactions of the Association for Computational Linguistics 10 (2022): 1138-1158. https://arxiv.org/pdf/2109.00725.pdf
Keith, Katherine A., Douglas Rice, and Brendan O'Connor. "Text as Causal Mediators: Research Design for Causal Estimates of Differential Treatment of Social Groups via Language Aspects." arXiv preprint arXiv:2109.07542 (2021). https://arxiv.org/pdf/2109.07542.pdf
Keidar, Daphna, et al. "Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang." arXiv preprint arXiv:2203.04651 (2022).https://arxiv.org/pdf/2203.04651.pdf

Discussion lead: Kristina Gligoric

Week 3, Feb 2nd

Topic: Reinforcement learning

Papers:

Pyatkin, Valentina, et al. "Reinforced Clarification Question Generation with Defeasibility Rewards for Disambiguating Social and Moral Situations." arXiv preprint arXiv:2212.10409 (2022). https://arxiv.org/abs/2212.10409
Ramamurthy, Rajkumar, et al. "Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization." arXiv preprint arXiv:2210.01241 (2022). https://arxiv.org/abs/2210.01241

Discussion lead: Caleb Ziems

Week 4, Feb 9th

Topic: Prediction, explanation, and integrating the two in NLP

Papers:

Shmueli. To Explain or to predict? https://projecteuclid.org/journals/statistical-science/volume-25/issue-3/To-Explain-or-to-Predict/10.1214/10-STS330.full
Schäfer, Mike S., and Valerie Hase. "Computational methods for the analysis of climate change communication: Towards an integrative and reflexive approach." Wiley Interdisciplinary Reviews: Climate Change (2022): e806. https://wires.onlinelibrary.wiley.com/doi/epdf/10.1002/wcc.806
Hofman, Jake M., et al. "Integrating explanation and prediction in computational social science." Nature 595.7866 (2021): 181-188. https://www.nature.com/articles/s41586-021-03659-0
Agrawal, Mayank, et al. "Scaling up psychology via scientific regret minimization." Proceedings of the National Academy of Sciences 117, no. 16 (2020): 8825-8835. https://www.pnas.org/doi/10.1073/pnas.1915841117

Discussion lead: Myra Cheng

Week 5, Feb 16th

Topic: Common ground and communication

Papers:

Herbert Clark’s “Using Language” (Chapter 4) Chapter 4.pdf
Contributing to Discourse. https://onlinelibrary.wiley.com/doi/pdfdirect/10.1207/s15516709cog1302_7
Grounding as a Collaborative Process. https://aclanthology.org/2021.eacl-main.41.pdf
Optional reading:
- - Communities, Commonalities, and Communication
  - https://www.sol.lu.se/media/utbildning/dokument/kurser/LINB30/20082/Clark.pdf
  - Shared common ground influences information density in microblog texts
  - https://aclanthology.org/N15-1182.pdf

Discussion lead: Omar Shaikh

Week 6, Feb 23rd

Topic: Human-centered data and language models: Privacy, data as labor, and licensing

Papers:

Contractor, Danish, et al. "Behavioral use licensing for responsible AI." 2022 ACM Conference on Fairness, Accountability, and Transparency. 2022. https://dl.acm.org/doi/abs/10.1145/3531146.3533143
The Paradox of Reuse, Language Models Edition. https://nmvg.mataroa.blog/blog/the-paradox-of-reuse-language-models-edition/
ChatGPT Stole Your Work. So What Are You Going to Do? https://www.wired.com/story/chatgpt-generative-artificial-intelligence-regulation/

Guest: Nick Vincent

Week 7, Mar 2nd

Topic: Agreeability among humans

Papers:

Bakker, Michiel A., et al. "Fine-tuning language models to find agreement among humans with diverse preferences." https://www.deepmind.com/publications/fine-tuning-language-models-to-find-agreement-among-humans-with-diverse-preferences
Argyle, Lisa P., et al. "AI Chat Assistants can Improve Conversations about Divisive Topics." https://arxiv.org/pdf/2302.07268.pdf

Discussion lead: Nicole Meister

Week 8, Mar 9th

Topic: Simulating humans with LLMs: Challenges and opportunities for social science research

Papers:

Argyle, Lisa P., et al. "Out of one, many: Using language models to simulate human samples." Political Analysis (2022): 1-15. https://arxiv.org/pdf/2209.06899.pdf
Aher, Gati, Rosa I. Arriaga, and Adam Tauman Kalai. "Using Large Language Models to Simulate Multiple Humans." https://arxiv.org/pdf/2208.10264.pdf
I strongly feel that this is an insult to life itself, Kevin Munger. https://kevinmunger.substack.com/p/i-strongly-feel-that-this-is-an-insult

Optional: Kosinski, Michal. "Theory of mind may have spontaneously emerged in large language models." arXiv preprint arXiv:2302.02083 (2023). https://arxiv.org/pdf/2302.02083.pdf

Discussion lead: Tiziano Piccardi

Week 9, Mar 16th

Topic: Race and racism

Papers:

- Voigt, Rob, et al. "Language from police body camera footage shows racial disparities in officer respect." Proceedings of the National Academy of Sciences 114.25 (2017): 6521-6526. https://www.pnas.org/doi/10.1073/pnas.1702413114
- Prabhakaran, Vinodkumar, et al. "Detecting institutional dialog acts in police traffic stops." Transactions of the Association for Computational Linguistics 6 (2018): 467-481. https://transacl.org/ojs/index.php/tacl/article/view/1349
- Camp, Nicholas P., et al. "The thin blue waveform: Racial disparities in officer prosody undermine institutional trust in the police." Journal of personality and social psychology 121.6 (2021): 1157. https://www.apa.org/pubs/journals/releases/psp-pspa0000270.pdf
- Epp, Charles R., Steven Maynard‐Moody, and Donald Haider Markel. "Beyond profiling: The institutional sources of racial disparities in policing." Public Administration Review 77.2 (2017): 168-178. https://onlinelibrary.wiley.com/doi/full/10.1111/puar.12702

Discussion lead: Anjalie Field

Page updated

Google Sites

Report abuse