Papers

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
Zhiwei Liu, Weiran Yao, Jianguo Zhang, Le Xue, Shelby Heinecke, Rithesh Murthy, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles, Devansh Arpit, Ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese
August 11, 2023
https://arxiv.org/abs/2308.05960

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
Weiran Yao, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Yihao Feng, Le Xue, Rithesh Murthy, Zeyuan Chen, Jianguo Zhang, Devansh Arpit, Ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese
August 4, 2023
https://arxiv.org/abs/2308.02151

REX: Rapid Exploration and eXploitation for AI Agents

Rithesh Murthy, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Le Xue, Weiran Yao, Yihao Feng, Zeyuan Chen, Akash Gokul, Devansh Arpit, Ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

July 2023

https://arxiv.org/abs/2307.08962

Using Tree-of-Thought Prompting to boost ChatGPT's reasoning (Zero Shot ToT)

David Hilbert

June 2023

https://github.com/dave1010/tree-of-thought-prompting

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan

May 2023

https://arxiv.org/abs/2305.10601

https://github.com/princeton-nlp/tree-of-thought-llm

https://github.com/jieyilong/tree-of-thought-puzzle-solver

Self-Refine: Iterative Refinement with Self-Feedback

Aman Madaan, et al.

March 30, 2023

https://arxiv.org/abs/2303.17651

Reflexion: Language Agents with Verbal Reinforcement Learning

Noah Shinn, Federico Cassano, Beck Labash, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao

March 2023

https://arxiv.org/abs/2303.11366

Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom

February 2023

https://arxiv.org/abs/2302.04761

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao

November 2022

https://arxiv.org/pdf/2210.03629.pdf

Automatic Chain of Thought Prompting in Large Language Models

Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola

October 2022

https://arxiv.org/abs/2210.03493

https://github.com/amazon-science/auto-cot

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou

January 2022

https://arxiv.org/pdf/2201.11903v6.pdf

LLM

The False Promise of Imitating Proprietary LLMs

Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao Liu, Pieter Abbeel, Sergey Levine, Dawn Song

May 2023

https://arxiv.org/abs/2305.15717

BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining

Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, and Tie-Yan Li

April 2023

https://arxiv.org/pdf/2210.10341.pdf

Large Language Models Are Human-Level Prompt Engineers

Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba

March 2023

https://arxiv.org/abs/2211.01910

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang

March 2023

https://arxiv.org/abs/2303.17580

Chain of Hindsight Aligns Language Models with Feedback

Hao Liu, Carmelo Sferrazza, Pieter Abbeel

March 2023

https://arxiv.org/abs/2302.02676

Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom

February 2023

https://arxiv.org/abs/2302.04761

Finetuned Language Models Are Zero-Shot Learners (FLAN)

Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le

September, 2022

https://arxiv.org/abs/2109.01652

TALM: Tool Augmented Language Models

Aaron Parisi, Yao Zhao, Noah Fiedel

May 2022

https://arxiv.org/abs/2205.12255

Ask Me Anything: A simple strategy for prompting language models (AMA)

S. Arora, A. Narayan, et al

Nov 2022

https://arxiv.org/pdf/2210.02441.pdf

What learning algorithm is in-context learning? Investigations with linear models

E. Akyürek, D. Schuurmans, et al.

November 2022

https://arxiv.org/pdf/2211.15661.pdf

Training language models to follow instructions with human feedback

Long Ouyang, Jeff Wu, et al (2022)

OpenAI released the instruction tuning paper, and its supervised tuning part corresponds to the davinci-instruct-beta and text-davinci-001.

March 2022

https://arxiv.org/abs/2203.02155

Finetuned Language Models are Zero-Shot Learners

Jason Wei, Maarten Bosma, Vincent Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V Le

Jan 2022

https://openreview.net/forum?id=gEZrGCozdqR

On the Opportunities and Risks of Foundation Models

Bommasani, et al,

Stanford HAI, 2021

https://crfm.stanford.edu/report.html

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜

Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell

March 2021

https://dl.acm.org/doi/10.1145/3442188.3445922

Prefix-Tuning: Optimizing Continuous Prompts for Generation

Xiang Lisa Li, Percy Liang

Jan, 2021

https://arxiv.org/pdf/2101.00190v1.pdf

Language Models are Few-Shot Learners (GPT3)

T. Brown, B. Mann, et al (2020)

OpenAI released the initial GPT-3 paper with the davinci model index

July, 2020

https://arxiv.org/abs/2005.14165

Deep reinforcement learning from human preferences (RLHF)

Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei

June 2017

https://arxiv.org/abs/1706.03741

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin

June 2017

https://arxiv.org/abs/1706.03762

UK Ofqual Press

LSE British Politics and Policy (2022) "The impact of COVID-19 on A-Levels since 2020, and what it means for higher education in 2022/23", London School of Economics Blog, May 25, 2022 (link)
Liz Lightfoot (2020) "'Against natural justice': father to sue exams regulator over A-level grades system", The Guardian, Jun 20, 2020 (link)
Sean Coughlan (2020) "Why did the A-level algorithm say no?" BBC News, Aug 14, 2020 (link)
James Clayton and Zoe Kleinman (2020) "The algorithms that make big decisions about your life," BBC News, Aug 17, 2020 (link)
Heather Stewart, Sally Weale, Kate Proctor (2020) "Ofqual chief to face MPs over exams fiasco and botched algorithm grading," The Guardian, Aug 20, 2020 (link)
Tamandra Harkness (2020) "How Ofqual failed the algorithm test," Unherd, Aug 20, 2020 (link)
Alex Hern (2020) "Ofqual's A-level algorithm: why did it fail to make the grade?", The Guardian, Aug 21, 2020 (link)

Harvard Admissions

Melissa Quinn (2023) "Supreme Court rejects affirmative action, ending use of race as factor in college admissions," CBS News, June 29, 2023 (link)
Rahem Hamid, Nia Orakwue (2022) "81 Republican Lawmakers File Amicus Brief Supporting SFFA in Harvard Affirmative Action Lawsuit," Harvard Crimson, May 11, 2022 (link)
Rahem Hamid, Nia Orakwue (2022) "SFFA Asks Supreme Court to Overturn Precedents Upholding Affirmative Action in Filing for Harvard, UNC Cases," Harvard Crimson, May 4, 2022 (link)
Christina Pazzanese (2022) "Demystifying Harvard’s admission process," The Harvard Gazette, April 14, 2022 (link)
Associated Press (2020) "Appeals Court Clears Harvard of Racial Bias in Admissions," US News, Nov 12, 2020 (link)
Christopher Rom (2019) "Harvard Admissions Will Never Be Fair," Forbes, Oct 4, 2019 (link)
Nicole Hong, Melissa Korn (2018) "Court Filings Detail Role of Race in Harvard Undergraduate Admissions," The Wall Street Journal, Jun 15, 2018 (link)
Jessica Wang (2018) "Breakdown of the Harvard Admissions Process: Trial documents reveal how the elite school chooses its students," The Wall Street Journal, Oct 23, 2018 (link)
Anemona Hartocollis (2018) "Harvard Rated Asian-American Applicants Lower on Personality Traits, Suit Says," NY Times, June 15, 2018 (link)

Misinformation

Gordon Pennycook, Ziv Epstein, Mohsen Mosleh, Antonio A. Arechar, Dean Eckles & David G. Rand (2021) "Shifting attention to accuracy can reduce misinformation online," Nature, Mar 17, 2021 (link)
David Rand (2020) "Understanding and Reducing the Spread of Misinformation Online," Harvard Shorenstein Center, Feb 14, 2020 (youtube)
David Rand, Gordon Pennycook, "The Right way to Fight Fake News," New York Times, Mar 24, 2020 (link)
Becca Lewis (2020), "“This Is What the News Won’t Show You”: YouTube Creators and the Reactionary Politics of Micro-celebrity," SagePub, Oct 17, 2019 (link)
Becca Lewis (2020) "All of YouTube, Not Just the Algorithm, is a Far-Right Propaganda Machine," Medium, Jan 7, 2020 (link)
Renee DiResta (2019) "Mediating Consent," Ribbonfarm, Dec 17, 2019 (link)
Rachelle Hampton (2019), "The Black Feminists Who Saw the Alt-Right Threat Coming," Slate, Apr 23, 2019 (link)
Gordon Pennycook, Ziv Epstein, Mohsen Mosleh, Antonio A. Arechar, Dean Eckles & David G. Rand (2019) "Understanding and reducing the spread of misinformation online," (link)

Misinformation - news

Odette Yousef (2022) "The Uvalde shooting conspiracies show how far-right misinformation is evolving", NPR News, May 26, 2022 (link)
Madison Czopek (2022) "Fact-checking misinformation about the Uvalde school shooting", Politifact, May 27, 2022 (link)
Jason Beeferman (2022) "How Sandy Hook lies and the Jan. 6 inquiry threaten to undo Alex Jones ", The Texas Tribune, April 28, 2022 (link)

Facial Recognition

Alex Najibi (2020) "Racial Discrimination in Face Recognition Technology", Oct 24, 2020 (link)

Text-to-Image Generators

OpenAI team (April, 2022) "DALL·E 2 Preview - Risks and Limitations." https://github.com/openai/dalle-2-preview/blob/main/system-card.md
AI Image Generators Routinely Display Gender and Cultural Bias [link]
Researchers Find Stable Diffusion Amplifies Stereotypes [link]
The Bias problem: Stable Diffusion [link]
Image Generators Like DALL-E Are Mimicking Our Worst Biases [link]

Papers

Ethics & LLM

Reasoning

LLM

UK Ofqual Press

Harvard Admissions

Misinformation

Misinformation - news

Facial Recognition

Text-to-Image Generators

Recent Student Publications