AI Agents Research
AI Agents Research
Our goals for this year's cohort include:
Learn foundations of AI/ML and agentic techniques
Appreciate the advances in LLMs and become familiar with toolchains for using them as consumers
Understand what is algorithmic bias, and in particular bias in AI/ML/LLMs
Study a few major algorithmic faux pas made by well intentioned engineers. We will study (possibly) unintentional consequences of algorithms leading to unfair biases.
Hypothesize how to mitigate as much as possible biases inherent in computer algorithms and data sets, as as to minimize harm to society.
We intend for you to engage in serious scientific investigation in groups, and to learn how to leverage each group member in your research. No major research is done alone in the real world. If you are interested in more specificity, here are our minimal research expectations for this group.
Research Assistant: this term, we have the pleasure of having Theodore M. be our research assistant. He will be sharing his current research with you as well as trying to help each of your group research effort. He can be reached at: theodoremui@gmail.com
We are excited to welcome a large group of high school researchers to our summer research on agents and reasoning.
Git Repo: github.com/philmui/research2025
Summer Projects
Jun 21: Introduction to summer research
(more info will come shortly)
This Spring, we focus on understanding and developing AI agents that can autonomously perform tasks, make decisions, and interact with their environment. Our high school researchers will gain hands-on experience with cutting-edge AI technologies while developing critical research and programming skills.
Feb 8: Hello
Setup for agent development environments
Git Repo: github.com/philmui/research2025
Notebook: [link]
Readings
Stanford AI Index Report. (link)
Feb 15: Retrieval Augmented Generation (RAG) Agents
Feb 22: SKI WEEK (no meeting)
March 1: Agentic Tooling
March 9: Agentic Reasoning
March 16: Image RAG
Slides: [link]
Recording: [link]
Notebook: [link]
Readings:
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever. Learning Transferable Visual Models From Natural Language Supervision. arXiv:2103.00020. [link]
Yossi Gandelsman, Alexei A. Efros, and Jacob Steinhardt. Interpreting CLIP's Image Representation via Text-based Decomposition. arXiv:2310.05916. [link]
March 22: Projects with Agents and LLMs
Slides: [link]
Recording: [link]
Research Project
Readings:
Tom Ling. What Elinor Ostrom Teaches Us About Avoiding the 'Tragedy of the Commons' in Delivering the Public Interest in the Post COVID-19 World, RAND Commentary, Jun 22, 2020. [link]
Kreps, David, Robert Wilson, Paul Milgrom, and John Roberts. “Rational Cooperation in the Finitely Repeated Prisoners’ Dilemma.” Journal of Economic Theory 27, no. 2 (August 1982): 245–252.
Milgrom, Paul. “Axelrod’s The Evolution of Cooperation.” RAND Journal of Economics 15, no. 2 (1984): 305–309.
March 29: Agent Evaluations
April 5: Advanced Chunking
Notebook:[advanced RAG]
Recording: [link]
April 12: Reasoning
Notebook: [HuatuoGPT & MedReason]
Recording: [link]
Readings:
Junying Chen and Zhenyang Cai and Ke Ji and Xidong Wang and Wanlong Liu and Rongsheng Wang and Jianye Hou and Benyou Wang (2025) HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs. arXiv:2412.18925 [link][ dataset]
Juncheng Wu, Wenlong Deng, Xingxuan Li, Sheng Liu, Taomian Mi, Yifan Peng, Ziyang Xu, Yi Liu, Hyunjin Cho, Chang-In Choi, Yihan Cao, Hui Ren, Xiang Li, Xiaoxiao Li, Yuyin Zhou (2025) MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs. arXiv:2504.00993, April 1, 2025. [link]
April 19: Reflexion
b: BioReasoning Project
May 3: EvoGames Project
May 10: Research Project Q&A
Reading:
Circuit Tracing: Revealing Computational Graphs in Language Models [link]
Research Project
What are we covering this Spring?
- Introduction to AI Agents and their applications
- Setting up development environments and tools
- Understanding Large Language Models (LLMs)
- Prompt engineering and chain-of-thought reasoning
- Introduction to autonomous agents and decision-making
- Basic agent architectures and frameworks
- Building simple AI agents using popular frameworks
- Implementing goal-oriented behavior
- Working with APIs and external tools
- Collaborative project development
- Multi-agent systems and communication
- Studying social phenomena with multi-agent systems
- Ethical considerations in AI agents
- Original research projects
Sep 10: Hello & ANN [slides, recording]
Readings
LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner (1998). "Gradient-based learning applied to document recognition". Proceedings of the IEEE. 86 (11): 2278–2324. CiteSeerX 10.1.1.32.9552. doi:10.1109/5.726791. S2CID 14542261. Retrieved October 7, 2016. (link)
Sep 17: ANN & CNN [slides, recording]
Readings
Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E. (May 24, 2017). "ImageNet classification with deep convolutional neural networks". Communications of the ACM. 60 (6): 84–90. doi:10.1145/3065386. ISSN 0001-0782. S2CID 195908774.
Oct 1: LLM-powered Chatbot [tutorial video]
Readings
OpenAI (2022) "Introducing ChatGPT," OpenAI Blog, Nov 30, 2022 (link)
Casado, Martin (2023) "The Economic Case for Generative AI," Andreessen Horowitz Blog, Sep 25, 2023 (link)
Baum, Jeremy; Villasenor, John (2023) "The politics of AI: ChatGPT and political bias," Brookings Commentary, May 8, 2023 (link)
Oct 8: Building an LLM Chatbot with Semantic Kernel [tutorial video]
Oct 15: Building Chatbots with Skills and Plugins [tutorial video]
Oct 22: Reasoning with Tools and Plugins [tutorial video]
Readings
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao (2023), "REACT: Synergizing Reasoning and Acting in Language Models", ICLR 2023 (link)
Oct 29: Image Understanding with LLMs [in class video]
Nov 12: Building GPTs [in class video]
Nov 19: OpenAI News and Discussions [in class video]
Nov 26: Thanksgiving Holiday (no meeting)
Dec 3: AI Model Evaluations [slides, in class video]
Readings
Stephanie Lin, Jacob Hilton, Owain Evans (2022) "TruthfulQA: Measuring How Models Mimic Human Falsehoods." Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, May 2022 (link)
Dec 10: Assistants API and Knowledge Retrieval [slides, in class video]
Readings
Charles Packer, Vivian Fang, Shishir G. Patil, Kevin Lin, Sarah Wooders, Joseph E. Gonzalez (2023) "MemGPT: Towards LLMs as Operating Systems," arXiv:2310.08560 [cs.AI], Oct 12, 2023 (link)
Dec 17: Multi-Agents [slides, in class video]
Readings
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Shaokun Zhang, Erkang Zhu, Beibin Li, Li Jiang, Xiaoyun Zhang, Chi Wang (2023) "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation", MSR-TR-2023-33, August 2023 (link)
Sep-Nov : explore new areas of “Deep Learning” and Language Models
Dec-Jan : propose new areas of research
Feb-May : group research focused on publication
June 22: Introduction & Transformers [slides, recording]
Readings
R. Shinha, H. Poosarla, H. Fu, A. Suen, A. Suen, A. Gandhi, V. Lo, L. Avigad, H. Senthikumar. "Statistical Analysis of Bias in ChatGPT Using Prompt Engineering," International Journal for Research in Applied Science & Engineering Technology, Vol 11(VI), June 2023 (link)
June 25: Predictive Machine Learning [slides, recording]
Readings
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser. "Attention Is All You Need," arXiv:1706.03762, 2017 (link)
June 29: Chatbots & Hallucinations [slides, recording]
Chatbots
Prompts
Hallucinations & Alignment
Papers: see this tab.
July 2: Chatbots & Prompting [slides, recording]
Papers: see this tab.
July 6: Grounding & Research [slides, recording]
We will review a few ideas for summer research
July 9: Embeddings & LlamaIndex [slides, recording]
Readings
Brown, et al. "Language Models are Few-Shot Learners," https://arxiv.org/abs/2005.14165, 2020 (link)
July 13: PII, Anonymization, Semantic Kernel [slides, recording]
Readings:
C. Patsakis, N. Lykousas. "Man vs the machine: The Struggle for Effective Text Anonymisation in the Age of Large Language Models," arXiv:2303.12429 [cs.CR] (link)
July 16: Plugins [slides, recording]
Readings:
OpenAI, ChatGPT Plugins, OpenAI Blog, March 23, 2023 (link)
July 27: Planning & Self-Critique [slides, recording]
Readings:
N. Goodman, "Meta-Prompt: A Simple Self-Improving Language Agent," Substack, April 8, 2023 (link)
H. Touvron, et al, "Llama 2: Open Foundation and Fine-Tuned Chat Models," Meta, July 18, 2023 (link)
A. Ghosh, A. Suen, P. Mui, "Application of Computational Analysis to Identify Housing Discrimination in 21st Century United States," International Journal for Research in Applied Science & Engineering Technology (IJRASET), Vol 11, Issue VII, July 2023 (link)
July 30 - Aug 11 : group meetings
Aug 13: ASDRP Expo Planning
Readings:
D. Hafner, "Benchmarking the Spectrum of Agent Capabilities," arXiv:2109.06780 [cs.AI], Sep 14, 2021 (link)
Y. Wu, et al, "SPRING: GPT-4 Out-performs RL Algorithms by Studying Papers and Reasoning," arXiv:2305.15486 [cs.AI] May 24, 2023 (link)
R. Murthy, et al, "REX: Rapid Exploration and eXploitation for AI Agents," arXiv:2307.08962 [cs.AI], July 18, 2023 (link)
W. Yao, et al, "Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization," arXiv:2308.02151 [cs.CL], August 4, 2023 (link)
July 8 (S) : submit 1-pager research proposal
July 29 (S) : submit initial data analysis
July 30-Aug 13: group research
mid-August : draft research paper
mid-August : ASDRP symposium presentation
September : finalize paper for publication