Bowen Jiang (Lauren)
she/her
she/her
Ph.D. Candidate at University of Pennsylvania
Research Intern at Microsoft and Google
Β
Lauren Jiang is an AI researcher and a Ph.D. candidate in Computer Science at University of Pennsylvania, where she is fortunate to be advised by Prof. Camillo J. Taylor and collaborates with Prof. Dan Roth and Prof. Lyle Ungar. She is also a research intern at Microsoft and Google, and a visiting student at Argonne National Laboratory. She received her B.S. from the University of Illinois Urbana-Champaign in 2021, graduating with highest honors and the Bronze Tablet, under the generous guidance of Prof. Yoram Bresler and Prof. Samni Koyejo.
Her research focuses on advancing human-centered AI, with interests in reinforcement fine-tuning for large language models, personalization and memory, multimodality, synthetic data, and social intelligence. She works to advance reasoning and safety in next-generation AI models with the shine of humanity, and is drawn to research problems where simple yet scalable solutions make real-world impacts.
π©π»βπ» [12-2025] I will be joining an AI start-up in stealth mode in San Francisco, CA as a Member of Technical Staff, working on data infra.
π©π»βπ» [12-2025] I will be joining Google in Mountain View, CA as a research intern, working on personalized memory in video models.
π¨ [12-2025] The official paper of PersonaMem-v2 is now released!
β [09-2025] I am on the job market seeking research opportunities in industry for 2026, including intern and full-time positions.
π¨ [09-2025] PersonaMem-v2 is now available on HuggingFace! It is the most cutting-edge LLM-personalization benchmark that focuses on implicit user preferences in long conversations. It attracted 4000+ downloads per month!
π [07-2025] We have a new journal paper accepted at Nature npj Climate Action, titled "MARSHA: Multi-Agent RAG System for Hazard Adaptation" to support LLMs for scientific research in critical domains.
π©π»βπ» [05-2025] I will be joining Microsoft Office of Applied Research in Redmond, WA as a research intern, working on reinforcement fine-tuning and AI-native collaboration.
π¨ [04-2025] We released a new LLM personalization benchmark named PersonaMem accepted at COLM 2025, featuring 180+ persona-oriented multi-session user-model conversations, dynamic user preference updates, and long context up to 1M tokens.
Β π€ [03-2025] Iβll be giving a talk on LLM personalization at MASC-SLL 2025, hosted at Penn State University.
πββοΈ [09-2024] I'll be giving a talk at the Penn Wharton AI & Analytics Initiative's Research & Education Symposium.
University of Pennsylvania
Ph.D. in Computer and Information Science, August 2021 - Present
M.S. in Computer and Information Science, August 2021 - December 2023
GPA 3.99/4.00
University of Illinois Urbana-Champaign
B.S. in Electrical Engineering with Highest Honors and Bronze Tablet award, August 2017 - May 2021
Minor in Computer Science
Minor in Mathematics
Advisors: Prof. Yoram Bresler - Founder Professor in Electrical and Computer Engineering and Prof. Sanmi Koyejo - Assistant Professor at Stanford University
GPA 3.99/4.00, Major GPA 4.00/4.00
Columbia University in the City of New York
Summer program for high school students, July - August 2016
Digital Filmmaking - From Initial Concept to Final Edit
An AI Startup in Stealth Mode
Member of Technical Staff, part-time, Upcoming in 2026
San Francisco, CA
Research Intern, full-time, Upcoming in 2026
Mountain View, CA
Large video models, multimodal understanding, personalization, and memory.
Microsoft
Research Intern, part-time, August 2025 - December 2025
Office of Applied Research, Redmond, WA (Remote)
Mentor: Dr. Sihao Chen, Manager and Leadership: Dr. Longqi Yang, Dr. Brent Hecht, and Dr. Jamie Teevan.Β
LLM post-training, reinforcement learning, social intelligence, personalization, and AI-native collaboration.
Microsoft
Research Intern, full-time, May 2025 - August 2025
Office of Applied Research, Redmond, WA
Mentor: Dr. Sihao Chen, Manager and Leadership: Dr. Longqi Yang, Dr. Brent Hecht, and Dr. Jamie Teevan.Β
LLM post-training, reinforcement learning, social intelligence, personalization, and AI-native collaboration.
Argonne National Laboratory
Visiting Student, part-time, September 2024 - May 2025
Mathematics and Computer Science Division, Lemont, IL (Remote)
Mentor: Dr. Tanwi Mallick
LLMs for scientific applications, multimodality, and multi-agent systems.
Please refer to my Google Scholar page for a complete list of publications.Β
PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory
Personalization is one of the next milestones in advancing AI capability and alignment. We introduce PersonaMem-v2, the state-of-the-art dataset for LLM personalization that simulates 1,000 realistic user-chatbot interactions on 300+ scenarios, 20,000+ user preferences, and 128k-token context windows, where most user preferences are implicitly revealed to reflect real-world interactions. Using this data, we investigate how reinforcement fine-tuning enables a model to improve its long-context reasoning capabilities for user understanding and personalization. We also develop a framework for training an agentic memory system, which maintains a single, human-readable memory that grows with each user over time.
@article{jiang2025personamem,
Β Β title={PersonaMem-v2: Towards Personalized Intelligence via Learning Implicit User Personas and Agentic Memory},
Β Β author={Jiang, Bowen and Yuan, Yuan and Shen, Maohao and Hao, Zhuoqun and Xu, Zhangchen and Chen, Zichen and Liu, Ziyi and Vijjini, Anvesh Rao and He, Jiashu and Yu, Hanchao and others},
Β Β journal={arXiv preprint arXiv:2512.06688},
Β Β year={2025}
}
To be released soon.
One Model, All Roles: Conversational Self-Play via Multi-Turn, Multi-Agent Reinforcement Learning towards Social Intelligence
This work introduces a generalized framework for training AI agents to operate in social environments. Real-world social interactions are multi-party, multi-turn, and span long horizons, requiring agents to recognize distinct personas and adapt to evolving group dynamics. By using scalable simulations that reflect real-world collaboration and competition, the framework trains agents to plan strategically and communicate adaptively across extended dialogues. As a result, agents learn to navigate complex, dynamic social interactions autonomously and without human supervision.
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Bowen Jiang, Zhuoqun Hao, Young-Min Cho, Bryan Li, Yuan Yuan, Sihao Chen, Lyle Ungar, Camillo J. Taylor, Dan Roth University of Pennsylvania & Microsoft
COLM 2025Β Β This work is also known as PersonaMem.
A short version has been accepted at the 12th Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2025) [Oral]
We introduce PersonaMem, a personalization benchmark that features scalable and persona-oriented multi-session user-LLM conversations, as well as fine-grained in-situ user query types designed to evaluate LLM capabilities in memorizing, tracking, and incorporating usersβ dynamic profiles into personalized responses across diverse scenarios.
@article{jiang2025know,
Β Β title={Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale},
Β Β author={Jiang, Bowen and Hao, Zhuoqun and Cho, Young-Min and Li, Bryan and Yuan, Yuan and Chen, Sihao and Ungar, Lyle and Taylor, Camillo J and Roth, Dan},
Β Β journal={arXiv preprint arXiv:2504.14225},
Β Β year={2025}
}
Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-Bench
Ziyi Liu, Priyanka Dey, Jen-tse Huang, Zhenyu Zhao, Bowen Jiang, Rahul Gupta, Yang Liu, Jieyu Zhao
University of Southern California & Amazon AGI & John Hopkins University & University of Pennsylvania
Under Review
CQ-Bench puts LLMsβ cultural intelligence to the test: gauging whether they can read between the lines of global conversations. Despite near-human performance in explicit value recognition, models still stumble on subtle attitude detection. With just 500 culturally nuanced samples, even smaller models can outperform larger ones from reinforcement finetuning.
@article{liu2025can,
Β Β title={Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-Bench},
Β Β author={Liu, Ziyi and Dey, Priyanka and Zhao, Zhenyu and Huang, Jen-tse and Gupta, Rahul and Liu, Yang and Zhao, Jieyu},
Β Β journal={arXiv preprint arXiv:2504.01127},
Β Β year={2025}
}
π Paper π©βπ« Poster π¬ Short Video π©π»βπ» Github π RedNoteΒ
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners
Bowen Jiang, Yangxinyu Xie, Zhuoqun Hao, Xiaomeng Wang, Tanwi Mallick, Weijie J. Su, Camillo J. Taylor, Dan RothΒ
University of Pennsylvania & Argonne National Laboratory
A short version has also been accepted to the NeurIPS 2024 Workshop on Statistical Foundations of LLMs and Foundation Models, EMNLP 2024 GenBench Workshop & ICML 2024 Workshop on LLMs and Cognition.
We propose a new perspective to evaluation the LLMs' logical reasoning abilities beyond accuracy benchmarks. Our findings reveal that LLMs primarily rely on token biases and superficial patterns rather than true reasoning. Using a hypothesis testing approach with statistical guarantees, we highlight the need for caution when interpreting their generalization in reasoning tasks.Β
MARSHA: Multi-Agent RAG System for Hazard Adaptation
Yangxinyu Xie, Bowen Jiang, Tanwi Mallick, Joshua David Bergerson, John K. Hutchison, Duane R. Verner, Jordan Branham, M. Ross Alexander, Robert B. Ross, Yan Feng, Leslie-Anne Levy, Weijie Su, Camillo J. Taylor
Argonne National Laboratory & University of PennsylvaniaΒ
Nature - npj Climate ActionΒ Β This work is also known as WildfireGPT.
A short version titled "WildfireGPT: Tailored Large Language Model for Wildfire Analysis" has been accepted to the NeurIPS 2024 Workshop on Tackling Climate Change with Machine Learning & EMNLP 2024 Workshop on NLP for Positive Impact
We propose a Retrieval-Augmented Generation (RAG)-based multi-agent LLM system to support analysis and decision-making in the context of natural hazards and extreme weather events. As a proof of concept, we present WildfireGPT, a specialized system focused on wildfire scenarios. The architecture employs a user-centered, multi-agent design to deliver tailored risk insights across diverse stakeholder groups.
@article{xie2025rag,
Β Β title={A rag-based multi-agent llm system for natural hazard resilience and adaptation},
Β Β author={Xie, Yangxinyu and Jiang, Bowen and Mallick, Tanwi and Bergerson, Joshua David and Hutchison, John K and Verner, Duane R and Branham, Jordan and Alexander, M Ross and Ross, Robert B and Feng, Yan and others},
Β Β journal={arXiv preprint arXiv:2504.17200},
Β Β year={2025}
@article{xie2024wildfiregpt,
Β Β title={WildfireGPT: Tailored Large Language Model for Wildfire Analysis},
Β Β author={Xie, Yangxinyu and Jiang, Bowen and Mallick, Tanwi and Bergerson, Joshua David and Hutchison, John K and Verner, Duane R and Branham, Jordan and Alexander, M Ross and Ross, Robert B and Feng, Yan and Levy, Leslie-Anne and Taylor, Camillo J},
Β Β journal={arXiv preprint arXiv:2402.07877},
Β Β year={2024}
}
GeoGrid-Bench: Can Foundation Models Understand Multimodal Gridded Geo-Spatial Data?
Bowen Jiang, Yangxinyu Xie, Xiaomeng Wang, Jiashu He, Joshua Bergerson, John K. Hutchison, Jordan Branham, Camillo J. Taylor, Tanwi Mallick Argonne National Laboratory & University of Pennsylvania
Under Review
We present GeoGrid-Bench, a benchmark designed to evaluate the ability of foundation models to understand geo-spatial data in the grid structure. Geo-spatial datasets pose distinct challenges due to their dense numerical values, strong spatial and temporal dependencies, and unique multimodal representations including tabular data, heatmaps, and geographic visualizations. To assess how foundation models can support scientific research in this domain, GeoGrid-Bench features large-scale, real-world data covering 16 climate variables across 150 locations and extended time frames. The benchmark includes approximately 3,200 question-answer pairs, systematically generated from 8 domain expert-curated templates to reflect practical tasks encountered by human scientists.Β
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
Bowen Jiang, Yuan Yuan, Xinyi Bai, Zhuoqun Hao, Alyson Yin, Yaojie Hu, Wenyu Liao, Lyle Ungar, Camillo J. Taylor
University of Pennsylvania & Cornell University & University of California Irvine
This work demonstrates that diffusion models can achieve font-controllable multilingual text
rendering using just raw images without font label annotations. By integrating a conditional diffusion model with a text segmentation model, the method captures font styles in pixel space in a self-supervised manner, allowing user-specified font customization without ground-truth labels.Β
@article{jiang2025controltext,
Β Β title={ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations},
Β Β author={Jiang, Bowen and Yuan, Yuan and Bai, Xinyi and Hao, Zhuoqun and Yin, Alyson and Hu, Yaojie and Liao, Wenyu and Ungar, Lyle and Taylor, Camillo J},
Β Β journal={arXiv preprint arXiv:2502.10999},
Β Β year={2025}
}
Towards Rationality in Language and Multimodal Agents: A Survey
Bowen Jiang, Yangxinyu Xie, Xiaomeng Wang, Yuan Yuan, Zhuoqun Hao, Xinyi Bai, Weijie J. Su, Camillo J. Taylor, Tanwi Mallick University of Pennsylvania & Cornell University & Argonne National Laboratory
A short version titled "Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey" has been accepted to the ICML 2024 Workshop on LLMs and Cognition.
Unlike reasoning that aims to draw conclusions from premises, rationality ensures that those conclusions are reliably consistent, have an orderability of preference, and are aligned with evidence from various sources and logical principles. This survey is the first to comprehensively explore the notion of rationality in language and multimodal agents, analyzing how designs in existing agents and agent systems contribute to advancing certain key axioms of rationality.
@article{jiang2024towards,
Β Β title={Towards Rationality in Language and Multimodal Agents: A Survey},
Β Β author={Jiang, Bowen and Xie, Yangxinyu and Wang, Xiaomeng and Yuan, Yuan and Hao, Zhuoqun and Bai, Xinyi and Su, Weijie J and Taylor, Camillo J and Mallick, Tanwi},
Β Β journal={arXiv preprint arXiv:2406.00252},
Β Β year={2024}
}
@article{jiang2024multi,
Β Β title={Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey},
Β Β author={Jiang, Bowen and Xie, Yangxinyu and Wang, Xiaomeng and Su, Weijie J and Taylor, Camillo J and Mallick, Tanwi},
Β Β journal={arXiv preprint arXiv:2406.00252},
Β Β year={2024}
}
Vysics: Object Reconstruction Under Occlusion by Fusing Vision and Contact-Rich Physics
Bibit Bianchini, Minghan Zhu, Mengti Sun, Bowen Jiang, Camillo Jose Taylor, Michael Posa
University of Pennsylvania
A short version titled " Instance-Agnostic Geometry and Contact Dynamics Learning" has been accepted to the IROS 2023 Workshop on Leveraging Models for Contact-Rich Manipulation
We introduce Vysics, a vision-and-physics framework for a robot to build an expressive geometry and dynamics model of a single rigid body, using a seconds-long RGBD video and the robotβs proprioception. It uses a vision-based tracking and reconstruction method, BundleSDF, to estimate the trajectory and the visible geometry from an RGBD video, and an odometry-based model learning method, Physics Learning Library (PLL), to infer the βphysibleβ geometry from the trajectory through implicit contact dynamics optimization.Β
@article{bianchini2025vysics,
Β Β title={Vysics: Object Reconstruction Under Occlusion by Fusing Vision and Contact-Rich Physics},
Β Β author={Bianchini, Bibit and Zhu, Minghan and Sun, Mengti and Jiang, Bowen and Taylor, Camillo J and Posa, Michael},
Β Β journal={arXiv preprint arXiv:2504.18719},
Β Β year={2025}
}
@article{sun2023instance,
Β Β title={Instance-Agnostic Geometry and Contact Dynamics Learning},
Β Β author={Sun, Mengti and Jiang, Bowen and Bianchini, Bibit and Taylor, Camillo Jose and Posa, Michael},
Β Β journal={arXiv preprint arXiv:2309.05832},
Β Β year={2023}
}
Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense KnowledgeΒ
Bowen Jiang, Zhijun Zhuang, Shreyas S. Shivakumar, Camillo J. TaylorΒ
University of Pennsylvania
A short version titled "Hierarchical Relationships: A New Perspective to Enhance Scene Graph Generation" has been accepted to the NeurIPS 2023 Workshop on New Frontiers in Graph Learning & Workshop on Queer in AI.
We develop plug-and-play modules that enhance state-of-the-art scene graph generation methods to new levels of performance. Our approach integrates LLMs to critique predictions and reduce common sense violations, alongside a Bayesian classification scheme that leverages a hierarchical structure in relations for improved performanceΒ
@article{jiang2023enhancing,
Β title={Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge},
Β author={Jiang, Bowen and Zhuang, Zhijun and Taylor, Camillo Jose},
Β journal={arXiv preprint arXiv:2311.12889},
Β year={2023}
}
@article{jiang2023scene,
Β title={Scene graph generation from hierarchical relationship reasoning},
Β author={Jiang, Bowen and Taylor, Camillo J},
Β journal={arXiv preprint arXiv:2303.06842},
Β year={2023}
}
Reviewer for the Journal of Machine Learning Research
Reviewer for the IEEE Transactions on Multimedia
Reviewer for the IEEE Sensors Journal
Reviewer for IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) 2026
Reviewer for IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025
Reviewer for the ICLR 2025 Representational Alignment Workshop
Reviewer for the NeurIPS 2024 Workshop on Behavioral ML
Reviewer for the IEEE/CVF CVPR 2024 Workshop on Scene Graphs and Graph Representation Learning
Reviewer for the IEEE/CVF CVPR 2024 Workshop on What is Next in Multimodal Foundation Models
Reviewer for the Elsevier Journal of Tunneling and Underground Space Technology
Reviewer for the IEEE/CVF ICCV 2023 Workshop on Scene Graphs and Graph Representation Learning
IEEE ICRA 2022 student volunteer in oral sessions and workshops
Teaching Assistant of Penn CIS 5190 Applied Machine Learning instructed by Prof. Mark Yatskar, Prof. Osbert Bastani, Prof. Surbhi Goel, and Prof. Eric Wong, September 2023 - April 2024
Laboratory Assistant of UIUC ECE 210 Analog Signal Processing, August 2018 - December 2018
Spotlight Oral Talk at the 12th Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2025) at Penn State, April 2025
- "Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale"
Penn Warren & ASSET Center Research Mixer, September 2024
- "A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners"
Wharton AI & Analytics Initiative Research & Education Symposium, September 2024
- "WildfireGPT: Tailored Large Language Model for Wildfire Analysis"
Spotlight Oral Talk at IEEE/CVF CVPR 2024 Workshop on Computer Vision in the Wild, June 2024
- "Multi-Agent VQA: Exploring Multi-Agent Foundation Models in Zero-Shot Visual Question Answering"
EMNLP 2024 GenBench Scholarship supported by ACL and Amazon, September 2024
Penn Engineering WiCS Grace Hopper Celebration Scholarship, August 2024
Five-Year Full Fellowship at Admission, Fall 2021
Highest Honors at Graduation, Spring 2021
The Bronze Tablet, top three percent of the graduating class, Spring 2021
Edmund J. James Scholar, Fall 2018 - Spring 2021
2020-2021 Illinois Engineering Achievement Scholarship, November 2020
2020-2021 Henry O. Koehler Merit Scholarship, September 2020
(\_/) πΊ<\
( β’_β’) (β’_β’ )
/>π° (βͺβͺ/)