Bowen Jiang (Lauren)
she/her
she/her
Ph.D. Candidate at University of Pennsylvania
Research Intern at Microsoft
👩🏻💻 I will be joining Microsoft Office of Applied Research in Redmond, WA as a research intern in summer 2025, working on fundamental reinforcement finetuning and self-evolving algorithms and collaboration environments.
🚨 We released a new LLM personalization benchmark named PersonaMem accepted at COLM 2025, featuring 180+ persona-oriented multi-session user-model conversations, dynamic user preference updates, and long context up to 1M tokens. We experimented GPT-4.1, o4-mini, GPT-4.5, o1, Gemini-2, DeepSeek-R1-607B, LLaMA-4, Claude-3.7, and other SOTA models. I will be presenting an oral talk on this work at MASC-SLL 2025.
🚀 We have a new journal paper accepted at Nature npj Climate Action, titled "MARSHA: Multi-Agent RAG System for Hazard Adaptation" to support LLMs for scientific research.
🚀 We have a new publication at EMNLP 2024 titled "A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners". This work proposes the concept of "Token Bias" to question the generalization of reasoning in LLMs.
🙆♀️ I am going to give a talk at the Penn Wharton AI & Analytics Initiative's Research & Education Symposium.
University of Pennsylvania
Ph.D. in Computer and Information Science, August 2021 - Present
M.S. in Computer and Information Science, August 2021 - December 2023
Advisor: Prof. Camillo J. Taylor - Raymond S. Markowitz President's Distinguished Professor
GPA 3.99/4.00
University of Illinois Urbana-Champaign
B.S. in Electrical Engineering with Highest Honors and Bronze Tablet award, August 2017 - May 2021
Minor in Computer Science
Minor in Mathematics
GPA 3.99/4.00, Major GPA 4.00/4.00
Columbia University in the City of New York
Summer program for high school students, July - August 2016
Digital Filmmaking - From Initial Concept to Final Edit
Microsoft Corporation
Research Intern - Office of Applied Research, May 2025 - August 2025
Mentor: Dr. Sihao Chen Manager: Dr. Longqi Yang
Post-training, reinforcement learning, social intelligence, self-evolvement, and multi-agent collaborations.
Argonne National Laboratory
Visiting Student - Mathematics and Computer Science Division, September 2024 - May 2025
Mentor: Dr. Tanwi Mallick
LLMs for scientific applications, evaluation, multimodality, and multi-agent systems.
University of Pennsylvania
Teaching Assistant of Prof. Mark Yatskar and Prof. Osbert Bastani
CIS 5190 Applied Machine Learning, September 2023 - May 2024
Please refer to my Google Scholar page for a complete list of publications.
Personalization is the pluralistic alignment!
Know Me, Respond to Me:
Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Bowen Jiang, Zhuoqun Hao, Young-Min Cho, Bryan Li, Yuan Yuan, Sihao Chen, Lyle Ungar, Camillo J. Taylor, Dan Roth
University of Pennsylvania & Microsoft
A short version has been accepted at the 12th Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2025) [Oral]
We introduce PersonaMem, a personalization benchmark that features scalable and persona-oriented multi-session user-LLM conversations, as well as fine-grained in-situ user query types designed to evaluate LLM capabilities in memorizing, tracking, and incorporating users’ dynamic profiles into personalized responses across diverse scenarios.
@article{jiang2025know,
title={Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale},
author={Jiang, Bowen and Hao, Zhuoqun and Cho, Young-Min and Li, Bryan and Yuan, Yuan and Chen, Sihao and Ungar, Lyle and Taylor, Camillo J and Roth, Dan},
journal={arXiv preprint arXiv:2504.14225},
year={2025}
}
Reasoning requires more generalization!
📖 Paper 👩🏫 Poster 🎬 Short Video 👩🏻💻 Github
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners
Bowen Jiang, Yangxinyu Xie, Zhuoqun Hao, Xiaomeng Wang, Tanwi Mallick, Weijie J. Su, Camillo J. Taylor, Dan Roth
University of Pennsylvania & Argonne National Laboratory
A short version has also been accepted to the NeurIPS 2024 Workshop on Statistical Foundations of LLMs and Foundation Models, EMNLP 2024 GenBench Workshop & ICML 2024 Workshop on LLMs and Cognition.
We propose a new perspective to evaluation the LLMs' logical reasoning abilities beyond accuracy benchmarks. Our findings reveal that LLMs primarily rely on token biases and superficial patterns rather than true reasoning. Using a hypothesis testing approach with statistical guarantees, we highlight the need for caution when interpreting their generalization in reasoning tasks.
AI for science with real-world climate data!
MARSHA: Multi-Agent RAG System for Hazard Adaptation
Yangxinyu Xie, Bowen Jiang, Tanwi Mallick, Joshua David Bergerson, John K. Hutchison, Duane R. Verner, Jordan Branham, M. Ross Alexander, Robert B. Ross, Yan Feng, Leslie-Anne Levy, Weijie Su, Camillo J. Taylor
Argonne National Laboratory & University of Pennsylvania
A short version titled "WildfireGPT: Tailored Large Language Model for Wildfire Analysis" has been accepted to the NeurIPS 2024 Workshop on Tackling Climate Change with Machine Learning & EMNLP 2024 Workshop on NLP for Positive Impact
We propose a Retrieval-Augmented Generation (RAG)-based multi-agent LLM system to support analysis and decision-making in the context of natural hazards and extreme weather events. As a proof of concept, we present WildfireGPT, a specialized system focused on wildfire scenarios. The architecture employs a user-centered, multi-agent design to deliver tailored risk insights across diverse stakeholder groups.
@article{xie2025rag,
title={A rag-based multi-agent llm system for natural hazard resilience and adaptation},
author={Xie, Yangxinyu and Jiang, Bowen and Mallick, Tanwi and Bergerson, Joshua David and Hutchison, John K and Verner, Duane R and Branham, Jordan and Alexander, M Ross and Ross, Robert B and Feng, Yan and others},
journal={arXiv preprint arXiv:2504.17200},
year={2025}
@article{xie2024wildfiregpt,
title={WildfireGPT: Tailored Large Language Model for Wildfire Analysis},
author={Xie, Yangxinyu and Jiang, Bowen and Mallick, Tanwi and Bergerson, Joshua David and Hutchison, John K and Verner, Duane R and Branham, Jordan and Alexander, M Ross and Ross, Robert B and Feng, Yan and Levy, Leslie-Anne and Taylor, Camillo J},
journal={arXiv preprint arXiv:2402.07877},
year={2024}
}
VLMs can copilot geo-spatial data analysts!
GeoGrid-Bench: Can Foundation Models Understand Multimodal Gridded Geo-Spatial Data?
Bowen Jiang, Yangxinyu Xie, Xiaomeng Wang, Jiashu He, Joshua Bergerson, John K. Hutchison, Jordan Branham, Camillo J. Taylor, Tanwi Mallick
Argonne National Laboratory & University of Pennsylvania
Under Review
We present GeoGrid-Bench, a benchmark designed to evaluate the ability of foundation models to understand geo-spatial data in the grid structure. Geo-spatial datasets pose distinct challenges due to their dense numerical values, strong spatial and temporal dependencies, and unique multimodal representations including tabular data, heatmaps, and geographic visualizations. To assess how foundation models can support scientific research in this domain, GeoGrid-Bench features large-scale, real-world data covering 16 climate variables across 150 locations and extended time frames. The benchmark includes approximately 3,200 question-answer pairs, systematically generated from 8 domain expert-curated templates to reflect practical tasks encountered by human scientists.
AI can render texts with unseen fonts!
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
Bowen Jiang, Yuan Yuan, Xinyi Bai, Zhuoqun Hao, Alyson Yin, Yaojie Hu, Wenyu Liao, Lyle Ungar, Camillo J. Taylor
University of Pennsylvania & Cornell University & University of California Irvine
Under Review.
This work demonstrates that diffusion models can achieve font-controllable multilingual text
rendering using just raw images without font label annotations. By integrating a conditional diffusion model with a text segmentation model, the method captures font styles in pixel space in a self-supervised manner, allowing user-specified font customization without ground-truth labels.
@article{jiang2025controltext,
title={ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations},
author={Jiang, Bowen and Yuan, Yuan and Bai, Xinyi and Hao, Zhuoqun and Yin, Alyson and Hu, Yaojie and Liao, Wenyu and Ungar, Lyle and Taylor, Camillo J},
journal={arXiv preprint arXiv:2502.10999},
year={2025}
}
How to make your agents more rational?
Towards Rationality in Language and Multimodal Agents: A Survey
Bowen Jiang, Yangxinyu Xie, Xiaomeng Wang, Yuan Yuan, Zhuoqun Hao, Xinyi Bai, Weijie J. Su, Camillo J. Taylor, Tanwi Mallick
University of Pennsylvania & Cornell University & Argonne National Laboratory
A short version titled "Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey" has been accepted to the ICML 2024 Workshop on LLMs and Cognition.
Unlike reasoning that aims to draw conclusions from premises, rationality ensures that those conclusions are reliably consistent, have an orderability of preference, and are aligned with evidence from various sources and logical principles. This survey is the first to comprehensively explore the notion of rationality in language and multimodal agents, analyzing how designs in existing agents and agent systems contribute to advancing certain key axioms of rationality.
@article{jiang2024towards,
title={Towards Rationality in Language and Multimodal Agents: A Survey},
author={Jiang, Bowen and Xie, Yangxinyu and Wang, Xiaomeng and Yuan, Yuan and Hao, Zhuoqun and Bai, Xinyi and Su, Weijie J and Taylor, Camillo J and Mallick, Tanwi},
journal={arXiv preprint arXiv:2406.00252},
year={2024}
}
@article{jiang2024multi,
title={Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey},
author={Jiang, Bowen and Xie, Yangxinyu and Wang, Xiaomeng and Su, Weijie J and Taylor, Camillo J and Mallick, Tanwi},
journal={arXiv preprint arXiv:2406.00252},
year={2024}
}
Physics helps better visual estimations!
Vysics: Object Reconstruction Under Occlusion by Fusing Vision and Contact-Rich Physics
Bibit Bianchini, Minghan Zhu, Mengti Sun, Bowen Jiang, Camillo Jose Taylor, Michael Posa
University of Pennsylvania
A short version titled " Instance-Agnostic Geometry and Contact Dynamics Learning" has been accepted to the IROS 2023 Workshop on Leveraging Models for Contact-Rich Manipulation
We introduce Vysics, a vision-and-physics framework for a robot to build an expressive geometry and dynamics model of a single rigid body, using a seconds-long RGBD video and the robot’s proprioception. It uses a vision-based tracking and reconstruction method, BundleSDF, to estimate the trajectory and the visible geometry from an RGBD video, and an odometry-based model learning method, Physics Learning Library (PLL), to infer the “physible” geometry from the trajectory through implicit contact dynamics optimization.
@article{bianchini2025vysics,
title={Vysics: Object Reconstruction Under Occlusion by Fusing Vision and Contact-Rich Physics},
author={Bianchini, Bibit and Zhu, Minghan and Sun, Mengti and Jiang, Bowen and Taylor, Camillo J and Posa, Michael},
journal={arXiv preprint arXiv:2504.18719},
year={2025}
}
@article{sun2023instance,
title={Instance-Agnostic Geometry and Contact Dynamics Learning},
author={Sun, Mengti and Jiang, Bowen and Bianchini, Bibit and Taylor, Camillo Jose and Posa, Michael},
journal={arXiv preprint arXiv:2309.05832},
year={2023}
}
Predict finer relations among objects!
Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge
Bowen Jiang, Zhijun Zhuang, Shreyas S. Shivakumar, Camillo J. Taylor
University of Pennsylvania
A short version titled "Hierarchical Relationships: A New Perspective to Enhance Scene Graph Generation" has been accepted to the NeurIPS 2023 Workshop on New Frontiers in Graph Learning & Workshop on Queer in AI.
We develop plug-and-play modules that enhance state-of-the-art scene graph generation methods to new levels of performance. Our approach integrates LLMs to critique predictions and reduce common sense violations, alongside a Bayesian classification scheme that leverages a hierarchical structure in relations for improved performance
@article{jiang2023enhancing,
title={Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge},
author={Jiang, Bowen and Zhuang, Zhijun and Taylor, Camillo Jose},
journal={arXiv preprint arXiv:2311.12889},
year={2023}
}
@article{jiang2023scene,
title={Scene graph generation from hierarchical relationship reasoning},
author={Jiang, Bowen and Taylor, Camillo J},
journal={arXiv preprint arXiv:2303.06842},
year={2023}
}
Specialized tools make VLMs more capable!
Multi-Agent VQA: Exploring Multi-Agent Foundation Models in Zero-Shot Visual Question Answering
Bowen Jiang, Zhijun Zhuang, Yuan Yuan, Shreyas S. Shivakumar, Dan Roth, Camillo J. Taylor
University of Pennsylvania
CVPR 2024 Workshop on Computer Vision in the Wild [Spotlight Oral] and Workshop on Multimodal Foundation Models
We explore the zero-shot capabilities of foundation models in Visual Question Answering in the open world, and proposed an adaptive multi-agent system to address their limitations in object detection and counting. Instead of fine-tuning foundation models for specific datasets, our approach uses specialized agents as tools.
@article{jiang2024multi,
title={Multi-Agent VQA: Exploring Multi-Agent Foundation Models in Zero-Shot Visual Question Answering},
author={Jiang, Bowen and Zhuang, Zhijun and Shivakumar, Shreyas S and Roth, Dan and Taylor, Camillo J},
journal={arXiv preprint arXiv:2403.14783},
year={2024}
}
Active learning reduces annotation costs!
Maohao Shen, Bowen Jiang, Jacky Yibo Zhang, Oluwasanmi Koyejo
MIT, University of Pennsylvania, Standord University
NeurIPS 2022 Workshop on human in the Loop Learning
@article{shen2022batch,
title={Batch active learning from the perspective of sparse approximation},
author={Shen, Maohao and Jiang, Bowen and Zhang, Jacky Yibo and Koyejo, Oluwasanmi},
journal={arXiv preprint arXiv:2211.00246},
year={2022}
}
Reviewer for the Journal of Machine Learning Research
Reviewer for the IEEE Transactions on Multimedia
Reviewer for the IEEE Sensors Journal
Reviewer for IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025
Reviewer for the ICLR 2025 Representational Alignment Workshop
Reviewer for the NeurIPS 2024 Workshop on Behavioral ML
Reviewer for the IEEE/CVF CVPR 2024 Workshop on Scene Graphs and Graph Representation Learning
Reviewer for the IEEE/CVF CVPR 2024 Workshop on What is Next in Multimodal Foundation Models
Reviewer for the Elsevier Journal of Tunneling and Underground Space Technology
Reviewer for the IEEE/CVF ICCV 2023 Workshop on Scene Graphs and Graph Representation Learning
IEEE ICRA 2022 student volunteer in oral sessions and workshops
Teaching Assistant of Penn CIS 5190 Applied Machine Learning, September 2023 - April 2024
Laboratory Assistant of UIUC ECE 210 Analog Signal Processing, August 2018 - December 2018
Spotlight Oral Talk at the 12th Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2025) at Penn State, April 2025
- "Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale"
Penn Warren & ASSET Center Research Mixer, September 2024
- "A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners"
Wharton AI & Analytics Initiative Research & Education Symposium, September 2024
- "WildfireGPT: Tailored Large Language Model for Wildfire Analysis"
Spotlight Oral Talk at IEEE/CVF CVPR 2024 Workshop on Computer Vision in the Wild, June 2024
- "Multi-Agent VQA: Exploring Multi-Agent Foundation Models in Zero-Shot Visual Question Answering"
EMNLP 2024 GenBench Scholarship supported by ACL and Amazon, September 2024
Penn Engineering WiCS Grace Hopper Celebration Scholarship, August 2024
Five-Year Full Fellowship at Admission, Fall 2021
Highest Honors at Graduation, Spring 2021
The Bronze Tablet, top three percent of the graduating class, Spring 2021
Edmund J. James Scholar, Fall 2018 - Spring 2021
2020-2021 Illinois Engineering Achievement Scholarship, November 2020
2020-2021 Henry O. Koehler Merit Scholarship, September 2020
I enjoy cooking and exploring food from different cultures. I love pets especially bunnies and cats. I also like visual arts and going on road trips.
Understanding the value of our humanity is the essential task in the super-intelligence era. Stay hungry, stay foolish.
Feel free to reach out for conversation or research collaborations. If you identify as a member of historically marginalized groups, know that this is a safe and welcoming space.
(\_/) 😺<\
( •_•) (•_• )
/>🐰 (∪∪/)
Contact bwjiang [at] seas [dot] upenn [dot] edu
Redmond, WA and Philadelphia, PA, United States