LinkedIn | Twitter | Google Scholar | CV
Email at hbansal@g.ucla.edu
LinkedIn | Twitter | Google Scholar | CV
Email at hbansal@g.ucla.edu
I am a Ph.D. candidate in the Department of Computer Science at UCLA (2021-Present), co-advised by Kai-Wei Chang and Aditya Grover. I am currently interning at Meta FAIR (Seattle) this summer with Ram Pasunuru (co-hosts: Devandra Sachan and Scott Yih).
My research focuses on data and algorithms for (a) reasoning in large language and multimodal models (smaller weaker better, gen verifiers, when2solve, openthoughts, openvlthinker), (b) vision-language understanding (lavida, cyclip, medmax, videocon), and (c) evaluating foundation models (mathvista, bbeh, videophy1/2).
In Summer 2024, I worked as a Student Researcher at Google DeepMind (hosts: Mehran Kazemi, Rishabh Agarwal, manager: Vinh Q. Tran). I spent my Summer 2022 at AWS AI (manager: Sravan Bodapati, mentor: Karthik Gopalakrishnan) . Before UCLA, I completed my B.Tech at IIT Delhi 2020 (advisor: Sumeet Agarwal). I have also worked at Goldman Sachs as an Analyst for a year in 2021.
I actively collaborate with other UCLA faculty, including Nanyun Peng, Jeffrey Brantingham, Hongjing Lu. Additionally, I work with Google Research for multimodal learning (mentor: Yonatan Bitton, manager: Idan Szpektor) and datacomp for data curation (advisor: Ludwig Schimdt).
July 2025: When To Solve, When To Verify is accepted to CoLM 2025!
June 2025: VideoPhy-2 won the 🏆 best paper award at Building Physically Plausible World Models ICML 2025!
May 2025: MathVista has been recognized as the 🔥most influential ICLR 2024 paper!
May 2025: Our workshop on video foundations is accepted to ICCV 2025!
May 2024: Stormer won the 🏆 best paper award at Climate Change and AI ICLR 2024!
April 2024: VideoCon won the 🏆 best paper award at Data Problems for Foundation Models ICLR 2024!
March 2024: I passed my Oral qualifications, and have advanced to candidacy!
January 2024: Peering through preferences and MathVista (Oral) got accepted at ICLR 2024!
August 2023: CleanCLIP got accepted at ICCV2023 as Oral Presentation!
May 2023: CleanCLIP won the 🏆 best paper award at the Reliable and Trustworthy Large Scale ML workshop in ICLR 2023!
April 2023: Two papers accepted as Highlighted papers at the Trustworthy and Reliable Large-Scale ML Workshop in ICLR 2023!
October 2022: GeoMLAMA and ENTIGEN are accepted as Oral Presentations at EMNLP 2022!
September 2022: CyCLIP got accepted as Oral Presentation in NeurIPS 2022!
Etash Guha*, Ryan Marten*, Sedrick Keh*, Negin Raoof*, Georgios Smyrnis*, Hritik Bansal, ..., Alex Dimakis, Ludwig Schmidt. OpenThoughts: Data Recipes for Reasoning Models. [Summary][Code]
Shufan Li*, Konstantinos Kallidromitis*, Hritik Bansal*, Akash Gokul*, Yusuke Kato, Kazuki Kozuka, Jason Kuen, Zhe Lin, Kai-Wei Chang, Aditya Grover. LaViDa: A Large Diffusion Language Model for Multimodal Understanding. [Code][Summary]
Yihe Deng, Hritik Bansal, Fan Yin, Nanyun Peng, Wei Wang, Kai-Wei Chang. OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement. [Code][Blog]
Hritik Bansal, Daniel Israel*, Siyan Zhao*, Shufan Li, Tung Nguyen, Aditya Grover. MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants. [Website] [Code] [Summary]
Nishad Singhi*, Hritik Bansal*, Arian Hosseini*, Aditya Grover, Kai-Wei Chang, Marcus Rohrbach, Anna Rohrbach. When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning. [Code][Summary]
Conference on Language Modeling (CoLM 2025)
Hritik Bansal*, Clark Peng*, Yonatan Bitton*, Roman Goldenberg, Aditya Grover, Kai-Wei Chang. VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation. [Code][Website][Summary]
🏆 BEST PAPER AWARD World Models ICML 2025
Mehran Kazemi, Bahare Fatemi, Hritik Bansal, ..., Kate Olszewska, Yi Tay, Vinh Q. Tran, Quoc V. Le, Orhan Firat. BIG-Bench Extra Hard. [Code][Summary]
ACL Main 2025
Hritik Bansal, Arian Hosseini, Rishabh Agarwal, Vinh Q. Tran, Mehran Kazemi. Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling. [Summary]
ICLR 2025
Lunjun Zhang, Arian Hosseini*, Hritik Bansal*, Mehran Kazemi, Aviral Kumar, Rishabh Agarwal. Generative Verifiers: Reward Modeling as Next-Token Prediction. [Summary][Website]
ICLR 2025
Hritik Bansal*, Zongyu Lin*, Tianyi Xie, Zeshun Zong, Michal Yarom, Yonatan Bitton, Chenfanfu Jiang, Yizhou Sun, Kai-Wei Chang, Aditya Grover. VideoPhy: Evaluating Physical Commonsense for Video Generation. [Website][Summary][Code]
ICLR 2025, Oral Video-Language Models NeurIPS 2024
Hritik Bansal*, Pratyush Maini*. Peeking Behind Closed Doors: Risks of LLM Evaluation by Private Data Curators. [Summary][Paper]
ICLR 2025 Blogpost Track
Hritik Bansal, Yonatan Bitton, Idan Szpektor*, Kai-Wei Chang*, Aditya Grover*. VideoCon: Robust Video-Language Alignment via Contrast Captions [Project Page][Summary][Code]
CVPR 2024, 🏆 BEST PAPER AWARD DPFM ICLR 2024
Hritik Bansal, Yonatan Bitton, Michal Yarom, Idan Szpektor, Aditya Grover, Kai-Wei Chang. TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation. [Website][Summary][Code]
Oral Video-Language Models NeurIPS 2024
Jeffrey Li*, Alex Fang*, Georgios Smyrnis*, Maor Ivgi*, Matt Jordan, Samir Gadre, Hritik Bansal, ..., Ludwig Schmidt, Vaishaal Shankar. DataComp-LM: In search of the next generation of training sets for language models. [Summary]
NeurIPS 2024
Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao, 2023. MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts [Project Page][Summary][Code]
Oral ICLR 2024 (Top 1.2%), 🔥 Most influential ICLR-2024 paper
Hritik Bansal, John Dang, Aditya Grover, 2023. Peering Through Preferences: Unraveling the Feedback Acquisition for Aligning LLMs. [Summary][Code]
ICLR 2024
Rohan Wadhawan, Hritik Bansal*, Kai-Wei Chang, Nanyun Peng. ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Language Models [Project Page][Summary][Code]
ICML 2024, Oral Vision Datasets Understanding CVPR 2024
Hritik Bansal*, Ashima Suvarna*, Gantavya Bhatt*, Nanyun Peng, Kai-Wei Chang, Aditya Grover. Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization. [Website][Summary][Code]
ACL Findings 2025, Oral DMLR ICML 2024
Tung Nguyen, Rohan Shah, Hritik Bansal, Troy Arcomano, Sandeep Madireddy, Romit Maulik, Veerabhadra Kotamarthi, Ian Foster, Aditya Grover. Scaling transformer neural networks for skillful and reliable medium-range weather forecasting.
NeurIPS 2024, 🏆 BEST PAPER AWARD Climate Change and AI ICLR 2024
Yonatan Bitton*, Hritik Bansal*, Jack Hessel*, Rulin Shao, Wanrong Zhu, Anas Awadalla, Josh Gardner, Rohan Taori, Ludwig Schmidt, 2023. VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use. [Summary][LAION Blog][Website][Code][Video]
NeurIPS 2023
Hritik Bansal, Aditya Grover, 2023. Leaving Reality to Imagination: Robust Classification via Generated Datasets. [Summary][Code][Dataset]
Oral Large-Scale ML ICLR 2023
Hritik Bansal*, Nishad Singhi*, Yu Yang, Fan Yin, Aditya Grover, Kai-Wei Chang, 2023. CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning. [Code][Summary]
Oral ICCV 2023, 🏆 BEST PAPER AWARD Large-Scale ML ICLR 2023
Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth, 2022. Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale. [Summary][Code]
ACL 2023
Da Yin*, Xiao Liu*, Fan Yin*, Ming Zhong*, Hritik Bansal, Jiawen Han, Kai-Wei Chang, 2023. Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation [Project Page][Summary]
EMNLP 2023
Tung Nguyen*, Jason Jewik*, Hritik Bansal, Prakhar Sharma, Aditya Grover, 2023. ClimateLearn: Benchmarking Machine Learning for Weather and Climate Modeling [Code][Summary]
NeurIPS 2023
Shashank Goel*, Hritik Bansal*, Sumit Bhatia, Ryan A. Rossi, Vishwa Vinay, Aditya Grover, 2022. CyCLIP: Cyclic Contrastive Language-Image Pre-training [Code][Summary]
Oral NeurIPS 2022 (Top 1.7%)
Hritik Bansal*, Da Yin*, Masoud Monajatipoor, Kai-Wei Chang, 2022. How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions? [Summary][Code][Dataset]
Oral EMNLP 2022 (Top 4%)
Da Yin, Hritik Bansal, Masoud Monajatipoor, Liunian Harold Li, Kai-Wei Chang, 2022. GEOMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models [Summary][Code]
Oral EMNLP 2022 (Top 4%)
Hritik Bansal*, Shashank Goel*, Tung Nguyen*, Aditya Grover, 2022. ClimateLearn: Machine Learning for Climate and Weather. [Tutorial][Slides][Code][Twitter Summary]
Spotlight Climate Change AI at NeurIPS 2022