V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models. Hsu-kuang Chiu (Carnegie Mellon University), Ryo Hachiuma (NVIDIA), Chien-Yi Wang (NVIDIA), Stephen F Smith (Carnegie Mellon University), Yu-Chiang Frank Wang (NVIDIA / National Taiwan University), Min-Hung Chen (NVIDIA). [Paper] [Poster] [Website]
X-Fusion: Introducing New Modality to Frozen Large Language Models. Sicheng Mo (UCLA), Thao Nguyen (University of Wisconsin - Madison), Xun Huang (Adobe), Siddharth Iyer (Adobe), Yijun Li (Adobe), Yuchen Liu (Adobe), Abhishek Tandon (Adobe), Eli Shechtman (Adobe), Krishna Singh (Adobe), Yong Jae Lee (University of Wisconsin - Madison), Bolei Zhou (UCLA), Yuheng Li (Adobe). [Paper] [Poster] [Website] [Supplementary]
GG-SSMs: Graph-Generating State Space Models. Nikola Zubic (University of Zurich / ETH Zurich), Davide Scaramuzza (University of Zurich). [Paper] [Poster] [Code]
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation. Chanyoung Kim (Yonsei University), Dayun Ju (Yonsei University), Woojung Han (Yonsei University), Ming-Hsuan Yang (University of California, Merced), Seong Jae Hwang (Yonsei University). [Paper] [Poster] [Code] [Website] [Video]
DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation. Kadir Yilmaz (RWTH Aachen University), Karim Abou Zeid (RWTH Aachen), Bastian Leibe (RWTH Aachen University), Alexander Hermans (RWTH Aachen University), Daan De Geus (Eindhoven University of Technology), Timm Linder (Bosch Center for AI), David Adrian (Bosch Center for AI). [Paper] [Poster] [Code] [Website]
EfficientQuant: An Efficient Post-Training Quantization for CNN-Transformer Hybrid Models on Edge Devices. Shaibal Saha (Oakland University), Lanyu Xu (Oakland University). [Paper] [Poster]
LiteVLM: A Low-Latency Vision-Language Model Inference Pipeline for Resource-Constrained Environments. Jin Huang (University of Massachusetts Amherst), Yuchao Jin (NVIDIA), Le An (NVIDIA), Josh Park (NVIDIA). [Paper] [Poster]
Scalable Crowd-sourced Global HD Map Construction via Collaborative Map Perception and Sparse Graph Fusion. Ruiyang Zhu (University of Michigan), Minkyoung Cho (University of Michigan), Shuqing Zeng (General Motors), Fan Bai (General Motors), Xiang Gao (General Motors), Z. Morley Mao (University of Michigan). [Paper] [Poster]
Secure Acoustic Semantic Communication via Vision Transformers and Generative Models. Michal Zakrzewski (SoftServe), Kateryna Koval (Softserve), Dmytro Khamula (Softserve), Taras Rumezhak (Softserve). [Paper] [Poster]
CA-ViT: Channel-Aware Vision Transformer for Dynamic Feature Fusion. Aon Safdar (University College Dublin), Mohamed Saadeldin (University College Dublin). [Paper] [Poster] [Code]
OV-COAST: Cost Aggregation with Optimal Transport for Open-Vocabulary Semantic Segmentation. Aditya Gandhamal (Indian Institute of Science, Bangalore), Aniruddh Sikdar (Indian Institute of Science, Bangalore), Suresh Sundaram (Indian Institute of Science, Bangalore). [Paper] [Poster] [Code]
VISTA: Vision-Language Inference for Training-Free Stock Time-Series Analysis. Tina Khezresmaeilzadeh (University of Southern California), Parsa Razmara (University of Southern California), Mohammad Erfan Sadeghi (University of Southern California), Seyedarmin Azizi (University of Southern California), Erfan Baghaei Potraghloo (University of Southern California). [Paper] [Poster]
HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics. Gueter Josmy Faure (National Taiwan University), Jia-Fong Yeh (National Taiwan University), Min-Hung Chen (NVIDIA), Shang-Hong Lai (National Tsing Hua University), Winston H. Hsu (National Taiwan University). [Paper] [Poster] [Code] [Website]
VADER: Enhancing Video Anomaly Understanding through Relation-Aware Large Language Models. Ying Cheng (National Tsing Hua University), Yu-Ho Lin (National Tsing Hua University), Min-Hung Chen (NVIDIA), Fu-En Yang (NVIDIA), Shang-Hong Lai (National Tsing Hua University). [Paper] [Poster]
Please follow the poster printing instructions from the CVPR organizers here, but the presenters need to print the poster by themselves. The maximum poster size is 84” x 42”. In the poster room, there will be tables, but no power outlets.