Efficient Additive Attention for Transformer-based Semi-supervised Document Layout Analysis
Tahira Shehzadi, Ifza Ifza, Didier Stricker, Muhammad Zeshan Afzal
Self-Supervised Representation Learning with Diffusion-Based Refinement for Image Rectification
Pooja Kumari, Sukhendu Das
Plyghtmesh: Transformer-Based Point Cloud Masking for Mesh Reconstruction with Occupancy Networks
Claudia Melis Tonti, Alessandro Nicolella, Irene Amerini
Pose to Protect: Federated Skeleton-Based Anomaly Detection for Privacy-Conscious Video Surveillance
Dina Famouri, Md Zarif Hossain, Ahmed Imteaj
Text-Guided Image Retouching Based on Interpretable Aesthetic Scoring
Supatta Viriyavisuthisakul, Parinya Sanguansat, Toshihiko Yamasaki
PRISM-FL: Privacy-preserving Image Synthesis Mechanism for Federated Learning
Efstathia Soufleri
MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance
Hallee Wong, Jose Javier Gonzalez Ortiz, John Guttag, Adrian Dalca
Enhancing Safety Judgment on LLM Responses via Text-to-Image Generation
Eunchung Noh, Jeonghun Baek
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad, Vibhav Vineet, Yogesh S Rawat
Memory-Aware Selective Unlearning in Vision-Language Models: A Scalable Extension of Parameter-Efficient Forgetting
Kaustubha V
DANTE-AD: Dual-Vision Attention Network for Long-Term Audio Description
Adrienne Deganutti, Simon Hadfield, Andrew Gilbert
JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version I
Xinhan Di, Qi Kristin, Pengqian Yu
REVEAL – Reasoning and Evaluation of Visual Evidence through Aligned Language
Ipsita Praharaj, Yukta Butala, Yash Butala
Concept Steerers: Leveraging K-Sparse Autoencoders for Test-Time Controllable Generations
Dahye Kim, Deepti Ghadiyaram
An improved semantic hand segmentation model
Eyitomilayo Babatope, Mireya Saraí García‐Vázquez, Alejandro Álvaro Ramírez‐Acosta
Riemannian-Geometric Fingerprints of Generative Models
Hayley Song, Laurent Itti
Improved U-Net-Based Approach for Brain Tumor Segmentation in T1-Weighted MRI
Somayeh Davar, Thomas Fevens
Towards Fair and Robust Face Parsing for Generative AI: A Multi-Objective Approach
Sophia Abraham
GANs and YOLOv11 for Automated Cochlear Hair Cell Detection
Sara Avila, Ariana Mondiri, Samantha Philips, Ashlyn Viereck, Macy Callahan, Danielle Carrol, Izzie Nielsen, Adya Dhuler, Cole Krudwig, Steven Fernandes
Teaching Diffusion Models to Respect the Laws of Physics: Insights from Climate Downscaling
Sophia Abraham
Understanding Co-speech Gestures in-the-wild
Sindhu Hegde, Prajwal K R , Taein Kwon, Andrew Zisserman
Metric Matters: Understanding the Impact of Evaluation Metrics in Shapley-Based Data Valuation
Nastaran Enshaei, Farnoosh Naderkhani