Papers

Oral Session 1

Tuesday 28th May, 13:15 - 13:50

Learning Content-enhanced Mask Transformer for Domain Generalized Urban-Scene Segmentation

Qi Bi (UvA), Shaodi You (UvA), Theo Gevers (UvA)

How to Train Neural Field Representations: A Comprehensive Study and Benchmark 

Samuele Papa (UvA), Riccardo Valperga (UvA), David Knigge (UvA), Miltiadis Kofinas (UvA), Phillip Lippe (UvA), Jan-Jakob Sonke (NKI), Efstratios Gavves (UvA) 

Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges

Shreyasi Pathak (UTwente), Jörg Schlötterer (Phillips-University Marburg, University of Mannheim), Jeroen Veltman (Hospital Group Twente), Jeroen Geerdink (Hospital Group Twente), Maurice van Keulen, Christin Seifert (Phillips-University Marburg)

ProtoDiff: Learning to Learn Prototypical Networks by Task-Guided Diffusion

Yingjun Du (UvA), Zehao Xiao (UvA), Shengcai Liao (IIAI), Cees Snoek (UvA)

Hyperbolic Deep Learning in Computer Vision: A Survey

Pascal Mettes (UvA), Mina Ghadimi Atigh (UvA), Martin Keller-Ressel (TU Dresden), Jeffrey Gu (Stanford), Serena Yeung (Stanford) 

BECLR: Batch Enhanced Contrastive Few-Shot Learning

Stylianos Poulakakis-Daktylidis (TU Delft) and Hadi Jamali-Rad (TU Delft, Shell)

Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting

Vladimir Yugay (UvA), Yue Li (UvA), Theo Gevers (UvA), Martin R. Oswald (UvA) 

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Narges Norouzi (TU/e), Svetlana Orlova (TU/e), Daan de Geus (TU/e), Gijs Dubbelman (TU/e)

Poster Session 1

Tuesday 28th May, 15:10 - 16:45

Learning Content-enhanced Mask Transformer for Domain Generalized Urban-Scene Segmentation

Qi Bi (UvA), Shaodi You (UvA), Theo Gevers (UvA)

How to Train Neural Field Representations: A Comprehensive Study and Benchmark

Samuele Papa (UvA), Riccardo Valperga (UvA), David Knigge (UvA), Miltiadis Kofinas (UvA), Phillip Lippe (UvA), Jan-Jakob Sonke (NKI), Efstratios Gavves (UvA) 

Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges

Shreyasi Pathak (UTwente), Jörg Schlötterer (Phillips-University Marburg, University of Mannheim), Jeroen Veltman (Hospital Group Twente), Jeroen Geerdink (Hospital Group Twente), Maurice van Keulen, Christin Seifert (Phillips-University Marburg)

ProtoDiff: Learning to Learn Prototypical Networks by Task-Guided Diffusion

Yingjun Du (UvA), Zehao Xiao (UvA), Shengcai Liao (IIAI), Cees Snoek (UvA)

Hyperbolic Deep Learning in Computer Vision: A Survey

Pascal Mettes (UvA), Mina Ghadimi Atigh (UvA), Martin Keller-Ressel (TU Dresden), Jeffrey Gu (Stanford), Serena Yeung (Stanford) 

BECLR: Batch Enhanced Contrastive Few-Shot Learning

Stylianos Poulakakis-Daktylidis (TU Delft) and Hadi Jamali-Rad (TU Delft, Shell)

Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting

Vladimir Yugay (UvA), Yue Li (UvA), Theo Gevers (UvA), Martin R. Oswald (UvA) 

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Narges Norouzi (TU/e), Svetlana Orlova (TU/e), Daan de Geus (TU/e), Gijs Dubbelman (TU/e)

Find the Cliffhanger: Multi-modal Trailerness in Soap Operas 

Carlo Bretti (UvA), Pascal Mettes (UvA), Hendrik Vincent Koops (RTL), Daan Odijk (RTL), and Nanne van Noord (UvA)

Low-Resource Vision Challenges for Foundation Models

Yunhua Zhang (UvA), Hazel Doughty (Leiden University), Cees G.M. Snoek (UvA)

Performance of computer vision algorithms for fine-grained classification using crowdsourced insect images

Rita Pucci (Naturalis), Vincent J. Kalkman (Naturalis), Dan Stowell (Naturalis, Tilburg University) 

Elucidating the Exposure Bias in Diffusion Models

Mang Ning (UU), Mingxiao Li (KU Leuven), Jianlin Su (Moonshot), Albert Ali Salah (UU), Itir Onal Ertugrul (UU)

Video BagNet: Short Temporal Receptive Fields Increase Robustness in Long-Term Action Recognition

Ombretta Strafforello (TU Delft, TNO), Xin Liu (TU Delft), Klamer Schutte (TNO), Jan van Gemert (TU Delft) 

Graph Neural Networks for Learning Equivariant Representations of Neural Networks

Miltiadis Kofinas (UvA), Boris Knyazev (Samsung), Yan Zhang (Samsung), Yunlu Chen (CMU), Gertjan J. Burghouts (TNO), Efstratios Gavves (UvA), Cees G. M. Snoek (UvA), David W. Zhang (UvA) 

IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting 

Tim J. Schoonbeek (TU/e), Tim Houben (TU/e), Hans Onvlee (ASML), Peter H.N (TU/e). de With, Fons van der Sommen (TU/e)

Human-centric anomaly recognition in real-world complex events 

Xin Yuan (UTwente), Estefanía Talavera Martínez (UTwente)

Unlocking Spatial Comprehension in Text-to-Image Diffusion Models 

Mohammad Mahdi Derakhshani (UvA), Menglin Xia (Microsoft), Harkirat Behl (Microsoft), Cees G. M. Snoek (UvA), Victor Rühle (Microsoft)

Enhancing Manufacturing Quality Assurance: Deep Learning Techniques for Missing/Wrong Part Object Detection in Assembly Processes. 

Betsy Villa Brochero (UTwente), Estefania Talavera (UTwente), Ian Gibson (UTwente)

HyperICaRL: Continual hyperbolic learning of visual instances 

Melika Ayoughi (UvA), Mina Ghadimiatigh (UvA), Mohammad Mehdi Derakhshani (UvA), Cees Snoek (UvA), Paul Groth (UvA), Pascal Mettes (UvA)

Physics-Guided and Training-Free Diffusion for Controlling Illumination Conditions in Images 

Xiaoyan Xing (UvA), Tao Hu (LMU), Konrad Groh (Bosch), Jan Hendrick Matezen (Bosch), Sezer Karaoglu (UvA), Theo Gevers (UvA) 

GO4Align: Group Optimization for Multi-Task Alignment

Jiayi Shen (UvA), Cheems Wang (Tsinghua University, Kaiyuan Mathematical Sciences Institute), Zehao Xiao (UvA), Nanne Van Noord (UvA), Marcel Worring (UvA) 

Conversational Image Search Meets Interactive Multimodal Learning: Teaching a New Dog Old Tricks 

Hongyi Zhu (UvA), Jia-Hong Huang (UvA), Stevan Rudinac (UvA), Evangelos Kanoulas (UvA) 

Oral Session 2

Wednesday 29th May, 10:15 - 10:45

Any-Shift Prompting for Generalization over Distributions 

Zehao Xiao (UvA), Jiayi Shen (UvA), Mohammad Mahdi Derakhshani (UvA), Shengcai Liao (Core42), Cees G. M. Snoek (UvA) 

Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image Classification  

Puru Vaish (UTwente), Shunxin Wang (UTwente), Nicola Strisciuglio (UTwente)

Task-aligned Part-aware Panoptic Segmentation using Joint Object-Part Representations 

Daan de Geus (TU/e), Gijs Dubbelman (TU/e)

PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs 

Michael Dorkenwald (UvA), Nimrod Barazani (UvA), Cees G. M. Snoek (UvA), Yuki M Asano (UvA) 

Color Equivariant Convolutional Networks 

Attila Lengyel (TU Delft), Ombretta Strafforello (TU Delft), Robert-Jan Bruintjes (TU Delft), Alexander Gielisse (TU Delft), Jan van Gemert (TU Delft) 

Context Diffusion: In-Context Aware Image Generation 

Ivona Najdenkoska (UvA), Animesh Sinha (Meta), Abhimanyu Dubey (Meta), Dhruv Mahajan (Meta), Vignesh Ramanathan (Meta), Filip Radenovic (Meta) 

Poster Session 2

Wednesday 29th May, 11:00 - 12:00

Any-Shift Prompting for Generalization over Distributions 

Zehao Xiao (UvA), Jiayi Shen (UvA), Mohammad Mahdi Derakhshani (UvA), Shengcai Liao (Core42), Cees G. M. Snoek (UvA) 

Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image Classification  

Puru Vaish (UTwente), Shunxin Wang (UTwente), Nicola Strisciuglio (UTwente)

Task-aligned Part-aware Panoptic Segmentation using Joint Object-Part Representations 

Daan de Geus (TU/e), Gijs Dubbelman (TU/e)

PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs 

Michael Dorkenwald (UvA), Nimrod Barazani (UvA), Cees G. M. Snoek (UvA), Yuki M Asano (UvA) 

Color Equivariant Convolutional Networks 

Attila Lengyel (TU Delft), Ombretta Strafforello (TU Delft), Robert-Jan Bruintjes (TU Delft), Alexander Gielisse (TU Delft), Jan van Gemert (TU Delft) 

Context Diffusion: In-Context Aware Image Generation 

Ivona Najdenkoska (UvA), Animesh Sinha (Meta), Abhimanyu Dubey (Meta), Dhruv Mahajan (Meta), Vignesh Ramanathan (Meta), Filip Radenovic (Meta) 

How to Benchmark Vision Foundation Models for Semantic Segmentation?

Tommie Kerssies, Daan de Geus, Gijs Dubbelman

Learning Generalized Segmentation for Foggy-scenes by Bi-directional Wavelet Guidance

Qi Bi (UvA), Shaodi You (UvA), Theo Gevers (UvA) 

Automated Camera Calibration via Homography Estimation with GNNs 

Giacomo D'Amicantonio (TU/e), Egor Bondarev (TU/e), Peter H.N. De With (TU/e) 

Balanced Hyperbolic Learning Improves Your Out-of-Distribution Detector 

Tejaswi Kasarla (UvA), Max van Spengler (UvA), Pascal Mettes (UvA) 

Automatic Wagon Code Identification using Computer Vision 

Melissa Tijink (UTwente), Estefanía Talavera Martínez (UTwente), Luuk Spreeuwers (UTwente), Nicola Strisciuglio (UTwente)

Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries

Christiaan G. A. Viviers (TU/e), Lena Filatova (Phillips), Maurice Termeer (Phillips), Peter H. N. de With (TU/e), Fons van der Sommen (TU/e)

We thank the NCCV 2024 sponsors: