Call for Paper and Submission guidlines
The workshop will provide a common platform to discuss the recent progress, challenges, and opportunities in developing transformer-based models for various computer vision applications. To this end, we welcome transformer-based original research contributions in the following topics (but are not limited to):
• Theoretical insights into transformer-based models
• Transformer models for spatial (image) and temporal (video) data modeling
• Efficient transformer architectures, including novel mechanisms for self-attention and non-local attention.
• Visualizing and interpreting transformer networks
• Generative models for transformer networks
• Hybrid network designs combining the strengths of transformer models with convolutional and graph-based models
• Unsupervised, weakly, and semi-supervised learning with transformer models
• Multi-modal learning combining visual data with text, speech, and knowledge graphs
• Leveraging multi-spectral data like satellite imagery and infrared images in transformer models for improved semantic understanding of visual content
• Transformer-based designs for low-level vision problems such as image super- resolution, deblurring, de-raining, and denoising
• Novel transformer-based methods for high-level vision problems such as object detection, segmentation, activity recognition, and pose estimation
• Transformer models for volumetric, mesh, and point-cloud data processing in 3D and 4D data settings
Submission Guidelines
Call for paper: pdf
Format: All the submissions should follow the instructions adapted for Neurips 2022.
Page Limit: 9 pages
https://cmt3.research.microsoft.com/VTTA2022/