T4V @ CVPR 2023

June 18, 2023 (PDT)

7:50 - 8:00

Gedas Bertasius

Opening Remarks

8:00 - 8:30

Jianfeng Gao (Microsoft)

From LLMs to Self-Improving AI

8:30 - 9:00

Carl Vondrick (Columbia University)

Connecting Vision and Language via Code

9:00 - 9:30

Saining Xie (NYU)

ConvNet vs Transformer ROUND 2: Self-Supervised Learning and Diffusion Models

9:30 - 10:00

Spotlight Talks 1

Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation
Dual PatchNorm
Point2Vec for Self-Supervised Representation Learning on Point Clouds.
FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer

10:00 - 10:30

Coffee Break

10:30 - 11:00

Ishan Misra (Meta AI)

Supercharge your Transformers with Self-supervised Learning

11:00 - 11:30

Ruben Villegas (Google DeepMind)

Visual Storytelling with Generative Models of Video

11:30 - 12:00

Alex Kirillov (Meta AI)

Segment Anything

12:00 - 13:30

Lunch Break

13:30 - 14:30

Poster Session

Please put up the posters in Exhibit hall #235 - #264. Location details here

14:30 - 15:00

Poster Session + Coffee Break

15:00 - 15:30

Huiwen Chang (Google)

Masked Modeling for Vision

15:30 - 16:00

Cordelia Schmid (Google)

Multimodal Video Representations and Their Extension to Visual Language Navigation

16:00 - 16:30

Spotlight Talks 2

RePAST: Relative Pose Attention Scene Representation Transformer
OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios
Clicks as Queries: Interactive Transformer for Multi-instance Segmentation
Joint Adaptive Representations for Image-Language Learning
PaReprop: Fast Parallelized Reversible Backpropagation

16:30 - 17:30

Panel Discussion + Closing Remarks

Page updated

Google Sites

Report abuse