Lecturer: Aishwarya Agrawal
Slides (key), Slides (pdf)
Lecturer: Aishwarya Agrawal
Slides (key), Slides (pdf)
Lecturer: Aishwarya Agrawal
Slides (key), Slides (pdf)
Lecturer: Aishwarya Agrawal
Slides (key), Slides (pdf)
Review paper: Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Paper presentation: Image Captioning
Lecturer: Le Zhang
Project presentation: Discriminative Stable Diffusion (DiscSD)
Project lead: Benno Krojer
Review paper: VQA: Visual Question Answering
Paper presentation: Visual Question Answering: Datasets
Lecturer: Arjun Vaithilingam
Project presentation: Weak language supervised finetuning of SSL vision models
Project lead: Diganta Misra
Review paper: Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Paper presentation: VQA Models
Lecturer: Diganta Misra
Project presentation: Enhancing compositional understanding for vision-language model
Project lead: Le Zhang
Review paper: Visual Dialog
Paper presentation: Visual Dialog: Datasets & Models
Lecturer: Benno Krojer
Project presentation: Interactive Learning with Grounded Language Agents Utilizing World Models
Project lead: Arjun Vaithilingam
Review paper: Multimodal Explanations: Justifying Decisions and Pointing to the Evidence
Paper presentation: Interpretability and Explainability in VL
Lecturer: Vitaly Kondulukov
Project presentation: Zero-Shot Natural Language Explanations
Project lead: Vitaly Kondulukov
Review paper: ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Paper presentation: Finetune based VLP models
Lecturer: Vitaly Kondulukov
Project presentation: Are Diffusion Models General Image-Text Scorers?
Project lead: Benno Krojer
Review paper: Multimodal Few-Shot Learning with Frozen Language Models
Paper presentation: Zero-shot / few-shot VLP models
Lecturer: Arjun Vaithilingam
Project presentation: Weak language supervision fine-tuning of vision encoders
Project lead: Diganta Misra
Review paper: VirTex: Learning Visual Representations from Textual Annotations
Paper presentation: VLP models for vision
Lecturer: Diganta Misra
Project presentation: Enhancing compositional understanding for vision-language model
Project lead: Le Zhang
Review paper: Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision
Paper presentation: VLP models for Language
Lecturer: Benno Krojer
Project presentation: Interactive Learning with Grounded Language Agents Utilizing World Models
Project lead: Arjun Vaithilingam
Review paper: Analyzing the Behavior of Visual Question Answering Models
Paper presentation: Shortcomings of Vision-Language models
Lecturer: Le Zhang
Project presentation: GQA with BLIP2
Project lead: Vitaly Kondulukov
Review paper: Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
Project presentation 1: Are Diffusion Models Vision-Language Reasoners?
Project lead: Benno Krojer
Project presentation 2: Weak language supervision fine-tuning of vision encoders
Project lead: Diganta Misra
Project presentation 3: Enhancing compositional understanding for vision-language model
Project lead: Le Zhang
Title: Are Diffusion Models Vision-Language Reasoners?
Project lead: Benno Krojer
Title: Weak language supervision fine-tuning of vision encoders
Project lead: Diganta Misra
Title: Enhancing compositional understanding for vision-language model
Project lead: Le Zhang
Title: Interactive Learning with Grounded Language Agents
Project lead: Arjun Vaithilingam
Title: Visual Encoder vs QFormer in BLiP2
Project lead: Vitaly Kondulukov