Computer Vision
Channel: #computer-vision
Co-leads:
Benedict - @Harkhymadhe on Discord, @Arkhymadhe on Twitter
Logistics:
Occurrences: Second Tuesday of each month at 8am PT
Feel free to add papers/articles you would like to read in the paper bank
Past Presentations
![](https://www.google.com/images/icons/product/drive-32.png)
Apoorv Khandelwal - Analyzing Modular Approaches for Visual Question Decomposition
![](https://www.google.com/images/icons/product/drive-32.png)
Lindsey Li presents Multimodal Understanding with Large Language Models.
![](https://www.google.com/images/icons/product/drive-32.png)
Maxim Bonnaerens presents Learned Threshold Token Merging & Pruning for Vision Transformers.
![](https://www.google.com/images/icons/product/drive-32.png)
Generating Images with Multimodal LMs with Jing Yu Koh
![](https://www.google.com/images/icons/product/drive-32.png)
Ahmed Imtiaz Humayun discusses their work on SplineCam
![](https://www.google.com/images/icons/product/drive-32.png)
![](https://www.google.com/images/icons/product/drive-32.png)
Muhammad Maaz shares their work on Video-ChatGPT
![](https://www.google.com/images/icons/product/drive-32.png)
Hila Chefer presents their work on explainable Vision Transformer network
![](https://www.google.com/images/icons/product/drive-32.png)
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
![](https://www.google.com/images/icons/product/drive-32.png)
Edwin (@sora) presents his work on fine grained recognition.
![](https://www.google.com/images/icons/product/drive-32.png)