Seminar 2023

A2FSeg: Adaptive Multi-Modal Fusion Network for Medical Image Segmentation
A2FSeg: Adaptive Multi-Modal Fusion Network for Medical Image Segmentation

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition

You Don't Have to Be Perfect to Be Amazing: Unveil the Utility of Synthetic Images
You Don't Have to Be Perfect to Be Amazing: Unveil the Utility of Synthetic Images

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation
Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation

EGE-UNet: an Efficient Group Enhanced UNet for skin lesion segmentation
EGE-UNet: an Efficient Group Enhanced UNet for skin lesion segmentation

RepVGG: Making VGG-style ConvNets Great Again
RepVGG: Making VGG-style ConvNets Great Again

Simple Online and Realtime Tracking with a Deep Association Metric
Simple Online and Realtime Tracking with a Deep Association Metric

WaveNet: A Generative Model for Raw Audio
WaveNet: A Generative Model for Raw Audio

LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models

Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation
Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation

High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions
High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions

Omni-Scale CNNs: a simple and effective kernel size configuration for time series classification
Omni-Scale CNNs: a simple and effective kernel size configuration for time series classification

STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection

Deep Region and Multi-label Learning for Facial Action Unit Detection
Deep Region and Multi-label Learning for Facial Action Unit Detection

Multimodal Few-Shot Learning with Frozen Language Models
Multimodal Few-Shot Learning with Frozen Language Models

Communication-Efficient Learning of Deep Networks from Decentralized Data
Communication-Efficient Learning of Deep Networks from Decentralized Data

Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks
BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks

Stride Length Estimation:
Stride Length Estimation:
Gaussian NLL Loss
Gaussian NLL Loss
Presenter: Min-Woo Tae (태민우)
Date: 11 May 2023

MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model
MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model

Turning a Blind Eye: Explicit Removal of Biases and Variation from Deep Neural Network Embeddings
Turning a Blind Eye: Explicit Removal of Biases and Variation from Deep Neural Network Embeddings

MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision

Training Language Models to Follow Instructions with Human Feedback
Training Language Models to Follow Instructions with Human Feedback

GraFormer: Graph Convolution Transformer for 3D Pose Estimation
GraFormer: Graph Convolution Transformer for 3D Pose Estimation

Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection

ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Vision GNN: An Image is Worth Graph of Nodes
Vision GNN: An Image is Worth Graph of Nodes

DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
MedCLIP: Contrastive Learning from Unpaired Medical Images and Text

Auto-Encoding Variational Bayes
Auto-Encoding Variational Bayes

Supervised Contrastive Learning
Supervised Contrastive Learning

DART: Articulated Hand Model with Diverse Accessories and Rich Textures
DART: Articulated Hand Model with Diverse Accessories and Rich Textures

Agent-Aware Dropout DQN for Safe and Efficient
Agent-Aware Dropout DQN for Safe and Efficient
On-line Dialogue Policy Learning
On-line Dialogue Policy Learning

TabNet: Attentive Interpretable Tabular Learning
TabNet: Attentive Interpretable Tabular Learning