Kimin Lee

I am an assistant professor at the Graduate School of AI at KAIST.  My current research goal is to develop safe and capable AI agents.  Recently, I've focused on the following topics:

Before joining KAIST, I was a research scientist at Google Research. I completed my postdoctoral training at UC Berkeley (working with Pieter Abbeel) and PhD at KAIST (advised by Jinwoo Shin). During PhD, I also interned and collaborated closely with Honglak Lee at University of Michigan. 

Publications (C: conference / J: journal / P: preprint / *: equal contribution / ^: equal advising)

2024

[C48] Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models [pdf][site][code]

[C47] By My Eyes: Grounding Multimodal Large Language Models with Sensor Data via Visual Prompting [pdf]

[C46] Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback [pdf]

[C45] Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation [pdf]

[C44] Promptable Behaviors: Personalizing Multi-Objective Rewards from Human Preferences [pdf][site]

[C43] Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models [pdf][code]

[P16] Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models [pdf]

[P15] MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control [pdf][code][site]

[P14] Latent Action Pretraining from Videos [pdf]

[P13] When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs [pdf]

[P12] Aligning Large Language Models with Self-generated Preference Data [pdf]

[P11] ADAPT^2: Adapting Pre-Trained Sensing Models to End-Users via Self-Supervision Replay [pdf]

[P10] DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing [pdf]

[P9] Benchmarking Mobile Device Control Agents across Diverse Configurations [pdf][site][code]

[P8] InstructBooth: Instruction-following Personalized Text-to-Image Generation [pdf]

2023

[C42] DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models [pdf][site][code]

[C41] Guide Your Agent with Adaptive Multimodal Rewards [pdf][site][code]

[C40] StyleDrop: Text-to-Image Generation in Any Style [pdf][site]

[C39] Multi-View Masked World Models for Visual Robotic Manipulation [pdf][code]

[C38] Controllability-Aware Unsupervised Skill Discovery [pdf][code]

[C37] Preference Transformer: Modeling Human Preferences using Transformers for RL [pdf][code]

[P7] Algorithms for Optimal Adaptation of Diffusion Models to Reward Functions [pdf]

[P6] Aligning Text-to-Image Models using Human Feedback [pdf]

2022

[C36] Masked World Models for Visual Control [pdf][code]

[C35] Reinforcement Learning with Action-Free Pre-Training from Videos [pdf][code]

[C34] SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning [pdf]

[C33] Reward Uncertainty for Exploration in Preference-based Reinforcement Learning [pdf]

[C32] Towards More Generalizable One-shot Visual Imitation Learning [pdf][code]

[C31] Programmatic Modeling and Generation of Real-time Strategic Soccer Environments for Reinforcement Learning [pdf]

[C30] HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator [pdf]

[P5] Instruction-Following Agents with Jointly Pre-Trained Vision-Language Models [pdf][code]

[P4] Dynamics-Augmented Decision Transformer for Offline Dynamics Generalization [pdf]

2021

[C29] B-Pref: Benchmarking Preference-Based Reinforcement Learning [pdf][code]

[C28] URLB: Unsupervised Reinforcement Learning Benchmark [pdf][code]

[C27] Decision Transformer: Reinforcement Learning via Sequence Modeling [pdf][code]

[C26] Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings [pdf][code]

[C25] Improving Transferability of Representations via Augmentation-Aware Self-Supervision [pdf][code]

[C24] Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback [pdf][code]

[C23]  Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble [pdf][code]

[C22] PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training [pdf][code][site]

[C21] State Entropy Maximization with Random Encoders for Efficient Exploration [pdf][code][site]

[C20] SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning [pdf][code]

[C19] Decoupling Representation Learning from Reinforcement Learning [pdf][code]

[C18] Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environments [pdf][podcast]

[C17] Learning to Sample with Local and Global Contexts in Experience Replay Buffer [pdf][code]

[C16]  MASKER: Masked Keyword Regularization for Reliable Text Classification [pdf][code]

2020

[C15] Reinforcement Learning with Augmented Data [pdf][code][blog] [media]

[C14] Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning [pdf][code][site]

[C13] Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning [pdf][code][site]

[C12] Regularizing Class-wise Predictions via Self-knowledge Distillation [pdf] [code]

[C11] Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning [pdf] [code]

[P3] R-LAtte: Visual Control via Deep Reinforcement Learning with Attention Network [pdf]

[P2] Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning [pdf]

2019

[C10] Overcoming Catastrophic Forgetting with Unlabeled Data in the Wild [pdf][code]

[C9] Robust Inference via Generative Classifiers for Handling Noisy Labels [pdf][code]

[C8] Using Pre-Training Can Improve Model Robustness and Uncertainty [pdf] [code]

[J1] Dynamic Control for On-demand Interference-managed WLAN Infrastructures [pdf]

2018

[C7] A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks  [pdf] [code]

[C6] Learning to Specialize with Knowledge Distillation for Visual Question Answering [pdf] [code]

[C5] Hierarchical Novelty Detection for Visual Object Recognition [pdf] [code]

[C4] Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples [pdf] [code]

2017

[C3] Confident Multiple Choice Learning [pdf] [code]

[P1] Simplified Stochastic Feedforward Neural Networks [pdf]

2016

[C2] TravelMiner: On the Benefit of Path-based Mobility Prediction

[C1] Just-in-time WLANs: On-demand Interference-managed WLAN Infrastructures [pdf] [slides]

Education

Academic Activities

Work Experience