Papers
Papers
❤️ : First / co-first author 🧡 : My favorite co-working papers!
❤️ [2026] "TextAway: Mask-Free Video Text Removal with End-to-End Text-Aware Generation"
[under review] project page / Paper coming soon
TextAway removes subtitles, captions, and other overlaid text from videos without masks. It is an end-to-end text-aware generation framework that restores clean videos directly from corrupted inputs, without OCR, text detection, or external mask generation at inference time.
❤️ [2026] "FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation"
[arXived] project page / arxiv
FlowBlending is a stage-aware diffusion sampling method that accelerates video generation by using large models only when capacity matters (early and late stages), and small models elsewhere. It achieves significant speedups while preserving visual quality and temporal consistency.
❤️ [2025] "Syncphony: Synchronized Audio-to-Video Generation with Diffusion Transformers"
[ICLR 2026] project page / arxiv
Syncphony is an audio-to-video generation framework that produces high-quality videos with precise audio-motion synchronization. It improves temporal alignment through motion-aware training and audio-emphasized inference, while maintaining strong visual quality across diverse audio inputs.