Towards Diverse and Efficient Audio Captioning via Diffusion Models


Manjie Xu, Chenxing Li, Yong Ren, Xinyi Tu, Ruibo Fu, Wei Liang, Dong Yu
Tencent AI LabBeijing Institute of TechnologyUniversity of California, BerkeleyInstitute of Automation, Chinese Academy of Sciences, Beijing, China