The banner image was generated via one of powerful diffusion models, SDXL, with prompt: Symphony emerging from white noise, digital art, intricate musical notes forming, dynamic and vibrant, surreal and dreamlike, high quality, detailed, digital art, symphony, music, surreal, dreamlike, intricate, vibrant, digital, dynamic, abstract, artistic, high quality, colorful, ethereal lighting.
Chieh-Hsin (Jesse) Lai earned his Ph.D. in Mathematics from U. of Minnesota in 2021. Currently, he is a research scientist at Sony AI and a visiting assistant professor at the Department of Applied Mathematics of National Yang Ming Chiao Tung University, Taiwan. His expertise is in deep generative models, especially diffusion models and its application for media content restoration.
He organized an Expo workshop at NeurIPS 2023 on ``Media Content Restoration and Editing with Deep Generative Models and Beyond'', and a social event at ICLR 2024 on ``Recent Advances on Diffusion and GAN''. For more information, please visit his Google Scholar and Personal Website.
Please contact Chieh-Hsin (Jesse) Lai with any questions or concerns about the tutorial.
Koichi Saito is an AI engineer at Sony AI. He has been working on deep generative models for music and sound, especially, solving inverse problems for music signals based on diffusion models and diffusion-based text-to-sound generation. He has extensive experience in showcasing advanced diffusion model technologies to businesses and industries related to music.
Bac Nguyen Cong earned his M.Sc. degree (summa cum laude) in computer science from Universidad Central de Las Villas in 2015, followed by a Ph.D. from Ghent University in 2019. He joined Sony in 2019, focusing his research on representation learning, vision-language models, and generative modeling. With four years of hands-on professional industry experience in deep learning and machine learning, his work spans various application domains, such as text-to-speech and voice conversion, showing his important contributions to the field.
Yuki Mitsufuji is a VP at Sony AI, leading two departments at Sony, and he is a specially appointed associate professor at TokyoTech, where he lectures on generative models. He's achieved Senior Member status in IEEE and serves on the IEEE AASP Technical Committee 2023-2026. He chaired multiple workshops on generative models for audio and speech at ICASSP and NeurIPS. He is organizing a workshop at ECCV 2024 with the title ``AVGenL: Audio-Visual Generation and Learning''.