References
[1] Byun Joon, Shin Seungmin, Park Youngcheol, Sung Jongmo, and Beack Seungkwon, “Development of a psychoacoustic loss function for the deep neural network (DNN)-based speech coder,” in Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech), 2021, pp. 1694–1698.
[2] Valero Laparra Johannes Balle and Eero P. Simoncelli, “End-to-end optimization of nonlinear transform codes for perceptual quality,” in Picture Coding Symposium, 2016.
<Sample-1> Encoded at 48kbps, fs=32kHz
Original
A. Ours
B. MP3
<Sample-2> Encoded at 56kbps, fs=32kHz
Original
A. Ours
B. MP3
<Sample-3> Encoded at 64kbps, fs=32kHz
Original
A. Ours
B. MP3