Sample Audio Clips for Neural Audio Coder Evaluation:
We provide example clips used in our experiments to compare the performance of neural audio coders (NACs). For each test clip, we include the original reference, AAC (FDK-AAC LC), and NAC outputs at 48, 56, and 64 kbps. The clips are grouped into three categories, Speech, Mixed, and Music, to illustrate how different content types are encoded across systems.
References
[4] Seungmin Shin, Joon Byun, Youngcheol Park, Jongmo Sung, and Seungkwon Beack , “Deep Neural Network (DNN) Audio Coder Using A Perceptually Improved Training Method,” IEEE ICASSP 2022.
[5] Joon Byun, Seungmin Shin, Youngcheol Park, Jongmo Sung, and Seungkwon Beack , “A perceptual neural audio coder with a mean-scale hyperprior,” IEEE ICASSP 2023.
[6] Seungmin Shin, Joon Byun, Jongmo Sung, Seungkwon Beack, and Youngcheol Park, “Quantization noise masking in perceptual neural audio coder,” IEEE ICASSP 2024.
[12] Joon Byun, Seungmin Shin, Seorim Hwang, Jongmo Sung, Seungkwon Beack, and Youngcheol Park, “Optimizations of neural audio coder toward perceptual transparency,” IEEE Journal of Selected Topics in Signal Processing, December 2024.
[*] Currently, a paper has been submitted to IEEE ICASSP 2026.
Encoded at 48kbps, sr=44.1kHz
<Speech> [S1 (es01)] [S4 (te1_mg54_speech)]
Original
FDK AAC-LC
TD-NAC
FD-NAC
<Mixed> [X1 (twinkle_ff51)] [X3 (SpeechOverMusic4)]
Original
FDK AAC-LC
TD-NAC
FD-NAC
<Music> [M4 (music_3)] [M5 (phi7)]
Original
FDK AAC-LC
TD-NAC
FD-NAC
Encoded at 56kbps, sr=44.1kHz
<Speech> [S1 (es01)] [S4 (te1_mg54_speech)]
Original
FDK AAC-LC
TD-NAC
FD-NAC
<Mixed> [X1 (twinkle_ff51)] [X3 (SpeechOverMusic4)]
Original
FDK AAC-LC
TD-NAC
FD-NAC
<Music> [M4 (music_3)] [M5 (phi7)]
Original
FDK AAC-LC
TD-NAC
FD-NAC
Encoded at 64kbps, sr=44.1kHz
<Speech> [S1 (es01)] [S4 (te1_mg54_speech)]
Original
FDK AAC-LC
TD-NAC
FD-NAC
<Mixed> [X1 (twinkle_ff51)] [X3 (SpeechOverMusic4)]
Original
FDK AAC-LC
TD-NAC
FD-NAC
<Music> [M4 (music_3)] [M5 (phi7)]
Original
FDK AAC-LC
TD-NAC
FD-NAC