Audio Samples
Display Format:
input target singing voice undefended output singing voice
input source singing voice defended output singing voice
Dual Prevention
Dataset: NUS-CMS-48 (English) Target Singer A (Female) + Lyric a
Dataset: NUS-CMS-48 (English) Target Singer B (Male) + Lyric b
Dataset: NUS-CMS-48 (English) Target Singer C (Male) + Lyric c
Dataset: NUS-CMS-48 (English) Target Singer D (Male) + Lyric d
Dataset: OpenSinger (Chinese) Target Singer E (Male) + Lyric e
Dataset: OpenSinger (Chinese) Target Singer F (Female) + Lyric f
Dataset: OpenSinger (Chinese) Target Singer F (Female) + Lyric g
Transferability
Transferability of Identity Disruption
Target Singer G Identity Encoder of Adversary: LSTM Identity Encoder of Defender: XV
Target Singer G Identity Encoder of Adversary: LSTM Identity Encoder of Defender: Auto
Target Singer H Identity Encoder of Adversary: LSTM Identity Encoder of Defender: XV
Transferability of Lyric Disruption
Lyric h Lyric Encoder of Adversary: Whisper-Medium Lyric Encoder of Defender: Whisper-Tiny
Lyric i Lyric Encoder of Adversary: Whisper-Medium Lyric Encoder of Defender: Wav2vec2
Lyric j Lyric Encoder of Adversary: Whisper-Medium Lyric Encoder of Defender: Decoar2
Code
Our code is available at