Abstract: Deep noise suppression (DNS) models enjoy widespread use throughout a variety of high-stakes speech applications. However, in this paper, we show that four recent DNS models can each be reduced to outputting unintelligible gibberish through the addition of imperceptible adversarial noise. Furthermore, our results show the near-term plausibility of targeted attacks, which could induce models to output arbitrary utterances, and over-the-air attacks. While the success of these attacks varies by model and setting, and attacks appear to be strongest when model-specific (i.e., white-box and non-transferable), our results highlight a pressing need for practical countermeasures in open-source DNS systems.
Legend:
Original Audio: Normal noisy input to the model, x = r∗ (y + b)
Initial Model Output: Normal model behavior f (x), i.e., Å·Â
Attacked Audio: Original audio plus a perturbation: x + δ
Attacked Model Output: f (x + δ)
Original Audio
Initial Model Output
Attacked Audio
Attacked Model Output
Original Audio
Initial Model Output
Attacked Audio
Attacked Model Output
Original Audio
Initial Model Output
Attacked Audio
Attacked Model Output
Original Audio
Initial Model Output
Attacked Audio
Attacked Model Output