The Attacker’s Perspective on Automatic Speaker Verification

Session Description:

Automatic speaker verification (ASV) systems have witnessed major breakthrough in the past decade and have led to many real-world practical systems. These systems are vulnerable to spoofing attacks and detection of such attacks has gained attention in the recent years [1, 2]. We believe apart from studying robust countermeasures, it is worth studying the weak aspects of ASV systems in order to make them robust against various kinds of attacks. In order to know the loopholes of the ASV systems, one needs to understand the scope of various attacks from the perspective of the attacker to deceive the systems. Such investigations and findings can lead to develop robust ASV systems for high security applications. With this special session, we plan to explore the attacker perspective on ASV that will in turn provide insights towards improving the robustness of futuristic ASV systems.

The conventional technical sessions in speaker recognition that are part of Interspeech mainly focuses to improve the ASV system performance under various adverse conditions using latest available standard corpora. Again, there are sessions on anti-spoofing and special session ASVspoof challenge series that are held biannually that are based on studying novel countermeasures for detection of spoofing attacks. In contrast to these exiting sessions, this special session will devote to find out the weak spots of existing ASV systems from the attacker’s perspective. For an attacker, the best way is to attack using the knowledge from the ASV system itself. There are three different possibilities to perform attacks using such information that are well studied in the literature of machine learning [3-8]. In addition, attacks to various spoofing countermeasures are also possible [9]. We would like explore different possible attacks to ASV and spoofing countermeasures that will showcase the weak aspects of existing systems, which can be worked upon to safeguard from such attacks.

We expect contributions from researchers from ASV as well as different fields of machine learning for this special session submission that we plan to have as an oral session presentations followed by a discussion. The prospective submissions can use any standard corpus available for ASV or spoofing countermeasures for the studies related to this session. The objective of the studies may focus on degrading the performance of existing systems by various possible attacks to highlight the loopholes in the system.

The topics are as follows but not limited to:

1/ To study the loopholes of ASV systems from attacker perspective

2/Increasing the vulnerability of ASV to various spoofing attacks for unauthorized access

(i) Voice Conversion (ii) Text-to-speech (iii) Replay (iv) Impersonation

3/ Adversarial attacks: ASV systems to assist the spoofed speech. This may include black-box, grey-box, white-box or any such attacks

4/ Attacks on spoofing countermeasures (CM)

5/ Joint attacks on ASV and spoofing CM


References

[1] Federico Alegre, Ravichander Vipperla, Nicholas Evans and Benoît Fauve, "On the vulnerability of automatic speaker recognition to spoofing attacks with artificial signals," 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), 2012, pp. 36-40.

[2] Zhizheng Wu, Nicholas Evans, Tomi Kinnunen, Junichi Yamagishi, Federico Alegre, and Haizhou Li, “Spoofing and countermeasures for speaker verification: A survey,” Speech Communication , vol. 66, pp. 130-153, February 2015.

[3] Jean-François Bonastre, Driss Matrouf, Corinne Fredouille, “Artificial impostor voice transformation effects on false acceptance rates”, Proc. Interspeech 2007, pp. 2053-2056, August 2007.

[4] Tomi Kinnunen, Rosa González Hautamäki, Ville Vestman, and Md Sahidullah, “Can we use speaker recognition technology to attack itself? enhancing mimicry attacks using automatic target speaker selection,” in Proc. IEEE ICASSP 2019 , pp. 6146–6150, May 2019.

[5] Taiki Nakamura, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima and Hiroshi Saruwatari, “V2S attack: building DNN-based voice conversion from automatic speaker verification”, Proc. 10th ISCA Speech Synthesis Workshop, pp. 161-165, September 2019.

[6] Ville Vestman, Tomi Kinnunen, Rosa González Hautamäki, and Md Sahidullah, “Voice mimicry attacks assisted by automatic speaker verification,” Computer Speech Language , vol. 59, pp. 36-54, January 2020.

[7] Mirko Marras, Paweł Korus, Naris Memon, Gianni Fenu, “Adversarial Optimization for Dictionary Attacks on Speaker Verification”, Proc. Interspeech 2019, pp. 2913-2917, September 2019.

[8] Xiaohai Tian, Rohan Kumar Das and Haizhou Li, “Black-box attacks on automatic speaker verification using feedback-controlled voice conversion” accepted in Proc. Speaker Odyssey 2020.

[9] Songxiang Liu, Haibin Wu, Hung-yi Lee and Helen Meng, “Adversarial attacks on spoofing countermeasures of automatic speaker verification”, Proc. IEEE ASRU, pp. 312-319, December 2019.


Organizers


Interspeech 2020 Important Dates [Revised]

  • Paper submission deadline: May 08, 2020
  • Paper acceptance/rejection notification: July 24, 2020
  • Interspeech 2020 Conference: October 25-29, 2020, Shanghai, China


Key Information

  • The paper(s) submitted to the special session will have the same submission/reviewing procedure as regular Interspeech paper(s)
  • The paper submission link: Submission Page