Frontiers in Deepfake Voice Detection and Beyond

Honolulu, Hawaii, USA

Call For Papers

This special session aims to address the emerging challenges posed by recent advances in deepfake audio generation, and to promote research beyond conventional anti-spoofing tasks. We seek to foster discussion and innovation in tackling new forms of deepfakes while ensuring robustness against existing threats and mitigating catastrophic forgetting. Specifically, we welcome contributions that explore novel detection techniques, evaluation protocols, and datasets related but not limited to:

- Partial Spoof Detection, Localization, and Diarization.
  - Although detection and localization in this scenario is preliminary discussed in the ADD challenge, it was limited to their defined case. How to defend advanced partial spoofing attacks beyond replacing individual words or randomly selected segments is still neglected.
  - New dataset or evaluation protocol developed on but not limited to PartialSpoof, HAD, ADD, Psynd, LlamaPartialSpoof, etc.
  - New datasets, metrics, toolkits, analyses on Detection, Localization, or Diarization.
- Detection of Codec-Based Deepfake Speech (CodecFake)
  - Codec-based speech generation systems are rapidly evolving. However, methods for detecting deepfake speech generated by such systems (known as CodecFake) or tracing it back to the original codec algorithm, have not yet been fully explored
  - New algorithms developed for CodecFake detection with CodecFake+ and/or other codecFake types of databases
  - CodecFake source tracing
- Proactive Defense
  - Disruption approaches to prevent misuse of the data.
  - Other proactive deepfake defense during the content creation or distribution.
- Singing Voice Deepfake Detection
  - New algorithms developed on but not limited to CtrSVDD, WildSVDD, SONICS, etc.
  - AI-generated full-song detection
  - Source tracing and source singer identification
- Multimodal Deepfake Detection
  - Leveraging more than audio modality for more reliable detection.
  - Databases, frameworks, and fusion strategies for multimodal spoofing scenarios.
- Adversarial Attack and Defenses
- Generalization and/or Domain Adaptation
  - Proposing adaptation or generalization algorithms for improving performances under cross-domain and/or cross-condition. Includes but not limited to different language/environment/speaker, unknown spoofing methods, etc.
  - Out-of-distribution detection, data drift detection, anomaly detection, etc.
  - Continual learning and catastrophic forgetting
- Source Tracing
  - Out-of-domain source tracing (datasets, evaluation metric, benchmark)
  - Task definition, especially for fake speech generated by emerging synthesis technologies, e.g., CodecFake
  - Source tracing beyond attribute-based multiclass classification and spoofing method classification
- Human Perception vs. Machine Detection
- Analysis, Explainability, and Evaluation on the Deepfake Detection. Because most of the existing models are black-box, their decisions of fake or real cannot be explained. We don't know whether their decisions are due to spurious correlations. The black-box models' decisions cannot be used to support the decision of the human users, either.
  - Proposing detection models that are explainability-by-design
  - Explainable AI to explore the mechanism about how models make decisions. explainability using but not limited to Shapley additive explanations, local interpretable model-agnostic explanations, and so on.
  - Reliable metrics for deepfake detection and emerging tasks, includes but not limited to deepfake detection, spoofing-aware speaker verification, source tracing, etc.
- Defenses Against Other Attacks (e.g., Shallow Fakes, Tampering, Fake Emotion, etc.)

Submission Instructions

Submission Link: Submit Paper

Instruction: We are following the same Author Instructions for ASRU 2025. When submitting, please be sure to select: "SS2. Frontiers in Deepfake Voice Detection and Beyond" as your primary subject area to ensure your paper is properly considered for inclusion.

Tip: If you are interested in responsible generative AI, we encourage you to consider submitting to our partner special session at IEEE ASRU 2025: "Responsible Speech and Audio Generative AI." ⚔️ 🛡️

Important Dates (Anywhere on Earth)

Paper submissions open: March 28, 2025 (Welcome your submissions!)
Paper submissions due: May 28, 2025
Paper revision due: June 4, 2025
Acceptance Notification: August 6, 2025
Camera-ready Deadline: August 13, 2025
Workshop Date: [TBA]

Page updated

Google Sites

Report abuse