Audio CAPTCHAs are supposed to provide a strong defense for online resources; however, advances in speech-to-text mechanisms have rendered these defenses ineffective. Accordingly, demonstrably more robust audio CAPTCHAs are important to the future of a secure and accessible Web. We look to recent literature on attacks on speech-to-text systems for inspiration for the construction of robust, principle- driven audio defenses. We propose a new mechanism that is both comparatively intelligible (evaluated through a user study) and hard to automatically transcribe.
We compare our work against audio samples from the recent Kenansville paper. We provide audio samples from both methods below. Listeners can clearly see that our work produces audio that is far more intelligible than ones generated using the Kenansville method.
Our Work
Kenansville