Forced Alignment Models
I've created the following forced alignment dictionaries, acoustic models, and g2p models:
Catalan MFA Bundle
The bundle includes a Catalan Pronunciation Dictionary, Acoustic Model, and G2P Model for the Montreal Forced Aligner (McAuliffe et al. 2017). It also includes phone-level alignments of the training data, for easy acoustical analysis!
Trained on 90 hours of clean training data from the ParlamentParla Speech Corpus
The dictionary was generated using the WikiPron Web Scraper Catalan Broad Transcription dataset (Lee et al. 2020) and expanded using the G2P model to generate pronunciations of additional out-of-vocabulary tokens from the ParlamentParla Corpus.
The dictionary contains pronunciations for 155,595 words and can easily be expanded using the G2P model. Two versions of the dictionary exist - one with word probabilities and one without, for easy customizability.
Made with MFA version 3.2.3, Conda 25.5.1, Python 3.12.3, Windows 11
Vargo, Julian (2025). Catalan Montreal Forced Alignment Bundle.
Access the Catalan MFA Bundle Here
Ladino/Judeo-Spanish MFA Bundle
The bundle includes a Ladino Pronunciation Dictionary and Acoustic Model for the Montreal Forced Aligner (McAuliffe et al. 2017).
Trained on roughly 1.5 hours of data from the Ladino Database Project of the Sephardic Center of Istanbul and Wikitongues YouTube videos. While the acoustical model is serviceable, there are plans to improve the model with higher quality and a higher quantity of original audio.
The dictionary contains 13,018 unique words, generated from the Ladino Database Project, with pronunciations for common code-switches included.
Two versions of the dictionary exist - one with word probabilities and one without, for easy customizability.
Made with MFA version 3.2.3, Conda 25.5.1, Python 3.12.3, Windows 11
Developed in conjunction with Naomi Schroeter
Vargo, Julian & Schroeter, Naomi (2025). Ladino Montreal Forced Alignment Bundle.
Access the Ladino MFA Bundle Here
Romanian MFA Bundle
The bundle includes a Romanian Pronunciation Dictionary, Acoustic Model, and G2P model for the Montreal Forced Aligner (McAuliffe et al. 2017).
Trained on roughly 100 hours of data from the RSC Romanian Read Speech Corpus (Georgescu, Cucu, Buzo, Burileanu 2020).
The dictionary contains 18,252 unique words, generated from the RSC.
The G2P Model was trained on the Romanian CV dictionary (Ahn & Chodroff 2022)
Made with MFA version 3.3.8, Conda 26.1.0, Python 3.13.11, Windows 11 ARM
Vargo, Julian (2026). Romanian Montreal Forced Alignment Bundle.