Ximena Gutierrez-Vasques
I am a computational linguist with an interdisciplinary focus to deepen the study of human language. I recently joined an interdisciplinary research center in Mexico (CEIICH, UNAM), where I work in the interface between humanities and the field of AI.
My lines of research cover Multilingual NLP, Computational Morphology, and NLP for less-resourced languages of the Americas.
Before I was a postdoctoral researcher at the University of Zürich where I specialized in approaches for modeling linguistic complexity and typology using text corpora and inspired by information theory.
--------------------------------------------------------------------------------------------------------------------------------------------------------
I come from a chaotic city, on a volcanic plateau at more than 2000 m above sea level, in a country where 68 different languages are spoken. Perhaps that's part of the reason why I'm captivated by the chaos, predictability and diversity inherent to natural languages and how can we measure that.
In my free time, I like to collaborate with initiatives that encourage NLP for under-represented languages of Mexico
*I also enjoy getting to know about the history/languages/cultures around the world (and within Mexico), bikes 🚲, axolotls ≽(◕ ᴗ ◕)≼ and more...
Current location: Ciudad Universitaria, Koyowakan, Mexico City
NEWS
Keynote Speaker @ SIGTYP 2024, EACL, Malta 2024
Keynote: : Text-based Typology for Modeling Linguistic Diversity in NLP [slides]
March, 2024
We're organizing a NLP summer School!
Mexican NLP Summer School, co-located with #NAACL2024 #MexicoCity
June 2024
Area chair
I'm an area chair in the track Less-Resourced/Endangered/Less-studied Languages in LREC-COLING 2024
EMNLP 2023 presentation
Ximena Gutierrez-Vasques, Christian Bentz, Tanja Samardžić. Languages through the Looking Glass of BPE Compression
New journal article
Ximena Gutierrez-Vasques, Christian Bentz, Tanja Samardžić. Languages through the Looking Glass of BPE Compression
Computational Linguistics (2023)
Academic services, NAACL 2024
I am part of the Publicity Chairs for the upcoming NAACL 2024, which will be happening at Mexico City!
Oral presentation @ QUALICO 2023
Julia Lukasiewicz-Pater, Ximena Gutierrez-Vasques and Christian Bentz. Entropic analyses of the Voynich Manuscript using a diverse cross-linguistic corpus and neural networks
Paper accepted @ CONLL 2022
Tanja Samardzic, Ximena Gutierrez-Vasques, Rob van der Goot, Max MüllerEberstein, Olga Pelloni and Barbara Plank. On Language Spaces, Scales and Cross-Lingual Transfer of UD Parsers.
New journal article
Bentz, Christian, Gutierrez-Vasques, Ximena, Sozinova, Olga and Samardžić, Tanja. "Complexity trade-offs and equi-complexity in natural languages: a meta-analysis"
1 paper accepted@LREC 2022
Moran, S., Bentz, C., Gutierrez-Vasques, X., Sozinova, O., & Samardzic, T. TeDDi Sample: Text Data Diversity Sample for Language Comparison and Multilingual NLP.
Corpora repo: https://github.com/MorphDiv/TeDDi_sample
Book Chapter
“Relación tipo-token para contrastar la complejidad morfológica del español-náhuatl”.
Book: Ámbitos morfológicos: Descripciones y métodos. UNAM, Mayo, 2022
Authors: Haspelmath, Martín; Körtvélyessy, Lívia; ?tekauer, Pavol; Orqueda, Verónica; Toro Varela, Francisca; Arriagada Anabalón, Silvana; Esquivel Brizuela, Shaila; Espinosa Ochoa, Mary Rosa; Velázquez Elizalde, Alejandro; Gallegos Shibya, Alfonso; Mijangos de la Cruz, Víctor; Hernández Quiroz, Anselmo; Zacarías Ponce de León, Ramón; Méndez Cruz, Carlos Francisco; Arroyo Fernández, Ignacio; Gutiérrez Vasques, Ximena
Workshop Information-Theoretic Analyses of Natural Languages
Held at the DGfS Conference in Tübingen 2022 by Christian Bentz and Ximena Gutierrez-Vasques.
Course repo: https://github.com/christianbentz/Workshop_DGfS2022