Publications

Ximena Gutierrez-Vasques, Christian Bentz, Tanja Samardžić; Languages through the Looking Glass of BPE Compression. Computational Linguistics 2023; doi: https://doi.org/10.1162/coli_a_00489

Tanja Samardzic, Ximena Gutierrez-Vasques, Rob van der Goot, Max MüllerEberstein, Olga Pelloni and Barbara Plank. On Language Spaces, Scales and Cross-Lingual Transfer of UD Parsers. CONLL, 2022

Kann, K., Ebrahimi, A., Mager, M., Oncevay, A., Ortega, J. E., Rios, A., Fan, A., Chiruzzo, L., Ramos, R., Meza Ruiz, I. V., Mager, E., Chaudhary, V., Neubig, G., Palmer, A., & Vu, N. T. (2022). AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas. Frontiers in Artificial Intelligence, 5. https://doi.org/10.3389/frai.2022.99566Kann, K., Ebrahimi, A., Mager, M., Oncevay, A., Ortega, J. E., Rios, A., Fan, A., Chiruzzo, L., Ramos, R., Meza Ruiz, I. V., Mager, E., Chaudhary, V., Neubig, G., Palmer, A., & Vu, N. T. (2022). AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas. Frontiers in Artificial Intelligence, 5. 

Bentz, Christian, Gutierrez-Vasques, Ximena, Sozinova, Olga and Samardžić, Tanja. "Complexity trade-offs and equi-complexity in natural languages: a meta-analysis" Linguistics Vanguard, 2022. https://doi.org/10.1515/lingvan-2021-0054

Adran Israel Lerma Mayer, Ximena Gutierrez-Vasques, Ernesto Priani Saiso, Hannu Salmi. Underlying Sentiments in 1867: A Study of News Flows on the Execution of Emperor Maximilian I of Mexico in Digitized Newspaper Corpora. Digital Humanities Quarterly (DHQ)

Moran, S., Bentz, C., Gutierrez-Vasques, X., Sozinova, O., & Samardzic, T. TeDDi Sample: Text Data Diversity Sample for Language Comparison and Multilingual NLP. LREC 2022

Book chapter: “Relación tipo-token para contrastar la complejidad morfológica del español-náhuatl”.   Ámbitos morfológicos: Descripciones y métodos. UNAM, Mayo, 2022

Authors: Haspelmath, Martín; Körtvélyessy, Lívia; ?tekauer, Pavol; Orqueda, Verónica; Toro Varela, Francisca; Arriagada Anabalón, Silvana; Esquivel Brizuela, Shaila; Espinosa Ochoa, Mary Rosa; Velázquez Elizalde, Alejandro; Gallegos Shibya, Alfonso; Mijangos de la Cruz, Víctor; Hernández Quiroz, Anselmo; Zacarías Ponce de León, Ramón; Méndez Cruz, Carlos Francisco; Arroyo Fernández, Ignacio; Gutiérrez Vasques, Ximena


Gutierrez-Vasques, X., Bentz, C., Sozinova, O., & Samardzic, T. (2021). From characters to words: the turning point of BPE merges. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Ruzsics, T., Sozinova, O., Gutierrez-Vasques, X., & Samardzic, T. (2021l). Interpretability for Morphological Inflection: from Character-level Predictions to Subword-level Rules. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Mager, M., Oncevay, A., Ebrahimi, A., Ortega, J., Gonzales, A. R., Fan, A., ... & Kann, K. (2021). Findings of the AmericasNLP 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, NAACL.


Martínez, D. B., Mijangos, V., & Gutierrez-Vasques, X. (2021). Automatic Interlinear Glossing for Otomi language. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, NAACL


Gutierrez-Vasques, X., & Mijangos, V. (2020). Productivity and Predictability for Measuring Morphological Complexity. Entropy, 22(1), 48. [paper]


Rocío Castellanos Rueda, Laura Martínez Domínguez, Ernesto Priani,  Laura Angélica López Méndez y Ximena Gutiérrez-Vasques (2020). “Si los telegramas no mienten”. Origen y circulación de las noticias de la explosión del Maine en la prensa mexicana, febrero 1898. Revista de historia de América,  no 159, p. 255-287. 


Gutierrez-Vasques, X., Medina-Urrea, A., & Sierra, G. (2019). Morphological segmentation for extracting Spanish-Nahuatl bilingual lexicon. Procesamiento del Lenguaje Natural, 63, 41-48. [paper] [slides]




Ximena Gutierrez-Vasques and Victor Mijangos. (2018). Comparing  morphological complexity of Spanish, Otomi and Nahuatl. In Proceedings  of the Workshop on Linguistic Complexity and Natural Language Processing.  Association for Computational Linguistics, Santa Fe, New-Mexico, pages 30–37. [paper] [slides]

Manuel  Mager, Ximena  Gutierrez-Vasques,  Gerardo Sierra, and  Ivan Meza. (2018). Challenges  of language technologies for the indigenous languages of the Americas. Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018). [paper] [slides]

Ximena Gutierrez Vasques. “Corpus paralelo español-náhuatl y su uso en las tecnologías del lenguaje humano”. In Galina Russell, Isabel; Peña Pimentel, Miriam; Priani Saisó, Ernesto; Barrón Tovar, José Francisco; Domínguez Herbón, David; Álvarez Sánchez, Adriana (Coords), Humanidades digitales: lengua, texto, patrimonio y datos. México, Bonilla Artigas Editores. 2018.


Gutierrez-Vasques, X., & Mijangos, V. (2017). Low-resource bilingual lexicon extraction using graph based word embeddings. arXiv preprint arXiv:1710.02569. Presentado en Mexican International Conference on Artificial Intelligence (MICAI) 2017


Gutierrez-Vasques, X., Sierra, G., & Pompa, I. H. (2016, May). Axolotl: a web accessible parallel corpus for spanish-nahuatl. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) (pp. 4210-4214). [paper]


Gutierrez-Vasques, X. (2015). Bilingual lexicon extraction for a distant language pair using a small parallel corpus. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop (pp. 154-160). 


Ximena Gutierrez-Vasques, Rocío Cerbón, Elena Carolina Vargas. Recopilación de un corpus paralelo electrónico para una lengua minoritaria: el caso del español-náhuatl. Primer Congreso Internacional el Patrimonio Cultural y las Nuevas Tecnologías (INAH, 2014) [paper]