Publications
2023
Ximena Gutierrez-Vasques, Christian Bentz, Tanja Samardžić; Languages through the Looking Glass of BPE Compression. Computational Linguistics 2023; doi: https://doi.org/10.1162/coli_a_00489
2022
Tanja Samardzic, Ximena Gutierrez-Vasques, Rob van der Goot, Max MüllerEberstein, Olga Pelloni and Barbara Plank. On Language Spaces, Scales and Cross-Lingual Transfer of UD Parsers. CONLL, 2022
Kann, K., Ebrahimi, A., Mager, M., Oncevay, A., Ortega, J. E., Rios, A., Fan, A., Chiruzzo, L., Ramos, R., Meza Ruiz, I. V., Mager, E., Chaudhary, V., Neubig, G., Palmer, A., & Vu, N. T. (2022). AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas. Frontiers in Artificial Intelligence, 5. https://doi.org/10.3389/frai.2022.99566Kann, K., Ebrahimi, A., Mager, M., Oncevay, A., Ortega, J. E., Rios, A., Fan, A., Chiruzzo, L., Ramos, R., Meza Ruiz, I. V., Mager, E., Chaudhary, V., Neubig, G., Palmer, A., & Vu, N. T. (2022). AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas. Frontiers in Artificial Intelligence, 5.
Bentz, Christian, Gutierrez-Vasques, Ximena, Sozinova, Olga and Samardžić, Tanja. "Complexity trade-offs and equi-complexity in natural languages: a meta-analysis" Linguistics Vanguard, 2022. https://doi.org/10.1515/lingvan-2021-0054
Adran Israel Lerma Mayer, Ximena Gutierrez-Vasques, Ernesto Priani Saiso, Hannu Salmi. Underlying Sentiments in 1867: A Study of News Flows on the Execution of Emperor Maximilian I of Mexico in Digitized Newspaper Corpora. Digital Humanities Quarterly (DHQ)
Moran, S., Bentz, C., Gutierrez-Vasques, X., Sozinova, O., & Samardzic, T. TeDDi Sample: Text Data Diversity Sample for Language Comparison and Multilingual NLP. LREC 2022
Book chapter: “Relación tipo-token para contrastar la complejidad morfológica del español-náhuatl”. Ámbitos morfológicos: Descripciones y métodos. UNAM, Mayo, 2022
Authors: Haspelmath, Martín; Körtvélyessy, Lívia; ?tekauer, Pavol; Orqueda, Verónica; Toro Varela, Francisca; Arriagada Anabalón, Silvana; Esquivel Brizuela, Shaila; Espinosa Ochoa, Mary Rosa; Velázquez Elizalde, Alejandro; Gallegos Shibya, Alfonso; Mijangos de la Cruz, Víctor; Hernández Quiroz, Anselmo; Zacarías Ponce de León, Ramón; Méndez Cruz, Carlos Francisco; Arroyo Fernández, Ignacio; Gutiérrez Vasques, Ximena
2021
Gutierrez-Vasques, X., Bentz, C., Sozinova, O., & Samardzic, T. (2021). From characters to words: the turning point of BPE merges. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Ruzsics, T., Sozinova, O., Gutierrez-Vasques, X., & Samardzic, T. (2021l). Interpretability for Morphological Inflection: from Character-level Predictions to Subword-level Rules. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Mager, M., Oncevay, A., Ebrahimi, A., Ortega, J., Gonzales, A. R., Fan, A., ... & Kann, K. (2021). Findings of the AmericasNLP 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, NAACL.
Martínez, D. B., Mijangos, V., & Gutierrez-Vasques, X. (2021). Automatic Interlinear Glossing for Otomi language. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, NAACL
2020
Gutierrez-Vasques, X., & Mijangos, V. (2020). Productivity and Predictability for Measuring Morphological Complexity. Entropy, 22(1), 48. [paper]
Rocío Castellanos Rueda, Laura Martínez Domínguez, Ernesto Priani, Laura Angélica López Méndez y Ximena Gutiérrez-Vasques (2020). “Si los telegramas no mienten”. Origen y circulación de las noticias de la explosión del Maine en la prensa mexicana, febrero 1898. Revista de historia de América, no 159, p. 255-287.
2019
Gutierrez-Vasques, X., Medina-Urrea, A., & Sierra, G. (2019). Morphological segmentation for extracting Spanish-Nahuatl bilingual lexicon. Procesamiento del Lenguaje Natural, 63, 41-48. [paper] [slides]
2018
Ximena Gutierrez-Vasques and Victor Mijangos. (2018). Comparing morphological complexity of Spanish, Otomi and Nahuatl. In Proceedings of the Workshop on Linguistic Complexity and Natural Language Processing. Association for Computational Linguistics, Santa Fe, New-Mexico, pages 30–37. [paper] [slides]
Manuel Mager, Ximena Gutierrez-Vasques, Gerardo Sierra, and Ivan Meza. (2018). Challenges of language technologies for the indigenous languages of the Americas. Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018). [paper] [slides]
Ximena Gutierrez Vasques. “Corpus paralelo español-náhuatl y su uso en las tecnologías del lenguaje humano”. In Galina Russell, Isabel; Peña Pimentel, Miriam; Priani Saisó, Ernesto; Barrón Tovar, José Francisco; Domínguez Herbón, David; Álvarez Sánchez, Adriana (Coords), Humanidades digitales: lengua, texto, patrimonio y datos. México, Bonilla Artigas Editores. 2018.
2017
Gutierrez-Vasques, X., & Mijangos, V. (2017). Low-resource bilingual lexicon extraction using graph based word embeddings. arXiv preprint arXiv:1710.02569. Presentado en Mexican International Conference on Artificial Intelligence (MICAI) 2017
2016 and before
Gutierrez-Vasques, X., Sierra, G., & Pompa, I. H. (2016, May). Axolotl: a web accessible parallel corpus for spanish-nahuatl. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) (pp. 4210-4214). [paper]
Gutierrez-Vasques, X. (2015). Bilingual lexicon extraction for a distant language pair using a small parallel corpus. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop (pp. 154-160).
Ximena Gutierrez-Vasques, Rocío Cerbón, Elena Carolina Vargas. Recopilación de un corpus paralelo electrónico para una lengua minoritaria: el caso del español-náhuatl. Primer Congreso Internacional el Patrimonio Cultural y las Nuevas Tecnologías (INAH, 2014) [paper]