Publications
2025
Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, and Elena Volodina. 2025. The Devil’s in the Details: the Detailedness of Classes Influences Personal Information Detection and Labeling. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 697–708, Tallinn, Estonia. University of Tartu Library.
Nikolai Ilinykh and Maria Irena Szawerna. 2025. “I Need More Context and an English Translation”: Analysing How LLMs Identify Personal Information in Komi, Polish, and English. In Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2025), pages 165–178, Tallinn, Estonia. University of Tartu Library, Estonia.
Arianna Masciolini, Aleksandrs Berdicevskis, Maria Irena Szawerna, and Elena Volodina. 2025. Annotating Second Language in Universal Dependencies: a Review of Current Practices and Directions for Harmonized Guidelines. In Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025), pages 153–163, Ljubljana, Slovenia. Association for Computational Linguistics.
2024
Ricardo Muñoz Sánchez, David Alfter, Simon Dobnik, Maria Irena Szawerna, and Elena Volodina. 2024. Jingle BERT, Jingle BERT, Frozen All the Way: Freezing Layers to Identify CEFR Levels of Second Language Learners Using BERT. In Proceedings of the 13th Workshop on Natural Language Processing for Computer Assisted Language Learning, pages 137–152, Rennes, France. LiU Electronic Press.
Maria Irena Szawerna, Simon Dobnik, Therese Lindström Tiedemann, Ricardo Muñoz Sánchez, Xuan-Son Vu, and Elena Volodina. 2024. Pseudonymization Categories across Domain Boundaries. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 13303–13314, Torino, Italia. ELRA and ICCL.
Arianna Masciolini, Emilie Francis, and Maria Irena Szawerna. 2024. Synthetic-Error Augmented Parsing of Swedish as a Second Language: Experiments with Word Order. In Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD) @ LREC-COLING 2024, pages 43–49, Torino, Italia. ELRA and ICCL.
Maria Irena Szawerna. 2024. Can Stanza be Used for Part-of-Speech Tagging Historical Polish?. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 44–49, St. Julian’s, Malta. Association for Computational Linguistics.
Elena Volodina, David Alfter, Simon Dobnik, Therese Lindström Tiedemann, Ricardo Muñoz Sánchez, Maria Irena Szawerna, and Xuan-Son Vu. 2024. Proceedings of the Workshop on Computational Approaches to Language Data Pseudonymization (CALD-pseudo 2024). Association for Computational Linguistics, St. Julian’s, Malta, edition.
Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, Therese Lindström Tiedemann, and Elena Volodina. 2024. Detecting Personal Identifiable Information in Swedish Learner Essays. In Proceedings of the Workshop on Computational Approaches to Language Data Pseudonymization (CALD-pseudo 2024), pages 54–63, St. Julian’s, Malta. Association for Computational Linguistics.
Ricardo Muñoz Sánchez, Simon Dobnik, Maria Irena Szawerna, Therese Lindström Tiedemann, and Elena Volodina. 2024. Did the Names I Used within My Essay Affect My Score? Diagnosing Name Biases in Automated Essay Scoring. In Proceedings of the Workshop on Computational Approaches to Language Data Pseudonymization (CALD-pseudo 2024), pages 81–91, St. Julian’s, Malta. Association for Computational Linguistics.
2025
"The Devil’s in the Details: the Detailedness of Classes Influences Personal Information Detection and Labeling" - a presentation at NoDaLiDa/Baltic-HLT 2025, in relation to the paper above (3. March)
"“I Need More Context and an English Translation”: Analysing How LLMs Identify Personal Information in Komi, Polish, and English" - a poster presentation at the RESOURCEFUL 2025 workshop, in relation to the paper above (2. March)
"Annotating Personal Information in Swedish Texts with SPARV" - a presentation at the LM4DH 2025 workshop (10. September)
"A Construction Grammar perspective on null subjects in Polish" - a presentation at the Formal Description of Slavic Languages 18 conference (24. September)
"Annotating Personal Information in Swedish Texts with SPARV" - a poster presentation at the CLARIN Annual Conference (1. October)
"A Construction Grammar perspective on null subjects in Polish(es)" - a presentation at the CASA|Plus workshop (9. October)
"Putting the "Im" Back in Personal Information" - a presentation at the CLT workshop (22. October)
2024
"The most stupid baseline for generating pseudonyms and how it does not work" - a presentation at the Språkbanken Text End of the Year Workshop 2024 (17. December)
"Swedish Learner Essays Revisited: Further Insights into Detecting Personal Information" - a presentation of a nonarchival submission at the Tenth Swedish Language Technology Conference (SLTC) (27. November)
"AI for open research data with Grandma Karl" - a presentation at the Privacy and AI: Towards a trustworthy eco-system (AI trust) workshop, WASP HS conference (19. November)
"As words have power, names have power" - a presentation at the CLEANUP seminar in Oslo presented together with Ricardo Muñoz Sánchez, Norway (8. October)
"AI for open research data with Grandma Karl" - a presentation at the Beyond Words Theoretical, Experimental, and Computational Approaches to Language, Contexts, and Modalities workshop (3. October)
"AI for open research data with Grandma Karl" - a presentation at the Symposium on ‘Humanistic AI’ workshop (19. June)
"Detecting Personal Identifiable Information in Swedish Learner Essays" - a presentation at the CALD-Pseudo workshop hosted at EACL 2024, in relation to the paper above (21. March)
"Can Stanza be Used for Part-of-Speech Tagging Historical Polish?" - a poster presentation at the Student Research Workshop at EACL 2024, in relation to the paper above (19. March)
2023
"Detecting Personal Identifiable Information (PII)" - a presentation at the Språkbanken Text End of the Year Workshop 2023 (13. December)
"Sense and Sensitivity: what do we need to turn private information into pseudonyms?" - a presentation at Mormor Karl Open House (29. November)
Blog posts
Maria Irena Szawerna. Personal information detection in Sparv: towards a pseudonymization pipeline in the Språkbanken Text blog (April 16th, 2025).
Maria Irena Szawerna, Ricardo Muñoz Sánchez. The Lions, the Words, and the Workshops: Språkbanken Text at EACL 2024 in the Språkbanken Text blog (April 11th, 2024).
Proceedings editorial team
Elena Volodina, David Alfter, Simon Dobnik, Therese Lindström Tiedemann, Ricardo Muñoz Sánchez, Maria Irena Szawerna, and Xuan-Son Vu. 2024. Proceedings of the Workshop on Computational Approaches to Language Data Pseudonymization (CALD-pseudo 2024). Association for Computational Linguistics, St. Julian’s, Malta.
Organizing
Mind, Machine, Multimodality: Current Issues in Linguistics and Beyond 2025 Student Conference, co-organizer (24. June, 2025)
Privacy and AI: Towards a Trustworthy Ecosystem (AITrust) Workshop, co-located with WASP-HS 2024, organizing co-chair (19. November, 2024)
CALD-Pseudo Workshop, co-located with EACL 2024, organizing co-chair (21. March, 2024)
Workshop on ethics for research and teaching in natural language processing, local organizer (23. January, 2024)
Open House at Mormor Karl’s, co-organizer (29. November, 2023)
Reviewing
I have reviewed for the following venues:
NLP4CALL Workshop: 2025
RESOURCEFUL Workshop: 2025