Publications

2026

[Nat. Comm.’26] OntoLearner: A Modular Python Library for Ontology Learning with Large Language Models. Hamed Babaei Giglou, Jennifer D'Souza, Andrei Aioanei, Nandana Mihindukulasooriya, Sören Auer. In review Nature Communications. (2026) preprint, code, documentation, python library, HF dataset

[DMLR’26] The Unreasonable Benchmark. Fernando Perez-Cruz, Briland Hitaj, Giovanna Maria Dimitri, Oishi Deb, Amine M'Charrak, Shikhar Srivastava, Christopher Kanan, Anup Kumar Gupta, Zefang Liu, Tarun Kumar, Álvaro Galisteo Bermúdez, Lisa Alazraki, Vasilios Mavroudis, Yuyang Xue, Brian Matejek, Swati Rajwal, Avinash Kumar Pandey, Ayyüce Begüm Bektaş, Anil B Murthy, Feng Chen, Hamed Babaei Giglou, Jennifer D'Souza, Huascar Sanchez, Michael Kraus, Isamu Lautenschläger, Ziyang Wang, Xuanli He, Hyun Song Shin. In: Data-centric Machine Learning Research. (2026) journal article, benchmark

[SWJ’26] A Framework for Assessing LLM Consistency in Knowledge Engineering. Mohammad Javad Saeedizade, Reham Alharbi, Hamed Babaei Giglou, Anna Sofia Lippolis, Eva Blomqvist, Valentina Tamma, Floriana Grasso, Terry Payne, Jennifer D'Souza, Sören Auer, Andrea Giovanni Nuzzolese, Robin Keskisärkkä, Zebah Valeyil. In: Semantic Web Journal. (2026). journal article, code

[JVST A’26] Designing an agentic AI workflow for structured information extraction from scientific text: A case study on atomic layer deposition of ZnO and IGZO. Sameer Sadruddin, Eleni Poupaki, Jennifer D’Souza, Sören Auer, Alex Watkins, Bora Karasulu, Adriaan J. M. Mackus, Erwin Kessels. (Jan 2026). J. Vac. Sci. Technol. A 1 May 2026; 44 (3): 032413. JVSTA special issue

[IRCDL'26] Diagnosing Structural Failures in LLM-based Evidence Extraction for Meta-analysis. Zhiyin Tan and Jennifer D’Souza. In: Proceedings of the 22nd Conference on Information and Research Science Connecting to Digital and Library Science (IRCDL 2026), Modena, Italy. (Feb 2026). CEUR proceedings, arxiv, slides (Winner: Best paper award, Award certificate)

[SWJ’26] SCHEMA-MINERpro: Agentic AI for Ontology Grounding over LLM-Discovered Scientific Schemas in a Human-in-the-Loop Workflow. Sameer Sadruddin, Jennifer D’Souza, Eleni Poupaki, Alex Watkins, Bora Karasulu, Sören Auer, Adrie Mackus, and Erwin Kessels. In: Semantic Web Journal, Special Issue on Large Language Models, Generative AI and Knowledge Graphs. (2026). journal article, code

[Information’26] KGEval: Evaluating Scientific Knowledge Graphs with Large Language Models. Vladyslav Nechakhin, Jennifer D’Souza, Steffen Eger, and Sören Auer. In Information 17, no. 1: 35. (2026). Open-access journal article

[JVST A’26] Publishing FAIR and Machine-actionable Reviews in Materials Science: The Case for Symbolic Knowledge in Neuro-symbolic Artificial Intelligence. Jennifer D’Souza, Sören Auer, Eleni Poupaki, Alex Watkins, Anjana Devi, Riikka L. Puurunen, Bora Karasulu, Adrie Mackus, Erwin Kessels. (Jan 2026). J. Vac. Sci. Technol. A 1 May 2026; 44 (3): 032408. arxiv, JVSTA special issue, code, dataset (Editor's pick)

2025

[arxiv] Transforming Science with Large Language Models: A Survey on AI-Assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation. Steffen Eger, Yong Cao, Jennifer D'Souza, Andreas Geiger, Christian Greisinger, Stephanie Gross, Yufang Hou, Brigitte Krenn, Anne Lauscher, Yizhi Li, Chenghua Lin, Nafise Sadat Moosavi, Wei Zhao, and Tristan Miller. preprint, repo, In review for ACM-CSUR

[Sci-K@ISWC’25] Towards AI-Supported Research: a Vision of the TIB AIssistant. Sören Auer, Allard Oelen, Mohamad Yaser Jaradeh, Mutahira Khalid, Farhana Keya, Sasi Kiran Gaddipati, Jennifer D’Souza, Lorenz Schlüter, Amirreza Alasti, Gollam Rabby, Azanzi Jiomekong and Oliver Karras. In: 5th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment (Sci-K), Nara, Japan. (Nov 2025). proceedings, slides, code

[LLMs4OL@ISWC’25] LLMs4OL 2025 Overview: The 2nd Large Language Models for Ontology Learning Challenge. Hamed Babaei Giglou, Jennifer D'Souza, Nandana Mihindukulasooriya, and Sören Auer. In: Open Conference Proceedings, 6, LLMs4OL 2025: The 2nd Large Language Models for Ontology Learning Challenge at the 24th ISWC, Nara, Japan. (Nov 2025). TIB proceedings, slides, dataset, website, codalab

[ISWC’25] MammoTab 25: A Large-Scale Dataset for Semantic Table Interpretation -- Training, Testing, and Detecting Weaknesses. Marco Cremaschi, Federico Belotti, Jennifer D'Souza and Matteo Palmonari. In: The Semantic Web -- 24th International Semantic Web Conference (ISWC 2025), Nara, Japan. (Nov 2025). Springer proceedings, slides, code, website, semtab'25 (Winner: Best resource paper award, Award certificate)

[OM’25] OntoAligner Meets Knowledge Graph Embedding Aligners. Hamed Babaei Giglou, Jennifer D'Souza, Sören Auer, Mahsa Sanaei. In: Proceedings of the 20th International Workshop on Ontology Matching (OM-2025), collocated with the 24th International Semantic Web Conference (ISWC-2025), Nara, Japan, November 2–3, 2025. CEUR proceedings, slides, arxiv

[IJDL’25] Toward Purpose-oriented Topic Model Evaluation enabled by Large Language Models. Zhiyin Tan and Jennifer D’Souza. In: International Journal on Digital Libraries, to appear (2025). proceedings, arxiv, code

[SymGenAI4Sci’25] DeepResearch^Eco: A Recursive Agentic Workflow for Complex Scientific Question Answering in Ecology. Jennifer D’Souza, Endres Keno Sander, Andrei Aioanei. In: Proceedings of the First International Workshop on Symbolic and Generative AI for Science (SymGenAI4Sci 2025), co-located with SEMANTiCS 2025, September 3–5, 2025, Vienna, Austria. Workshop Proceedings. proceedings, arxiv, slides, code, dataset, medium blogpost

[SemEval'25] SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog. Jennifer D'Souza, Sameer Sadruddin, Holger Israel, Mathias Begoin, Diana Slawig. In: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1082--1095, Vienna, Austria. Association for Computational Linguistics. proceedings, arxiv, code, dataset

[ACL'25] YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering. Jennifer D’Souza, Hamed Babaei Giglou, Quentin Münch. In: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13749–13783, Vienna, Austria. Association for Computational Linguistics. proceedings, arxiv, code, dataset

[Code4Lib'25] Taming the Generative AI Wild West: Integrating Knowledge Graphs in Digital Library Systems. Jennifer D'Souza. In: the Code4Lib Journal, Issue 60. (Apr 2025). article, ORKG comparison, ORKG review, code & dataset

[EcoEvoRxiv'25] Reflections from the 2025 EcoHack: AI & LLM Hackathon for Applications in Evidence-based Ecological Research & Practice. Jennifer D'Souza, Tarek Al Mustafa, Daphne Frederike Auer, Sarah T. Bachinger, Marc Brinner, Caren Daniel, Nayanika Das, Alexander Espig, Nadeen Fathallah, Edward Gow-Smith, Lorenz Gunreben, Nico Heider, Hrishikesh Jadhav, Basma Jalloul, Vamsi Krishna Kommineni, Samira Korani, Andrii Krutsylo, Zijian Ling, Vaishnavi Mendu, Shuhan Miao, Bartolome Ortiz-Viso, Anne Peter, Moritz Plenz, Javad Razavian, Moiz Khan Sherwani, Mir Nafis Sharear Shopnil, Will Woof, Birgitta König-Ries, and Tina Heger. In: EcoEvoRxiv, preprint. (April 2025). ecoevorxiv, code

[ESWC'25] LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models. Sameer Sadruddin, Jennifer D'Souza, Eleni Poupaki, Alex Watkins, Hamed Babaei Giglou, Anisa Rula, Bora Karasulu, Sören Auer, Adrie Mackus, and Erwin Kessels. In: The Semantic Web. ESWC 2025. Lecture Notes in Computer Science, vol 15719. Springer, Cham. Portorož, Slovenia. (Jun 2025). proceedings, arxiv, slides, code, dataset

[ESWC'25] OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment. Hamed Babaei Giglou, Jennifer D’Souza, Oliver Karras, and Sören Auer. In: The Semantic Web. ESWC 2025. Lecture Notes in Computer Science, vol 15719. Springer, Cham. Portorož, Slovenia. (Jun 2025). proceedings, arxiv, slides, code (Winner: Best resource paper award, Award certificate)

[ESWC'25] Research Knowledge Graphs: the Shifting Paradigm of Scholarly Information Representation. Matthäus Zloch, Danilo Dessì, Jennifer D'Souza, Leyla Jael Castro, Benjamin Zapilko, Saurav Karmakar, Brigitte Mathiak, Markus Stocker, Wolfgang Otto, Sören Auer and Stefan Dietze. In: The Semantic Web. ESWC 2025. Lecture Notes in Computer Science, vol 15719. Springer, Cham. Portorož, Slovenia. (Jun 2025). proceedings, slides (Nominee: Best In-Use Paper Award)

[NLP4Ecology'25] Mining for Species, Locations, Habitats, and Ecosystems from Scientific Papers in Invasion Biology: A Large-Scale Exploratory Study with Large Language Models. Jennifer D’Souza, Zachary Laubach, Tarek Al Mustafa, Sina Zarrieß, Robert Frühstückl, and Phyllis Illari. In: NLP4Ecology 2025 the 1st Workshop on Ecology, Environment, and Natural Language Processing, University of Tartu Library, pages 16–23, Tallinn, Estonia. (March 2025). ACL proceedings, arxiv, slides, code, dataset

[IRCDL'25] Bridging the Evaluation Gap: Leveraging Large Language Models for Topic Model Evaluation. Zhiyin Tan and Jennifer D’Souza. In: Proceedings of the 21st Conference on Information and Research Science Connecting to Digital and Library Science (IRCDL 2025), Udine, Italy. (Feb 2025). CEUR proceedings, arxiv, slides, code

[IOS Press'25] Open Research Knowledge Graph: A Large-Scale Neuro-Symbolic Knowledge Organization System. Sören Auer, Jennifer D'Souza, Kheir Eddine Farfar, Mohamad Yaser Jaradeh, Azanzi Jiomekong, Oliver Karras, Allard Oelen, Lauren Snyder, Markus Stocker, Lars Vogt. In: Handbook on Neurosymbolic AI and Knowledge Graphs, Frontiers in Artificial Intelligence and Applications (FAIA), vol. 400, IOS Press, pp. 385–420. (2025). article

2024

[JCDL'24] LLMs4Synthesis: Leveraging Large Language Models for Scientific Synthesis. Hamed Babaei Giglou, Jennifer D’Souza, and Sören Auer. In: Proceedings of the 24th ACM/IEEE Joint Conference on Digital Libraries. Association for Computing Machinery, Article 31, 1–12, Hong Kong. (Dec 2024). ACM proceedings, arxiv, slides, code, dataset

[LLMs4OL@ISWC'24] LLMs4OL 2024 Overview: The 1st Large Language Models for Ontology Learning Challenge. Hamed Babaei Giglou, Jennifer D’Souza, and Sören Auer. Open Conference Proceedings, 4, 3–16, Baltimore, MD, USA. (Nov 2024). TIB proceedings, arxiv, slides, website, codalab

[LLMs4OL@ISWC'24] LLMs4OL 2024 Datasets: Toward Ontology Learning with Large Language Models. Hamed Babaei Giglou, Jennifer D’Souza, Sameer Sadruddin, and Sören Auer. Open Conference Proceedings, 4, 17-30. (Nov 2024). TIB proceedings, dataset

[ISWC'24] From Keywords to Structured Summaries: Streamlining Scholarly Information Access. Mahsa Shamsabadi and Jennifer D’Souza. Posters, Demos, and Industry Tracks at ISWC 2024, 4, 17-30, Baltimore, MD, USA. (Nov 2024). CEUR proceedings, arxiv, frontend code, backend code, live web application

[KONVENS'24] Large Language Models as Evaluators for Scientific Synthesis. Julia Evans, Jennifer D’Souza, and Sören Auer. In Proceedings of the 20th Conference on Natural Language Processing, pp. 1–22, Vienna, Austria. (Sep 2024). ACL proceedings, arxiv

[CLEF'24] Overview of the CLEF 2024 SimpleText Track: Improving Access to Scientific Texts for Everyone. Liana Ermakova, Eric SanJuan, Stéphane Huet, Hosein Azarbonyad, Giorgio Maria Di Nunzio, Federica Vezzani, Jennifer D’Souza, and Jaap Kamps. In Experimental IR Meets Multilinguality, Multimodality, and Interaction: 15th International Conference of the CLEF Association, Grenoble, France. (Sep 2024). ACM proceedings, slides, website

[CLEF'24] Overview of the CLEF 2024 SimpleText Task 4: SOTA? Tracking the State-of-the-Art in Scholarly Publications. Jennifer D’Souza, Salomon Kabongo, Hamed Babaei Giglou, and Yue Zhang. In Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, pp. 3163-3173, Grenoble, France. (Sep 2024). CEUR proceedings, slides, website, codalab, dataset

[CLEF'24] Exploring the Latest LLMs for Leaderboard Extraction. Salomon Kabongo, Jennifer D’Souza, and Sören Auer. In Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings, pages 3246-3260, Grenoble, France. (Sep 2024). CEUR proceedings, slides, dataset

[NLDB'24] Effective Context Selection in LLM-Based Leaderboard Generation: An Empirical Study. Salomon Kabongo Kabenamualu , Jennifer D’Souza, and Sören Auer. In Natural Language Processing and Information Systems. Lecture Notes in Computer Science, vol 14763, Turin, Italy. (Jun 2024). Springer proceedings, arxiv, dataset

[NLDB'24] A FAIR and Free Prompt-based Research Assistant. Mahsa Shamsabadi and Jennifer D’Souza. In Natural Language Processing and Information Systems. Lecture Notes in Computer Science, vol 14763, Turin, Italy. (Jun 2024). Springer proceedings, arxiv, slides, code

[Information'24] Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph. Vladyslav Nechakhin, Jennifer D’Souza, and Steffen Eger. In Information 15, no. 6: 328. (Jun 2024). Information journal article, arxiv, dataset

[ESWC'24] LLMs4OM: Matching Ontologies with Large Language Models. Hamed Babaei Giglou, Jennifer D’Souza, Felix Engel, and Sören Auer. In Extended Semantic Web Conference, Special Track on Large Language Models for Knowledge Engineering, Crete, Greece. (May 2024). Springer proceedings, arxiv, slides, code

[NSLP'24] Scholarly Question Answering Using Large Language Models in the NFDI4DataScience Gateway. Hamed Babaei Giglou, Tilahun Abedissa Taffa, Rana Abdullah, Aida Usmanova, Ricardo Usbeck, Jennifer D'Souza, and Sören Auer. In Natural Scientific Language Processing and Research Knowledge Graphs. Lecture Notes in Computer Science, vol 14770, Crete, Greece. (May 2024). Springer proceedings, arxiv, slides, code, workshop website

[ECIR'24] Overview of the CLEF 2024 SimpleText Track: Improving Access to Scientific Texts for Everyone. Liana Ermakova, Eric SanJuan, Stéphane Huet, Hosein Azarbonyad, Giorgio Maria Di Nunzio, Federica Vezzani, Jennifer D’Souza, Salomon Kabongo, Hamed Babaei Giglou, Yue Zhang, Sören Auer and Jaap Kamps. In Advances in Information Retrieval. Lecture Notes in Computer Science, vol 14613, Glasgow, Scotland. (Mar 2024). Springer proceedings

[EACL-Findings'24] Large Language Models for Scientific Information Extraction: An Empirical Study for Virology. Mahsa Shamsabadi, Jennifer D’Souza, and Sören Auer. In Findings of the Association for Computational Linguistics: EACL, pp. 374–392, St. Julian’s, Malta. (Mar 2024). ACL proceedings, arxiv, slides, video, dataset, code, huggingface models

[JLIS'24] Quality Assessment of Research Comparisons in the Open Research Knowledge Graph: A Case Study. Jennifer D’Souza, Hassan Hussein, Julia Evans, Lars Vogt, Oliver Karras, Vinodh Ilangovan, Anna-Lena Lorenz, and Sören Auer. In JLIS.it, 15(1), pp. 126–143. (Jan 2024). JLIS journal article

[Knowledge'24] Agriculture Named Entity Recognition—Towards FAIR, Reusable Scholarly Contributions in Agriculture. Jennifer D’Souza. In Knowledge 4, no. 1: 1-26. (Jan 2024). Knowledge journal article, dataset, code

2023

[K-CAP'23] Procedural Text Mining with Large Language Models. Anisa Rula and Jennifer D'Souza. In Proceedings of the 12th Knowledge Capture Conference, Pensacola, Florida, USA. (Dec 2023). ACM proceedings, arxiv, dataset

[ICADL'23] Toward Semantic Publishing in Non-invasive Brain Stimulation: A Comprehensive Analysis of rTMS Studies. Swathi Anil and Jennifer D'Souza. In 25th International Conference on Asia-Pacific Digital Libraries. Lecture Notes in Computer Science, vol 14458. (Dec 2023). Springer proceedings, arxiv, slides, video, ORKG comparison parts: one, two, three, four, five, and six

[ISWC'23] LLMs4OL: Large Language Models for Ontology Learning. Hamed Babaei Giglou, Jennifer D’Souza and Sören Auer. In The Semantic Web -- 22nd International Semantic Web Conference, vol. 14265, Athens, Greece. (Nov 2023). Springer proceedings, arxiv, slides, code

[Informatik'23] Research Knowledge Graphs in NFDI4DS. Saurav Karmakar, Matthäus Zloch, Fidan Limani, Benjamin Zapilko, Sharmila Upadhyaya, Jennifer D'Souza, Leyla Jael Castro, Georg Rehm, Marcel R. Ackermann, Harald Sack, Zeyd Boukhers, Sonja Schimmler, Peter Mutschke, and Stefan Dietze. In INFORMATIK 2023 - Designing Futures: Zukünfte gestalten. Bonn: Gesellschaft für Informatik e.V. (Sep 2023). Informatik proceedings

[Onto4FAIR'23] Towards FAIR Semantic Publishing of Research Dataset Metadata in the Open Research Knowledge Graph. Raia Abu Ahmad, Jennifer D’Souza, Matthäus Zloch, Wolfgang Otto, Georg Rehm, Allard Oelen, Stefan Dietze, and Sören Auer. In Joint Proceedings of the Onto4FAIR 2023 Workshops: Collocated with 13th International Conference on Formal Ontology in Information Systems (FOIS 2023) and 19th International Conference on Semantic Systems (SEMANTICS 2023), pp.23-31, Leipzig, Germany. (Sep 2023). HAL Science proceedings, arxiv, slides, workshop website

[CoRDI'23] Organizing Scholarly Knowledge in the Open Research Knowledge Graph: An Open-Science Platform for FAIR Scholarly Knowledge. Sören Auer, Markus Stocker, Oliver Karras, Allard Oelen, Jennifer D'Souza, and Anna-Lena Lorenz . In Proceedings of the Conference on Research Data Infrastructure, vol. 1, Karlsruhe, Germany. (Sep 2023). TIB proceedings

[NLP4KGC'23] Similar Papers Recommendation for Research Comparisons. Vladyslav Nechakhin and Jennifer D’Souza. In Joint Workshop Proceedings of the 5th International Workshop on A Semantic Data Space For Transport (Sem4Tra) and 2nd NLP4KGC: Natural Language Processing for Knowledge Graph Construction (SEMANTiCS 2023), vol 3510, Leipzig, Germany. (Sep 2023). CEUR proceedings, slides, workshop website

[NLP4KGC'23] Probing Large Language Models for Scientific Synonyms. Freya Thießen, Jennifer D’Souza, and Markus Stocker. In Joint Workshop Proceedings of the 5th International Workshop on A Semantic Data Space For Transport (Sem4Tra) and 2nd NLP4KGC: Natural Language Processing for Knowledge Graph Construction (SEMANTiCS 2023), vol 3510, Leipzig, Germany. (Sep 2023). CEUR proceedings, workshop website

[DeXa'23] Evaluating Prompt-Based Question Answering for Object Prediction in the Open Research Knowledge Graph. Jennifer D’Souza, Moussab Hrou, and Sören Auer. In 34th Database and Expert Systems Applications. Lecture Notes in Computer Science, vol 14146, Penang, Malaysia. (Aug 2023). Springer proceedings, arxiv, slides, video, dataset, code, huggingface models

[JCDL'23] Zero-Shot Entailment of Leaderboards for Empirical AI Research. Salomon Kabongo, Jennifer D'Souza, and Sören Auer. In 2023 ACM/IEEE Joint Conference on Digital Libraries, Santa Fe, NM, USA. (Jun 2023). IEEE proceedings, arxiv, slides

[IJDL'23] ORKG-Leaderboards: A Systematic Workflow for Mining Leaderboards as a Knowledge Graph. Salomon Kabongo, Jennifer D'Souza, and Sören Auer. In International Journal of Digital Libraries, vol. 25, pp. 41–54. (Jun 2023). Springer journal article, arxiv

[FAIRConnect'23] FAIR scientific information with the Open Research Knowledge Graph. Markus Stocker, Allard Oelen, Mohamad Yaser Jaradeh, Muhammad Haris, Omar Arab Oghli, Golsa Heidari, Hassan Hussein, Anna-Lena Lorenz, Salomon Kabenamualu, Kheir Eddine Farfar, Manuel Prinz, Oliver Karras, Jennifer D’Souza, Lars Vogt, and Sören Auer. In FAIR Connect, vol. 1, no. 1, pp. 19-21. (Jan 2023). FAIR Connect journal article

2022

[Knowledge'22] Overview of STEM Science as Process, Method, Material, and Data Named Entities. Jennifer D’Souza. In Knowledge 2, no. 4: 735-754. (Dec 2022). Knowledge journal article, dataset

[ICADL'22] Computer Science Named Entity Recognition in the Open Research Knowledge Graph. Jennifer D’Souza and Sören Auer. In 24th International Conference on Asia-Pacific Digital Libraries. Lecture Notes in Computer Science, vol 13636, Hanoi, Vietnam. (Nov-Dec 2022). Springer proceedings, arxiv, slides, video, dataset, code, python package, rest api

[ICADL'22] Clustering Semantic Predicates in the Open Research Knowledge Graph. Omar Arab Oghli, Jennifer D'Souza, and Sören Auer. In 24th International Conference on Asia-Pacific Digital Libraries. Lecture Notes in Computer Science, vol 13636, Hanoi, Vietnam. (Nov-Dec 2022). Springer proceedings, arxiv, dataset, code

[WIESP'22] NLPSharedTasks: A Corpus of Shared Task Overview Papers in Natural Language Processing Domains. Anna Martin, Ted Pedersen, and Jennifer D’Souza. In Proceedings of the first Workshop on Information Extraction from Scientific Publications, pages 105–120, Online. (Nov 2022). ACL proceedings, dataset

[DeXa'22] The Digitalization of Bioassays in the Open Research Knowledge Graph. Jennifer D’Souza, Anita Monteverdi, Muhammad Haris, Marco Anteghini, Kheir Eddine Farfar, Markus Stocker, Vitor A. P. Martins dos Santos, and Sören Auer. In 33rd Database and Expert Systems Applications. Lecture Notes in Computer Science, vol 13426, Vienna, Austria. (Aug 2022). Springer proceedings, arxiv, slides

[NLE'22] Ranking Facts for Explaining Answers to Elementary Science Questions. Jennifer D’Souza. In Natural Language Engineering 29, no. 2: 228–53. (Jan 2022). NLE journal article

2021

[ICADL'21] Automated Mining of Leaderboards for Empirical AI Research. Salomon Kabongo, Jennifer D’Souza, and Sören Auer. In 23rd International Conference on Asia-Pacific Digital Libraries. Lecture Notes in Computer Science, vol 13133, Online. (Dec 2021). Springer proceedings, arxiv, slides, code (Winner: Best paper award)

[ICADL'21] Pattern-Based Acquisition of Scientific Entities from Scholarly Article Titles. Jennifer D’Souza and Sören Auer. In 23rd International Conference on Asia-Pacific Digital Libraries. Lecture Notes in Computer Science, vol 13133, Online. (Dec 2021). Springer proceedings, arxiv, slides, video, dataset

[AIxIA'21] Easy Semantification of Bioassays. Marco Anteghini, Jennifer D'Souza, Vitor A. P. Martins dos Santos, and Sören Auer. In AIxIA 2021 – Advances in Artificial Intelligence. Lecture Notes in Computer Science, vol 13196, Milan, Italy & Online. (Nov-Dec 2021). Springer proceedings, arxiv

[IJDL'21] Evaluating BERT-based scientific relation classifiers for scholarly knowledge graph construction on digital library collections. Jennifer D’Souza. In International Journal of Digital Libraries 23, no. 2, pp. 197-215. (Nov 2021). IJDL journal article

[SemEval'21] SemEval-2021 Task 11: NLPContributionGraph - Structuring Scholarly NLP Contributions for a Research Knowledge Graph. Jennifer D’Souza, Sören Auer, and Ted Pedersen. In Proceedings of the 15th International Workshop on Semantic Evaluation, pages 364–376, Online. (Nov 2021). ACL proceedings, arxiv, slides, video, website, codalab, dataset, scoring program (Winner: Best system description paper award)

[JDIS'21] Sentence, Phrase, and Triple Annotations to Build a Knowledge Graph of Natural Language Processing Contributions—A Trial Dataset. Jennifer D’Souza and Sören Auer. In Journal of Data and Information Science 6, no. 3, pp. 6-34. (May 2021). JDIS journal article, dataset

2020

[Degruyter'20] Improving Access to Scientific Literature with Knowledge Graphs. Sören Auer , Allard Oelen , Muhammad Haris , Markus Stocker , Jennifer D’Souza , Kheir Eddine Farfar , Lars Vogt , Manuel Prinz , Vitalis Wiens and Mohamad Yaser Jaradeh . In Bibliothek Forschung und Praxis, vol. 44, no. 3, pp. 516-529. (Dec 2020). Degruyter journal article, video, orkg platform

[*SEM'20] Fine-tuning BERT with Focus Words for Explanation Regeneration. Isaiah Onando Mulang’, Jennifer D’Souza, and Sören Auer. In Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics, pages 125–130, Barcelona, Spain & Online. (Dec 2020). ACL proceedings

[ICADL'20] Improving Scholarly Knowledge Representation: Evaluating BERT-Based Models for Scientific Relation Classification. Ming Jiang, Jennifer D’Souza, Sören Auer, and Stephen J. Downie. In 22nd International Conference on Asia-Pacific Digital Libraries. Lecture Notes in Computer Science, vol 12504, Online. (Nov-Dec 2020). Springer proceedings, arxiv (Winner: Best paper runner-up award)

[ICADL'20] Representing Semantified Biological Assays in the Open Research Knowledge Graph. Marco Anteghini, Jennifer D'Souza, Vitor A. P. Martins dos Santos, and Sören Auer. In 22nd International Conference on Asia-Pacific Digital Libraries. Lecture Notes in Computer Science, vol 12504, Online. (Nov-Dec 2020). Springer proceedings, arxiv, ORKG assay comparison

[ASIS&T'20] Targeting precision: A hybrid scientific relation extraction pipeline for improved scholarly knowledge organization. Ming Jiang, Jennifer D’Souza, Sören Auer, and Stephen J. Downie. In Proc Assoc Inf Sci Technol. 57:e303. (Oct 2020). Wiley proceedings

[EKAW'20] SciBERT-based Semantification of Bioassays in the Open Research Knowledge Graph. Marco Anteghini, Jennifer D'Souza, Vitor A. P. Martins dos Santos, and Sören Auer. In EKAW 2020 Posters and Demonstrations Session co-located with 22nd International Conference on Knowledge Engineering and Knowledge Management, Bozen-Bolzano, Italy & Online. (Sep 2020). CEUR proceedings, arxiv

[EEKE'20] NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature. Jennifer D'Souza and Sören Auer. In Proceedings of the 1st Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents co-located with the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL 2020), Hubei, China & Online. (Aug 2020). CEUR proceedings, arxiv, dataset, workshop website

[JCDL'20] Toward Representing Research Contributions in Scholarly Knowledge Graphs Using Knowledge Graph Cells. Lars Vogt, Jennifer D’Souza, Markus Stocker, and Sören Auer. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, Hubei, China & Online. (Aug 2020). ACM proceedings, slides

[LREC'20] The STEM-ECR Dataset: Grounding Scientific Entity References in STEM Scholarly Content to Authoritative Encyclopedic and Lexicographic Sources. Jennifer D'Souza, Anett Hoppe, Arthur Brack, Mohmad Yaser Jaradeh, Sören Auer, and Ralph Ewerth. In Proceedings of the 12th Language Resources and Evaluation Conference, pp. 2192-2203, Marseille, France. (May 2020). ACL proceedings, arxiv, dataset, annotation guidelines

[ECIR'20] Domain-Independent Extraction of Scientific Concepts from Research Articles. Arthur Brack, Jennifer D’Souza, Anette Hoppe, Sören Auer, and Ralph Ewerth. In Advances in Information Retrieval--the 42nd European Conference on Information Retrieval. Lecture Notes in Computer Science, vol 12035, pp 251–266, Online. (April 2020). Springer proceedings, arxiv, dataset, code

2019

[K-CAP'19] Open Research Knowledge Graph: Next Generation Infrastructure for Semantic Scholarly Knowledge. Mohamad Yaser Jaradeh, Allard Oelen, Kheir Eddine Farfar, Manuel Prinz, Jennifer D’Souza, Gábor Kismihók, Markus Stocker, and Sören Auer. In Proceedings of the 10th Knowledge Capture Conference, Marina del Rey, California, USA. (Nov 2019). ACM proceedings, arxiv, orkg platform

[TextGraphs'19] Team SVMrank: Leveraging Feature-rich Support Vector Machines for Ranking Explanations to Elementary Science Questions. Jennifer D’Souza, Isaiah Onando Mulang’, and Sören Auer. In Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13), pages 90–100, Hong Kong. (Nov 2019). ACL proceedings, code

Earlier

[EMNLP'15] Sieve-Based Spatial Relation Extraction with Expanding Parse Trees. Jennifer D'Souza, and Vincent Ng. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 758-768, Lisbon, Portugal. (Sep 2015). ACL proceedings, poster, code

[ACL'15] Sieve-Based Entity Linking for the Biomedical Domain. Jennifer D'Souza, and Vincent Ng. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 297–302, Beijing, China. (Jul 2015). ACL proceedings, slides, code

[SemEval'15] UTD: Ensemble-Based Spatial Relation Extraction. Jennifer D'Souza, and Vincent Ng. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 862–869, Denver, Colorado. (Jun 2015). ACL proceedings

[PLOSOne'14] Three Journal Similarity Metrics and Their Application to Biomedical Journals. Jennifer L. D′Souza and Neil R. Smalheiser. In PLoS ONE 9(12): e115681. (Dec 2014). PLOS journal proceedings, online platform

[Database'14] Knowledge-rich temporal relation identification and classification in clinical notes. Jennifer D'Souza, and Vincent Ng. In Database, Volume 2014, 2014, bau109. (Nov 2014). Database journal proceedings

[COLING'14] Ensemble-Based Medical Relation Classification. Jennifer D'Souza, and Vincent Ng. In Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers, pages 1682–1693, Dublin, Ireland. (Aug 2014). ACL proceedings, slides

[LREC'14] Annotating Inter-Sentence Temporal Relations in Clinical Notes. Jennifer D'Souza, and Vincent Ng. In Proceedings of the Ninth International Conference on Language Resources and Evaluation, pp. 2758–2765, Reykjavik, Iceland. (May 2014). ACL proceedings, slides

[JBI'13] Classifying temporal relations in clinical data: A hybrid, knowledge-rich approach. Jennifer D'Souza, and Vincent Ng. In Journal of Biomedical Informatics, Volume 46, pp. S29-S39, ISSN 1532-0464. (Dec 2013). ScienceDirect journal proceedings

[ACM-BCB'13] Temporal Relation Identification and Classification in Clinical Notes. Jennifer D'Souza, and Vincent Ng. In Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics, pp. 392-401, Washington DC, USA. (Sep 2013). ACM proceedings, slides, poster

[NAACL'13] Classifying Temporal Relations with Rich Linguistic Knowledge. Jennifer D'Souza, and Vincent Ng. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 918–927, Atlanta, Georgia. (Jun 2013). ACL proceedings, slides, video

[ACM-BCB'12] Anaphora Resolution in Biomedical Literature: A Hybrid Approach. Jennifer D'Souza, and Vincent Ng. In Proceedings of the ACM conference on bioinformatics, computational biology and biomedicine, pp. 113-122, Orlando, Florida. (Oct 2012). ACM proceedings, slides

PhD Dissertation

Title: Extracting Time and Space Relations from Natural Language Text

Author: Jennifer Linda D'Souza

Advisor: Prof. Vincent Ng

Publisher: University of Texas at Dallas, 2015

Length: 152 pages

Abstract: Relation extraction is a core task in natural language processing that concerns the extraction of relations among the entities and events mentioned in a text document. Despite the vast amount of work on relation extraction from text, there has been relatively little work that focuses on understanding how entities and events are temporally and spatially related. This dissertation examines two key tasks in relation extraction, temporal relation extraction and spatial relation extraction. Temporal relation extraction involves determining the temporal ordering over events, dates, and other temporal entities. We focus on fine-grained temporal relation extraction, where we classify a pair of temporal entities as belonging to one of a predefined set of up to 14 temporal relation types. We propose a knowledge-rich, hybrid approach to this task. Specifically, we employ sophisticated linguistic knowledge derived from a variety of semantic and discourse relations, and leverage a hybrid system combining the strengths of rule-based and learning-based approaches. Experiments on newswire and medical data show that our approach yields a relative error reduction of about 15% over the state of the art. Spatial relation extraction, on the other hand, concerns the determination of how spatial entities are related to each other. While previous work on this task has focused on extracting relations involving stationary spatial entities, we examine a more challenging version of the task, in which we additionally identify spatial relations involving objects in motion. Unlike in many relation extraction tasks where exactly two entities can participate in a relation, in spatial relation extraction involving objects in motion, up to eight spatial elements can participate. To handle the complexity of extracting these relations, we propose a multi-pass sieve approach to spatial relation extraction, which can capture the partial dependencies among spatial elements without sacrificing computational tractability. When evaluated on a newly released corpus, our approach significantly outperforms state-of-the-art spatial relation extraction systems.

Links: UT Dallas Library Archives, ResearchGate https://doi.org/10.13140/RG.2.2.20018.89288

Page updated

Google Sites

Report abuse