LLM-based Subject Tagging for
the TIB Technical Library's Open-Access Catalog
Theme: The Development of Energy- and Compute-Efficient LLM Systems
The 2nd LLMs4Subjects Shared Task | GermEval'25 @ Konvens 2025, Hildesheim, Germany
Theme: The Development of Energy- and Compute-Efficient LLM Systems
The 2nd LLMs4Subjects Shared Task | GermEval'25 @ Konvens 2025, Hildesheim, Germany
The official shared task dataset is hosted on GitHub at: https://github.com/sciknoworg/llms4subjects. Extensive documentation is provided in the repository to help participants get familiar with the data.
The GND subjects taxonomy is officially used by TIB's subject matter experts to index records. We provide the taxonomy in human-readable form to the shared task participants. The aim is for participants to incorporate it as a comprehensive knowledge base into their LLMs for subject tagging. The GND root subfolder in the shared task repository is an excellent place to get started. There, participants are provided with the entire GND Subject Collection. It can downloaded here: GND-subjects
With this dataset, participants are expected to equip their systems with knowledge about the GND taxonomy, enabling the LLMs to effectively understand and utilize the taxonomy's subjects and their nuanced semantics. We do not limit the methodologies that participants can use to develop their LLM-based solutions and encourage creativity, empiricism, and practicality in the developed solutions.
The TIBKAT root subfolder in the shared task repository is another key starting point. Participants are provided with full TIBKAT collection for training. It can be downloaded here: TIBKAT-subjects-dataset
Now, given both the GND Subjects and TIBKAT Collections, we introduce the tasks for LLMs4Subjects shared task.
Given a human-readable record, the developed system must classify it into one or more of the 28 predefined domains. This subtask is formulated as a multi-label classification task, as each record may be associated with multiple relevant subject domains. The list of predefined subject domains can be found here. Participants should focus only on the subject domains related to the Subject Classification System LinSearch from the TIB terminology service.
Most records in TIBKAT collection has been assigned one or more subject domains based on its title and abstract. However, there are some exceptions. Specifically, a small subset of records does not include domain information in its metadata. Out of the 110,401 records in the train and development splits, only 314 records lack domain details. The list of these records can be found here. Participants should develop systems that recommend one or more subject domains by capturing the semantic relationship between the subject, title, and abstract of the record.
Given a human-readable record, the developed system must generate relevant subject suggestions that accurately reflect its content.
Participants should align the subject tagging capability of their systems with the annotations provided in TIBKAT. The system should recommend GND subjects based on the semantic relationship between the subjects and the title + abstract of a technical record.
Note 1: While the TIBKAT annotations, created by human subject experts, serve as a valuable reference, they are not an exhaustive representation of all possible GND subjects. As is common in NLP, human annotation can be prone to issues such as annotator fatigue, and the absence of AI assistance limits the consistency of tagging. The goal of this task is to streamline the subject tagging process, helping subject matter experts with LLM-based AI tools that improve both efficiency and accuracy in their workflows.
Note 2: We will accept submissions in the form of code repositories, which will be evaluated based on documentation, ease of deployment, and user experience.
Given subtasks 1 and 2 above, the LLMs4Subjects shared task does not limit the possibilities for how participants choose to train their systems. For example, participants may choose to first fine-tune their systems on the GND for both subtasks and then align them with TIBKAT annotations for downstream tasks, or they may opt to directly align LLMs with the TIBKAT dataset or subtask 1 or 2. Should agentic workflows be considered? Could Retrieval-Augmented Generation (RAG) techniques be applied? These are open-ended questions that participants will explore through their design & methodology preferences and via their submitted systems.
We hope that the systems submitted for this shared task push the boundaries of what is possible and offer innovative insights into developing LLM-based solutions for subject tagging. While the research possibilities are broad, we have provided a list of 10 research questions (in no particular order of preference) to guide participants as they develop their systems. You can view these questions on the Research Questions page.
The key to a successful solution will be finding a balance between research-driven innovation and the creation of a practical, usable system.
Due to the open-ended nature of system training for LLMs4Subjects, we recommend participants read the planned Evaluations page.