The official shared task dataset is hosted on GitHub at: https://github.com/jd-coderepos/llms4subjects/ . Extensive documentation is provided in the repository to help participants get familiar with the data. Below, we introduce the tasks for the LLMs4Subjects shared task.
The GND subjects taxonomy is officially used by TIB's subject matter experts to index records. We provide the taxonomy in human-readable form to the shared task participants. The aim is for participants to incorporate it as a comprehensive knowledge base into their LLMs for subject tagging. The GND root subfolder in the shared task repository is an excellent place to get started. There, participants are provided with two versions of the GND taxonomy:
GND Subjects - TIB Core. A collection of GND subjects limited to TIB's core technical subjects. Download here: GND-subjects-tib-core
Full GND Subjects Collection. The entire GND subjects collection. Download here: GND-subjects-all
With these datasets, participants are expected to equip their systems with knowledge about the GND taxonomy, enabling the LLMs to effectively understand and utilize the taxonomy's subjects and their nuanced semantics. We do not limit the methodologies that participants can use to develop their LLM-based solutions and encourage creativity, empiricism, and practicality in the developed solutions.
The TIBKAT root subfolder in the shared task repository is another key starting point. Participants are provided with two versions of the TIBKAT data:
TIBKAT Core Subjects. A collection of TIBKAT records, each annotated with at least one of TIB's core technical subjects. Download here: TIBKAT-tib-core-subjects-dataset
Full TIBKAT Collection. The full TIBKAT dataset for training. Download here: TIBKAT-all-subjects-dataset
Participants should align the subject tagging capability of their systems with the annotations provided in TIBKAT. The system should recommend GND subjects based on the semantic relationship between the subjects and the title + abstract of a technical record. While the TIBKAT annotations, created by human subject experts, serve as a valuable reference, they are not an exhaustive representation of all possible GND subjects. As is common in NLP, human annotation can be prone to issues such as annotator fatigue, and the absence of AI assistance limits the consistency of tagging. The goal of this task is to streamline the subject tagging process, helping subject matter experts with LLM-based AI tools that improve both efficiency and accuracy in their workflows.
Task 3 offers a creative divergence from LLM development, focusing on UI/UX design for subject tagging. Participants are encouraged to design frontend interfaces that provide a seamless interaction between users (subject matter experts) and the developed systems. These interfaces should follow TIB’s branding guidelines (https://www.tib.eu/en/) and be intuitive, efficient, and user-friendly.
Key interface features might include:
Auto-complete suggestions and predictive text for faster subject term selection.
Dynamic filtering to help users quickly navigate large subject lists.
The ability to add new subject terms that may not be included in the broader GND.
Participants should consider responsive design and accessibility to create an interface that works well across devices and for diverse user groups. Submissions should include a clear guide for deploying the application on localhost environments, as testing will be performed locally.
We will accept submissions in the form of code repositories, which will be evaluated based on documentation, ease of deployment, and user experience.
Given Tasks 1 and 2 above, the LLMs4Subjects shared task does not limit the possibilities for how participants choose to train their systems. For example, participants may choose to first fine-tune their systems on the GND and then align them with TIBKAT annotations, or they may opt to directly align LLMs with the TIBKAT dataset. Should agentic workflows be considered? Could Retrieval-Augmented Generation (RAG) techniques be applied? These are open-ended questions that participants will explore through their design & methodology preferences and via their submitted systems.
We hope that the systems submitted for this shared task push the boundaries of what is possible and offer innovative insights into developing LLM-based solutions for subject tagging. While the research possibilities are broad, we have provided a list of 10 research questions (in no particular order of preference) to guide participants as they develop their systems. You can view these questions on the Research Questions page.
The key to a successful solution will be finding a balance between research-driven innovation and the creation of a practical, usable system.
Due to the open-ended nature of system training for LLMs4Subjects, we recommend participants read the planned Evaluations page.