To inspire and guide participants in the development of their systems, we've compiled a list of research questions exploring key aspects of LLM-based subject tagging in a bilingual context. These questions are intended to stimulate innovation and assist in advancing the field. Participants are encouraged to consider one or more of these questions but are also welcome to pursue their own research objectives.
Fine-Tuning LLMs for Bilingual Subject Tagging
How can LLMs be fine-tuned effectively to perform accurate subject tagging using the GND taxonomy in a bilingual setting?
Multilingual vs. Monolingual Models in Subject Tagging
How do multilingual pre-trained models compare to monolingual models in bilingual subject tagging tasks?
Mitigating Hallucinations and Overgeneralizations
How can we prevent LLMs from producing irrelevant or overly generalized subject tags?
Integrating External Knowledge into LLMs
What methods are most effective for integrating external knowledge, like the GND taxonomy, into LLMs for enhanced subject tagging?
Impact of Prompt Engineering on Subject Tagging
How does prompt engineering influence the performance of LLMs in subject tagging tasks?
Handling Technical Jargon in LLMs
What challenges do LLMs face when processing technical jargon and domain-specific terminology, and how can these be mitigated?
Disambiguating Polysemous Terms with LLMs
How can LLMs be improved to disambiguate terms with multiple meanings in different technical contexts?
Effect of Training Data Size and Diversity
How does the size and diversity of training data affect LLM performance in subject tagging?
Explainable AI for Subject Tagging
How can LLMs provide explanations or justifications for their subject tagging decisions to assist human experts?
Impact of LLMs on Expert Workflow Efficiency
How does the use of LLM-based subject tagging tools affect the efficiency and accuracy of subject matter experts' workflows?