SCIDOCA 2025 Shared Task
SCIDOCA 2025 Shared Task Overview
This year’s shared task will focus on Citation Prediction, Discovery, and Placement within scientific documents. Participants will be challenged to develop models that can accurately predict relevant citations, discover masked citations, and identify the specific sentences in which citations should be inserted. The shared task is designed to assess models’ abilities to understand the intricate citation networks in scientific discourse while also exploring how well they handle domain-specific knowledge.
Motivation and Impact
The SCIDOCA 2025 Shared Task is designed to address the growing need for automated citation systems that assist researchers in managing the ever-expanding corpus of scientific literature. By improving citation discovery and placement, this task could lead to advancements in:
Efficient Literature Review: Helping researchers quickly find relevant work.
Improved Scientific Writing Tools: Automating citation insertion to enhance the drafting process.
Citation Network Analysis: Enabling better understanding of citation behaviors across scientific domains.
By focusing on these tasks, the shared task aims to advance the state of research in scientific document analysis and citation management.
Important Dates
November 8 2024: Call for participation (data format finalized before distributing training and test data).
January 6 2025 TBA: Training data distributed.
February 3 2025: Test input data distributed (registration closes).
February 10 2025: System submission deadline (outputs + method summary).
February 12 2025: Results and team rankings announced.
February 21 2025: Technical paper submission deadline (consider requesting LNAI option).
March 6 2025: Notification of paper acceptance.
March 13 2025: Camera-ready submission deadline.
Data Usage Rules Summary
No External Data Transmission: Systems must operate offline and cannot send any provided data (training or test) to external services or APIs.
No Human Intervention: Systems must function autonomously during test-time inference, with no manual adjustments or parameter tuning.
Restricted Use of Non-Organizer Data:
External citation-related datasets or services (e.g., CrossRef, PubMed) are prohibited.
General-purpose pretrained models (e.g., BERT) are allowed if unrelated to citations.
Citation-related pretrained models (e.g., SPECTER, Galactica) are prohibited.
Subtasks
Subtask 1: Citation Discovery
Predict relevant citations for a given paragraph without specifying the exact sentence.
Subtask 2: Masked Citation Prediction
Predict the correct citation for each masked citation slot in a paragraph.
Subtask 3: Citation Sentence Prediction
Identify the correct citation for each sentence in a paragraph that contains a citation.
Subtask 1: Citation Discovery
Objective:
Predict relevant citations for a paragraph without specifying the exact sentence where the citation belongs.
Input:
Paragraph: A text passage from a scientific document that doesn’t contain citations.
Candidate References: A list of potential references, which includes both the correct citations and distractors (irrelevant but plausible citations).
Example Input:
{
"paragraph": "Recent advances in natural language processing have significantly improved the performance of models on various tasks such as machine translation and question answering.",
"candidate_references": [
"[Vaswani et al. 2017]",
"[Devlin et al. 2019]",
"[Brown et al. 2020]",
"[Radford et al. 2018]"
]
}
Output:
Predicted Citations: A list of citations that are contextually relevant to the paragraph.
Example Output:
{
"predicted_citations": [
"[Vaswani et al. 2017]",
"[Devlin et al. 2019]"
]
}
Evaluation:
For a given paragraph i, calculate Precision, Recall, and F1-Score using the equations:
Precision: measures the proportion of correctly predicted citations (TP) among all predicted citations (TP + FP). It reflects the relevance of the predicted citations.
Recall: measures the proportion of correctly predicted citations (TP) among all ground-truth citations (TP + FN). It reflects the completeness of predictions.
F1-Score: is the harmonic mean of Precision and Recall, balancing both measures.
Evaluation Across the Dataset: Weight the metrics by the number of ground-truth citations (GT_i) in each paragraph.
Subtask 2: Masked Citation Prediction
Objective: Participants will predict the correct citation for each masked citation slot within a paragraph where the citation has been removed.
Input:
Paragraph: A paragraph where one or more citation slots have been masked (replaced by a placeholder such as [MASK1], [MASK2], etc).
Candidate References: A list of potential references, including both correct citations and distractors.
Example Input:
{
"paragraph": "Transformer models like BERT [MASK1a][MASK1b] and GPT-3 [MASK2] have revolutionized natural language processing tasks. These models [MASK3] continue to set benchmarks across various domains.",
"candidate_references": [
"[Vaswani et al. 2017]",
"[Devlin et al. 2019]",
"[Brown et al. 2020]",
"[Radford et al. 2018]"
]
}
Output:
Predicted Citations: A dictionary mapping each labeled mask to its corresponding citation.
Example Output:
{
"predicted_citations": {
"[MASK1a][MASK1b]": ["[Devlin et al. 2019]"],
"[MASK2]": ["[Brown et al. 2020]"],
"[MASK3]": ["[Radford et al. 2018]"]
}
}
Subtask 3: Citation Sentence Prediction
Objective: Given a paragraph, participants will predict the correct citation for each sentence that contains a citation.
Input:
Paragraph: A multi-sentence paragraph without any explicit citation markers.
Candidate References: A list of potential citations, including both correct citations and distractors.
Example Input:
{
"paragraph": ["Transformer models have transformed the field of NLP.","One of the most influential models is BERT.", "We will investigate the results of BERT models." ,"GPT-3 has further pushed the boundaries of language modeling."],
"candidate_references": [
"[Vaswani et al. 2017]",
"[Devlin et al. 2019]",
"[Brown et al. 2020]",
"[Radford et al. 2018]"
]
}
Output:
Sentence Citations: A mapping of sentences to the correct citation(s), if required.
Example Output:
{
"sentence_citations": [
{
"sentence": "Transformer models have transformed the field of NLP.",
"predicted_citation": ["[Vaswani et al. 2017]"]
},
{
"sentence": "One of the most influential models is BERT.",
"predicted_citation": ["[Devlin et al. 2019]"]
},
{
"sentence": "We will investigate the results of BERT models.",
"predicted_citation": [[empty], [Devlin et al. 2019]]
},
{
"sentence": "GPT-3 has further pushed the boundaries of language modeling.",
"predicted_citation": ["[Brown et al. 2020]"]
}
]
}