As an associated event of KSE 2025, we are happy to announce the 5th Automated Legal Question Answering Competition (ALQAC 2025). ALQAC 2025 includes two tasks:
Legal Document Retrieval
Legal Question Answering
For the competition, we introduce a new Legal Question Answering dataset – a manually annotated dataset based on well-known statute laws in the Vietnamese language. Through the competition, we aim to develop a research community on legal support systems. While the data is in Vietnamese, we extend a warm invitation to international teams to join us in uncovering the potential of multilingual methods and models.
Task 1: Legal Document Retrieval
Task 1’s goal is to return the article(s) that are related to a given question. The article(s) are considered “relevant” to a question iff the question can be answered using the article(s).
The training data is in JSON format as follows:
[
{
"question_id": "DS-101",
"question_type": "Đúng/Sai",
"text": "Cơ sở điện ảnh phát hành phim phải chịu trách nhiệm trước pháp luật về nội dung phim phát hành, đúng hay sai?",
"relevant_articles": [
{
"law_id": "05/2022/QH15",
"article_id": "15"
}
]
}
]
The test data of Task 1 is in JSON format as follows:
[
{
"question_id": "DS-1",
"text": "Phim đã được Bộ Văn hóa, Thể thao và Du lịch, Ủy ban nhân dân cấp tỉnh cấp giấy phép phân loại phim sẽ có giá trị trên toàn quốc, đúng hay sai?"
}
]
The system should retrieve all the relevant articles. Please see the Submission Details section below for the format of the submissions.
Note that “relevant_articles” is the list of all relevant articles to the questions.
The evaluation methods are precision, recall, and F2-measure as follows:
Precisioni = the number of correctly retrieved articles of question ith
the number of retrieved articles of question ith
Recalli = the number of correctly retrieved articles of question ith
the number of relevant articles of question ith
F2i= (5 x Precisioni x Recalli)
(4Precisioni + Recalli)
F2 = average of (F2i)
In addition to the above evaluation measures, ordinal information retrieval measures such as Mean Average Precision and R-precision can be used for discussing the characteristics of the submission results. The macro-average F2-measure is the principal measure for Task 1.
Task 2: Legal Question Answering
Given a legal question, the goal is to answer the question. In ALQAC 2025, there are three types of questions:
True/False questions (Câu hỏi Đúng/Sai). Here is a training example:
[
{
"question_id": "DS-101",
"question_type": "Đúng/Sai",
"text": "Cơ sở điện ảnh phát hành phim phải chịu trách nhiệm trước pháp luật về nội dung phim phát hành, đúng hay sai?",
"relevant_articles": [
{
"law_id": "05/2022/QH15",
"article_id": "15"
}
],
"answer": "Đúng"
}
]
For the True/False questions, the answer must be "Đúng" or "Sai".
Multiple-choice questions (Câu hỏi trắc nghiệm). Here is a training example:
[
{
"question_id": "TN-102",
"question_type": "Trắc nghiệm",
"text": "Nam, nữ kết hôn với nhau phải từ đủ bao nhiêu tuổi trở lên?",
"choices": {
"A": "Nam từ đủ 20 tuổi trở lên, nữ từ đủ 18 tuổi trở lên.",
"B": "Nam từ đủ 18 tuổi trở lên, nữ từ đủ 20 tuổi trở lên.",
"C": "Nam từ đủ 21 tuổi trở lên, nữ từ đủ 19 tuổi trở lên.",
"D": "Nam từ đủ 19 tuổi trở lên, nữ từ đủ 21 tuổi trở lên."
},
"relevant_articles": [
{
"law_id": "52/2014/QH13",
"article_id": "8"
}
],
"answer": "A"
}
]
For the multiple-choice questions, the answer must be "A", "B", "C" or "D".
Free-text questions (Câu hỏi tự luận). Here is a training example:
[
{
"question_id": "TL-103",
"question_type": "Tự luận",
"text": "Cơ quan nào có trách nhiệm thống nhất quản lý nhà nước về điện ảnh?",
"relevant_articles": [
{
"law_id": "05/2022/QH15",
"article_id": "45"
}
],
"answer": "Chính phủ"
}
]
For the free-text questions, the answer is free-text and will be evaluated by human experts.
The test data for Task 2, which includes the gold labels from Task 1, is provided in the following JSON format:
[
{
"question_id": "DS-1",
"question_type": "Đúng/Sai",
"text": "Phim đã được Bộ Văn hóa, Thể thao và Du lịch, Ủy ban nhân dân cấp tỉnh cấp giấy phép phân loại phim sẽ có giá trị trên toàn quốc, đúng hay sai?",
"relevant_articles": [
{
"law_id": "05/2022/QH15",
"article_id": "27"
}
],
}
]
The principal evaluation measure is accuracy:
Accuracy = (the number of questions that were correctly answered)
(the number of questions)
Note:
Once the submission deadline for Task 1 has passed, the gold labels for Task 1 will be made available and can be utilized for Task 2.
The output submitted by participants will be published to a public GitHub repository so that legal and AI experts can refer to this information for analysis purposes. In situations of uncertainty, expert evaluation serves as the authorized measure for determining the participants' system performance.
You are free to crawl Internet data as additional data. There are no limitations on the use of externally sourced data. (not applicable in ALQAC 2025, see Restriction below)
Participants may use any large language models (LLMs) and pre-trained models. However, all participating teams are required to submit their source code for verification. (not applicable in ALQAC 2025, see Restriction below)
Participants are responsible for ensuring that their result files adhere to the format requirements. The format should be as follows:
Task 1: Legal Document Retrieval
[
{
"question_id": "TN-2",
"relevant_articles": [
{
"law_id": "05/2022/QH15",
"article_id": "95"
}
]
},
...
]
Task 2: Legal Question Answering
[
{
"question_id": "TL-3",
"answer": <the answer>
},
...
]
Submission of Predictions: Participants must submit the files containing the systems' predictions for each task via email. For each task, participants are allowed to submit a maximum of 3 files, which should correspond to 3 different settings or methods for this task.
Submission of Source Code: Participants are required to submit the source code of their method.
Submission of Papers: Participants are required to submit a paper on their method and experimental results. Papers should conform to the standards set out on the KSE 2025 webpage (section Submission). At least one of the authors of an accepted paper has to present the paper at the ALQAC workshop of KSE 2025.
Inclusion in Proceedings: The papers authored by the task winners will be included in the main KSE 2025 proceedings if ALQAC organizers admit the paper novelty after the review process.
In the spirit of fostering open research and reproducibility, participating teams are permitted to use any publicly available resources intended for the research community. This includes online legal databases such as vbpl.vn and open-weight large language models (LLMs) like LLaMA-3.
However, the following restrictions apply:
The use of closed or proprietary systems—such as ChatGPT, GPT-4, Claude, Gemini, or any other non-open models—is strictly prohibited.
To ensure fairness and accessibility, only open-weight models with fewer than 10 billion parameters are allowed. This encourages efficient, resource-conscious approaches and levels the playing field for teams with limited computational resources.
While online legal databases are permitted, the use of externally annotated datasets specifically created for legal question answering or legal entailment (e.g., labeled QA pairs or entailment examples) is not allowed.
Any results obtained in violation of these rules will be disregarded in the final team ranking.