5 – 7 November 2024
Automated Legal Question Answering Competition
(ALQAC 2024)
associated event of KSE 2024
held at the Eastin Hotel Kuala Lumpur, Malaysia
Overview
As an associated event of KSE 2024, we are happy to announce the 4th Automated Legal Question Answering Competition (ALQAC 2024). ALQAC 2024 includes two tasks:
Legal Document Retrieval
Legal Question Answering
For the competition, we introduce a new Legal Question Answering dataset – a manually annotated dataset based on well-known statute laws in the Vietnamese language. Through the competition, we aim to develop a research community on legal support systems. While the data is in Vietnamese, we extend a warm invitation to international teams to join us in uncovering the potential of multilingual methods and models.
Tasks
Task 1: Legal Document Retrieval
Task 1’s goal is to return the article(s) that are related to a given question. The article(s) are considered “relevant” to a question iff the question can be answered using the article(s).
The training data is in JSON format as follows:
[
{
"question_id": "DS-101",
"question_type": "Đúng/Sai",
"text": "Cơ sở điện ảnh phát hành phim phải chịu trách nhiệm trước pháp luật về nội dung phim phát hành, đúng hay sai?",
"relevant_articles": [
{
"law_id": "05/2022/QH15",
"article_id": "15"
}
]
}
]
The test data is in JSON format as follows:
[
{
"question_id": "DS-1",
"question_type": "Đúng/Sai",
"text": "Phim đã được Bộ Văn hóa, Thể thao và Du lịch, Ủy ban nhân dân cấp tỉnh cấp giấy phép phân loại phim sẽ có giá trị trên toàn quốc, đúng hay sai?"
}
]
The system should retrieve all the relevant articles. Please see the Submission Details section below for the format of the submissions.
Note that “relevant_articles” is the list of all relevant articles to the questions.
The evaluation methods are precision, recall, and F2-measure as follows:
Precisioni = the number of correctly retrieved articles of question ith
the number of retrieved articles of question ith
Recalli = the number of correctly retrieved articles of question ith
the number of relevant articles of question ith
F2i= (5 x Precisioni x Recalli)
(4Precisioni + Recalli)
F2 = average of (F2i)
In addition to the above evaluation measures, ordinal information retrieval measures such as Mean Average Precision and R-precision can be used for discussing the characteristics of the submission results. The macro-average F2-measure is the principal measure for Task 1.
Task 2: Legal Question Answering
Given a legal question, the goal is to answer the question. In ALQAC 2024, there are three types of questions:
True/False questions (Câu hỏi Đúng/Sai). Here is a training example:
[
{
"question_id": "DS-101",
"question_type": "Đúng/Sai",
"text": "Cơ sở điện ảnh phát hành phim phải chịu trách nhiệm trước pháp luật về nội dung phim phát hành, đúng hay sai?",
"relevant_articles": [
{
"law_id": "05/2022/QH15",
"article_id": "15"
}
],
"answer": "Đúng"
}
]
For the True/False questions, the answer must be "Đúng" or "Sai".
Multiple-choice questions (Câu hỏi trắc nghiệm). Here is a training example:
[
{
"question_id": "TN-102",
"question_type": "Trắc nghiệm",
"text": "Nam, nữ kết hôn với nhau phải từ đủ bao nhiêu tuổi trở lên?",
"choices": {
"A": "Nam từ đủ 20 tuổi trở lên, nữ từ đủ 18 tuổi trở lên.",
"B": "Nam từ đủ 18 tuổi trở lên, nữ từ đủ 20 tuổi trở lên.",
"C": "Nam từ đủ 21 tuổi trở lên, nữ từ đủ 19 tuổi trở lên.",
"D": "Nam từ đủ 19 tuổi trở lên, nữ từ đủ 21 tuổi trở lên."
},
"relevant_articles": [
{
"law_id": "52/2014/QH13",
"article_id": "8"
}
],
"answer": "A"
}
]
For the multiple-choice questions, the answer must be "A", "B", "C" or "D".
Free-text questions (Câu hỏi tự luận). Here is a training example:
[
{
"question_id": "TL-103",
"question_type": "Tự luận",
"text": "Cơ quan nào có trách nhiệm thống nhất quản lý nhà nước về điện ảnh?",
"relevant_articles": [
{
"law_id": "05/2022/QH15",
"article_id": "45"
}
],
"answer": "Chính phủ"
}
]
For the free-text questions, the answer is free-text and will be evaluated by human experts.
The principal evaluation measure is accuracy:
Accuracy = (the number of questions that were correctly answered)
(the number of questions)
Note:
Once the submission deadline for Task 1 has passed, the gold labels for Task 1 will be made available and can be utilized for Task 2.
The output submitted by participants will be published to a public GitHub repository so that legal and AI experts can refer to this information for analysis purposes. In situations of uncertainty, expert evaluation serves as the authorized measure for determining the participants' system performance.
You are free to crawl Internet data as additional data. There are no limitations on the use of externally sourced data.
Participants may use any large language models (LLMs) and pre-trained models. However, all participating teams are required to submit their source code for verification.
Submission Details
Participants are responsible for ensuring that their result files adhere to the format requirements. The format should be as follows:
Task 1: Legal Document Retrieval
[
{
"question_id": "TN-2",
"relevant_articles": [
{
"law_id": "05/2022/QH15",
"article_id": "95"
}
]
},
...
]
Task 2: Legal Question Answering
[
{
"question_id": "TL-3",
"answer": <the answer>
},
...
]
Submission of Predictions: Participants must submit the files containing the systems' predictions for each task via email. For each task, participants are allowed to submit a maximum of 3 files, which should correspond to 3 different settings or methods for this task.
Submission of Source Code: Participants are required to submit the source code of their method.
Submission of Papers: Participants are required to submit a paper on their method and experimental results. Papers should conform to the standards set out on the KSE 2024 webpage (section Submission). At least one of the authors of an accepted paper has to present the paper at the ALQAC workshop of KSE 2024.
Inclusion in Proceedings: The papers authored by the task winners will be included in the main KSE 2024 proceedings if ALQAC organizers admit the paper novelty after the review process.
Public Test Leaderboard
A leaderboard has been established to facilitate the submission of system predictions by participating teams for the public test. Note that the public test leaderboard serves as a reference and may not accurately reflect the actual performance of participating systems. The final results will be validated using the private test set.
The designated website for the leaderboard can be accessed at https://eval.ai/web/challenges/challenge-page/2294/overview.