ALQAC 2026 introduces a single shared task on Vietnamese legal case understanding. Given a short case query derived from a Vietnamese court judgment, participating systems must predict whether the plaintiff or the defendant wins the case. In addition to the final prediction, systems are expected to retrieve supporting evidence from the case-content corpus and relevant legal provisions from the law corpus.
The task is designed to evaluate agentic legal AI systems that can combine case-level factual understanding, legal provision retrieval, evidence grounding, and outcome prediction. Instead of providing the full judgment directly to participants, the organizers expose segmented case content through official APIs. This setting encourages systems to actively search for relevant information, reason over retrieved evidence, and produce a verifiable prediction.
For each test instance, participants are given a short natural-language query describing the dispute. The query includes the main parties, the disputed legal relationship or asset, a brief summary of the plaintiff's claim and the defendant's position, and a question asking whether the plaintiff or the defendant is more likely to win.
Participants must build a system that:
1. Reads the provided case query.
2. Calls the official Case Content API to retrieve relevant case segments.
3. Retrieves relevant legal provisions from the provided law corpus.
4. Predicts the final outcome of the case.
5. Submits the predicted outcome together with supporting case evidence and legal provisions.
The competition contains only one task:
The expected prediction is binary:
- `A_WIN`: the plaintiff wins, meaning the court substantially accepts the plaintiff's main claim.
- `B_WIN`: the defendant wins, meaning the court substantially rejects the plaintiff's main claim or rules in favor of the defendant.
Legal case outcome prediction requires more than simple text classification. A strong system must understand the legal dispute, identify the claims of the parties, retrieve relevant factual evidence from the case record, retrieve applicable legal provisions, and reason about how the court is likely to resolve the dispute.
This task aims to encourage research on:
- Vietnamese legal judgment understanding.
- Retrieval-augmented legal reasoning.
- Agentic interaction with legal APIs.
- Evidence-grounded legal prediction.
- Vietnamese legal corpus retrieval.
- Transparent and verifiable legal AI systems.
The task setting reflects a realistic legal AI scenario: a system receives an initial case description, then must actively retrieve additional information before making a prediction.
Participants will work with two main resources.
*** Case Query Input***
Each test case includes a short query generated from a Vietnamese court judgment. The query is intended to simulate the initial information given to a legal AI agent.
Example:
{
"case_id": "0001",
"case_query": "Ông Nguyễn Khắc Vũ H1 (nguyên đơn) và Chu Quang Nguyễn H2 (bị đơn) tranh chấp hợp đồng chuyển nhượng quyền sử dụng đất đối với một phần thửa 366. Nguyên đơn yêu cầu được công nhận hợp đồng chuyển nhượng cho diện tích nêu trên. Agent cần dự đoán nguyên đơn thắng kiện hay bị đơn thắng kiện?"
}
The query does not reveal the court's reasoning, the final decision, or the winner of the case.
1. Case Content Corpus
The case content is segmented into smaller chunks and hosted by the organizers. Participants do not receive the full raw judgments directly for the test set. Instead, they must retrieve case segments through the official Case Content API.
Case segments may contain information such as:
- Plaintiff's claims.
- Defendant's arguments.
- Statements from related parties.
- Case facts.
- Procedural information.
- Court reasoning.
- Final verdict.
Participants are expected to call the API to identify the most relevant case segments for each query.
2. Law Corpus
The law corpus is provided to all participating teams. Teams may build their own retrieval system over this corpus to identify relevant legal provisions.
Each legal provision may include fields such as:
{
"lawcorpus_id": "001abc",
"law_name": "Bộ luật Dân sự 2015",
"cited_law": "Điều 3 khoản 6 ..."
}
1. Search Case Segments
Endpoint:
POST /v1/case_segments/search
APi in bearer
Request example:
{
"case_id": "001",
"query": "hợp đồng chuyển nhượng quyền sử dụng đất thửa 396 nguyên đơn yêu cầu công nhận hợp đồng",
}
Response example:
{
"case_id": "0001",
"result":
{
"hash_id": "hashsdjfvhlisduhfliudh",
"text": "Ngày 04/5/2018 nguyên đơn có nhận chuyển nhượng của ông H2 diện tích 135m2..."
}
}
2. API Usage
Participants may call the Case Content API multiple times for each test case. The number of API calls may be used for analysis or as a tie-breaker. The organizers may also define a maximum number of API calls per case or per submission.
The public test input will be released as a JSON file.
Example:
[
{
"case_id": "0001",
"case_query": "Ông Nguyễn Khắc Vũ H1 (nguyên đơn) và Chu Quang Nguyễn H2 (bị đơn) tranh chấp hợp đồng chuyển nhượng quyền sử dụng đất đối với một phần thửa 366. Nguyên đơn yêu cầu được công nhận hợp đồng chuyển nhượng cho diện tích nêu trên. Agent cần dự đoán nguyên đơn thắng kiện hay bị đơn thắng kiện"
},
{
"case_id": "0002",
"case_query": "..."
}
]
Field descriptions:
| Field | Type | Description |
|---|---|---|
| `case_id` | string | Public identifier of the test case. |
| `case_query` | string | Short natural-language description of the dispute and the prediction question. |
The input file will not include the gold verdict, court reasoning, court decision, or gold evidence.
1. Submission Format
Each team must submit a single JSON file named: submission.json
The submission must contain a list of predictions, one object per test case.
Example:
[
{
"case_id": "0001",
"prediction": "A_WIN",
"law_evidence": [
"001abc",
"001aac",
],
}
]
a. Required Fields
b. Prediction Labels
The `prediction` field must be one of the following values:
If a case contains multiple claims, teams should focus on the main claim described in the `case_query`.
2. Evaluation
The official evaluation metrics for the Legal Case Outcome Prediction task.
The final score consists of three components:
- Outcome Accuracy: whether the system correctly predicts the winning side.
- Penalized Case Evidence Recall: whether the system retrieves the correct case-content evidence, with a penalty for excessive API calls.
- Micro Law Evidence F1: whether the system retrieves the correct legal provisions from the law corpus.
The final score is defined as:
2.1 Definitions
2.2 Outcome Accuracy
This component rewards systems that correctly predict whether the plaintiff or the defendant wins the case.
2.3 Case Evidence Recall
For each case, the system submits a set of case-content evidence segments.
This component measures how many gold case evidence segments are successfully retrieved by the system.
2.4 API Efficiency Penalty
The API call budget is case-dependent. Larger cases have more segments, so they are allowed more API calls.
2.5 Penalized Case Evidence Recall
2.6 Micro Law Evidence F1
Law evidence is evaluated using micro-averaged F1 over the full test set.
2.7 Full Final Score Formula
2.8 Example
2.9 Practical Interpretation
The metric is designed to reward systems that:
a. Predict the correct outcome.
b. Retrieve the correct case evidence.
c. Retrieve the relevant legal provisions.
d. Use the Case Content API efficiently.
3. Submission Validation
A submission may be rejected or partially ignored if it violates the required format.
The organizers may validate the following conditions:
- Every test case has exactly one submitted prediction.
- Every `case_id` exists in the official test set.
- There are no duplicate `case_id`s.
- The `prediction` value is either `A_WIN` or `B_WIN`.
- `law_evidence` is a list of valid legal provision identifiers from the law corpus.
- The JSON file is valid and can be parsed automatically.
Duplicate evidence items may be automatically deduplicated before scoring.
*** Example Submission ***
[
{
"case_id": "0001",
"prediction": "A_WIN",
"law_evidence": [
"001abc",
"001aac",
]
},
{
"case_id": "0002",
"prediction": "B_WIN",
"law_evidence": [
"001abb",
"001aba",
],
}
]
1. System Requirements and Reproducibility
Participating teams are encouraged to submit a short technical report describing their method, including:
- Retrieval strategy for the Case Content API.
- Retrieval strategy for the law corpus.
- Reasoning and prediction method.
- Models and tools used.
- Prompting or agent design, if applicable.
- Post-processing and validation steps.
The organizers may request source code, configuration files, or logs for verification and reproducibility.
2. Notes for Participants
- The case query is not sufficient to solve the task reliably. Systems should retrieve additional case segments through the official API.
- The law corpus is provided separately and should be used to retrieve relevant legal provisions.
- The `explanation` field is optional and is not the principal scoring component, but it may be used for qualitative analysis.
- The official ranking is based on the final score defined above.
- Participants should ensure that their submission file strictly follows the required JSON format.