Dataset and Evaluation

Data for Task 1:

The training data consists of the following:

  1. Query_doc.txt -- contains the 50 queries, i.e., description of legal situations

  2. Object_casedocs -- contains 2,914 prior case documents, some of which are relevant to the queries (to be used for Task 1).

  3. Object_statutes -- contains the title and textual description of 197 statutes (to be used for Task 2)

  4. Gold standard annotations for relevant precedents and

  5. A README.txt file that specifies the formats of the other files / folders

The test data will consist of additional queries which will be used for evaluation. The same document collection (2 & 3 above) will be used for evaluation

Training data: Download the train dataset for task 1 here

Test Data: Download the test dataset for both tasks here

Decryption key for the datasets can be obtained by registering for the task.

Data for Task 2:

The training data consists of 50 case documents. In each document the sentences are labelled by one of the 7 categories mentioned here.

The test data will consist of additional documents which will be used for evaluation.

Training data: Download the train dataset for task 2 here

Test Data: Download the test dataset for both tasks here

Decryption key for the datasets can be obtained by registering for the task.

Evaluation plan:

Task 1

Standard Information retrieval metrics like Measures like Precision, Recall, Mean Average Precision (MAP), Discounted Cumulative Gain(DCG) and Mean Reciprocal Rank(MRR) will be used for evaluation in task 1. This task is continuation of AILA 2019 and more details can be found in last year's overview paper.

We will be using the official trec eval tookit for evaluation and encourage the participants to use the same while validating their system outputs.

Task 2

Standard classification metrics Precision, Recall and F1-Score will be used for evaluation in task 2