An Evaluation Framework for Legal Document Summarization (LREC 2022)

Ankan Mullick*, Abhilash Nandy*, Manav Nitin Kapadnis*, Sohan Patnaik, R Raghav, Roshni Kar

*Contributed equally

Abstract

A law practitioner has to go through numerous lengthy legal case proceedings for their practices of various categories, such as land dispute, corruption, etc. Hence, it is important to summarize these documents, and ensure that summaries contain phrases with intent matching the category of the case. To the best of our knowledge, there is no evaluation metric that evaluates a summary based on its intent. We propose an automated intent-based summarization metric, which shows a better agreement with human evaluation as compared to other automated metrics like BLEU, ROUGE-L etc. in terms of human satisfaction. We also curate a dataset by annotating intent phrases in legal documents, and show a proof of concept as to how this system can be automated. Additionally, all the code and data to generate reproducible results is available on Github.


Paper - https://arxiv.org/abs/2205.08478


Data, poster, code, demo, and video

Annotated Data of Intent Phrases in Legal Case Proceedings

Poster

Github Repo can be found here.

Check out this demo.

Explanation video can be found here.


Cite as -

@InProceedings{mullick-EtAl:2022:LREC2,

author = {Mullick, Ankan and Nandy, Abhilash and Kapadnis, Manav and Patnaik, Sohan and R, Raghav and Kar, Roshni},

title = {An Evaluation Framework for Legal Document Summarization},

booktitle = {Proceedings of the Language Resources and Evaluation Conference},

month = {June},

year = {2022},

address = {Marseille, France},

publisher = {European Language Resources Association},

pages = {4747--4753},

abstract = {A law practitioner has to go through numerous lengthy legal case proceedings for their practices of various categories, such as land dispute, corruption, etc. Hence, it is important to summarize these documents, and ensure that summaries contain phrases with intent matching the category of the case. To the best of our knowledge, there is no evaluation metric that evaluates a summary based on its intent. We propose an automated intent-based summarization metric, which shows a better agreement with human evaluation as compared to other automated metrics like BLEU, ROUGE-L etc. in terms of human satisfaction. We also curate a dataset by annotating intent phrases in legal documents, and show a proof of concept as to how this system can be automated.},

url = {https://aclanthology.org/2022.lrec-1.508}

}