Mining and Learning in the Legal Domain

International Workshop on Mining and Learning in the Legal Domain (MLLD-2020)

In conjunction with the 20th IEEE International Conference on Data Mining, November 17-20, 2020, Sorrento, Italy Virtual

The increasing accessibility of large legal corpora and databases create opportunities to develop data driven techniques as well as more advanced tools that can facilitate multiple tasks of researchers and practitioners in the legal domain. While recent advancements in the areas of data mining and machine learning have gained many applications in domains such as biomedical, healthcare and finance, there is still a noticeable gap in how much the state-of-the-art techniques are being incorporated in the legal domain. Achieving this goal entails building a multi-disciplinary community that can benefit from the competencies of both law and computer science experts. The goal of this workshop is to bring the researchers and practitioners of both disciplines together and provide an opportunity to share the latest novel research findings and innovative approaches in employing data analytics and machine learning in the legal domain.


We encourage submissions on novel mining and learning based solutions in various aspects of analyzing legal data such as Legislations, litigations, court cases, contracts, patents, Non-Disclosure Agreements (NDAs) and Bylaws. Topics of interest include, but are not limited to:

  • Applications of data mining techniques in the legal domain

    • case outcome prediction

    • classifying, clustering and identifying anomalies in big corpora of legal records

    • legal analytics

    • citation analysis for case law

    • eDiscovery

  • Applications of natural language processing and machine learning techniques for legal textual data

    • information extraction and entity extraction/resolution for legal document reviews

    • information retrieval and question answering in applications such as identifying relevant case law

    • summarization of legal documents

    • legal language modelling and legal document embedding and representation

    • recommender systems for legal applications

    • topic modelling in large amounts of legal documents

    • harnessing of deep learning approaches

  • Ethical issues in mining legal data

    • privacy and GDPR in legal analytics

    • bias in the applications of data mining

    • transparency in legal data mining

  • Training data for legal domain

    • acquisition, representation, indexing, storage, and management of legal data

    • automatic annotation and learning with human in the loop

    • data augmentation techniques for legal data

    • semi-supervised learning, domain adaptation, distant supervision and transfer learning

  • Emerging topics in the intersection of data mining and law

    • digital lawyers and legal machines

    • smart contracts

    • future of law practice in the age of AI


You are invited to submit your original research and application papers to the workshop. As per ICDM instructions, papers are limited to a maximum of 8 pages, and must follow the IEEE ICDM format requirements. All accepted workshop papers will be published in the formal proceedings by the IEEE Computer Society Press. Each paper is reviewed by at least 3 reviewers from the program committee. Paper review is triple-blind. Manuscripts are to be submitted through CyberChair. Please forward your questions to or


  • Paper submission due date: August 24, 2020

  • Notification of acceptance: September 17, 2020 September 18, 2020

  • Camera ready submission: September 24, 2020 September 27, 2020

  • MLLD -2020 Workshop: November 17, 2020


  • Sedef Akinli Kocak, Vector Institute

  • Kevin Ashley, University of Pittsburgh

  • Jack Conrad, Thomson Reuters

  • Rozita Dara, University of Guelph

  • Elham Dolatabadi, Vector Institue

  • Matthias Grabmair, Carnegie Mellon University

  • Diana Inkpen, University of Ottawa

  • Jaromír Šavelka, University of Pittsburgh

  • Frank Schilder, Thomson Reuters

  • Anne Tucker, Georgia State University

  • Bernhard Waltl, BMW AI Lab

  • Adam Wyner, Swansea University


  • Shohreh Shaghaghian, Center for AI and Cognitive Computing at Thomson Reuters

  • Masoud Makrehchi, OntarioTech University and Center for AI and Cognitive Computing at Thomson Reuters


  • Unsupervised Extraction of Workplace Rights and Duties from Collective Bargaining Agreements, Elliott Ash, Jeff Jacobs, Bentley MacLeod, Suresh Naidu, and Dominik Stammbach

  • Immigration Document Classification and Automated Response Generation, Sourav Mukherjee, Tim Oates, Vince DiMascio, Huguens Jean, Rob Ares, David Widmark, and Jaclyn Harder

  • Tasks performed in the legal domain through Deep Learning: A bibliometric review (1987-2020), Alfredo Montelongo and Joao Luiz Becker

  • Using Unlabeled Data for US Supreme Court Case Classification, George Sanchez

  • Building knowledge graphs of homicide investigation chronologies, Ritika Pandey, P. Jeffrey Brantingham, Craig Uchida, and George Mohler


Kevin D. Ashley

Professor of Law and Intelligent Systems, University of Pittsburgh

Kevin D. Ashley, Ph.D., is an expert on computer modeling of legal reasoning. He performs research in the field of legal text analytics and studies how to prepare law students for its effects on legal practice. In 2002 he was selected as a Fellow of the American Association of Artificial Intelligence “for significant contributions in computationally modeling case-based and analogical reasoning in law and practical ethics.” He is co-editor in chief of Artificial Intelligence and Law, the journal of record in the field of AI and Law and has been a principal investigator of a number of National Science Foundation grants. He is the author of Modeling Legal Argument: Reasoning with Cases and Hypotheticals (MIT Press/Bradford Books, 1990) and of Artificial Intelligence and Legal Analytics: New Tools for Law Practice in the Digital Age (Cambridge University Press, 2017). In addition to his appointment at the School of Law, Professor Ashley is a senior scientist at the Learning Research and Development Center, an adjunct professor of computer science, and a faculty member of the Graduate Program in Intelligent Systems of the University of Pittsburgh. A former National Science Foundation Presidential Young Investigator, Professor Ashley has been a visiting scientist at the IBM Thomas J. Watson Research Center, a Senior Visiting Fellow at the Institute for Advanced Studies of the University of Bologna where he is a frequent visiting professor of the Faculty of Law, and a former President of the International Association of Artificial Intelligence and Law.

Title: Progress in Teasing Meaning from Legal Texts

Abstract: Traditionally, the field of AI and Law has focused on representing legal knowledge in ways that computers can use to perform legal reasoning, or something like it, with legally intelligible results. Today, the research paradigm in AI and Law has largely shifted to applying new machine learning and natural language processing techniques to legal texts. Although for some time ML models have been predicting outcomes of cases directly from their texts, they cannot yet explain their predictions or support them with arguments. This talk surveys recent research efforts that tease elements of legal meaning, including legal concepts and argument structures, from legal texts. These methods can improve legal information retrieval and may eventually enable ML models to explain and justify their results.


*All programs are UTC±0 time zone.

UTC 15:00-16:00, November 17 (Tuesday), 2020

Live Keynote Session

  • Session Chair: Shohreh Shaghaghian

UTC 16:00-17:00, November 17 (Tuesday), 2020

Panel Session

  • Session Chair: Masoud Makrehchi