Mining and Learning in the Legal Domain

The 3rd International Workshop on Mining and Learning in the Legal Domain (MLLD-2023)


Keynote Talk


Ensuring Reliability in Legal LLM Applications


The usage of large language models (LLMs) has exploded over the past year, especially since OpenAI introduced ChatGPT in November 2022; however, ensuring accuracy and reliability in LLM-generated outputs remains a challenge, especially in knowledge-intensive domains such as law. In this talk, we will present some methods that we use to ensure reliability in CoCounsel, Casetext's GPT-4 based legal AI assistant, touching upon topics including retrieval-augmented generation for legal research, methods for reducing hallucinations, managing cost vs. reliability tradeoff, evaluating LLMs in the legal context, and generating synthetic data from GPT4.


Paper Presentations

Panel Discussion


LLM Meets LLP: An Industry Insider’s Experience over the Past 12 Months 


In this engaging session, our expert panelists from Thomson Reuters and Simmons & Simmons will explore the dynamic future of the legal industry in the wake of the advent of generative AI, shedding light on the profound impact it has on law firms and legal practitioners. We will delve into the opportunities and challenges AI presents within the legal sector, while also addressing the real-world hurdles in adopting AI solutions. This discussion should be of equal interest to industry professionals seeking valuable insights and to academics in pursuit of novel research questions uncovered in practical industry experience.




The increasing accessibility of legal corpora and databases create opportunities to develop data-driven techniques and advanced tools that can facilitate a variety of tasks in the legal domain, such as legal search and research, legal document review and summary, legal contract drafting, and legal outcome prediction. Compared with other application domains, the legal domain is characterized by the huge scale of natural language text data, the high complexity of specialist knowledge, and the critical importance of ethical considerations. The MLLD workshop aims to bring together researchers and practitioners to share the latest research findings and innovative approaches in employing data mining, machine learning, information retrieval, and knowledge management techniques to transform the legal sector. Building upon the previous successes, the third edition of the MLLD workshop will emphasize the exploration of new research opportunities brought about by recent rapid advances in Large Language Models and Generative AI. We encourage submissions that intersect computer science and law, from both academia and industry, embodying the interdisciplinary spirit of CIKM. 


We encourage submissions on novel mining and learning based solutions in various aspects of legal data analysis such as legislations, litigations, court cases, contracts, patents, NDAs and bylaws. Topics of interest include, but are not limited to: 


All submissions must be in English, in PDF format, and in ACM two-column format (sigconf). The ACM LaTeX template are available from the ACM website and the Overleaf online editor

To enable double-blind reviewing, authors are required to take all reasonable measures to conceal their identity. The anonymous option of the acmart class must be used.  Furthermore, ACM copyright and permission information should be removed by using the nonacm option. Therefore, the first line of your main LaTeX document should be as follows.

To facilitate the exchange of ideas, this year we adopt a policy similar to that of ICTIR'23 which allows submissions of any length between 2 and 9 pages plus unrestricted space for references. Authors are expected to submit a paper whose length reflects what is needed for the content of the work, i.e., page length should be commensurate with contribution size.  Reviewers will assess whether the contribution is appropriate for the given length. Consequently, there is no longer a distinction between long and short papers, nor a need of condensing or enlarging medium-length ones. We will probably allocate more presentation time to longer papers during the workshop.

As in the previous editions of MLLD, each paper will be reviewed by at least 3 reviewers from the Program Committee. 

We are going to produce non-archival proceedings for this workshop on, similar to IPA'20. Thus, authors can refine their accepted papers and submit them to formal conferences/journals after the workshop.

Submissions should be made electronically via EasyChair:

Important Dates


The CIKM-2023 conference will be held in-person in Birmingham, UK. Therefore, it is expected that most (if not all) of the authors will present their accepted papers in-person for this workshop. Some invited speakers and/or participants may have the flexibility to attend online.


The registration for the workshop is done through the main conference CIKM-2023

CIKM will be opening the registration in a few weeks. If you would like to express your interest in the project and be notified when the registration is open, please drop an email to Alina Petrova.

Programme Committee



If you have any question regarding this workshop, please email

Previous Workshops