Legislation and Regulation
on the Semantic Web
June 17, 2019, Université de Montréal, Canada
in conjunction with ICAIL
June 17, 2019, Université de Montréal, Canada
in conjunction with ICAIL
Call for Participation
Schedule
Description
This workshop will bring together academics, lawyers, government administrators, legal service providers, and corporate researchers to analyze and discuss a shared, linked corpus of legislation and regulation that is represented in a machine readable form, e.g. Semantic Web/Knowledge Graph technologies. With such technologies, data is published in a standardized format such as Resource Description Framework (RDF), Web Ontology Language (OWL), Extensible Markup Language (XML), or others. Such formats facilitate querying, linking, and inferencing over the internet and with automation.
The goal of working with a shared corpus using the Semantic Web is that individual researchers can work with tools, formats, or approaches or theories as they wish, but still have a common basis for discussion, linking, and accessing resources. Over the course of the workshop (before, during and after), efforts will be made to integrate the contributions and demonstrate the utility of a prototype legislative/regulatory Semantic Web. The shared corpus will be a relatively small, coherent, multi-jurisdictional, related corpus of articles international relevance as selected by the workshop organisers (see Shared Corpus below).
The workshop is very timely. There have been recent advances in natural language processing of legal texts, developments of legal XMLs (e.g., LegalDocML and LegalRuleML), creations of corpora of annotated texts (e.g., GDPR, financial regulations, smoking regulations), and formations of international online networks of collaborators (e.g., Better Rules, World Legal Information Institutes) which aim to create machine readable legislation and regulation. Yet, there has not been a forum for a task to develop a shared, linked, diverse corpus which tests out fundamental goals of the Semantic Web. This workshop addresses this gap in the research and development community.
Topics for the workshop:
Schedule
09:00-09:10
Introductory remarks on the meeting
9:10-10:00
Invited Speaker
Pierre-Paul Lemyre, Director of Business Development, Lexum.com
10:00-11:15 Session I
Wolfgang Alschner and Peter Zachar, University of Ottawa, Canada
Enrico Francesconi, Institute of Legal Information Theory and Techniques, Italy, and Publications Office of the European Union, Luxembourg
Mirna El Ghosh and Habib Abdulrab, National Institute of Applied Sciences of Rouen, France
11:15-11:30 Coffee Break
11:30-12:45 Session II
Guido Governatori, Silvano Colombo Tosatto, Gabriela Ferraro, Nick van Beest, Mohammad Badiul Islam, Ho-Pun Lam, Francesco Olivieri and Regis Riveret, The Commonwealth Scientific and Industrial Research Organisation and Data61, Australia
Adrian Kelly, Inland Revenue Department, New Zealand
Daniela Piana, University of Bologna, Italy
12:45-14:00 Lunch
14:00-14:50 Session III
Monica Palmirani, University of Bologna, Italy
Mark Stodder and Grant Vergottini, Xcential Legislative Technologies, United States
14:50-15:00 Break
15:00-15:45 Discussion
15:45-16:00 Break
16:00-17:00
Joint Meeting LegRegSW2019 and AIAS for Rules as Code panel with Jameson Dempsey, Michael Genesereth, and Roland Vogl. Venue - AIAS meeting room.
17:00-17:30
Discussion
Important Dates
Submission due date: May 20, 2019
Accept/Reject Notification: May 22, 2019
Suitability of submissions for the workshop will be evaluated by members of the organising and program committees.
Workshop: 17 June 2019
Paper Submission Format
We welcome submissions of titles and abstracts (up to 400 words) that address the workshop description and topics (above). The abstracts should indicate what machine readable resources have been created, discussion about the resources, e.g. tools and techniques that were applied to the corpus, and links to the resources. See Workshop Format for further information about machine readable resources.
Submission link: easychair legregsw2019
The Organising Committee will review the abstracts for appropriateness for the workshop. Authors of accepted abstracts will be invited to present at the workshop.
Note that virtual presentation and participation is also welcome.
See the note below on Publication subsequent to the workshop.
Workshop Format
Prior to the workshop, participants should make their machine readable resources accessible on the web, queriable by others on the web, and (optimally) downloadable. Resource formats can be in, e.g., XML, RDF, JSON. By accessible and queriable on the web, it is meant that the files can be accessed by, e.g., API, SPARQL, or a JSON query tool.
Practical Information
The workshop is held in conjunction with International Conference on AI and Law (ICAIL), June 17-21, 2019, Montréal, Canada. See the ICAIL conference for registration, travel, accommodation, and program information.
Links to Tools and Resources
Social Media
Twitter: #legregsw
Publication
If there is sufficient interest by workshop participants, the organising committee will seek a publication outlet. The presentation and discussion at the workshop should provide an opportunity to further develop the work individually and collaboratively.
Organising Committee
Shared Corpus
The main purposes of the workshop is to create and discuss as much material in common as possible which is 'coherent', machine-readable, web-accessible, linkable, and queryable analysis of the normative contents of a body of legal text.
To accomplish this, we have created a small, scoped, shared corpus. With such a small, scoped, shared corpus, participants can better appreciate the commonalities/differences of alternative analyses and tool application along with issues arising from integration and access. Such an approach is intended to contribute towards work of greater breadth, depth, completeness, and connectedness. Our choice of shared corpus is pragmatic and intended to serve the scientific purposes of the meeting.
To suit the main purposes of the workshop, the OC has selected a text based on the following guidelines:
Therefore, we propose to create a shared corpus:
While the focus is first and foremost on the shared primary corpus, discussions are welcome about the other levels, the selections, and alternatives.
Should participants not wish to work with the primary (secondary or tertiary) corpus, but some other corpus, such discussion is also welcome, providing some explanation for why they cannot or do not want to work with the shared primary corpus.
The document we propose for the shared primary corpus is drawn from
Title:
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance)
Document 32016R0679
The online source text is:
https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32016R0679
To create the corpus, we have selected chapters and articles from the source text. The intention is that the text is facilitates discussion about analysis and linking rather than being representative of the whole legislation. A principled selection is impractical (other choices could be made) and ill-formed (the legislation per se comes as a whole); more to the point, a principled edit is not essential to the current technical agenda. Selections of any sort are likely to give rise to issues of analysis, processing, interpretation, or dependencies, which are valid topics of discussion at the meeting.
We have made the corpus available in two forms - primary corpus in a text file (link below) or a roll-your-own corpus. To indicate our selections from the source text and create a roll-your-own corpus, we have used the table of contents with the following:
*0* out of scope - not to be included in the analysis of the text.
*1a* primary - the first portions of text to be analysed. This is a section of articles from Chapter 3 of the legislation.
*1* additional material that can be considering after *1a*.
*2* secondary - where the primary corpus is treated, the secondary corpus can be added.
*3* tertiary - where the secondary corpus is treated, the tertiary corpus can be added.
After working on *1a*, researchers are invited to add additional sections. One should preferably follow the order above. Definitions might be particularly important. Moreover, so as to create as much common analysis to the greatest extent possible, we advise a top-down methodology, working from the top of the document and down.
The primary corpus in a text file only contains the *1a* components. Caveat: errors such as missing sections or other belong to Adam Wyner.
One can use the Table of Contents with Indications to create the other corpora.