LegRegSW 2019

Legislation and Regulation

on the Semantic Web

June 17, 2019, Université de Montréal, Canada

in conjunction with ICAIL

Call for Participation

Schedule

Description

This workshop will bring together academics, lawyers, government administrators, legal service providers, and corporate researchers to analyze and discuss a shared, linked corpus of legislation and regulation that is represented in a machine readable form, e.g. Semantic Web/Knowledge Graph technologies. With such technologies, data is published in a standardized format such as Resource Description Framework (RDF), Web Ontology Language (OWL), Extensible Markup Language (XML), or others. Such formats facilitate querying, linking, and inferencing over the internet and with automation.

The goal of working with a shared corpus using the Semantic Web is that individual researchers can work with tools, formats, or approaches or theories as they wish, but still have a common basis for discussion, linking, and accessing resources. Over the course of the workshop (before, during and after), efforts will be made to integrate the contributions and demonstrate the utility of a prototype legislative/regulatory Semantic Web. The shared corpus will be a relatively small, coherent, multi-jurisdictional, related corpus of articles international relevance as selected by the workshop organisers (see Shared Corpus below).

The workshop is very timely. There have been recent advances in natural language processing of legal texts, developments of legal XMLs (e.g., LegalDocML and LegalRuleML), creations of corpora of annotated texts (e.g., GDPR, financial regulations, smoking regulations), and formations of international online networks of collaborators (e.g., Better Rules, World Legal Information Institutes) which aim to create machine readable legislation and regulation. Yet, there has not been a forum for a task to develop a shared, linked, diverse corpus which tests out fundamental goals of the Semantic Web. This workshop addresses this gap in the research and development community.

Topics for the workshop:

ontologies
XML, JSON, RDF, etc.
Controlled natural language,
Translation from natural language into a formal language
Methodologies and tools for translation
Interpretative issues of the texts
Querying
Inference
Practical matters related to how to promote and realise the LegRegSW vision.
Abstract goals and position papers on machine-readable, linkable legislation/regulation
Specific suggestions and position papers, e.g. a high level annotation language

Schedule

09:00-09:10

Introductory remarks on the meeting

9:10-10:00

Invited Speaker

Pierre-Paul Lemyre, Director of Business Development, Lexum.com

10:00-11:15 Session I

Wolfgang Alschner and Peter Zachar, University of Ottawa, Canada

Corpus Analysis of Canadian Federal Regulations

Enrico Francesconi, Institute of Legal Information Theory and Techniques, Italy, and Publications Office of the European Union, Luxembourg

Semantic Annotation for Reasoning on Legal Provisions and Norms: a GDPR regulation exercise

Mirna El Ghosh and Habib Abdulrab, National Institute of Applied Sciences of Rouen, France

Towards a Well-Founded Legal Domain Ontology for the Protection of Personal Data Grounded on the Unified Foundational Ontology

11:15-11:30 Coffee Break

11:30-12:45 Session II

Guido Governatori, Silvano Colombo Tosatto, Gabriela Ferraro, Nick van Beest, Mohammad Badiul Islam, Ho-Pun Lam, Francesco Olivieri and Regis Riveret, The Commonwealth Scientific and Industrial Research Organisation and Data61, Australia

Extracting Rules from Legal Texts: Challenges

Adrian Kelly, Inland Revenue Department, New Zealand

A Computer Language Model for Digitising New Zealand Statute Law

Daniela Piana, University of Bologna, Italy

Frames of Case Laws as Bottom Up Avenues toward Graph Technology in a Multiple Sourced Legal System

12:45-14:00 Lunch

14:00-14:50 Session III

Monica Palmirani, University of Bologna, Italy

Legal Knowledge Extraction from Legal Documents: GDPR Use-Case

Mark Stodder and Grant Vergottini, Xcential Legislative Technologies, United States

Xcential's Approach to Machine-readable Law

14:50-15:00 Break

15:00-15:45 Discussion

15:45-16:00 Break

16:00-17:00

Joint Meeting LegRegSW2019 and AIAS for Rules as Code panel with Jameson Dempsey, Michael Genesereth, and Roland Vogl. Venue - AIAS meeting room.

17:00-17:30

Discussion

Important Dates

Submission due date: ~~May 20, 2019~~

Accept/Reject Notification: ~~May 22, 2019~~

Suitability of submissions for the workshop will be evaluated by members of the organising and program committees.

Workshop: 17 June 2019

Paper Submission Format

We welcome submissions of titles and abstracts (up to 400 words) that address the workshop description and topics (above). The abstracts should indicate what machine readable resources have been created, discussion about the resources, e.g. tools and techniques that were applied to the corpus, and links to the resources. See Workshop Format for further information about machine readable resources.

Submission link: easychair legregsw2019

The Organising Committee will review the abstracts for appropriateness for the workshop. Authors of accepted abstracts will be invited to present at the workshop.

Note that virtual presentation and participation is also welcome.

See the note below on Publication subsequent to the workshop.

Workshop Format

Prior to the workshop, participants should make their machine readable resources accessible on the web, queriable by others on the web, and (optimally) downloadable. Resource formats can be in, e.g., XML, RDF, JSON. By accessible and queriable on the web, it is meant that the files can be accessed by, e.g., API, SPARQL, or a JSON query tool.

Practical Information

The workshop is held in conjunction with International Conference on AI and Law (ICAIL), June 17-21, 2019, Montréal, Canada. See the ICAIL conference for registration, travel, accommodation, and program information.

Links to Tools and Resources

To Appear:

Social Media

Twitter: #legregsw

Publication

If there is sufficient interest by workshop participants, the organising committee will seek a publication outlet. The presentation and discussion at the workshop should provide an opportunity to further develop the work individually and collaboratively.

Organising Committee

Adam Wyner, Swansea University, Law and Computer Science
- First point of contact: a.z.wyner@swansea.ac.uk
Adeline Nazarenko, University of Paris 13, LIPN
Francois Levy, University of Paris 13, LIPN
Matt Lynch, Parliamentary Counsel Office, Scottish Government
Enrico Francesconi, Italian National Research Council (ITTIG-CNR), Publications Office of the EU
Monica Palmirani, University of Bologna, Legal Studies

Shared Corpus

The main purposes of the workshop is to create and discuss as much material in common as possible which is 'coherent', machine-readable, web-accessible, linkable, and queryable analysis of the normative contents of a body of legal text.

To accomplish this, we have created a small, scoped, shared corpus. With such a small, scoped, shared corpus, participants can better appreciate the commonalities/differences of alternative analyses and tool application along with issues arising from integration and access. Such an approach is intended to contribute towards work of greater breadth, depth, completeness, and connectedness. Our choice of shared corpus is pragmatic and intended to serve the scientific purposes of the meeting.

To suit the main purposes of the workshop, the OC has selected a text based on the following guidelines:

a topic of widespread, common interest;
small and narrowly scoped;
extensible in terms of scale, related works, languages, and jurisdictions;
in a commonly used language;
normative content.

Therefore, we propose to create a shared corpus:

EU legislation on Data Protection
only selected passages
in English
articles that bear on norms
distinguished between what is out of scope, then what is of primary, secondary, and tertiary interest.

While the focus is first and foremost on the shared primary corpus, discussions are welcome about the other levels, the selections, and alternatives.

Should participants not wish to work with the primary (secondary or tertiary) corpus, but some other corpus, such discussion is also welcome, providing some explanation for why they cannot or do not want to work with the shared primary corpus.

The document we propose for the shared primary corpus is drawn from

Title:

Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance)

Document 32016R0679

The online source text is:

https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32016R0679

To create the corpus, we have selected chapters and articles from the source text. The intention is that the text is facilitates discussion about analysis and linking rather than being representative of the whole legislation. A principled selection is impractical (other choices could be made) and ill-formed (the legislation per se comes as a whole); more to the point, a principled edit is not essential to the current technical agenda. Selections of any sort are likely to give rise to issues of analysis, processing, interpretation, or dependencies, which are valid topics of discussion at the meeting.

We have made the corpus available in two forms - primary corpus in a text file (link below) or a roll-your-own corpus. To indicate our selections from the source text and create a roll-your-own corpus, we have used the table of contents with the following:

*0* out of scope - not to be included in the analysis of the text.

*1a* primary - the first portions of text to be analysed. This is a section of articles from Chapter 3 of the legislation.

*1* additional material that can be considering after *1a*.

*2* secondary - where the primary corpus is treated, the secondary corpus can be added.

*3* tertiary - where the secondary corpus is treated, the tertiary corpus can be added.

After working on *1a*, researchers are invited to add additional sections. One should preferably follow the order above. Definitions might be particularly important. Moreover, so as to create as much common analysis to the greatest extent possible, we advise a top-down methodology, working from the top of the document and down.

The primary corpus in a text file only contains the *1a* components. Caveat: errors such as missing sections or other belong to Adam Wyner.

One can use the Table of Contents with Indications to create the other corpora.

Google Sites

Report abuse