SEA
Search Engines Amsterdam. - meetup, UvA - SciencePark D
vr 27 okt, 16:00
Science Park 904, D1.115, Amsterdam
16:00
https://info.openlaws.com/openlaws-eu/
LEIBNIZ CENTRE FOR LAW Using network references, Text similarity, .. Future work, more users, different tasks
Typing the network links,
CASE SIMILARITY
Using Legal Data, Legal Citation Fields, ... Dutch legislative Portal, .. / https://www.government.nl
Radboud Winkels, working in a legal field, in terms of algorythms or new ways of searching not much experience.
Legal Intelligence from the beginning,
Towards a legal recommender system
Meetup ILPS Search Engines.
http://www.leibnizcenter.org/projects
Articles in focus? in the network? in other network? .. centrality for relevance ranking ? importance of notes?
Anaphora
Ambiquous title abbreviations
Which version of the law?
Resolve to the work level ...
Include case law
All cases from official portal from ...
Immigration Law, 13.311 documents,
Recall / Precision - Check 25 random documents by hand.
Automatically located, Resolved?
Case > Articles, Law A / Law B .. applying law in practice.
LEGISLATION, CASELAW ACTS OF PARLIAMENT COMPANIES, PLACES, FIELDS OF LAW, LAWYERS, COURTS, MINISTRIES, ...
Case collections, procedural law,
looking for the pattern? interesting patterns?
Cluster of cases, Court Breaks a Line of Reasoning
Citations of ... Case over the years ...
Dutch Supreme Court Cases - 1965-2008 as pubished in NJ
(>15000 cases)
>100.000 refs, F-score >98%
Official Dutch Portal - rechtspraak.nl
1999--2008 89,179 cases
64.000 refs F-score >96%
Many sources of law available online
Not all links / metadata machine readable
Not inter-linked
Between types of documents
Cross-jurisdiction
Typically not task or user oriented
Classic key-word search not enough.
Human enrichment / crowd sourcing
Collections
Processes / lists
Annotations, highlighting
Translations
...
Natural Language Processing
Finding and resolving references / citations
Making metadata explicit
Analysing content of sources of law
= Doman and jurisdiction specific
Network Analysis.
Finding related sources of law
Determining importance / authority
Analysing trends
= Domain and jurisdiction indepentent (?)
Lingua Characteristica Universalis - vision - to have a machine - calculate the output.
Leibniz calculator, there is no golden standard.
For people, a system, justice is not like medicine or other fields, in law there is no hypothethic testing,
How do you find, what is relevant, .. The correct result of a query?
A case with a different result might be better.
OpenLaws.eu and OpenLaws.com, a legal recommender systems
Social enrichment
Case Law
Legislation
Combination.
making sense on searching in case law,.. Journalis publishing case law
16:40 Lawyers that Searh - LegalIntelligence.com > search: onrechtmatige
How to leverage that? ( journals, laws, etc) build a graph of legal documents.
Get value from users?
A Hackathon - relevant related documents search interface /
Ongoing Research & Next steps, supervised ML for metadata, Law Area classificaton.
In a POc used 300k public + 550k commercial docs for training.
Classifier,
Garbage collecting,
Filtering internal content on legal relevancy aiming to identify subsets of relevant documents in collections.
Clusters found with SVD embeddings,..
SOPHISTICATED RECOGNITION > Legal Search > Applying Thesauri
Holiday - Christmas - Easter - Boxing day - search for Haviltex arrest / Privacy Directive
parse identifiers, Law referenes, BW / Art. / sequences ..
Legal content integration, Public content, Commercial content, from al NL legal publishers,
Customer internal content
>9 million documents from >3000 sources
Market
Law firms (of all sizes)
Corporate law
Tax
Government
Courts
Advanced search optimised for legal professionals,
50 regexes applies? Interference
Complicated cases are hard to add
Might work query time, but not index time.
entities like law articles in text, .. expanded query is needed,
New approach: Chain of heuristic rules
Solr = Set of rules, Token Filters - to interpret tokens
Use a Finite State Transducer, to identify thesaurus terms in text
identify law , synonyms, abbreviations,.. law names, thesaurus, sources, ournals, labels, numbers,
Coded state machines
Checking patterns.
Good old regexes / ECLI / CELEX, EUROPEAN IDENTIFIER
Nederland
418.656
467.802
1.415.709
45.045
105.579
61.965
131.301