Date: Feb 10, 2017
Speaker: Michael Galkin, University of Bonn
Abstract
Michael will present a few projects developed in the Enterprise Information Systems (EIS) department at the University of Bonn, and give short demos of the following systems:
Michael will also talk about a large project they are working on:
Data Lakes have been proposed to provide scalable and flexible data discovery, analysis, reporting and to reduce the initial integration efforts before the data sources can be used.
Although Data Lakes reduce the costs of identifying, storing, cleansing, and integrating data significantly and promote flexibility in data analysis, they introduce complexity in exploring, querying, and analyzing the available raw data in a unified manner. In this work, we introduce an ontological architecture for Data Lakes (Semantic Data Lake, SDL) tailored for large-scale heterogeneous data.
SDL leverages mappings to a knowledge graph to provide a robust and universal query mechanism with SPARQL as a common query language.
SDL outperforms existing SPARQL query federation approaches on both synthetic and real datasets on the scale of hundreds of millions of triples.