Summary

In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this “open world scenario” because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. Thus, a DW with semantic data sources in addition to traditional data sources requires more powerful techniques to define, integrate, transform, update, and load data semantically, which are the research and technical solutions provided by SETL: A unified framework for semantic Extract-Transform-Load.

Our hypothesis is “In the context of highly heterogeneous data integration processes, considering semantics as first-class citizen facilitates the integration process by providing automation and lower entry barriers for non-technical users."

The mapping between the objectives of this research and provided solutions are illustrated in Figure 1: