Data Integration and Applications Workshop

DINA 2020

September 2020

Ghent, Belgium

Held in conjunction with ECML PKDD 2020

Data are at the core of research in many domains outside of computer science, such as healthcare, social sciences, and business. Combining diverse sources of data provides potentially very useful and powerful data, but it is also a challenging research problem. There are a multitude of challenges in data integration: the data collections to be integrated may come from different sources; the collections may have been created by different groups; their characteristics can be different (different schema, different data types); and the data may contain duplicates. Solving these challenges requires substantial effort and domain experts need to be involved. In the era of Big Data, with organizations scaling up the volume of their data, it is critical to develop new and scalable approaches to deal with all these challenges. In addition, it is important to properly assess the quality of the source data as well as the integrated data. As a consequence, the quality of the source data will drive the methods needed for its integration. Data integration is an important phase in the KDD process, by creating new and enriched records from a multitude of sources. These new records can be queried, searched, mined and analyzed for discovering new, interesting and useful patterns.

The goal of this workshop is to bring together computer scientists with researchers from other domains and practitioners from businesses and governments to present and discuss current research directions on multi source data integration and its application. The workshop will provide a forum for original high-quality research papers on record linkage, data integration, population informatics, mining techniques of integrated data, and applications, as well as multidisciplinary research opportunities.

Topics of interest include (but are not limited to):

Data Integration Methodologies

  • Automating data cleaning and pre-processing
  • Algorithms and techniques for data integration
  • Entity resolution, record linkage, data matching, and duplicate detection
  • Big Data integration
  • Integrating complex data

Population Informatics

  • Algorithms and techniques for managing, processing, analyzing, and mining large population databases
  • Requirements analysis for population informatics
  • Models and algorithms for population informatics
  • Architectures and frameworks for population informatics
  • Research case studies of population informatics in health, demographics, ecology, economics, the social sciences, and other research domains

Evaluation, Quality and Privacy

  • Evaluation of linkage/matching/data integration methods
  • Data quality evaluation for source data and/or integrated data
  • Bias and quality of longitudinal data
  • Preserving privacy in data integration

Integrated Data and Longitudinal Data Applications

  • Mining and analysis of longitudinal data
  • Data integration applications for healthcare, social sciences, digital humanities, bioinformatics, genomics, etc.
  • Applications of population informatics in governments and businesses
http://www.kdnuggets.com/